< draft-ietf-rtcweb-security-11.txt   draft-ietf-rtcweb-security-12.txt >
RTC-Web E. Rescorla RTC-Web E. Rescorla
Internet-Draft RTFM, Inc. Internet-Draft RTFM, Inc.
Intended status: Standards Track February 1, 2019 Intended status: Standards Track July 5, 2019
Expires: August 5, 2019 Expires: January 6, 2020
Security Considerations for WebRTC Security Considerations for WebRTC
draft-ietf-rtcweb-security-11 draft-ietf-rtcweb-security-12
Abstract Abstract
WebRTC is a protocol suite for use with real-time applications that WebRTC is a protocol suite for use with real-time applications that
can be deployed in browsers - "real time communication on the Web". can be deployed in browsers - "real time communication on the Web".
This document defines the WebRTC threat model and analyzes the This document defines the WebRTC threat model and analyzes the
security threats of WebRTC in that model. security threats of WebRTC in that model.
Status of This Memo Status of This Memo
skipping to change at page 1, line 33 skipping to change at page 1, line 33
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 5, 2019. This Internet-Draft will expire on January 6, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 48 skipping to change at page 2, line 48
4.2.4. IP Location Privacy . . . . . . . . . . . . . . . . . 15 4.2.4. IP Location Privacy . . . . . . . . . . . . . . . . . 15
4.3. Communications Security . . . . . . . . . . . . . . . . . 15 4.3. Communications Security . . . . . . . . . . . . . . . . . 15
4.3.1. Protecting Against Retrospective Compromise . . . . . 16 4.3.1. Protecting Against Retrospective Compromise . . . . . 16
4.3.2. Protecting Against During-Call Attack . . . . . . . . 17 4.3.2. Protecting Against During-Call Attack . . . . . . . . 17
4.3.2.1. Key Continuity . . . . . . . . . . . . . . . . . 17 4.3.2.1. Key Continuity . . . . . . . . . . . . . . . . . 17
4.3.2.2. Short Authentication Strings . . . . . . . . . . 18 4.3.2.2. Short Authentication Strings . . . . . . . . . . 18
4.3.2.3. Third Party Identity . . . . . . . . . . . . . . 19 4.3.2.3. Third Party Identity . . . . . . . . . . . . . . 19
4.3.2.4. Page Access to Media . . . . . . . . . . . . . . 19 4.3.2.4. Page Access to Media . . . . . . . . . . . . . . 19
4.3.3. Malicious Peers . . . . . . . . . . . . . . . . . . . 20 4.3.3. Malicious Peers . . . . . . . . . . . . . . . . . . . 20
4.4. Privacy Considerations . . . . . . . . . . . . . . . . . 20 4.4. Privacy Considerations . . . . . . . . . . . . . . . . . 20
4.4.1. Correlation of Anonymous Calls . . . . . . . . . . . 20 4.4.1. Correlation of Anonymous Calls . . . . . . . . . . . 21
4.4.2. Browser Fingerprinting . . . . . . . . . . . . . . . 21 4.4.2. Browser Fingerprinting . . . . . . . . . . . . . . . 21
5. Security Considerations . . . . . . . . . . . . . . . . . . . 21 5. Security Considerations . . . . . . . . . . . . . . . . . . . 21
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
8. Changes Since -04 . . . . . . . . . . . . . . . . . . . . . . 21 8. Changes Since -04 . . . . . . . . . . . . . . . . . . . . . . 21
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 22
9.1. Normative References . . . . . . . . . . . . . . . . . . 22 9.1. Normative References . . . . . . . . . . . . . . . . . . 22
9.2. Informative References . . . . . . . . . . . . . . . . . 22 9.2. Informative References . . . . . . . . . . . . . . . . . 22
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 25 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 25
skipping to change at page 3, line 45 skipping to change at page 3, line 45
| Browser |<---------->| Browser | | Browser |<---------->| Browser |
| | | | | | | |
+-----------+ +-----------+ +-----------+ +-----------+
Alice Bob Alice Bob
Figure 1: A simple WebRTC system Figure 1: A simple WebRTC system
In the system shown in Figure 1, Alice and Bob both have WebRTC- In the system shown in Figure 1, Alice and Bob both have WebRTC-
enabled browsers and they visit some Web server which operates a enabled browsers and they visit some Web server which operates a
calling service. Each of their browsers exposes standardized calling service. Each of their browsers exposes standardized
JavaScript calling APIs (implementated as browser built-ins) which JavaScript calling APIs (implemented as browser built-ins) which are
are used by the Web server to set up a call between Alice and Bob. used by the Web server to set up a call between Alice and Bob. The
The Web server also serves as the signaling channel to transport Web server also serves as the signaling channel to transport control
control messages between the browsers. While this system is messages between the browsers. While this system is topologically
topologically similar to a conventional SIP-based system (with the similar to a conventional SIP-based system (with the Web server
Web server acting as the signaling service and browsers acting as acting as the signaling service and browsers acting as softphones),
softphones), control has moved to the central Web server; the browser control has moved to the central Web server; the browser simply
simply provides API points that are used by the calling service. As provides API points that are used by the calling service. As with
with any Web application, the Web server can move logic between the any Web application, the Web server can move logic between the server
server and JavaScript in the browser, but regardless of where the and JavaScript in the browser, but regardless of where the code is
code is executing, it is ultimately under control of the server. executing, it is ultimately under control of the server.
It should be immediately apparent that this type of system poses new It should be immediately apparent that this type of system poses new
security challenges beyond those of a conventional VoIP system. In security challenges beyond those of a conventional VoIP system. In
particular, it needs to contend with malicious calling services. For particular, it needs to contend with malicious calling services. For
example, if the calling service can cause the browser to make a call example, if the calling service can cause the browser to make a call
at any time to any callee of its choice, then this facility can be at any time to any callee of its choice, then this facility can be
used to bug a user's computer without their knowledge, simply by used to bug a user's computer without their knowledge, simply by
placing a call to some recording service. More subtly, if the placing a call to some recording service. More subtly, if the
exposed APIs allow the server to instruct the browser to send exposed APIs allow the server to instruct the browser to send
arbitrary content, then they can be used to bypass firewalls or mount arbitrary content, then they can be used to bypass firewalls or mount
skipping to change at page 5, line 8 skipping to change at page 5, line 8
For instance, an attacker can purchase display advertisements which For instance, an attacker can purchase display advertisements which
direct the user (either automatically or via user clicking) to their direct the user (either automatically or via user clicking) to their
site, at which point the browser will execute the attacker's scripts. site, at which point the browser will execute the attacker's scripts.
Thus, it is important that it be safe to view arbitrarily malicious Thus, it is important that it be safe to view arbitrarily malicious
pages. Of course, browsers inevitably have bugs which cause them to pages. Of course, browsers inevitably have bugs which cause them to
fall short of this goal, but any new WebRTC functionality must be fall short of this goal, but any new WebRTC functionality must be
designed with the intent to meet this standard. The remainder of designed with the intent to meet this standard. The remainder of
this section provides more background on the existing Web security this section provides more background on the existing Web security
model. model.
In this model, then, the browser acts as a TRUSTED COMPUTING BASE In this model, then, the browser acts as a Trusted Coomputing Base
(TCB) both from the user's perspective and to some extent from the (TCB) both from the user's perspective and to some extent from the
server's. While HTML and JavaScript (JS) provided by the server can server's. While HTML and JavaScript (JS) provided by the server can
cause the browser to execute a variety of actions, those scripts cause the browser to execute a variety of actions, those scripts
operate in a sandbox that isolates them both from the user's computer operate in a sandbox that isolates them both from the user's computer
and from each other, as detailed below. and from each other, as detailed below.
Conventionally, we refer to either WEB ATTACKERS, who are able to Conventionally, we refer to either web attackers, who are able to
induce you to visit their sites but do not control the network, and induce you to visit their sites but do not control the network, and
NETWORK ATTACKERS, who are able to control your network. Network network attackers, who are able to control your network. Network
attackers correspond to the [RFC3552] "Internet Threat Model". Note attackers correspond to the [RFC3552] "Internet Threat Model". Note
that for non-HTTPS traffic, a network attacker is also a Web that in some cases, a network attacker is also a web attacker, since
attacker, since it can inject traffic as if it were any non-HTTPS Web transport protocols that do not provide integrity protection allow
site. Thus, when analyzing HTTP connections, we must assume that the network to inject traffic as if they were any communications
traffic is going to the attacker. peer. TLS, and HTTPS in particular, prevent against these attacks,
but when analyzing HTTP connections, we must assume that traffic is
going to the attacker.
3.1. Access to Local Resources 3.1. Access to Local Resources
While the browser has access to local resources such as keying While the browser has access to local resources such as keying
material, files, the camera, and the microphone, it strictly limits material, files, the camera, and the microphone, it strictly limits
or forbids web servers from accessing those same resources. For or forbids web servers from accessing those same resources. For
instance, while it is possible to produce an HTML form which will instance, while it is possible to produce an HTML form which will
allow file upload, a script cannot do so without user consent and in allow file upload, a script cannot do so without user consent and in
fact cannot even suggest a specific file (e.g., /etc/passwd); the fact cannot even suggest a specific file (e.g., /etc/passwd); the
user must explicitly select the file and consent to its upload. user must explicitly select the file and consent to its upload.
[Note: in many cases browsers are explicitly designed to avoid [Note: in many cases browsers are explicitly designed to avoid
dialogs with the semantics of "click here to bypass security checks", dialogs with the semantics of "click here to bypass security checks",
as extensive research shows that users are prone to consent under as extensive research [cranor-wolf] shows that users are prone to
such circumstances.] consent under such circumstances.]
Similarly, while Flash programs (SWFs) [SWF] can access the camera Similarly, while Flash programs (SWFs) [SWF] can access the camera
and microphone, they explicitly require that the user consent to that and microphone, they explicitly require that the user consent to that
access. In addition, some resources simply cannot be accessed from access. In addition, some resources simply cannot be accessed from
the browser at all. For instance, there is no real way to run the browser at all. For instance, there is no real way to run
specific executables directly from a script (though the user can of specific executables directly from a script (though the user can of
course be induced to download executable files and run them). course be induced to download executable files and run them).
3.2. Same-Origin Policy 3.2. Same-Origin Policy
Many other resources are accessible but isolated. For instance, Many other resources are accessible but isolated. For instance,
while scripts are allowed to make HTTP requests via the while scripts are allowed to make HTTP requests via the
XMLHttpRequest() API (see [XmlHttpRequest]) those requests are not XMLHttpRequest() API (see [XmlHttpRequest]) those requests are not
allowed to be made to any server, but rather solely to the same allowed to be made to any server, but rather solely to the same
ORIGIN from whence the script came [RFC6454] (although CORS [CORS] ORIGIN from whence the script came [RFC6454] (although CORS [CORS]
and WebSockets [RFC6455] provide a escape hatch from this and WebSockets [RFC6455] provide an escape hatch from this
restriction, as described below.) This SAME ORIGIN POLICY (SOP) restriction, as described below.) This SAME ORIGIN POLICY (SOP)
prevents server A from mounting attacks on server B via the user's prevents server A from mounting attacks on server B via the user's
browser, which protects both the user (e.g., from misuse of his browser, which protects both the user (e.g., from misuse of his
credentials) and the server B (e.g., from DoS attack). credentials) and the server B (e.g., from DoS attack).
More generally, SOP forces scripts from each site to run in their More generally, SOP forces scripts from each site to run in their
own, isolated, sandboxes. While there are techniques to allow them own, isolated, sandboxes. While there are techniques to allow them
to interact, those interactions generally must be mutually consensual to interact, those interactions generally must be mutually consensual
(by each site) and are limited to certain channels. For instance, (by each site) and are limited to certain channels. For instance,
multiple pages/browser panes from the same origin can read each multiple pages/browser panes from the same origin can read each
skipping to change at page 7, line 36 skipping to change at page 7, line 38
to Entity A (e.g., your mother). to Entity A (e.g., your mother).
2. Entity A (e.g., a calling service) asks to access the user's 2. Entity A (e.g., a calling service) asks to access the user's
devices with the assurance that it will transfer the media to devices with the assurance that it will transfer the media to
entity B (e.g., your mother) entity B (e.g., your mother)
In either case, identity is at the heart of any consent decision. In either case, identity is at the heart of any consent decision.
Moreover, the identity of the party the browser is connecting to is Moreover, the identity of the party the browser is connecting to is
all that the browser can meaningfully enforce; if you are calling A, all that the browser can meaningfully enforce; if you are calling A,
A can simply forward the media to C. Similarly, if you authorize A A can simply forward the media to C. Similarly, if you authorize A
to place a call to B, A can call C instead. In either case, all the to place a call to B, A can call C instead. In either cases, all the
browser is able to do is verify and check authorization for whoever browser is able to do is verify and check authorization for whoever
is controlling where the media goes. The target of the media can of is controlling where the media goes. The target of the media can of
course advertise a security/privacy policy, but this is not something course advertise a security/privacy policy, but this is not something
that the browser can enforce. Even so, there are a variety of that the browser can enforce. Even so, there are a variety of
different consent scenarios that motivate different technical consent different consent scenarios that motivate different technical consent
mechanisms. We discuss these mechanisms in the sections below. mechanisms. We discuss these mechanisms in the sections below.
It's important to understand that consent to access local devices is It's important to understand that consent to access local devices is
largely orthogonal to consent to transmit various kinds of data over largely orthogonal to consent to transmit various kinds of data over
the network (see Section 4.2). Consent for device access is largely the network (see Section 4.2). Consent for device access is largely
skipping to change at page 10, line 7 skipping to change at page 10, line 7
Section 3.1, great care must be taken in the design of this interface Section 3.1, great care must be taken in the design of this interface
to avoid the users just clicking through. Note also that the user to avoid the users just clicking through. Note also that the user
interface chrome, which is the representation through which the user interface chrome, which is the representation through which the user
interacts with the user agent itself, must clearly display elements interacts with the user agent itself, must clearly display elements
showing that the call is continuing in order to avoid attacks where showing that the call is continuing in order to avoid attacks where
the calling site just leaves it up indefinitely but shows a Web UI the calling site just leaves it up indefinitely but shows a Web UI
that implies otherwise. that implies otherwise.
4.1.3. Origin-Based Security 4.1.3. Origin-Based Security
Now that we have seen another use case, we can start to reason about Now that we have described the calling scenarios, we can start to
the security requirements. reason about the security requirements.
As discussed in Section 3.2, the basic unit of Web sandboxing is the As discussed in Section 3.2, the basic unit of Web sandboxing is the
origin, and so it is natural to scope consent to origin. origin, and so it is natural to scope consent to origin.
Specifically, a script from origin A MUST only be allowed to initiate Specifically, a script from origin A MUST only be allowed to initiate
communications (and hence to access camera and microphone) if the communications (and hence to access camera and microphone) if the
user has specifically authorized access for that origin. It is of user has specifically authorized access for that origin. It is of
course technically possible to have coarser-scoped permissions, but course technically possible to have coarser-scoped permissions, but
because the Web model is scoped to origin, this creates a difficult because the Web model is scoped to origin, this creates a difficult
mismatch. mismatch.
skipping to change at page 11, line 48 skipping to change at page 11, line 48
network (e.g., a hotspot or if my own home wireless network is network (e.g., a hotspot or if my own home wireless network is
insecure), and browse any HTTP site, then an attacker can bug my insecure), and browse any HTTP site, then an attacker can bug my
computer. The attack proceeds like this: computer. The attack proceeds like this:
1. I connect to http://anything.example.org/. Note that this site is 1. I connect to http://anything.example.org/. Note that this site is
unaffiliated with the calling service. unaffiliated with the calling service.
2. The attacker modifies my HTTP connection to inject an IFRAME (or 2. The attacker modifies my HTTP connection to inject an IFRAME (or
a redirect) to http://calling-service.example.com a redirect) to http://calling-service.example.com
3. The attacker forges the response apparently http://calling- 3. The attacker forges the response from http://calling-
service.example.com/ to inject JS to initiate a call to himself. service.example.com/ to inject JS to initiate a call to himself.
Note that this attack does not depend on the media being insecure. Note that this attack does not depend on the media being insecure.
Because the call is to the attacker, it is also encrypted to him. Because the call is to the attacker, it is also encrypted to him.
Moreover, it need not be executed immediately; the attacker can Moreover, it need not be executed immediately; the attacker can
"infect" the origin semi-permanently (e.g., with a web worker or a "infect" the origin semi-permanently (e.g., with a web worker or a
popped-up window that is hidden under the main window.) and thus be popped-up window that is hidden under the main window.) and thus be
able to bug me long after I have left the infected network. This able to bug me long after I have left the infected network. This
risk is created by allowing calls at all from a page fetched over risk is created by allowing calls at all from a page fetched over
HTTP. HTTP.
Even if calls are only possible from HTTPS [RFC2818] sites, if those Even if calls are only possible from HTTPS [RFC2818] sites, if those
sites include active content (e.g., JavaScript) from an untrusted sites include active content (e.g., JavaScript) from an untrusted
site, that JavaScript is executed in the security context of the page site, that JavaScript is executed in the security context of the page
[finer-grained]. This could lead to compromise of a call even if the [finer-grained]. This could lead to compromise of a call even if the
parent page is safe. Note: this issue is not restricted to PAGES parent page is safe. Note: this issue is not restricted to PAGES
which contain untrusted content. If a page from a given origin ever which contain untrusted content. If any page from a given origin
loads JavaScript from an attacker, then it is possible for that ever loads JavaScript from an attacker, then it is possible for that
attacker to infect the browser's notion of that origin semi- attacker to infect the browser's notion of that origin semi-
permanently. permanently.
4.2. Communications Consent Verification 4.2. Communications Consent Verification
As discussed in Section 3.3, allowing web applications unrestricted As discussed in Section 3.3, allowing web applications unrestricted
network access via the browser introduces the risk of using the network access via the browser introduces the risk of using the
browser as an attack platform against machines which would not browser as an attack platform against machines which would not
otherwise be accessible to the malicious site, for instance because otherwise be accessible to the malicious site, for instance because
they are topologically restricted (e.g., behind a firewall or NAT). they are topologically restricted (e.g., behind a firewall or NAT).
In order to prevent this form of attack as well as cross-protocol In order to prevent this form of attack as well as cross-protocol
attacks it is important to require that the target of traffic attacks it is important to require that the target of traffic
explicitly consent to receiving the traffic in question. Until that explicitly consent to receiving the traffic in question. Until that
consent has been verified for a given endpoint, traffic other than consent has been verified for a given endpoint, traffic other than
the consent handshake MUST NOT be sent to that endpoint. the consent handshake MUST NOT be sent to that endpoint.
Note that consent verification is not sufficient to prevent overuse Note that consent verification is not sufficient to prevent overuse
of network resources. Because WebRTC allows for a Web site to create of network resources. Because WebRTC allows for a Web site to create
data flows between two browser instances without user consent, it is data flows between two browser instances without user consent, it is
possible for a malicious site to chew up a signficant amount of a possible for a malicious site to chew up a significant amount of a
user's bandwidth without incurring significant costs to himself by user's bandwidth without incurring significant costs to himself by
setting up such a channel to another user. However, as a practical setting up such a channel to another user. However, as a practical
matter there are a large number of Web sites which can act as data matter there are a large number of Web sites which can act as data
sources, so an attacker can at least use downlink bandwidth with sources, so an attacker can at least use downlink bandwidth with
existing Web APIs. However, this potential DoS vector reinforces the existing Web APIs. However, this potential DoS vector reinforces the
need for adequate congestion control for WebRTC protocols to ensure need for adequate congestion control for WebRTC protocols to ensure
that they play fair with other demands on the user's bandwidth. that they play fair with other demands on the user's bandwidth.
4.2.1. ICE 4.2.1. ICE
Verifying receiver consent requires some sort of explicit handshake, Verifying receiver consent requires some sort of explicit handshake,
but conveniently we already need one in order to do NAT hole- but conveniently we already need one in order to do NAT hole-
punching. ICE [RFC8445] includes a handshake designed to verify that punching. Interactive Connectivity Establishment (ICE) [RFC8445]
the receiving element wishes to receive traffic from the sender. It includes a handshake designed to verify that the receiving element
is important to remember here that the site initiating ICE is wishes to receive traffic from the sender. It is important to
presumed malicious; in order for the handshake to be secure the remember here that the site initiating ICE is presumed malicious; in
receiving element MUST demonstrate receipt/knowledge of some value order for the handshake to be secure the receiving element MUST
not available to the site (thus preventing the site from forging demonstrate receipt/knowledge of some value not available to the site
responses). In order to achieve this objective with ICE, the STUN (thus preventing the site from forging responses). In order to
transaction IDs must be generated by the browser and MUST NOT be made achieve this objective with ICE, the STUN transaction IDs must be
available to the initiating script, even via a diagnostic interface. generated by the browser and MUST NOT be made available to the
Verifying receiver consent also requires verifying the receiver wants initiating script, even via a diagnostic interface. Verifying
to receive traffic from a particular sender, and at this time; for receiver consent also requires verifying the receiver wants to
receive traffic from a particular sender, and at this time; for
example a malicious site may simply attempt ICE to known servers that example a malicious site may simply attempt ICE to known servers that
are using ICE for other sessions. ICE provides this verification as are using ICE for other sessions. ICE provides this verification as
well, by using the STUN credentials as a form of per-session shared well, by using the STUN credentials as a form of per-session shared
secret. Those credentials are known to the Web application, but secret. Those credentials are known to the Web application, but
would need to also be known and used by the STUN-receiving element to would need to also be known and used by the STUN-receiving element to
be useful. be useful.
There also needs to be some mechanism for the browser to verify that There also needs to be some mechanism for the browser to verify that
the target of the traffic continues to wish to receive it. Because the target of the traffic continues to wish to receive it. Because
ICE keepalives are indications, they will not work here. [RFC7675] ICE keepalives are indications, they will not work here. [RFC7675]
skipping to change at page 15, line 11 skipping to change at page 15, line 11
STUN servers without the risk of confusing them with legacy STUN STUN servers without the risk of confusing them with legacy STUN
servers. If a non-ICE legacy solution is needed, then this is servers. If a non-ICE legacy solution is needed, then this is
probably the best choice. probably the best choice.
Once initial consent is verified, we also need to verify continuing Once initial consent is verified, we also need to verify continuing
consent, in order to avoid attacks where two people briefly share an consent, in order to avoid attacks where two people briefly share an
IP (e.g., behind a NAT in an Internet cafe) and the attacker arranges IP (e.g., behind a NAT in an Internet cafe) and the attacker arranges
for a large, unstoppable, traffic flow to the network and then for a large, unstoppable, traffic flow to the network and then
leaves. The appropriate technologies here are fairly similar to leaves. The appropriate technologies here are fairly similar to
those for initial consent, though are perhaps weaker since the those for initial consent, though are perhaps weaker since the
threats is less severe. threats are less severe.
4.2.4. IP Location Privacy 4.2.4. IP Location Privacy
Note that as soon as the callee sends their ICE candidates, the Note that as soon as the callee sends their ICE candidates, the
caller learns the callee's IP addresses. The callee's server caller learns the callee's IP addresses. The callee's server
reflexive address reveals a lot of information about the callee's reflexive address reveals a lot of information about the callee's
location. In order to avoid tracking, implementations may wish to location. In order to avoid tracking, implementations may wish to
suppress the start of ICE negotiation until the callee has answered. suppress the start of ICE negotiation until the callee has answered.
In addition, either side may wish to hide their location from the In addition, either side may wish to hide their location from the
other side entirely by forcing all traffic through a TURN server. other side entirely by forcing all traffic through a TURN server.
In ordinary operation, the site learns the browser's IP address, In ordinary operation, the site learns the browser's IP address,
though it may be hidden via mechanisms like Tor though it may be hidden via mechanisms like Tor
[http://www.torproject.org] or a VPN. However, because sites can [http://www.torproject.org] or a VPN. However, because sites can
cause the browser to provide IP addresses, this provides a mechanism cause the browser to provide IP addresses, this provides a mechanism
for sites to learn about the user's network environment even if the for sites to learn about the user's network environment even if the
user is behind a VPN that masks their IP address. Implementations user is behind a VPN that masks their IP address. Implementations
may wish to provide settings which suppress all non-VPN candidates if may wish to provide settings which suppress all non-VPN candidates if
the user is on certain kinds of VPN, especially privacy-oriented the user is on certain kinds of VPN, especially privacy-oriented
systems such as Tor. systems such as Tor. See [I-D.ietf-rtcweb-ip-handling] for
additional information.
4.3. Communications Security 4.3. Communications Security
Finally, we consider a problem familiar from the SIP world: Finally, we consider a problem familiar from the SIP world:
communications security. For obvious reasons, it MUST be possible communications security. For obvious reasons, it MUST be possible
for the communicating parties to establish a channel which is secure for the communicating parties to establish a channel which is secure
against both message recovery and message modification. (See against both message recovery and message modification. (See
[RFC5479] for more details.) This service must be provided for both [RFC5479] for more details.) This service must be provided for both
data and voice/video. Ideally the same security mechanisms would be data and voice/video. Ideally the same security mechanisms would be
used for both types of content. Technology for providing this used for both types of content. Technology for providing this
skipping to change at page 16, line 20 skipping to change at page 16, line 21
rendered by the browser but under control of the calling service. rendered by the browser but under control of the calling service.
This likely includes the peer's identity information, which, after This likely includes the peer's identity information, which, after
all, is only meaningful in the context of some calling service. all, is only meaningful in the context of some calling service.
This limitation does not mean that preventing attack by the calling This limitation does not mean that preventing attack by the calling
service is completely hopeless. However, we need to distinguish service is completely hopeless. However, we need to distinguish
between two classes of attack: between two classes of attack:
Retrospective compromise of calling service. Retrospective compromise of calling service.
The calling service is is non-malicious during a call but The calling service is non-malicious during a call but
subsequently is compromised and wishes to attack an older call subsequently is compromised and wishes to attack an older call
(often called a "passive attack") (often called a "passive attack")
During-call attack by calling service. During-call attack by calling service.
The calling service is compromised during the call it wishes to The calling service is compromised during the call it wishes to
attack (often called an "active attack"). attack (often called an "active attack").
Providing security against the former type of attack is practical Providing security against the former type of attack is practical
using the techniques discussed in Section 4.3.1. However, it is using the techniques discussed in Section 4.3.1. However, it is
skipping to change at page 18, line 20 skipping to change at page 18, line 21
Moreover, it is trivial to bypass even this kind of mechanism. Moreover, it is trivial to bypass even this kind of mechanism.
Recall that unlike the case of SSH, the browser never directly gets Recall that unlike the case of SSH, the browser never directly gets
the peer's identity from the user. Rather, it is provided by the the peer's identity from the user. Rather, it is provided by the
calling service. Even enabling a mechanism of this type would calling service. Even enabling a mechanism of this type would
require an API to allow the calling service to tell the browser "this require an API to allow the calling service to tell the browser "this
is a call to user X". All the calling service needs to do to avoid is a call to user X". All the calling service needs to do to avoid
triggering a key continuity warning is to tell the browser that "this triggering a key continuity warning is to tell the browser that "this
is a call to user Y" where Y is confusable with X. Even if the user is a call to user Y" where Y is confusable with X. Even if the user
actually checks the other side's name (which all available evidence actually checks the other side's name (which all available evidence
indicates is unlikely), this would require (a) the browser to trusted indicates is unlikely), this would require (a) the browser to use the
UI to provide the name and (b) the user to not be fooled by similar trusted UI to provide the name and (b) the user to not be fooled by
appearing names. similar appearing names.
4.3.2.2. Short Authentication Strings 4.3.2.2. Short Authentication Strings
ZRTP [RFC6189] uses a "short authentication string" (SAS) which is ZRTP [RFC6189] uses a "short authentication string" (SAS) which is
derived from the key agreement protocol. This SAS is designed to be derived from the key agreement protocol. This SAS is designed to be
compared by the users (e.g., read aloud over the the voice channel or compared by the users (e.g., read aloud over the voice channel or
transmitted via an out of band channel) and if confirmed by both transmitted via an out of band channel) and if confirmed by both
sides precludes MITM attack. The intention is that the SAS is used sides precludes MITM attack. The intention is that the SAS is used
once and then key continuity (though a different mechanism from that once and then key continuity (though a different mechanism from that
discussed above) is used thereafter. discussed above) is used thereafter.
Unfortunately, the SAS does not offer a practical solution to the Unfortunately, the SAS does not offer a practical solution to the
problem of a compromised calling service. "Voice conversion" problem of a compromised calling service. "Voice conversion"
systems, which modify voice from one speaker to make it sound like systems, which modify voice from one speaker to make it sound like
another, are an active area of research. These systems are already another, are an active area of research. These systems are already
good enough to fool both automatic recognition systems good enough to fool both automatic recognition systems
[farus-conversion] and humans [kain-conversion] in many cases, and [farus-conversion] and humans [kain-conversion] in many cases, and
are of course likely to improve in future, especially in an are of course likely to improve in future, especially in an
environment where the user just wants to get on with the phone call. environment where the user just wants to get on with the phone call.
Thus, even if SAS is effective today, it is likely not to be so for Thus, even if SAS is effective today, it is likely not to be so for
much longer. much longer.
Additionally, it is unclear that users will actually use an SAS. As Additionally, it is unclear that users will actually use an SAS. As
discussed above, the browser UI constraints preclude requiring the discussed above, the browser UI constraints preclude requiring the
SAS exchange prior to completing the call and so it must be SAS exchange prior to completing the call and so it must be
voluntary; at most the browser will provide some UI indicator that voluntary; at most the browser will provide some UI indicator that
the SAS has not yet been checked. However, it it is well-known that the SAS has not yet been checked. However, it is well-known that
when faced with optional security mechanisms, many users simply when faced with optional security mechanisms, many users simply
ignore them [whitten-johnny]. ignore them [whitten-johnny].
Once users have checked the SAS once, key continuity is required to Once users have checked the SAS once, key continuity is required to
avoid them needing to check it on every call. However, this is avoid them needing to check it on every call. However, this is
problematic for reasons indicated in Section 4.3.2.1. In principle problematic for reasons indicated in Section 4.3.2.1. In principle
it is of course possible to render a different UI element to indicate it is of course possible to render a different UI element to indicate
that calls are using an unauthenticated set of keying material that calls are using an unauthenticated set of keying material
(recall that the attacker can just present a slightly different name (recall that the attacker can just present a slightly different name
so that the attack shows the same UI as a call to a new device or to so that the attack shows the same UI as a call to a new device or to
skipping to change at page 20, line 10 skipping to change at page 20, line 13
media flows are rendered into HTML5 MediaStreams which can be media flows are rendered into HTML5 MediaStreams which can be
manipulated by the calling site. Obviously, if the site can modify manipulated by the calling site. Obviously, if the site can modify
or view the media, then the user is not getting the level of or view the media, then the user is not getting the level of
assurance they would expect from being able to authenticate their assurance they would expect from being able to authenticate their
peer. In many cases, this is acceptable because the user values peer. In many cases, this is acceptable because the user values
site-based special effects over complete security from the site. site-based special effects over complete security from the site.
However, there are also cases where users wish to know that the site However, there are also cases where users wish to know that the site
cannot interfere. In order to facilitate that, it will be necessary cannot interfere. In order to facilitate that, it will be necessary
to provide features whereby the site can verifiably give up access to to provide features whereby the site can verifiably give up access to
the media streams. This verification must be possible both from the the media streams. This verification must be possible both from the
local side and the remote side. I.e., I must be able to verify that local side and the remote side. I.e., users must be able to verify
the person I am calling has engaged a secure media mode (see that the person called has engaged a secure media mode (see
Section 4.3.3). In order to achieve this it will be necessary to Section 4.3.3). In order to achieve this it will be necessary to
cryptographically bind an indication of the local media access policy cryptographically bind an indication of the local media access policy
into the cryptographic authentication procedures detailed in the into the cryptographic authentication procedures detailed in the
previous sections. previous sections.
It should be noted that the use of this secure media mode is left to
the discretion of the site. When such a mode is engaged, the browser
will need to provide indicia to the user that the associated media
has been authenticated as coming from the identified user. This
allows WebRTC services that wish to claim end-to-end security to do
so in a way that can be easily verified by the user. This model
requires that the remote party's browser be included in the TCB, as
described in Section 3.
4.3.3. Malicious Peers 4.3.3. Malicious Peers
One class of attack that we do not generally try to prevent is One class of attack that we do not generally try to prevent is
malicious peers. For instance, no matter what confidentiality malicious peers. For instance, no matter what confidentiality
measures you employ the person you are talking to might record the measures you employ the person you are talking to might record the
call and publish it on the Internet. Similarly, we do not attempt to call and publish it on the Internet. Similarly, we do not attempt to
prevent them from using voice or video processing technology from prevent them from using voice or video processing technology from
hiding or changing their appearance. While technologies (DRM, etc.) hiding or changing their appearance. While technologies (DRM, etc.)
do exist to attempt to address these issues, they are generally not do exist to attempt to address these issues, they are generally not
compatible with open systems and WebRTC does not address them. compatible with open systems and WebRTC does not address them.
skipping to change at page 20, line 50 skipping to change at page 21, line 18
threat in settings where the user wishes to be anonymous. WebRTC threat in settings where the user wishes to be anonymous. WebRTC
provides a number of possible persistent identifiers such as DTLS provides a number of possible persistent identifiers such as DTLS
certificates (if they are reused between connections) and RTCP CNAMES certificates (if they are reused between connections) and RTCP CNAMES
(if generated according to [RFC6222] rather than the privacy (if generated according to [RFC6222] rather than the privacy
preserving mode of [RFC7022]). In order to prevent this type of preserving mode of [RFC7022]). In order to prevent this type of
correlation, browsers need to provide mechanisms to reset these correlation, browsers need to provide mechanisms to reset these
identifiers (e.g., with the same lifetime as cookies). Moreover, the identifiers (e.g., with the same lifetime as cookies). Moreover, the
API should provide mechanisms to allow sites intended for anonymous API should provide mechanisms to allow sites intended for anonymous
calling to force the minting of fresh identifiers. In addition, IP calling to force the minting of fresh identifiers. In addition, IP
addresses can be a source of call linkage addresses can be a source of call linkage
[I-D.ietf-rtcweb-ip-handling] [I-D.ietf-rtcweb-ip-handling].
4.4.2. Browser Fingerprinting 4.4.2. Browser Fingerprinting
Any new set of API features adds a risk of browser fingerprinting, Any new set of API features adds a risk of browser fingerprinting,
and WebRTC is no exception. Specifically, sites can use the presence and WebRTC is no exception. Specifically, sites can use the presence
or absence of specific devices as a browser fingerprint. In general, or absence of specific devices as a browser fingerprint. In general,
the API needs to be balanced between functionality and the the API needs to be balanced between functionality and the
incremental fingerprint risk. See [Fingerprinting] incremental fingerprint risk. See [Fingerprinting].
5. Security Considerations 5. Security Considerations
This entire document is about security. This entire document is about security.
6. Acknowledgements 6. Acknowledgements
Bernard Aboba, Harald Alvestrand, Dan Druta, Cullen Jennings, Alan Bernard Aboba, Harald Alvestrand, Dan Druta, Cullen Jennings, Alan
Johnston, Hadriel Kaplan (S 4.2.1), Matthew Kaufman, Martin Thomson, Johnston, Hadriel Kaplan (S 4.2.1), Matthew Kaufman, Martin Thomson,
Magnus Westerlund. Magnus Westerlund.
skipping to change at page 22, line 42 skipping to change at page 23, line 10
[farus-conversion] [farus-conversion]
Farrus, M., Erro, D., and J. Hernando, "Speaker Farrus, M., Erro, D., and J. Hernando, "Speaker
Recognition Robustness to Voice Conversion", January 2008. Recognition Robustness to Voice Conversion", January 2008.
[finer-grained] [finer-grained]
Barth, A. and C. Jackson, "Beware of Finer-Grained Barth, A. and C. Jackson, "Beware of Finer-Grained
Origins", W2SP, 2008, July 2008. Origins", W2SP, 2008, July 2008.
[Fingerprinting] [Fingerprinting]
W3C, "Fingerprinting Guidance for Web Specification "Fingerprinting Guidance for Web Specification Authors
Authors (Draft)", November 2013. (Draft)", November 2013.
[huang-w2sp] [huang-w2sp]
Huang, L-S., Chen, E., Barth, A., Rescorla, E., and C. Huang, L-S., Chen, E., Barth, A., Rescorla, E., and C.
Jackson, "Talking to Yourself for Fun and Profit", W2SP, Jackson, "Talking to Yourself for Fun and Profit", W2SP,
2011, May 2011. 2011, May 2011.
[I-D.ietf-rtcweb-ip-handling] [I-D.ietf-rtcweb-ip-handling]
Uberti, J., "WebRTC IP Address Handling Requirements", Uberti, J., "WebRTC IP Address Handling Requirements",
draft-ietf-rtcweb-ip-handling-11 (work in progress), draft-ietf-rtcweb-ip-handling-12 (work in progress), July
November 2018. 2019.
[I-D.ietf-rtcweb-overview] [I-D.ietf-rtcweb-overview]
Alvestrand, H., "Overview: Real Time Protocols for Alvestrand, H., "Overview: Real Time Protocols for
Browser-based Applications", draft-ietf-rtcweb-overview-19 Browser-based Applications", draft-ietf-rtcweb-overview-19
(work in progress), November 2017. (work in progress), November 2017.
[I-D.ietf-rtcweb-security-arch] [I-D.ietf-rtcweb-security-arch]
Rescorla, E., "WebRTC Security Architecture", draft-ietf- Rescorla, E., "WebRTC Security Architecture", draft-ietf-
rtcweb-security-arch-17 (work in progress), November 2018. rtcweb-security-arch-18 (work in progress), February 2019.
[kain-conversion] [kain-conversion]
Kain, A. and M. Macon, "Design and Evaluation of a Voice Kain, A. and M. Macon, "Design and Evaluation of a Voice
Conversion Algorithm based on Spectral Envelope Mapping Conversion Algorithm based on Spectral Envelope Mapping
and Residual Prediction", Proceedings of ICASSP, May and Residual Prediction", Proceedings of ICASSP, May
2001, May 2001. 2001, May 2001.
[OpenID] Sakimura, N., Bradley, J., Jones, M., de Medeiros, B., and [OpenID] Sakimura, N., Bradley, J., Jones, M., de Medeiros, B., and
C. Mortimore, "Fingerprinting Guidance for Web C. Mortimore, "OpenID Connect Core 1.0", November 2014.
Specification Authors (Draft)", November 2014.
[RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818,
DOI 10.17487/RFC2818, May 2000, DOI 10.17487/RFC2818, May 2000,
<https://www.rfc-editor.org/info/rfc2818>. <https://www.rfc-editor.org/info/rfc2818>.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
A., Peterson, J., Sparks, R., Handley, M., and E. A., Peterson, J., Sparks, R., Handley, M., and E.
Schooler, "SIP: Session Initiation Protocol", RFC 3261, Schooler, "SIP: Session Initiation Protocol", RFC 3261,
DOI 10.17487/RFC3261, June 2002, DOI 10.17487/RFC3261, June 2002,
<https://www.rfc-editor.org/info/rfc3261>. <https://www.rfc-editor.org/info/rfc3261>.
skipping to change at page 25, line 25 skipping to change at page 25, line 37
Thomson, "Session Traversal Utilities for NAT (STUN) Usage Thomson, "Session Traversal Utilities for NAT (STUN) Usage
for Consent Freshness", RFC 7675, DOI 10.17487/RFC7675, for Consent Freshness", RFC 7675, DOI 10.17487/RFC7675,
October 2015, <https://www.rfc-editor.org/info/rfc7675>. October 2015, <https://www.rfc-editor.org/info/rfc7675>.
[RFC8445] Keranen, A., Holmberg, C., and J. Rosenberg, "Interactive [RFC8445] Keranen, A., Holmberg, C., and J. Rosenberg, "Interactive
Connectivity Establishment (ICE): A Protocol for Network Connectivity Establishment (ICE): A Protocol for Network
Address Translator (NAT) Traversal", RFC 8445, Address Translator (NAT) Traversal", RFC 8445,
DOI 10.17487/RFC8445, July 2018, DOI 10.17487/RFC8445, July 2018,
<https://www.rfc-editor.org/info/rfc8445>. <https://www.rfc-editor.org/info/rfc8445>.
[SWF] Adobe, "SWF File Format Specification Version 19", April [SWF] "SWF File Format Specification Version 19", April 2013.
2013.
[whitten-johnny] [whitten-johnny]
Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A
Usability Evaluation of PGP 5.0", Proceedings of the 8th Usability Evaluation of PGP 5.0", Proceedings of the 8th
USENIX Security Symposium, 1999, August 1999. USENIX Security Symposium, 1999, August 1999.
[XmlHttpRequest] [XmlHttpRequest]
van Kesteren, A., "XMLHttpRequesti Level 2", January 2012. van Kesteren, A., "XMLHttpRequesti Level 2", January 2012.
Author's Address Author's Address
 End of changes. 32 change blocks. 
65 lines changed or deleted 76 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/