draft-iab-privsec-confidentiality-threat-00.txt   draft-iab-privsec-confidentiality-threat-01.txt 
Network Working Group R. Barnes Network Working Group R. Barnes
Internet-Draft Internet-Draft
Intended status: Informational B. Schneier Intended status: Informational B. Schneier
Expires: March 15, 2015 Expires: June 15, 2015
C. Jennings C. Jennings
T. Hardie T. Hardie
B. Trammell B. Trammell
C. Huitema C. Huitema
D. Borkmann D. Borkmann
September 11, 2014 December 12, 2014
Confidentiality in the Face of Pervasive Surveillance: A Threat Model Confidentiality in the Face of Pervasive Surveillance: A Threat Model
and Problem Statement and Problem Statement
draft-iab-privsec-confidentiality-threat-00 draft-iab-privsec-confidentiality-threat-01
Abstract Abstract
Documents published in 2013 have revealed several classes of Documents published in 2013 revealed several classes of pervasive
"pervasive" attack on Internet communications. In this document we surveillance attack on Internet communications. In this document we
develop a threat model that describes these pervasive attacks. We develop a threat model that describes these pervasive attacks. We
start by assuming a completely passive adversary with an interest in start by assuming a completely passive attacker with an interest in
indiscriminate eavesdropping that can observe network traffic, then undetected, indiscriminate eavesdropping, then expand the threat
expand the threat model with a set of verified attacks that have been model with a set of verified attacks that have been published. Based
published. Based on this threat model, we discuss the techniques on this threat model, we discuss the techniques that can be employed
that can be employed in Internet protocol design to increase the in Internet protocol design to increase the protocols robustness to
protocols robustness to pervasive attacks. pervasive surveillance.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 15, 2015. This Internet-Draft will expire on June 15, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 41 skipping to change at page 2, line 41
To ensure that the Internet can be trusted by users, it is necessary To ensure that the Internet can be trusted by users, it is necessary
for the Internet technical community to address the vulnerabilities for the Internet technical community to address the vulnerabilities
exploited in these attacks [RFC7258]. The goal of this document is exploited in these attacks [RFC7258]. The goal of this document is
to describe more precisely the threats posed by these pervasive to describe more precisely the threats posed by these pervasive
attacks, and based on those threats, lay out the problems that need attacks, and based on those threats, lay out the problems that need
to be solved in order to secure the Internet in the face of those to be solved in order to secure the Internet in the face of those
threats. threats.
The remainder of this document is structured as follows. In The remainder of this document is structured as follows. In
Section 3, we describe an idealized passive adversary, one which Section 3, we describe an idealized passive attacket, one which could
could completely undetectably compromise communications at Internet completely undetectably compromise communications at Internet scale.
scale. In Section 4, we provide a brief summary of some attacks that In Section 4, we provide a brief summary of some attacks that have
have been disclosed, and use these to expand the assumed capabilities been disclosed, and use these to expand the assumed capabilities of
of our idealized adversary. Section 5 describes a threat model based our idealized attacker. Section 5 describes a threat model based on
on these attacks, focusing on classes of attack that have not been a these attacks, focusing on classes of attack that have not been a
focus of Internet engineering to date. Section 6 provides some high- focus of Internet engineering to date.
level guidance on how Internet protocols can defend against the
threats described here.
2. Terminology 2. Terminology
This document makes extensive use of standard security and privacy This document makes extensive use of standard security and privacy
terminology; see [RFC4949] and [RFC6973]. In addition, we use a few terminology; see [RFC4949] and [RFC6973]. Terms used from [RFC6973]
terms that are specific to the attacks discussed here: include Eavesdropper, Observer, Initiator, Intermediary, Recipient,
Attack (in a privacy context), Correlation, Fingerprint, Traffic
Analysis, and Identifiability (and related terms). In addition, we
use a few terms that are specific to the attacks discussed here:
Pervasive Attack: An attack on Internet protocols that makes use of Pervasive Attack: An attack on Internet communications that makes
access at a large number of points in the network, or otherwise use of access at a large number of points in the network, or
provides the attacker with access to a large amount of Internet otherwise provides the attacker with access to a large amount of
traffic. Internet traffic; see [RFC7258]
Observation: Information collected directly from communications by Observation: Information collected directly from communications by
an eavesdropper or observer. For example, the knowledge that an eavesdropper or observer. For example, the knowledge that
<alice@example.com> sent a message to <bob@example.com> via SMTP <alice@example.com> sent a message to <bob@example.com> via SMTP
taken from the headers of an observed SMTP message would be an taken from the headers of an observed SMTP message would be an
observation. observation.
Inference: :Information extracted from analysis of information Inference: Information extracted from analysis of information
collected directly from communications by an eavesdropper or collected directly from communications by an eavesdropper or
observer. For example, the knowledge that a given web page was observer. For example, the knowledge that a given web page was
accessed by a given IP address, by comparing the size in octets of accessed by a given IP address, by comparing the size in octets of
measured network flow records to fingerprints derived from known measured network flow records to fingerprints derived from known
sizes of linked resources on the web servers involved, would be an sizes of linked resources on the web servers involved, would be an
inference. inference.
Collaborator: An entity that is a legitimate participant in a Collaborator: An entity that is a legitimate participant in a
protocol, but who provides information about that interaction communication, but who provides information about that interaction
(keys or data) to an attacker. (keys or data) to an attacker.
Unwitting Collaborator: A collaborator that provides information to
the attacker not deliberately, but because the attacker has
exploited some technology used by the collaborator.
Key Exfiltration: The transmission of keying material for an Key Exfiltration: The transmission of keying material for an
encrypted communication from a collaborator to an attacker encrypted communication from a collaborator to an attacker
Content Exfiltration: The transmission of the content of a Content Exfiltration: The transmission of the content of a
communication from a collaborator to an attacker communication from a collaborator to an attacker
Unwitting Collaborator: A collaborator that provides information to 3. An Idealized Pervasive Passive Attacker
the attacker not deliberately, but because the attacker has
exploited some technology used by the collaborator.
3. An Idealized Pervasive Passive Adversary
We assume a pervasive passive adversary, an indiscriminate
eavesdropper on an Internet-attached computer network that
To build a thread model, we first assume a pervasive passive
attacker, an indiscriminate eavesdropper on an Internet-attached
computer network that
o can observe every packet of all communications at any or every hop o can observe every packet of all communications at any or every hop
in any network path between an initiator and a recipient; and in any network path between an initiator and a recipient; and
o can observe data at rest in intermediate systems between the o can observe data at rest in intermediate systems between the
endpoints controlled by the initiator and recipient; but endpoints controlled by the initiator and recipient; but
o takes no other action with respect to these communications (i.e., o takes no other action with respect to these communications (i.e.,
blocking, modification, injection, etc.). blocking, modification, injection, etc.).
This adversary is less capable than those which we know to have This attacker is less capable than those which we know to have
compromised the Internet from press reports, elaborated in Section 4, compromised the Internet from press reports, elaborated in Section 4,
but represents the threat to communications privacy by a single but represents the threat to communications privacy by a single
entity interested in remaining undetectable. eavesdropper interested in remaining undetectable.
The techniques available to our ideal adversary are direct The techniques available to our ideal attacker are direct observation
observation and inference. Direct observation involves taking and inference. Direct observation involves taking information
information directly from eavesdropped communications - e.g., URLs directly from eavesdropped communications - e.g., URLs identifying
identifying content or email addresses identifying individuals from content or email addresses identifying individuals from application-
application-layer headers. Inference, on the other hand, involves layer headers. Inference, on the other hand, involves analyzing
analyzing eavesdropped information to derive new information from it; eavesdropped information to derive new information from it; e.g.,
e.g., searching for application or behavioral fingerprints in searching for application or behavioral fingerprints in observed
observed traffic to derive information about the observed individual traffic to derive information about the observed individual from
from them, in absence of directly-observed sources of the same them, in absence of directly-observed sources of the same
information. The use of encryption to protect confidentiality is information. The use of encryption to protect confidentiality is
generally enough to prevent direct observation, assuming generally enough to prevent direct observation, assuming
uncompromised encryption implementations and key material, but uncompromised encryption implementations and key material, but
provides less complete protection against inference, especially provides less complete protection against inference, especially
inference based only on unprotected portions of communications (e.g. inference based only on unprotected portions of communications (e.g.
IP and TCP headers for TLS). IP and TCP headers for TLS [RFC5246]).
3.1. Information subject to direct observation 3.1. Information subject to direct observation
Protocols which do not encrypt their payload make the entire content Protocols which do not encrypt their payload make the entire content
of the communication available to a PPA along their path. Following of the communication available to the idealized attacker along their
the advice in [RFC3365], most such protocols have a secure variant path. Following the advice in [RFC3365], most such protocols have a
which encrypts payload for confidentiality, and these secure variants secure variant which encrypts payload for confidentiality, and these
are seeing ever-wider deployment. A noteworthy exception is DNS secure variants are seeing ever-wider deployment. A noteworthy
[RFC1035], as DNSSEC [RFC4033] does not have confidentiality as a exception is DNS [RFC1035], as DNSSEC [RFC4033] does not have
requirement. This implies that all DNS queries and answers generated confidentiality as a requirement. This implies that all DNS queries
by the activities of any protocol are available to a the adversary. and answers generated by the activities of any protocol are available
to the attacker.
Protocols which imply the storage of some data at rest in Protocols which imply the storage of some data at rest in
intermediaries leave this data subject to observation by an adversary intermediaries (e.g. SMTP [RFC5321]) leave this data subject to
that has compromised these intermediaries, unless the data is observation by an attacker that has compromised these intermediaries,
encrypted end-to-end by the application layer protocol, or the unless the data is encrypted end-to-end by the application layer
implementation uses an encrypted store for this data. protocol, or the implementation uses an encrypted store for this
data.
3.2. Information useful for inference 3.2. Information useful for inference
Inference is information extracted from later analysis of an observed Inference is information extracted from later analysis of an observed
communication, and/or correlation of observed information with or eavesdropped communication, and/or correlation of observed or
information available from other sources. Indeed, most useful eavesdropped information with information available from other
inference performed by a our ideal adversary falls under the rubric sources. Indeed, most useful inference performed by the attacker
of correlation. The simplest example of this is the observation of falls under the rubric of correlation. The simplest example of this
DNS queries and answers from and to a source and correlating those is the observation of DNS queries and answers from and to a source
with IP addresses with which that source communicates. This can give and correlating those with IP addresses with which that source
access to information otherwise not available from encrypted communicates. This can give access to information otherwise not
application payloads (e.g., the Host: HTTP/1.1 request header when available from encrypted application payloads (e.g., the Host:
HTTP is used with TLS). HTTP/1.1 request header when HTTP is used with TLS).
Protocols which encrypt their payload using an application- or Protocols which encrypt their payload using an application- or
transport-layer encryption scheme (e.g. TLS [RFC5246]) still expose transport-layer encryption scheme (e.g. TLS) still expose all the
all the information in their network and transport layer headers to a information in their network and transport layer headers to the
PPA, including source and destination addresses and ports. IPsec attacker, including source and destination addresses and ports.
ESP[RFC4303] further encrypts the transport-layer headers, but still IPsec ESP[RFC4303] further encrypts the transport-layer headers, but
leaves IP address information unencrypted; in tunnel mode, these still leaves IP address information unencrypted; in tunnel mode,
addresses correspond to the tunnel endpoints. Features of the these addresses correspond to the tunnel endpoints. Features of the
cryptographic protocols themselves, e.g. the TLS session identifier, cryptographic protocols themselves, e.g. the TLS session identifier,
may leak information that can be used for correlation and inference. may leak information that can be used for correlation and inference.
While this information is much less semantically rich than the While this information is much less semantically rich than the
application payload, it can still be useful for the inferring an application payload, it can still be useful for the inferring an
individual's activities. individual's activities.
Inference can also leverage information obtained from sources other Inference can also leverage information obtained from sources other
than direct traffic observation. Geolocation databases, for example, than direct traffic observation. Geolocation databases, for example,
have been developed map IP addresses to a location, in order to have been developed map IP addresses to a location, in order to
provide location-aware services such as targeted advertising. This provide location-aware services such as targeted advertising. This
skipping to change at page 5, line 50 skipping to change at page 5, line 50
accessible information. This information can be extremely accessible information. This information can be extremely
semantically rich, including information about an individual's semantically rich, including information about an individual's
location, associations with other individuals and groups, and location, associations with other individuals and groups, and
activities. Further, this information is generally contributed and activities. Further, this information is generally contributed and
curated voluntarily by the individuals themselves: it represents curated voluntarily by the individuals themselves: it represents
information which the individuals are not necessarily interested in information which the individuals are not necessarily interested in
protecting for privascy reasons. However, correlation of this social protecting for privascy reasons. However, correlation of this social
networking data with information available from direct observation of networking data with information available from direct observation of
network traffic allows the creation of a much richer picture of an network traffic allows the creation of a much richer picture of an
individual's activities than either alone. We note with some alarm individual's activities than either alone. We note with some alarm
that there is little that can be done from the protocol design side that there is little that can be done at protocol design time to
to limit such correlation by a PPA, and that the existence of such limit such correlation by the attacker, and that the existence of
data sources in many cases greatly complicates the problem of such data sources in many cases greatly complicates the problem of
protecting privacy by hardening protocols alone. protecting privacy by hardening protocols alone.
3.3. An illustration of an ideal passive attack 3.3. An illustration of an ideal passive attack
To illustrate how capable even this limited adversary is, we explore To illustrate how capable the attacker is even given its limitations,
the non-anonymity of even encrypted IP traffic by examining in detail we explore the non-anonymity of even encrypted IP traffic by
some inference techniques for associating a set of addresses with an examining in detail some inference techniques for associating a set
individual, in order to illustrate the difficulty of defending of addresses with an individual, in order to illustrate the
communications against a PPA. Here, the basic problem is that difficulty of defending communications against our idealized
information radiated even from protocols which have no obvious attacker. Here, the basic problem is that information radiated even
connection with personal data can be correlated with other from protocols which have no obvious connection with personal data
information which can paint a very rich behavioral picture, that only can be correlated with other information which can paint a very rich
takes one unprotected link in the chain to associate with an behavioral picture, that only takes one unprotected link in the chain
identity. to associate with an identity.
3.3.1. Analysis of IP headers 3.3.1. Analysis of IP headers
Internet traffic can be monitored by tapping Internet links, or by Internet traffic can be monitored by tapping Internet links, or by
installing monitoring tools in Internet routers. Of course, a single installing monitoring tools in Internet routers. Of course, a single
link or a single router only provides access to a fraction of the link or a single router only provides access to a fraction of the
global Internet traffic. However, monitoring a number of high global Internet traffic. However, monitoring a number of high
capacity links or a set of routers placed at strategic locations capacity links or a set of routers placed at strategic locations
provides access to a good sampling of Internet traffic. provides access to a good sampling of Internet traffic.
skipping to change at page 7, line 4 skipping to change at page 6, line 51
Analysis of traffic variations over time can be used to detect Analysis of traffic variations over time can be used to detect
increased activity by particular users, or in the case of peer-to- increased activity by particular users, or in the case of peer-to-
peer connections increased activity within groups of users. peer connections increased activity within groups of users.
3.3.2. Correlation of IP addresses to user identities 3.3.2. Correlation of IP addresses to user identities
The correlation of IP addresses with specific users can be done in The correlation of IP addresses with specific users can be done in
various ways. For example, tools like reverse DNS lookup can be used various ways. For example, tools like reverse DNS lookup can be used
to retrieve the DNS names of servers. Since the addresses of servers to retrieve the DNS names of servers. Since the addresses of servers
tend to be quite stable and since servers are relatively less tend to be quite stable and since servers are relatively less
numerous than users, a PPA could easily maintain its own copy of the numerous than users, an attacker could easily maintain its own copy
DNS for well-known or popular servers, to accelerate such lookups. of the DNS for well-known or popular servers, to accelerate such
lookups.
On the other hand, the reverse lookup of IP addresses of users is On the other hand, the reverse lookup of IP addresses of users is
generally less informative. For example, a lookup of the address generally less informative. For example, a lookup of the address
currently used by one author's home network returns a name of the currently used by one author's home network returns a name of the
form "c-192-000-002-033.hsd1.wa.comcast.net". This particular type form "c-192-000-002-033.hsd1.wa.comcast.net". This particular type
of reverse DNS lookup generally reveals only coarse-grained location of reverse DNS lookup generally reveals only coarse-grained location
or provider information. or provider information.
In many jurisdictions, Internet Service Providers (ISPs) are required In many jurisdictions, Internet Service Providers (ISPs) are required
to provide identification on a case by case basis of the "owner" of a to provide identification on a case by case basis of the "owner" of a
specific IP address for law enforcement purposes. This is a specific IP address for law enforcement purposes. This is a
reasonably expedient process for targeted investigations, but reasonably expedient process for targeted investigations, but
pervasive surveillance requires something more efficient. This pervasive surveillance requires something more efficient. This
provides an incentive for the adversary to secure the cooperation of provides an incentive for the attacker to secure the cooperation of
the ISP in order to automate this correlation. the ISP in order to automate this correlation.
3.3.3. Monitoring messaging clients for IP address correlation 3.3.3. Monitoring messaging clients for IP address correlation
Even if the ISP does not cooperate, user identity can often be Even if the ISP does not cooperate, user identity can often be
obtained via inference. POP3 [RFC1939] and IMAP [RFC3501] are used obtained via inference. POP3 [RFC1939] and IMAP [RFC3501] are used
to retrieve mail from mail servers, while a variant of SMTP [RFC5321] to retrieve mail from mail servers, while a variant of SMTP is used
is used to submit messages through mail servers. IMAP connections to submit messages through mail servers. IMAP connections originate
originate from the client, and typically start with an authentication from the client, and typically start with an authentication exchange
exchange in which the client proves its identity by answering a in which the client proves its identity by answering a password
password challenge. The same holds for the SIP protocol [RFC3261] challenge. The same holds for the SIP protocol [RFC3261] and many
and many instant messaging services operating over the Internet using instant messaging services operating over the Internet using
proprietary protocols. proprietary protocols.
The username is directly observable if any of these protocols operate The username is directly observable if any of these protocols operate
in cleartext; the username can then be directly associated with the in cleartext; the username can then be directly associated with the
source address. source address.
3.3.4. Retrieving IP addresses from mail headers 3.3.4. Retrieving IP addresses from mail headers
SMTP [RFC5321] requires that each successive SMTP relay adds a SMTP [RFC5321] requires that each successive SMTP relay adds a
"Received" header to the mail headers. The purpose of these headers "Received" header to the mail headers. The purpose of these headers
skipping to change at page 8, line 10 skipping to change at page 8, line 10
with ESMTPSA (DHE-RSA-AES256-SHA encrypted, authenticated); 27 Oct with ESMTPSA (DHE-RSA-AES256-SHA encrypted, authenticated); 27 Oct
2013 21:47:14 +0100 Message-ID: <526D7BD2.7070908@example.org> Date: 2013 21:47:14 +0100 Message-ID: <526D7BD2.7070908@example.org> Date:
Sun, 27 Oct 2013 20:47:14 +0000 From: Some One <some.one@example.org> Sun, 27 Oct 2013 20:47:14 +0000 From: Some One <some.one@example.org>
" "
This is the first "Received" header attached to the message by the This is the first "Received" header attached to the message by the
first SMTP relay; for privacy reasons, the field values have been first SMTP relay; for privacy reasons, the field values have been
anonymized. We learn here that the message was submitted by "Some anonymized. We learn here that the message was submitted by "Some
One" on October 27, from a host behind a NAT (192.168.1.100) One" on October 27, from a host behind a NAT (192.168.1.100)
[RFC1918] that used the IP address 192.0.2.44. The information [RFC1918] that used the IP address 192.0.2.44. The information
remained in the message, and is accessible by all recipients of the remained in the message, and is accessible by all recipients of the
"perpass" mailing list, or indeed by any PPA that sees at least one "perpass" mailing list, or indeed by any attacker that sees at least
copy of the message. one copy of the message.
An idealized adversary that can observe sufficient email traffic can An attacker that can observe sufficient email traffic can regularly
regularly update the mapping between public IP addresses and update the mapping between public IP addresses and individual email
individual email identities. Even if the SMTP traffic was encrypted identities. Even if the SMTP traffic was encrypted on submission and
on submission and relaying, the adversary can still receive a copy of relaying, the attacker can still receive a copy of public mailing
public mailing lists like "perpass". lists like "perpass".
3.3.5. Tracking address usage with web cookies 3.3.5. Tracking address usage with web cookies
Many web sites only encrypt a small fraction of their transactions. Many web sites only encrypt a small fraction of their transactions.
A popular pattern was to use HTTPS for the login information, and A popular pattern is to use HTTPS for the login information, and then
then use a "cookie" to associate following clear-text transactions use a "cookie" to associate following clear-text transactions with
with the user's identity. Cookies are also used by various the user's identity. Cookies are also used by various advertisement
advertisement services to quickly identify the users and serve them services to quickly identify the users and serve them with
with "personalized" advertisements. Such cookies are particularly "personalized" advertisements. Such cookies are particularly useful
useful if the advertisement services want to keep tracking the user if the advertisement services want to keep tracking the user across
across multiple sessions that may use different IP addresses. multiple sessions that may use different IP addresses.
As cookies are sent in clear text, a PPA can build a database that As cookies are sent in clear text, an attacker can build a database
associates cookies to IP addresses for non-HTTPS traffic. If the IP that associates cookies to IP addresses for non-HTTPS traffic. If
address is already identified, the cookie can be linked to the user the IP address is already identified, the cookie can be linked to the
identify. After that, if the same cookie appears on a new IP user identify. After that, if the same cookie appears on a new IP
address, the new IP address can be immediately associated with the address, the new IP address can be immediately associated with the
pre-determined identity. pre-determined identity.
3.3.6. Graph-based approaches to address correlation 3.3.6. Graph-based approaches to address correlation
An adversary can track traffic from an IP address not yet associated An attacker can track traffic from an IP address not yet associated
with an individual to various public services (e.g. websites, mail with an individual to various public services (e.g. websites, mail
servers, game servers), and exploit patterns in the observed traffic servers, game servers), and exploit patterns in the observed traffic
to correlate this address with other addresses that show similar to correlate this address with other addresses that show similar
patterns. For example, any two addresses that show connections to patterns. For example, any two addresses that show connections to
the same IMAP or webmail services, the same set of favorite websites, the same IMAP or webmail services, the same set of favorite websites,
and game servers at similar times of day may be associated with the and game servers at similar times of day may be associated with the
same individual. Correlated addresses can then be tied to an same individual. Correlated addresses can then be tied to an
individual through one of the techniques above, walking the "network individual through one of the techniques above, walking the "network
graph" to expand the set of attributable traffic. graph" to expand the set of attributable traffic.
3.3.7. Tracking of MAC Addresses
Moving back down the stack, technologies like Ethernet or Wi-Fi use
MAC Addresses to identify link-level destinations. MAC Addresses
assigned according to IEEE-802 standards are unique to the device.
If the link is publicly accessible, an attacker can track it. For
example, the attacker can track the wireless traffic at public Wi-Fi
networks. Simple devices can monitor the traffic, and reveal which
MAC Addresses are present. If the network does not use some form of
Wi-Fi encryption, or if the attacker can access the decrypted
traffic, the analysis will also provide the correlation between MAC
Addresses and IP addresses. Additional monitoring using techniques
exposed in the previous sections will reveal the correlation between
MAC Addresses, IP Addresses, and user identity. The attacker can
conceivably build a database linking MAC Addresses and device or user
identities, and use it to track the movement of devices and of their
owners.
4. Reported Instances of Large-Scale Attacks 4. Reported Instances of Large-Scale Attacks
The situation in reality is more bleak than that suggested by an The situation in reality is more bleak than that suggested by an
analysis of our idealized adversary. Through revelations of analysis of our idealized attacker. Through revelations of sensitive
sensitive documents in several media outlets, the Internet community documents in several media outlets, the Internet community has been
has been made aware of several intelligence activities conducted by made aware of several intelligence activities conducted by US and UK
US and UK national intelligence agencies, particularly the US national intelligence agencies, particularly the US National Security
National Security Agency (NSA) and the UK Government Communications Agency (NSA) and the UK Government Communications Headquarters
Headquarters (GCHQ). These documents have revealed methods that (GCHQ). These documents have revealed methods that these agencies
these agencies use to attack Internet applications and obtain use to attack Internet applications and obtain sensitive user
sensitive user information. information.
First, they have confirmed that these agencies have capabilities in First, they have confirmed that these agencies have capabilities in
line with those of our idealized adversary, thorugh the large-scale line with those of our idealized attacker, thorugh the large-scale
passive collection of Internet traffic [pass1][pass2][pass3][pass4]. passive collection of Internet traffic [pass1][pass2][pass3][pass4].
For example: * The NSA XKEYSCORE system accesses data from multiple For example: - The NSA XKEYSCORE system accesses data from multiple
access points and searches for "selectors" such as email addresses, access points and searches for "selectors" such as email addresses,
at the scale of tens of terabytes of data per day. at the scale of tens of terabytes of data per day. - The GCHQ
* The GCHQ Tempora system appears to have access to around 1,500 Tempora system appears to have access to around 1,500 major cables
major cables passing through the UK. * The NSA MUSCULAR program passing through the UK. - The NSA MUSCULAR program tapped cables
tapped cables between data centers belonging to major service between data centers belonging to major service providers. - Several
providers. * Several programs appear to perform wide-scale programs appear to perform wide-scale collection of cookies in web
collection of cookies in web traffic and location data from location- traffic and location data from location-aware portable devices such
aware portable devices such as smartphones. as smartphones.
However, the capabilities described go beyond those available to our However, the capabilities described go beyond those available to our
idealized adversary, including: idealized attacker, including:
o Decryption of TLS-protected Internet sessions [dec1][dec2][dec3]. o Decryption of TLS-protected Internet sessions [dec1][dec2][dec3].
For example, the NSA BULLRUN project appears to have had a budget For example, the NSA BULLRUN project appears to have had a budget
of around $250M per year to undermine encryption through multiple of around $250M per year to undermine encryption through multiple
approaches. approaches.
o Insertion of NSA devices as a man-in-the-middle of Internet o Insertion of NSA devices as a man-in-the-middle of Internet
transactions [TOR1][TOR2]. For example, the NSA QUANTUM system transactions [TOR1][TOR2]. For example, the NSA QUANTUM system
appears to use several different techniques to hijack HTTP appears to use several different techniques to hijack HTTP
connections, ranging from DNS response injection to HTTP 302 connections, ranging from DNS response injection to HTTP 302
skipping to change at page 10, line 21 skipping to change at page 10, line 37
covert modifications to software as one means to undermine covert modifications to software as one means to undermine
encryption. encryption.
* There is also some suspicion that NSA modifications to the * There is also some suspicion that NSA modifications to the
DUAL_EC_DRBG random number generator were made to ensure that DUAL_EC_DRBG random number generator were made to ensure that
keys generated using that generator could be predicted by NSA. keys generated using that generator could be predicted by NSA.
These suspicions have been reinforced by reports that RSA These suspicions have been reinforced by reports that RSA
Security was paid roughly $10M to make DUAL_EC_DRBG the default Security was paid roughly $10M to make DUAL_EC_DRBG the default
in their products. in their products.
We use the term "pervasive attack" to collectively describe these We use the term "pervasive attack" [RFC7258] to collectively describe
operations. The term "pervasive" is used because the attacks are these operations. The term "pervasive" is used because the attacks
designed to indiscriminately gather as much data as possible and to are designed to indiscriminately gather as much data as possible and
apply selective analysis on targets after the fact. This means that to apply selective analysis on targets after the fact. This means
all, or nearly all, Internet communications are targets for these that all, or nearly all, Internet communications are targets for
attacks. To achieve this scale, the attacks are physically these attacks. To achieve this scale, the attacks are physically
pervasive; they affect a large number of Internet communications. pervasive; they affect a large number of Internet communications.
They are pervasive in content, consuming and exploiting any They are pervasive in content, consuming and exploiting any
information revealed by the protocol. And they are pervasive in information revealed by the protocol. And they are pervasive in
technology, exploiting many different vulnerabilities in many technology, exploiting many different vulnerabilities in many
different protocols. different protocols.
It's important to note that although the attacks mentioned above were It's important to note that although the attacks mentioned above were
executed by NSA and GCHQ, there are many other organizations that can executed by NSA and GCHQ, there are many other organizations that can
mount pervasive attacks. Because of the resources required to mount pervasive surveillance attacks. Because of the resources
achieve pervasive scale, pervasive attacks are most commonly required to achieve pervasive scale, these attacks are most commonly
undertaken by nation-state actors. For example, the Chinese Internet undertaken by nation-state actors. For example, the Chinese Internet
filtering system known as the "Great Firewall of China" uses several filtering system known as the "Great Firewall of China" uses several
techniques that are similar to the QUANTUM program, and which have a techniques that are similar to the QUANTUM program, and which have a
high degree of pervasiveness with regard to the Internet in China. high degree of pervasiveness with regard to the Internet in China.
5. Threat Model 5. Threat Model
Given these disclosures, we must consider broader threat model. Given these disclosures, we must consider a broader threat model.
Pervasive surveillance aims to collect information across a large Pervasive surveillance aims to collect information across a large
number of Internet communications, observing the collected number of Internet communications, analyzing the collected
communications to identify information of interest within individual communications to identify information of interest within individual
communications, or inferring information from correlated communications, or inferring information from correlated
communications. This analysis sometimes benefits from decryption of communications. his analysis sometimes benefits from decryption of
encrypted communications and deanonymization of anonymized encrypted communications and deanonymization of anonymized
communications. As a result, these attackers desire both access to communications. As a result, these attackers desire both access to
the bulk of Internet traffic and to the keying material required to the bulk of Internet traffic and to the keying material required to
decrypt any traffic that has been encrypted (though the presence of a decrypt any traffic that has been encrypted. Even if keys are not
communication and the fact that it is encrypted may both be inputs to available, note that the presence of a communication and the fact
an analysis, even if the attacker cannot decrypt the communication). that it is encrypted may both be inputs to an analysis, even if the
attacker cannot decrypt the communication.
The attacks listed above highlight new avenues both for access to The attacks listed above highlight new avenues both for access to
traffic and for access to relevant encryption keys. They further traffic and for access to relevant encryption keys. They further
indicate that the scale of surveillance is sufficient to provide a indicate that the scale of surveillance is sufficient to provide a
general capability to cross-correlate communications, a threat not general capability to cross-correlate communications, a threat not
previously thought to be relevant at the scale of all Internet previously thought to be relevant at the scale of the Internet.
communications.
5.1. Attacker Capabilities 5.1. Attacker Capabilities
+--------------------------+-------------------------------------+ +--------------------------+-------------------------------------+
| Attack Class | Capability | | Attack Class | Capability |
+--------------------------+-------------------------------------+ +--------------------------+-------------------------------------+
| Passive observation | Directly capture data in transit | | Passive observation | Directly capture data in transit |
| | | | | |
| Passive inference | Infer from reduced/encrypted data | | Passive inference | Infer from reduced/encrypted data |
| | | | | |
skipping to change at page 11, line 36 skipping to change at page 12, line 7
| | | | | |
| Static key exfiltration | Obtain key material once / rarely | | Static key exfiltration | Obtain key material once / rarely |
| | | | | |
| Dynamic key exfiltration | Obtain per-session key material | | Dynamic key exfiltration | Obtain per-session key material |
| | | | | |
| Content exfiltration | Access data at rest | | Content exfiltration | Access data at rest |
+--------------------------+-------------------------------------+ +--------------------------+-------------------------------------+
Security analyses of Internet protocols commonly consider two classes Security analyses of Internet protocols commonly consider two classes
of attacker: Passive attackers, who can simply listen in on of attacker: Passive attackers, who can simply listen in on
communications as they transit the network, and "active attackers", communications as they transit the network, and active attackers, who
who can modify or delete packets in addition to simply collecting can modify or delete packets in addition to simply collecting them.
them.
In the context of pervasive attack, these attacks take on an even In the context of pervasive passive surveillance, these attacks take
greater significance. In the past, these attackers were often on an even greater significance. In the past, these attackers were
assumed to operate near the edge of the network, where attacks can be often assumed to operate near the edge of the network, where attacks
simpler. For example, in some LANs, it is simple for any node to can be simpler. For example, in some LANs, it is simple for any node
engage in passive listening to other nodes' traffic or inject packets to engage in passive listening to other nodes' traffic or inject
to accomplish active attacks. In the pervasive attack case, however, packets to accomplish active attacks. However, as we now know, both
both passive and active attacks are undertaken closer to the core of passive and active attacks are undertaken by pervasive attackers
the network, greatly expanding the scope and capability of the closer to the core of the network, greatly expanding the scope and
attacker. capability of the attacker.
A passive attacker with access to a large portion of the Internet can Eavesdropping and observation at a larger scale make passive
analyze collected traffic to create a much more detailed view of user inference attacks easier to carry out: a passive attacker with access
behavior than an attacker that collects at a single point. Even the to a large portion of the Internet can analyze collected traffic to
usual claim that encryption defeats passive attackers is weakened, create a much more detailed view of individual behavior than an
since a pervasive passive attacker can infer relationships from attacker that collects at a single point. Even the usual claim that
correlations over large numbers of sessions, e.g., pairing encrypted encryption defeats passive attackers is weakened, since a pervasive
sessions with unencrypted sessions from the same host, or performing passive attacker can infer relationships from correlations over large
traffic fingerprinting between known and unknown encrypted sessions. numbers of sessions, e.g., pairing encrypted sessions with
The reports on the NSA XKEYSCORE system would make it an example of unencrypted sessions from the same host, or performing traffic
such an attacker. fingerprinting between known and unknown encrypted sessions. Reports
on the NSA XKEYSCORE system would indicate it is an example of such
an attacker.
A pervasive active attacker likewise has capabilities beyond those of A pervasive active attacker likewise has capabilities beyond those of
a localized active attacker. Active attacks are often limited by a localized active attacker. Active attacks are often limited by
network topology, for example by a requirement that the attacker be network topology, for example by a requirement that the attacker be
able to see a targeted session as well as inject packets into it. A able to see a targeted session as well as inject packets into it. A
pervasive active attacker with multiple accesses at core points of pervasive active attacker with access at multiple points within the
the Internet is able to overcome these topological limitations and core of the Internet is able to overcome these topological
apply attacks over a much broader scope. Being positioned in the limitations and perform attacks over a much broader scope. Being
core of the network rather than the edge can also enable a pervasive positioned in the core of the network rather than the edge can also
active attacker to reroute targeted traffic. Pervasive active enable a pervasive active attacker to reroute targeted traffic,
attackers can also benefit from pervasive passive collection to amplifying the ability to perform both eavesdropping and traffic
identify vulnerable hosts. injection. Pervasive active attackers can also benefit from
pervasive passive collection to identify vulnerable hosts.
While not directly related to pervasiveness, attackers that are in a While not directly related to pervasiveness, attackers that are in a
position to mount a pervasive active attack are also often in a position to mount a pervasive active attack are also often in a
position to subvert authentication, the traditional response to position to subvert authentication, a traditional protection against
active attack. Authentication in the Internet is often achieved via such attacks. Authentication in the Internet is often achieved via
trusted third party authorities such as the Certificate Authorities trusted third party authorities such as the Certificate Authorities
(CAs) that provide web sites with authentication credentials. An (CAs) that provide web sites with authentication credentials. An
attacker with sufficient resources for pervasive attack may also be attacker with sufficient resources may also be able to induce an
able to induce an authority to grant credentials for an identity of authority to grant credentials for an identity of the attacker's
the attacker's choosing. If the parties to a communication will choosing. If the parties to a communication will trust multiple
trust multiple authorities to certify a specific identity, this authorities to certify a specific identity, this attack may be
attack may be mounted by suborning any one of the authorities (the mounted by suborning any one of the authorities (the proverbial
proverbial "weakest link"). Subversion of authorities in this way "weakest link"). Subversion of authorities in this way can allow an
can allow an active attack to succeed in spite of an authentication active attack to succeed in spite of an authentication check.
check.
Beyond these three classes (observation, inference, and active), Beyond these three classes (observation, inference, and active),
reports on the BULLRUN effort to defeat encryption and the PRISM reports on the BULLRUN effort to defeat encryption and the PRISM
effort to obtain data from service providers suggest three more effort to obtain data from service providers suggest three more
classes of attack: classes of attack:
o Static key exfiltration o Static key exfiltration
o Dynamic key exfiltration o Dynamic key exfiltration
o Content exfiltration o Content exfiltration
These attacks all rely on a "collaborator" endpoint providing the
attacker with some information, either keys or data. These attacks These attacks all rely on a collaborator providing the attacker with
have not traditionally been considered in security analyses of some information, either keys or data. These attacks have not
protocols, since they happen outside of the protocol. traditionally been considered in security analyses of protocols,
since they happen outside of the protocol.
The term "key exfiltration" refers to the transfer of keying material The term "key exfiltration" refers to the transfer of keying material
for an encrypted communication from the collaborator to the attacker. for an encrypted communication from the collaborator to the attacker.
By "static", we mean that the transfer of keys happens once, or By "static", we mean that the transfer of keys happens once, or
rarely, typically of a long-lived key. For example, this case would rarely, typically of a long-lived key. For example, this case would
cover a web site operator that provides the private key corresponding cover a web site operator that provides the private key corresponding
to its HTTPS certificate to an intelligence agency. to its HTTPS certificate to an intelligence agency.
"Dynamic" key exfiltration, by contrast, refers to attacks in which "Dynamic" key exfiltration, by contrast, refers to attacks in which
the collaborator delivers keying material to the attacker frequently, the collaborator delivers keying material to the attacker frequently,
skipping to change at page 13, line 49 skipping to change at page 14, line 20
are "collaborating" with the attacker (by providing keys/content) are "collaborating" with the attacker (by providing keys/content)
without their owner's knowledge or consent. without their owner's knowledge or consent.
Any party that has access to encryption keys or unencrypted data can Any party that has access to encryption keys or unencrypted data can
be a collaborator. While collaborators are typically the endpoints be a collaborator. While collaborators are typically the endpoints
of a communication (with encryption securing the links), of a communication (with encryption securing the links),
intermediaries in an unencrypted communication can also facilitate intermediaries in an unencrypted communication can also facilitate
content exfiltration attacks as collaborators by providing the content exfiltration attacks as collaborators by providing the
attacker access to those communications. For example, documents attacker access to those communications. For example, documents
describing the NSA PRISM program claim that NSA is able to access describing the NSA PRISM program claim that NSA is able to access
user data directly from servers, where it was stored unencrypted. In user data directly from servers, where it is stored unencrypted. In
these cases, the operator of the server would be a collaborator these cases, the operator of the server would be a collaborator, if
(wittingly or unwittingly). By contrast, in the NSA MUSCULAR an unwitting one. By contrast, in the NSA MUSCULAR program, a set of
program, a set of collaborators enabled attackers to access the collaborators enabled attackers to access the cables connecting data
cables connecting data centers used by service providers such as centers used by service providers such as Google and Yahoo. Because
Google and Yahoo. Because communications among these data centers communications among these data centers were not encrypted, the
were not encrypted, the collaboration by an intermediate entity collaboration by an intermediate entity allowed NSA to collect
allowed NSA to collect unencrypted user data. unencrypted user data.
5.2. Attacker Costs 5.2. Attacker Costs
+--------------------------+-----------------------------------+ +--------------------------+-----------------------------------+
| Attack Class | Cost / Risk to Attacker | | Attack Class | Cost / Risk to Attacker |
+--------------------------+-----------------------------------+ +--------------------------+-----------------------------------+
| Passive observation | Passive data access | | Passive observation | Passive data access |
| | | | | |
| Passive inference | Passive data access + processing | | Passive inference | Passive data access + processing |
| | | | | |
| Active | Active data access + processing | | Active | Active data access + processing |
| | | | | |
| Static key exfiltration | One-time interaction | | Static key exfiltration | One-time interaction |
| | | | | |
| Dynamic key exfiltration | Ongoing interaction / code change | | Dynamic key exfiltration | Ongoing interaction / code change |
| | | | | |
| Content exfiltration | Ongoing, bulk interaction | | Content exfiltration | Ongoing, bulk interaction |
+--------------------------+-----------------------------------+ +--------------------------+-----------------------------------+
In order to realize an attack of each of the types discussed above, Each of the attack types discussed in the previous section entails
the attacker has to incur certain costs and undertake certain risks. certain costs and risks. These costs differ by attack, and can be
These costs differ by attack, and can be helpful in guiding response helpful in guiding response to pervasive attack.
to pervasive attack.
Depending on the attack, the attacker may be exposed to several types Depending on the attack, the attacker may be exposed to several types
of risk, ranging from simply losing access to arrest or prosecution. of risk, ranging from simply losing access to arrest or prosecution.
In order for any of these negative consequences to happen, however,
In order for any of these negative consequences to occur, however,
the attacker must first be discovered and identified. So the primary the attacker must first be discovered and identified. So the primary
risk we focus on here is the risk of discovery and attribution. risk we focus on here is the risk of discovery and attribution.
A passive attack is the simplest attack to mount in some ways. The A passive attack is the simplest to mount in some ways. The base
base requirement is that the attacker obtain physical access to a requirement is that the attacker obtain physical access to a
communications medium and extract communications from it. For communications medium and extract communications from it. For
example, the attacker might tap a fiber-optic cable, acquire a mirror example, the attacker might tap a fiber-optic cable, acquire a mirror
port on a switch, or listen to a wireless signal. The need for these port on a switch, or listen to a wireless signal. The need for these
taps to have physical access or proximity to a link exposes the taps to have physical access or proximity to a link exposes the
attacker to the risk that the taps will be discovered. For example, attacker to the risk that the taps will be discovered. For example,
a fiber tap or mirror port might be discovered by network operators a fiber tap or mirror port might be discovered by network operators
noticing increased attenuation in the fiber or a change in switch noticing increased attenuation in the fiber or a change in switch
configuration. Of course, passive attacks may be accomplished with configuration. Of course, passive attacks may be accomplished with
the cooperation of the network operator, in which case there is a the cooperation of the network operator, in which case there is a
risk that the attacker's interactions with the network operator will risk that the attacker's interactions with the network operator will
skipping to change at page 16, line 17 skipping to change at page 16, line 33
It should also be noted that in these latter three exfiltration It should also be noted that in these latter three exfiltration
cases, the collaborator also undertakes a risk that his collaboration cases, the collaborator also undertakes a risk that his collaboration
with the attacker will be discovered. Thus the attacker may have to with the attacker will be discovered. Thus the attacker may have to
incur additional cost in order to convince the collaborator to incur additional cost in order to convince the collaborator to
participate in the attack. Likewise, the scope of these attacks is participate in the attack. Likewise, the scope of these attacks is
limited to case where the attacker can convince a collaborator to limited to case where the attacker can convince a collaborator to
participate. If the attacker is a national government, for example, participate. If the attacker is a national government, for example,
it may be able to compel participation within its borders, but have a it may be able to compel participation within its borders, but have a
much more difficult time recruiting foreign collaborators. much more difficult time recruiting foreign collaborators.
As noted above, the "collaborator" in an exfiltration attack can be As noted above, the collaborator in an exfiltration attack can be
unwitting; the attacker can steal keys or data to enable the attack. unwitting; the attacker can steal keys or data to enable the attack.
In some ways, the risks of this approach are similar to the case of In some ways, the risks of this approach are similar to the case of
an active collaborator. In the static case, the attacker needs to an active collaborator. In the static case, the attacker needs to
steal information from the collaborator once; in the dynamic case, steal information from the collaborator once; in the dynamic case,
the attacker needs to continued presence inside the collaborators the attacker needs to continued presence inside the collaborators
systems. The main difference is that the risk in this case is of systems. The main difference is that the risk in this case is of
automated discovery (e.g., by intrusion detection systems) rather automated discovery (e.g., by intrusion detection systems) rather
than discovery by humans. than discovery by humans.
6. Responding to Pervasive Attack 6. Security Considerations
Given this threat model, how should the Internet technical community
respond to pervasive attack?
The cost and risk considerations discussed above can provide a guide
to response. Namely, responses to passive attack should close off
avenues for attack that are safe, scalable, and cheap, forcing the
attacker to mount attacks that expose it to higher cost and risk.
In this section, we discuss a collection of high-level approaches to
mitigating pervasive attacks. These approaches are not meant to be
exhaustive, but rather to provide general guidance to protocol
designers in creating protocols that are resistant to pervasive
attack.
+--------------------------+----------------------------------------+
| Attack Class | High-level mitigations |
+--------------------------+----------------------------------------+
| Passive observation | Encryption for confidentiality |
| | |
| Passive inference | ??? |
| | |
| Active | Authentication, monitoring |
| | |
| Static key exfiltration | Encryption with per-session state |
| | (PFS) |
| | |
| Dynamic key exfiltration | Transparency, validation of end |
| | systems |
| | |
| Content exfiltration | Object encryption, distributed systems |
+--------------------------+----------------------------------------+
The traditional mitigation to passive attack is to render content
unintelligible to the attacker by applying encryption, for example,
by using TLS or IPsec [RFC5246][RFC4301]. Even without
authentication, encryption will prevent a passive attacker from being
able to read the encrypted content. Exploiting unauthenticated
encryption requires an active attack (man in the middle); with
authentication, a key exfiltration attack is required.
The additional capabilities of a pervasive passive attacker, however,
require some changes in how protocol designers evaluate what
information is encrypted. In addition to directly collecting
unencrypted data, a pervasive passive attacker can also make
inferences about the content of encrypted messages based on what is
observable. For example, if a user typically visits a particular set
of web sites, then a pervasive passive attacker observing all of the
user's behavior can track the user based on the hosts the user
communicates with, even if the user changes IP addresses, and even if
all of the connections are encrypted.
Thus, in designing protocols to be resistant to pervasive passive
attacks, protocol designers should consider what information is left
unencrypted in the protocol, and how that information might be
correlated with other traffic. Information that cannot be encrypted
should be anonymized, i.e., it should be dissociated from other
information. For example, the Tor overlay routing network anonymizes
IP addresses by using multi-hop onion routing [TOR].
As with traditional, limited active attacks, the basic mitigation to
pervasive active attack is to enable the endpoints of a communication
to authenticate each other. However, as noted above, attackers that
can mount pervasive active attacks can often subvert the authorities
on which authentication systems rely. Thus, in order to make
authentication systems more resilient to pervasive attack, it is
beneficial to monitor these authorities to detect misbehavior that
could enable active attack. For example, DANE and Certificate
Transparency both provide mechanisms for detecting when a CA has
issued a certificate for a domain name without the authorization of
the holder of that domain name [RFC6962][RFC6698].
While encryption and authentication protect the security of
individual sessions, these sessions may still leak information, such
as IP addresses or server names, that a pervasive attacker can use to
correlate sessions and derive additional information about the
target. Thus, pervasive attack highlights the need for anonymization
technologies, which make correlation more difficult. Typical
approaches to anonymization against traffic analysis include:
o Aggregation: Routing sessions for many endpoints through a common
mid-point (e.g., an HTTP proxy). Since the midpoint appears as
the end of the communication, individual endpoints cannot be
distinguished.
o Onion routing: Routing a session through several mid-points,
rather than directly end-to-end, with encryption that guarantees
that each node can only see the previous and next hops [TOR].
This ensures that the source and destination of a communication
are never revealed simultaneously.
o Multi-path: Routing different sessions via different paths (even
if they originate from the same endpoint). This reduces the
probability that the same attacker will be able to collect many
sessions.
An encrypted, authenticated session is safe from content-monitoring
attacks in which neither end collaborates with the attacker, but can
still be subverted by the endpoints. The most common ciphersuites
used for HTTPS today, for example, are based on using RSA encryption
in such a way that if an attacker has the private key, the attacker
can derive the session keys from passive observation of a session.
These ciphersuites are thus vulnerable to a static key exfiltration
attack - if the attacker obtains the server's private key once, then
they can decrypt all past and future sessions for that server.
Static key exfiltration attacks are prevented by including ephemeral,
per-session secret information in the keys used for a session. Most
IETF security protocols include modes of operation that have this
property. These modes are known in the literature under the heading
"perfect forward secrecy" (PFS) because even if an adversary has all
of the secrets for one session, the next session will use new,
different secrets and the attacker will not be able to decrypt it.
The Internet Key Exchange (IKE) protocol used by IPsec supports PFS
by default [RFC4306], and TLS supports PFS via the use of specific
ciphersuites [RFC5246].
Dynamic key exfiltration cannot be prevent by protocol means. By
definition, any secrets that are used in the protocol will be
transmitted to the attacker and used to decrypt what the protocol
encrypts. Likewise, no technical means will stop a willing
collaborator from sharing keys with an attacker. However, this
attack model also covers "unwitting collaborators", whose technical
resources are collaborating with the attacker without their owners'
knowledge. This could happen, for example, if flaws are built into
products or if malware is injected later on.
The best defense against becoming an unwitting collaborator is thus
to assure that end systems are well-vetted and secure. Transparency
is a major tool in this process [secure]. Open source software is
easier to evaluate for potential flaws than proprietary software, by
a wider array of independent analysts. Products that conform to
standards for cryptography and security protocols are limited in the
ways they can misbehave. And standards processes that are open and
transparent help ensure that the standards themselves do not provide
avenues for attack.
Standards can also define protocols that provide greater or lesser
opportunity for dynamic key exfiltration. Collaborators engaging in
key exfiltration through a standard protocol will need to use covert
channels in the protocol to leak information that can be used by the
attacker to recover the key. Such use of covert channels has been
demonstrated for SSL, TLS, and SSH [key-recovery]. Any protocol bits
that can be freely set by the collaborator can be used as a covert
channel, including, for example, TCP options or unencrypted traffic
sent before a STARTTLS message in SMTP or XMPP. Protocol designers
should consider what covert channels their protocols expose, and how
those channels can be exploited to exfiltrate key information.
Content exfiltration has some similarity to the dynamic exfiltration
case, in that nothing can prevent a collaborator from revealing what
they know, and the mitigations against becoming an unwitting
collaborator apply. In this case, however, applications can limit
what the collaborator is able to reveal. For example, the S/MIME and
PGP systems for secure email both deny intermediate servers access to
certain parts of the message [RFC5750][RFC2015]. Even if a server
were to provide an attacker with full access, the attacker would
still not be able to read the protected parts of the message.
Mechanisms like S/MIME and PGP are often referred to as "end-to-end"
security mechanisms, as opposed to "hop-by-hop" or "end-to-middle"
mechanisms like the use of SMTP over TLS. These two different
mechanisms address different types of attackers: Hop-by-hop
mechanisms protect from attackers on the wire (passive or active),
while end-to-end mechansims protect against attackers within
intermediate nodes. Thus, neither of these mechanisms provides
complete protection by itself. For example:
o Two users messaging via Facebook over HTTPS are protected against
passive and active attackers in the network between the users and
Facebook. However, if Facebook is a collaborator in an
exfiltration attack, their communications can still be monitored.
They would need to encrypt their messages end-to-end in order to
protect themselves against this risk.
o Two users exchanging PGP-protected email have protected the
content of their exchange from network attackers and intermediate
servers, but the header information (e.g., To and From addresses)
is unnecessarily exposed to passive and active attackers that can
see communications among the mail agents handling the email
messages. These mail agents need to use hop-by-hop encryption and
traffic analysis mitigation to address this risk.
Mechanisms such as S/MIME and PGP are also known as "object-based"
security mechanisms (as opposed to "communications security"
mechanisms), since they operate at the level of objects, rather than
communications sessions. Such secure object can be safely handled by
intermediaries in order to realize, for example, store and forward
messaging. In the examples above, the encrypted instant messages or
email messages would be the secure objects.
The mitigations to the content exfiltration case are thus to regard
participants in the protocol as potential passive attackers
themselves, and apply the mitigations discussed above with regard to
passive attack. Information that is not necessary for these
participants to fulfill their role in the protocol can be encrypted,
and other information can be anonymized.
In summary, many of the basic tools for mitigating pervasive attack
already exist. As Edward Snowden put it, "properly implemented
strong crypto systems are one of the few things you can rely on"
[snowden]. The task for the Internet community is to ensure that
applications are able to use the strong crypto systems we have
defined - for example, TLS with PFS ciphersuites - and that these
properly implemented. (And, one might add, turned on!) Some of this
work will require architectural changes to applications, e.g., in
order to limit the information that is exposed to servers. In many
other cases, however, the need is simply to make the best use we can
of the cryptographic tools we have.
7. Acknowledgements
o Thaler for list of attacks and taxonomy
o Security ADs for starting and managing the perpass discussion
o See PPA acks as well
8. TODO
o Ensure all bases are covered WRT threats to confidentiality This document describes a threat model for pervasive surveillance
attacks. Mitigations are to be given in a future document.
o Consider moving mitigations to a separate document per program 7. IANA Considerations
description
o Look at better alignment with draft-farrell-perpass-attack This document has no actions for IANA.
o Better coverage of traffic analysis - PPA helped somewhat here but 8. Acknowledgements
the problem is hard
o Terminology alignment (after the program agrees the structure is Thanks to Dave Thaler for the list of attacks and taxonomy; to
good) Security Area Directors Stephen Farrell, Sean Turner, and Kathleen
Moriarty for starting and managing the IETF's discussion on pervasive
attack; and to Stephan Neuhaus, Mark Townsley, Chris Inacio, and
Evangelos Halepilidis, and to the membership of the IAB Privacy and
Security Program for their input.
9. References 9. References
9.1. Normative References 9.1. Normative References
[RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
Morris, J., Hansen, M., and R. Smith, "Privacy Morris, J., Hansen, M., and R. Smith, "Privacy
Considerations for Internet Protocols", RFC 6973, July Considerations for Internet Protocols", RFC 6973, July
2013. 2013.
 End of changes. 63 change blocks. 
443 lines changed or deleted 251 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/