draft-ietf-sipping-spam-02.txt   draft-ietf-sipping-spam-03.txt 
SIPPING J. Rosenberg SIPPING J. Rosenberg
Internet-Draft C. Jennings Internet-Draft C. Jennings
Expires: September 7, 2006 Cisco Expires: April 25, 2007 Cisco
J. Peterson October 22, 2006
Neustar
March 6, 2006
The Session Initiation Protocol (SIP) and Spam The Session Initiation Protocol (SIP) and Spam
draft-ietf-sipping-spam-02 draft-ietf-sipping-spam-03
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 36 skipping to change at page 1, line 34
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 7, 2006. This Internet-Draft will expire on April 25, 2007.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2006). Copyright (C) The Internet Society (2006).
Abstract Abstract
Spam, defined as the transmission of bulk unsolicited messages, has Spam, defined as the transmission of bulk unsolicited messages, has
plagued Internet email. Unfortunately, spam is not limited to email. plagued Internet email. Unfortunately, spam is not limited to email.
It can affect any system that enables user to user communications. It can affect any system that enables user to user communications.
The Session Initiation Protocol (SIP) defines a system for user to The Session Initiation Protocol (SIP) defines a system for user to
user multimedia communications. Therefore, it is susceptible to user multimedia communications. Therefore, it is susceptible to
spam, just as email is. In this document, we analyze the problem of spam, just as email is. In this document, we analyze the problem of
spam in SIP. We first identify the ways in which the problem is the spam in SIP. We first identify the ways in which the problem is the
same and the ways in which it is different from email. We then same and the ways in which it is different from email. We then
examine the various possible solutions that have been discussed for examine the various possible solutions that have been discussed for
email and consider their applicability to SIP. Discussions on this email and consider their applicability to SIP.
draft should be directed at sipping@ietf.org.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Problem Definition . . . . . . . . . . . . . . . . . . . . . 3 2. Problem Definition . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Call Spam . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. Call Spam . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 IM Spam . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2. IM Spam . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Presence Spam . . . . . . . . . . . . . . . . . . . . . . 7 2.3. Presence Spam . . . . . . . . . . . . . . . . . . . . . . 7
3. Solution Space . . . . . . . . . . . . . . . . . . . . . . . 8 3. Solution Space . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1 Content Filtering . . . . . . . . . . . . . . . . . . . . 8 3.1. Content Filtering . . . . . . . . . . . . . . . . . . . . 8
3.2 Black Lists . . . . . . . . . . . . . . . . . . . . . . . 8 3.2. Black Lists . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 White Lists . . . . . . . . . . . . . . . . . . . . . . . 9 3.3. White Lists . . . . . . . . . . . . . . . . . . . . . . . 9
3.4 Consent-Based Communications . . . . . . . . . . . . . . . 10 3.4. Consent-Based Communications . . . . . . . . . . . . . . . 10
3.5 Reputation Systems . . . . . . . . . . . . . . . . . . . . 11 3.5. Reputation Systems . . . . . . . . . . . . . . . . . . . . 12
3.6 Address Obfuscation . . . . . . . . . . . . . . . . . . . 13 3.6. Address Obfuscation . . . . . . . . . . . . . . . . . . . 13
3.7 Limited Use Addresses . . . . . . . . . . . . . . . . . . 14 3.7. Limited Use Addresses . . . . . . . . . . . . . . . . . . 14
3.8 Turing Tests . . . . . . . . . . . . . . . . . . . . . . . 14 3.8. Turing Tests . . . . . . . . . . . . . . . . . . . . . . . 14
3.9 Computational Puzzles . . . . . . . . . . . . . . . . . . 16 3.9. Computational Puzzles . . . . . . . . . . . . . . . . . . 16
3.10 Payments at Risk . . . . . . . . . . . . . . . . . . . . 16 3.10. Payments at Risk . . . . . . . . . . . . . . . . . . . . . 17
3.11 Legal Action . . . . . . . . . . . . . . . . . . . . . . 17 3.11. Legal Action . . . . . . . . . . . . . . . . . . . . . . . 18
3.12 Circles of Trust . . . . . . . . . . . . . . . . . . . . 18 3.12. Circles of Trust . . . . . . . . . . . . . . . . . . . . . 18
3.13 Centralized SIP Providers . . . . . . . . . . . . . . . 18 3.13. Centralized SIP Providers . . . . . . . . . . . . . . . . 19
3.14 Sender Checks . . . . . . . . . . . . . . . . . . . . . 19 4. Authenticated Identity in Email . . . . . . . . . . . . . . . 20
4. Authenticated Identity in SIP . . . . . . . . . . . . . . . 20 4.1. Sender Checks . . . . . . . . . . . . . . . . . . . . . . 20
5. Framework for Anti-Spam in SIP . . . . . . . . . . . . . . . 21 4.2. Signature-Based Techniques . . . . . . . . . . . . . . . . 20
6. Additional Work . . . . . . . . . . . . . . . . . . . . . . 22 5. Authenticated Identity in SIP . . . . . . . . . . . . . . . . 21
7. Security Considerations . . . . . . . . . . . . . . . . . . 22 6. Framework for Anti-Spam in SIP . . . . . . . . . . . . . . . . 22
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 22 7. Additional Work . . . . . . . . . . . . . . . . . . . . . . . 23
9. Informative References . . . . . . . . . . . . . . . . . . . 22 8. Security Considerations . . . . . . . . . . . . . . . . . . . 23
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 24 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23
Intellectual Property and Copyright Statements . . . . . . . 26 10. Informative References . . . . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27
Intellectual Property and Copyright Statements . . . . . . . . . . 28
1. Introduction 1. Introduction
Spam, defined as the transmission of bulk unsolicited email, has been Spam, defined as the transmission of bulk unsolicited email, has been
a plague on the Internet email system, rendering it nearly useless. a plague on the Internet email system, rendering it nearly useless.
Many solutions have been documented and deployed to counter the Many solutions have been documented and deployed to counter the
problem. None of these solutions is ideal. However, one thing is problem. None of these solutions is ideal. However, one thing is
clear: the spam problem would be much less significant had solutions clear: the spam problem would be much less significant had solutions
been deployed ubiquitously before the problem became widespread. been deployed ubiquitously before the problem became widespread.
The Session Initiation Protocol (SIP) [2] is used for multimedia The Session Initiation Protocol (SIP) [2] is used for multimedia
communications between users, including voice, video, instant communications between users, including voice, video, instant
messaging and presence. Although it has seen widespread deployment, messaging and presence. Consequently, it can be just as much of a
the deployments today have mostly been in disconnected islands. target for spam as email. To deal with this, solutions need to be
Providers have not yet connected to each other in significant ways, defined and recommendations put into place for dealing with spam as
nor have they yet opened up access so as to allow receipt of SIP soon as possible.
messaging from the open Internet. Possibly as a result of this, SIP
networks have not yet been the target of any significant amount of
spam. However, we believe that it is just a matter of time.
It is important that the SIP community react now, rather than later, This document serves to meet those goals by defining the problem
and define and deploy anti-spam measures before the problem arises. space more concretely, analyzing the applicability of solutions used
This document serves to help frame the problem of spam in SIP and in the email space, identifying protocol mechanisms that have been
analyze the solution space in order to help determine a path forward. defined for SIP which can help the problem, and making
recommendations for implementors.
2. Problem Definition 2. Problem Definition
The spam problem in email is well understood, and we make no attempt The spam problem in email is well understood, and we make no attempt
to further elaborate on it here. The question, however, is what is to further elaborate on it here. The question, however, is what is
the meaning of spam when applied to SIP? Since SIP covers a broad the meaning of spam when applied to SIP? Since SIP covers a broad
range of functionality, there appear to be three related but range of functionality, there appear to be three related but
different manifestations: different manifestations:
Call Spam: This type of spam is defined as a bulk unsolicited set of Call Spam: This type of spam is defined as a bulk unsolicited set of
skipping to change at page 4, line 7 skipping to change at page 4, line 7
the message that the spammer is seeking to convey. IM spam is the message that the spammer is seeking to convey. IM spam is
most naturally sent using the SIP MESSAGE [3] request. However, most naturally sent using the SIP MESSAGE [3] request. However,
any other request which causes content to automatically appear on any other request which causes content to automatically appear on
the user's display will also suffice. That might include INVITE the user's display will also suffice. That might include INVITE
requests with large Subject headers (since the Subject is requests with large Subject headers (since the Subject is
sometimes rendered to the user), or INVITE requests with text or sometimes rendered to the user), or INVITE requests with text or
HTML bodies. HTML bodies.
Presence Spam: This type of spam is similar to IM spam. It is Presence Spam: This type of spam is similar to IM spam. It is
defined as a bulk unsolicited set of presence requests (i.e., defined as a bulk unsolicited set of presence requests (i.e.,
SUBSCRIBE requests [4] for the presence event package [7]), in an SUBSCRIBE requests [4] for the presence event package [6]), in an
attempt to get on the "buddy list" or "white list" of a user in attempt to get on the "buddy list" or "white list" of a user in
order to send them IM or initiate other forms of communications. order to send them IM or initiate other forms of communications.
There are many other SIP messages that a spammer might send. There are many other SIP messages that a spammer might send.
However, most of the other ones do not result in content being However, most of the other ones do not result in content being
delivered to a user, nor do they seek input from a user. Rather, delivered to a user, nor do they seek input from a user. Rather,
they are answered by automata. OPTIONS is a good example of this. they are answered by automata. OPTIONS is a good example of this.
There is little value for a spammer in sending an OPTIONS request, There is little value for a spammer in sending an OPTIONS request,
since it is answered automatically by the UAS. No content is since it is answered automatically by the UAS. No content is
delivered to the user, and they are not consulted. delivered to the user, and they are not consulted.
In the sections below, we consider the likelihood of these various In the sections below, we consider the likelihood of these various
forms of SIP spam. This is done in some cases by a rough cost forms of SIP spam. This is done in some cases by a rough cost
analysis. It should be noted that all of these analyses are analysis. It should be noted that all of these analyses are
approximate, and serve only to give a rough sense of the order of approximate, and serve only to give a rough sense of the order of
magnitude of the problem. magnitude of the problem.
2.1 Call Spam 2.1. Call Spam
Will call spam occur? That is an important question to answer. Will call spam occur? That is an important question to answer.
Clearly, it does occur in the existing telephone network, in the form Clearly, it does occur in the existing telephone network, in the form
of telemarketer calls. Although these calls are annoying, they do of telemarketer calls. Although these calls are annoying, they do
not arrive in the same kind of volume as email spam. The difference not arrive in the same kind of volume as email spam. The difference
is cost; it costs more for the spammer to make a phone call than it is cost; it costs more for the spammer to make a phone call than it
does to send email. This cost manifests itself in terms of the cost does to send email. This cost manifests itself in terms of the cost
for systems which can perform telemarketer call, and in cost per for systems which can perform telemarketer call, and in cost per
call. call.
skipping to change at page 5, line 48 skipping to change at page 5, line 48
Even ignoring the zombie issue, this reduction in cost is even more Even ignoring the zombie issue, this reduction in cost is even more
amplified for international calls. Currently, there is very little amplified for international calls. Currently, there is very little
telemarketing calls across international borders, largely due to the telemarketing calls across international borders, largely due to the
large cost of making international calls. This is one of the reasons large cost of making international calls. This is one of the reasons
why the "do not call list", a United States national list of numbers why the "do not call list", a United States national list of numbers
that telemarketers cannot call - has been effective. The law only that telemarketers cannot call - has been effective. The law only
affects U.S. companies, but since most telemarketing calls are affects U.S. companies, but since most telemarketing calls are
domestic, it has been effective. Unfortunately (and fortunately), domestic, it has been effective. Unfortunately (and fortunately),
the IP network provides no boundaries of these sorts, and calls to the IP network provides no boundaries of these sorts, and calls to
any SIP URL are possible from anywhere in the world. This will allow any SIP URI are possible from anywhere in the world. This will allow
for international spam at a significantly reduced cost. for international spam at a significantly reduced cost.
International spam is likely to be even more annoying that national International spam is likely to be even more annoying that national
spam, since it may arrive in languages that the recipient doesn't spam, since it may arrive in languages that the recipient doesn't
even speak. even speak.
These figures assume that the primary limitation is the access These figures assume that the primary limitation is the access
bandwidth and not CPU, disk, or termination costs. Termination costs bandwidth and not CPU, disk, or termination costs. Termination costs
merit further discussion. Currently, most VoIP calls terminate on merit further discussion. Currently, most VoIP calls terminate on
the Public Switched Telephone Network (PSTN), and this termination the Public Switched Telephone Network (PSTN), and this termination
costs the originator of the call money. These costs are similar to costs the originator of the call money. These costs are similar to
skipping to change at page 7, line 5 skipping to change at page 7, line 5
termination provider, a spammer can direct SIP calls at traditional termination provider, a spammer can direct SIP calls at traditional
PSTN devices. It is not clear whether email spammers have also been PSTN devices. It is not clear whether email spammers have also been
collecting phone numbers as they perform their web sweeps, but it is collecting phone numbers as they perform their web sweeps, but it is
probably not hard to do so. Furthermore, unlike email addresses, probably not hard to do so. Furthermore, unlike email addresses,
phone numbers are a finite address space and one that is fairly phone numbers are a finite address space and one that is fairly
densely packed. As a result, going sequentially through phone densely packed. As a result, going sequentially through phone
numbers is likely to produce a fairly high hit rate. Thus, it seems numbers is likely to produce a fairly high hit rate. Thus, it seems
like the cost is relatively low for a spammer to obtain large numbers like the cost is relatively low for a spammer to obtain large numbers
of SIP addresses to which spam can be directed. of SIP addresses to which spam can be directed.
2.2 IM Spam 2.2. IM Spam
IM spam is very much like email, in terms of the costs for deploying IM spam is very much like email, in terms of the costs for deploying
and generating spam. Assuming, for the sake of argument, a 1kB and generating spam. Assuming, for the sake of argument, a 1kB
message to be sent and 500 Kbps of upstream bandwidth, thats 62 message to be sent and 500 Kbps of upstream bandwidth, thats 62
messages per second. At $50/month, the result is 31 microcents per messages per second. At $50/month, the result is 31 microcents per
message. This is less than voice spam, but not substantially less. message. This is less than voice spam, but not substantially less.
The cost is probably on par with email spam. However, IM is much The cost is probably on par with email spam. However, IM is much
more intrusive than email. In today's systems, IMs automatically pop more intrusive than email. In today's systems, IMs automatically pop
up and present themselves to the user. Email, of course, must be up and present themselves to the user. Email, of course, must be
deliberately selected and displayed. However, many IM systems employ deliberately selected and displayed. However, many IM systems employ
skipping to change at page 7, line 32 skipping to change at page 7, line 32
It is important to point out that there are two different types of IM It is important to point out that there are two different types of IM
systems. Page mode IM systems work much like email, with each IM systems. Page mode IM systems work much like email, with each IM
being sent as a separate message. In session mode IM, there is being sent as a separate message. In session mode IM, there is
signaling in advance of communication to establish a session, and signaling in advance of communication to establish a session, and
then IMs are exchanged, perhaps point-to-point, as part of the then IMs are exchanged, perhaps point-to-point, as part of the
session. The modality impacts the types of spam techniques that can session. The modality impacts the types of spam techniques that can
be applied. Techniques for email can be applied identically to page be applied. Techniques for email can be applied identically to page
mode IM, but session mode IM is more like telephony, and many mode IM, but session mode IM is more like telephony, and many
techniques (such as content filtering) are harder to apply. techniques (such as content filtering) are harder to apply.
2.3 Presence Spam 2.3. Presence Spam
As defined above, presence spam is the generation of bulk unsolicited As defined above, presence spam is the generation of bulk unsolicited
SUBSCRIBE messages. What would be the effect of such spam? Most SUBSCRIBE messages. What would be the effect of such spam? Most
presence systems provide some kind of consent framework. A watcher presence systems provide some kind of consent framework. A watcher
that has not been granted permission to see the user's presence will that has not been granted permission to see the user's presence will
not gain access to their presence. However, the presence request is not gain access to their presence. However, the presence request is
usually noted and conveyed to the user, allowing them to approve or usually noted and conveyed to the user, allowing them to approve or
deny the request. In SIP, this is done using the watcherinfo event deny the request. In SIP, this is done using the watcherinfo event
package [8]. This package allows a user to learn the identity of the package [7]. This package allows a user to learn the identity of the
watcher, in order to make an authorization decision. watcher, in order to make an authorization decision.
Interestingly, this provides a vehicle for conveying information to a Interestingly, this provides a vehicle for conveying information to a
user. By generating SUBSCRIBE requests from identities such as user. By generating SUBSCRIBE requests from identities such as
sip:please-buy-my-product@spam.example.com, brief messages can be sip:please-buy-my-product@spam.example.com, brief messages can be
conveyed to the user, even though the sender does not have, and never conveyed to the user, even though the sender does not have, and never
will receive, permission to access presence. As such, presence spam will receive, permission to access presence. As such, presence spam
can be viewed as a form of IM spam, where the amount of content to be can be viewed as a form of IM spam, where the amount of content to be
conveyed is limited. The limit is equal to the amount of information conveyed is limited. The limit is equal to the amount of information
generated by the watcher that gets conveyed to the user through the generated by the watcher that gets conveyed to the user through the
skipping to change at page 8, line 18 skipping to change at page 8, line 18
3. Solution Space 3. Solution Space
In this section, we consider the various solutions that might be In this section, we consider the various solutions that might be
possible to deal with SIP spam. We primarily consider techniques possible to deal with SIP spam. We primarily consider techniques
that have been employed to deal with email spam. It is important to that have been employed to deal with email spam. It is important to
note that the solutions documented below are not meant to be an note that the solutions documented below are not meant to be an
exhaustive study of the spam solutions used for email but rather just exhaustive study of the spam solutions used for email but rather just
a representative set. We also consider some solutions that appear to a representative set. We also consider some solutions that appear to
be SIP-specific. be SIP-specific.
3.1 Content Filtering 3.1. Content Filtering
The most common form of spam protection used in email is based on The most common form of spam protection used in email is based on
content filtering. These spam filters analyze the content of the content filtering. These spam filters analyze the content of the
email, and look for clues that the email is spam. Bayesian spam email, and look for clues that the email is spam. Bayesian spam
filters are in this category. filters are in this category.
Unfortunately, this type of spam filtering is almost completely Unfortunately, this type of spam filtering is almost completely
useless for call spam. There are two reasons. First, in the case useless for call spam. There are two reasons. First, in the case
where the user answers the call, the call is already established and where the user answers the call, the call is already established and
the user is paying attention before the content is delivered. The the user is paying attention before the content is delivered. The
skipping to change at page 8, line 48 skipping to change at page 8, line 48
recognition is even harder to do and remains primarily an area of recognition is even harder to do and remains primarily an area of
research. research.
Therefore, our conclusion is that the most successful form of anti- Therefore, our conclusion is that the most successful form of anti-
spam measures used in email are almost useless for call spam. spam measures used in email are almost useless for call spam.
IM spam, due to its similarity to email, can be countered with IM spam, due to its similarity to email, can be countered with
content analysis tools. Indeed, the same tools and techniques used content analysis tools. Indeed, the same tools and techniques used
for email will directly work for IM spam. for email will directly work for IM spam.
3.2 Black Lists 3.2. Black Lists
Black listing is an approach whereby the spam filter maintains a list Black listing is an approach whereby the spam filter maintains a list
of addresses that identify spammers. These addresses include both of addresses that identify spammers. These addresses include both
usernames (spammer@domain.com) and entire domains (spammers.com). usernames (spammer@domain.com) and entire domains (spammers.com).
Pure blacklists are not very effective in email for two reasons. Pure blacklists are not very effective in email for two reasons.
First, email addresses are easy to spoof, making it easy for the First, email addresses are easy to spoof, making it easy for the
sender to pretend to be someone else. If the sender varies the sender to pretend to be someone else. If the sender varies the
addresses they send from, the black list becomes almost completely addresses they send from, the black list becomes almost completely
useless. The second problem is that, even if the sender doesn't useless. The second problem is that, even if the sender doesn't
forge the from address, email addresses are in almost limitless forge the from address, email addresses are in almost limitless
supply. Each domain contains an infinite supply of email addresses, supply. Each domain contains an infinite supply of email addresses,
and new domains can be obtained for very low cost. Furthermore, and new domains can be obtained for very low cost. Furthermore,
there will always be public providers that will allow users to obtain there will always be public providers that will allow users to obtain
identities for almost no cost (for example, Yahoo or AOL mail identities for almost no cost (for example, Yahoo or AOL mail
skipping to change at page 9, line 24 skipping to change at page 9, line 24
identities for almost no cost (for example, Yahoo or AOL mail identities for almost no cost (for example, Yahoo or AOL mail
accounts). The entire domain cannot be blacklisted because it accounts). The entire domain cannot be blacklisted because it
contains so many valid users. Blacklisting needs to be for contains so many valid users. Blacklisting needs to be for
individual users. Those identities are easily changed. individual users. Those identities are easily changed.
As a result, as long as identities are easy to manufacture, black As a result, as long as identities are easy to manufacture, black
lists will have limited effectiveness for email. lists will have limited effectiveness for email.
Blacklists are also likely to be ineffective for SIP spam. Blacklists are also likely to be ineffective for SIP spam.
Fortunately, SIP has much stronger mechanisms for inter-domain Fortunately, SIP has much stronger mechanisms for inter-domain
authenticated identity than email has (see Section 4). Assuming authenticated identity than email has (see Section 5). Assuming
these mechanisms are used and enabled in inter-domain communications, these mechanisms are used and enabled in inter-domain communications,
it becomes nearly impossible to forge sender addresses. However, it it becomes nearly impossible to forge sender addresses. However, it
still remains cheap to obtain a nearly infinite supply of addresses. still remains cheap to obtain a nearly infinite supply of addresses.
3.3 White Lists 3.3. White Lists
White lists are the opposite of black lists. It is a list of valid White lists are the opposite of black lists. It is a list of valid
senders that a user is willing to accept email from. Unlike black senders that a user is willing to accept email from. Unlike black
lists, a spammer can not change identities to get around the white lists, a spammer can not change identities to get around the white
list. White lists are susceptible to address spoofing, but a strong list. White lists are susceptible to address spoofing, but a strong
identity authentication mechanism can prevent that problem. As a identity authentication mechanism can prevent that problem. As a
result, the combination of white lists and strong identity are a good result, the combination of white lists and strong identity are a good
form of defense against spam. form of defense against spam.
However, they are not a complete solution, since they would prohibit However, they are not a complete solution, since they would prohibit
skipping to change at page 10, line 21 skipping to change at page 10, line 22
provide a much more secure form of authenticated identity, even for provide a much more secure form of authenticated identity, even for
inter-domain communications. As a result, the problem of forged inter-domain communications. As a result, the problem of forged
senders can be eliminated, making the white list solution feasible. senders can be eliminated, making the white list solution feasible.
The introduction problem remains, however. In email, techniques like The introduction problem remains, however. In email, techniques like
the Turing tests have been employed for this purpose. Those are the Turing tests have been employed for this purpose. Those are
considered further in the sections below. As with email, a technique considered further in the sections below. As with email, a technique
for solving the introduction problem would need to be applied in for solving the introduction problem would need to be applied in
conjunction with a white list. conjunction with a white list.
3.4 Consent-Based Communications 3.4. Consent-Based Communications
A consent-based solution is used in conjunction with white or black A consent-based solution is used in conjunction with white or black
lists. That is, if user A is not on user B's white or black list, lists. That is, if user A is not on user B's white or black list,
and user A attempts to communicate with user B, user A's attempt is and user A attempts to communicate with user B, user A's attempt is
initially rejected, and they are told that consent is being initially rejected, and they are told that consent is being
requested. Next time user B connects, user B is informed that user A requested. Next time user B connects, user B is informed that user A
had attempted communications. User B can then authorize or reject had attempted communications. User B can then authorize or reject
user A. user A.
These kinds of consent-based systems are used widely in presence and These kinds of consent-based systems are used widely in presence and
IM but not in email. This is likely due to the need for a secure IM but not in email. This is likely due to the need for a secure
authenticated identity mechanism, which is a pre-requisite for this authenticated identity mechanism, which is a pre-requisite for this
kind of solution. Since most of today's IM systems are closed, kind of solution. Since most of today's IM systems are closed,
sender identities can be authenticated. sender identities can be authenticated.
This kind of consent-based communications has been standardized in This kind of consent-based communications has been standardized in
SIP for presence, using the watcher information event package [8] and SIP for presence, using the watcher information event package [7] and
data format [9], which allow a user to find out that someone has data format [8], which allow a user to find out that someone has
subscribed. Then, the XML Configuration Access Protocol (XCAP) [11] subscribed. Then, the XML Configuration Access Protocol (XCAP) [10]
is used, along with the XML format for presence authorization [12] to is used, along with the XML format for presence authorization [11] to
provide permission for the user to communicate. However, to date, provide permission for the user to communicate.
these techniques have been applied strictly for presence.
If they were extended to cover IM and calling, would it help? It is A consent framework has also been developed that is applicable to
hard to say. At first glance, it would seem to help a lot. However, other forms of SIP communications [12]. However, this framework
focuses on authorizing the addition of users to "mailing lists",
known as exploders in SIP terminology. Though spammers typically use
such exploder functions, presumably one run by a spammer would not
use this technique. Consequently, this consent framework is not
directly applicable to the spam problem. It is, however, useful as a
tool for managing a white list. Through the PUBLISH mechanism, it
allows a user to upload a permission document [13] which indicates
that they will only accept incoming calls from a particular sender.
Can a consent framework, like the ones used for presence, help solve
call spam? At first glance, it would seem to help a lot. However,
it might just change the nature of the spam. Instead of being it might just change the nature of the spam. Instead of being
bothered with content, in the form of call spam or IM spam, users are bothered with content, in the form of call spam or IM spam, users are
bothered with consent requests. A user's "communications inbox" bothered with consent requests. A user's "communications inbox"
might instead be filled with requests for communications from a might instead be filled with requests for communications from a
multiplicity of users. Those requests for communications don't multiplicity of users. Those requests for communications don't
convey much useful content to the user, but they can convey some. At convey much useful content to the user, but they can convey some. At
the very least, they will convey the identity of the requester. The the very least, they will convey the identity of the requester. The
user part of the SIP URI allows for limited freeform text, and thus user part of the SIP URI allows for limited freeform text, and thus
could be used to convey brief messages. One can imagine receiving could be used to convey brief messages. One can imagine receiving
consent requests with identities like, consent requests with identities like,
skipping to change at page 11, line 44 skipping to change at page 12, line 7
words of text in a single IM, which he can use to identify himself, words of text in a single IM, which he can use to identify himself,
so that the user can determine whether or not more permissions are so that the user can determine whether or not more permissions are
appropriate. However, this 200 words of text may be enough for a appropriate. However, this 200 words of text may be enough for a
spammer to convey their message, in much the same way they might spammer to convey their message, in much the same way they might
convey it in the user part of the SIP URI. convey it in the user part of the SIP URI.
Thus, it seems that a consent-based framework, along with white lists Thus, it seems that a consent-based framework, along with white lists
and black lists, cannot fully solve the problem for SIP, although it and black lists, cannot fully solve the problem for SIP, although it
does appear to help. does appear to help.
3.5 Reputation Systems 3.5. Reputation Systems
A reputation system is also used in conjunction with white or black A reputation system is also used in conjunction with white or black
lists. Assume that user A is not on user B's white list, and they lists. Assume that user A is not on user B's white list, and they
attempt to contact user B. If a consent-based system is used, B is attempt to contact user B. If a consent-based system is used, B is
prompted to consent to communications from A, a reputation score prompted to consent to communications from A, a reputation score
might be displayed in order to help B decide whether or not they might be displayed in order to help B decide whether or not they
should accept communications from A. should accept communications from A.
Traditionally, reputation systems are implemented in highly Traditionally, reputation systems are implemented in highly
centralized messaging architectures; the most widespread reputation centralized messaging architectures; the most widespread reputation
skipping to change at page 13, line 23 skipping to change at page 13, line 35
reputation system. Your perception of a particular user's reputation reputation system. Your perception of a particular user's reputation
might be dependent on your relationship to them in the social might be dependent on your relationship to them in the social
network: are they one buddy removed (strong reputation), four buddies network: are they one buddy removed (strong reputation), four buddies
removed (weaker reputation), three buddies removed but connected to removed (weaker reputation), three buddies removed but connected to
you through several of your buddies, etc. This web of trust you through several of your buddies, etc. This web of trust
furthermore would have the very desirable property that circles of furthermore would have the very desirable property that circles of
spammers adding one another to their own buddylists would not affect spammers adding one another to their own buddylists would not affect
your perception of their reputation unless their circle linked to your perception of their reputation unless their circle linked to
your own social network. your own social network.
3.6 Address Obfuscation 3.6. Address Obfuscation
Spammers build up their spam lists by gathering email addresses from Spammers build up their spam lists by gathering email addresses from
web sites and other public sources of information. One way to web sites and other public sources of information. One way to
prevent spam is to make your address difficult or impossible to prevent spam is to make your address difficult or impossible to
gather. Spam bots typically look for text in pages of the form gather. Spam bots typically look for text in pages of the form
"user@domain", and assume that anything of that form is an email "user@domain", and assume that anything of that form is an email
address. To hide from such spam bots, many websites have recently address. To hide from such spam bots, many websites have recently
begun placing email addresses in an obfuscated form, usable to humans begun placing email addresses in an obfuscated form, usable to humans
but difficult for an automata to read as an email address. Examples but difficult for an automata to read as an email address. Examples
include forms such as, "user at example dot com" or "j d r o s e n a include forms such as, "user at example dot com" or "j d r o s e n a
t e x a m p l e d o t c o m". t e x a m p l e d o t c o m".
These techniques are equally applicable to prevention of SIP spam, These techniques are equally applicable to prevention of SIP spam,
and are likely to be as equally effective or ineffective in its and are likely to be as equally effective or ineffective in its
prevention. prevention.
It is worth mentioning that the source of addresses need not be a web It is worth mentioning that the source of addresses need not be a web
site - any publically accessible service containing addresses will site - any publically accessible service containing addresses will
suffice. As a result, ENUM [10] has been cited as a potential gold suffice. As a result, ENUM [9] has been cited as a potential gold
mine for spammers. It would allow a spammer to collect SIP and other mine for spammers. It would allow a spammer to collect SIP and other
URIs by traversing the tree in e164.arpa and mining it for data. URIs by traversing the tree in e164.arpa and mining it for data.
This problem is mitigated in part if only number prefixes, as opposed This problem is mitigated in part if only number prefixes, as opposed
to actual numbers, appear in the DNS. Even in that case, however, it to actual numbers, appear in the DNS. Even in that case, however, it
provides a technique for a spammer to learn which phone numbers are provides a technique for a spammer to learn which phone numbers are
reachable through cheaper direct SIP connectivity. reachable through cheaper direct SIP connectivity.
3.7 Limited Use Addresses 3.7. Limited Use Addresses
A related technique to address obfuscation is limited use addresses. A related technique to address obfuscation is limited use addresses.
In this technique, a user has a large number of email addresses at In this technique, a user has a large number of email addresses at
their disposal. They give out different email addresses to different their disposal. They give out different email addresses to different
people. Once spam begins arriving at an address, the user terminates people. Once spam begins arriving at an address, the user terminates
the address and replaces it with another. the address and replaces it with another.
This technique is equally applicable to SIP. One of the drawbacks of This technique is equally applicable to SIP. One of the drawbacks of
the approach is that it can make it hard for people to reach you; if the approach is that it can make it hard for people to reach you; if
an email address you hand out to a friend becomes spammed, changing an email address you hand out to a friend becomes spammed, changing
it requires you to inform your friend of the new address. SIP can it requires you to inform your friend of the new address. SIP can
help solve this problem in part, by making use of presence [7]. help solve this problem in part, by making use of presence [6].
Instead of handing out your email address to your friends, you would Instead of handing out your email address to your friends, you would
hand out your presence URI. When a friend wants to send you an hand out your presence URI. When a friend wants to send you an
email, they subscribe to your presence (indeed, they are likely email, they subscribe to your presence (indeed, they are likely
continuously subscribed from a buddy list application). The presence continuously subscribed from a buddy list application). The presence
data can include an email address where you can be reached. This data can include an email address where you can be reached. This
email address can be obfuscated and be of single use, different for email address can be obfuscated and be of single use, different for
each buddy who requests your presence. They can also be constantly each buddy who requests your presence. They can also be constantly
changed, as these changes are pushed directly to your buddies. In a changed, as these changes are pushed directly to your buddies. In a
sense, the buddy list represents an automatically updated address sense, the buddy list represents an automatically updated address
book, and would therefore eliminate the problem. book, and would therefore eliminate the problem.
3.8 Turing Tests Another approach is to give a different address to each and every
correspondent, so that it is never necessary to tell a "good" user
that an address needs to be changed. This is an extreme form of
limited use addresses, which can be called a single-use address.
Mechanisms are available in SIP for the generation of [25] an
infinite supply of single use addresses. However, the hard part
remains a useful mechanism for distribution and management of those
addresses.
3.8. Turing Tests
In email, Turing tests are those solutions whereby the sender of the In email, Turing tests are those solutions whereby the sender of the
message is given some kind of puzzle or challenge, which only a human message is given some kind of puzzle or challenge, which only a human
can answer. If the puzzle is answered correctly, the sender is can answer. If the puzzle is answered correctly, the sender is
placed on the user's white list. These puzzles frequently take the placed on the user's white list. These puzzles frequently take the
form of recognizing a word or sequence of numbers in an image with a form of recognizing a word or sequence of numbers in an image with a
lot of background noise. Automata cannot easily perform the image lot of background noise. Automata cannot easily perform the image
recognition needed to extract the word or number sequence, but a recognition needed to extract the word or number sequence, but a
human user usually can. Since Turing tests rely on video or audio human user usually can. Since Turing tests rely on video or audio
puzzles, they sometimes cannot be solved by individuals with puzzles, they sometimes cannot be solved by individuals with
skipping to change at page 15, line 37 skipping to change at page 16, line 8
In the case of voice, there are problems with the Turing test In the case of voice, there are problems with the Turing test
described above. First, it is language specific. The application described above. First, it is language specific. The application
could be made to run in different languages, if the caller indicates could be made to run in different languages, if the caller indicates
their supported languages. This is possible in SIP, using the their supported languages. This is possible in SIP, using the
Accept-Language header field, but this is not widely used at the Accept-Language header field, but this is not widely used at the
moment. moment.
The other problem with this Turing test is the same one that email The other problem with this Turing test is the same one that email
tests have: instead of having an automata process the test, a spammer tests have: instead of having an automata process the test, a spammer
can pay cheap workers to take the tests. Assuming cheap labor in a can pay cheap workers to take the tests. Assuming cheap labor in a
poor country can be obtained for about $100 US dollars per year, and poor country can be obtained for about 60 cents per hour, and
assuming a Turing test of 30 second duration, this ends up being assuming a Turing test of 30 second duration, this is about 50 cents
about ten thousand messages per dollar, or about 10,000 microcents per test and thus 50 cents per message to send an IM spam. Lower
per message. Though much more expensive than the 31 microcents per labor rates would reduce this further; the number quoted here is
message to send an IM spam, it is still relatively inexpensive. based on real online bids in September of 2006 made for actual work
of this type.
As an alternative to paying cheap workers to take the tests, the As an alternative to paying cheap workers to take the tests, the
tests can be taken by human users that are tricked into completing tests can be taken by human users that are tricked into completing
the tests in order to gain access to what they believe is a the tests in order to gain access to what they believe is a
legitimate resource. This was done by a spambot that posted the legitimate resource. This was done by a spambot that posted the
tests on a pornography site, and required users to complete the tests tests on a pornography site, and required users to complete the tests
in order to gain access to content. in order to gain access to content.
Due to these limitations, turing tests may never completely solve the Due to these limitations, turing tests may never completely solve the
problem. problem.
3.9 Computational Puzzles 3.9. Computational Puzzles
This technique is similar to Turing tests. When user A tries to This technique is similar to Turing tests. When user A tries to
communicate with user B, user B asks user A to perform a computation communicate with user B, user B asks user A to perform a computation
and pass the result back. This computation has to be something a and pass the result back. This computation has to be something a
human user cannot perform and something expensive enough to increase human user cannot perform and something expensive enough to increase
user A's cost to communicate. This cost increase has to be high user A's cost to communicate. This cost increase has to be high
enough to make it prohibitively expensive for spammers but enough to make it prohibitively expensive for spammers but
inconsequential for legitimate users. inconsequential for legitimate users.
One of the problems with the technique is that there is wide One of the problems with the technique is that there is wide
variation in the computational power of the various clients that variation in the computational power of the various clients that
might legitimately communicate. The CPU speed on a low end cell might legitimately communicate. The CPU speed on a low end cell
phone is around 50 MHz, while a high end PC approaches 5 GHz. This phone is around 50 MHz, while a high end PC approaches 5 GHz. This
represents almost two orders of magnitude difference. Thus, if the represents almost two orders of magnitude difference. Thus, if the
test is designed to be reasonable for a cell phone to perform, it is test is designed to be reasonable for a cell phone to perform, it is
two orders of magnitude cheaper to perform for a spammer on a high two orders of magnitude cheaper to perform for a spammer on a high
end machine. Recent research has focused on defining computational end machine. Recent research has focused on defining computational
puzzles that challenge the CPU/memory bandwidth, as opposed to just puzzles that challenge the CPU/memory bandwidth, as opposed to just
the CPU [19]. It seems that there is less variety in the CPU/memory the CPU [22]. It seems that there is less variety in the CPU/memory
bandwidth across devices, roughly a single order of magnitude. bandwidth across devices, roughly a single order of magnitude.
Recent work [21] suggests that, due to the ability of spammers to use Recent work [24] suggests that, due to the ability of spammers to use
virus-infected machines (also known as zombies) to generate the spam, virus-infected machines (also known as zombies) to generate the spam,
the amount of computational power available to the spammers is the amount of computational power available to the spammers is
substantial, and it may be impossible to have them compute a puzzle substantial, and it may be impossible to have them compute a puzzle
that is sufficiently hard that will not also block normal emails. that is sufficiently hard that will not also block normal emails.
However, if combined with white listing, the computational puzzles However, if combined with white listing, the computational puzzles
only become needed for validating new communication partners. The only become needed for validating new communication partners. The
frequency of communications with new partners is arguably higher for frequency of communications with new partners is arguably higher for
email than for multimedia, and thus the computational puzzle email than for multimedia, and thus the computational puzzle
techniques may be more effective for SIP than for email in dealing techniques may be more effective for SIP than for email in dealing
with the introduction problem. with the introduction problem.
These techniques are an active area of research right now, and any These techniques are an active area of research right now, and any
results for email are likely to be usable for SIP. Of course, it is results for email are likely to be usable for SIP. Of course, it is
likely that these techniques will come with a lot of patents and likely that these techniques will come with a lot of patents and
other intellectual property constraints. other intellectual property constraints.
3.10 Payments at Risk 3.10. Payments at Risk
This approach has been proposed for email [20]. When user A sends to This approach has been proposed for email [23]. When user A sends to
user B, user A deposits a small amount of money (say, one dollar) user B, user A deposits a small amount of money (say, one dollar)
into user B's account. If user B decides that the message is not into user B's account. If user B decides that the message is not
spam, user B refunds this money back to user A. If the message is spam, user B refunds this money back to user A. If the message is
spam, user B keeps the money. This technique requires two spam, user B keeps the money. This technique requires two
transactions to complete: a transfer from A to B, and a transfer from transactions to complete: a transfer from A to B, and a transfer from
B back to A. The first transfer has to occur before the message can B back to A. The first transfer has to occur before the message can
be received in order to avoid reuse of "pending payments" across be received in order to avoid reuse of "pending payments" across
several messages, which would eliminate the utility of the solution. several messages, which would eliminate the utility of the solution.
The second one then needs to occur when the message is found not to The second one then needs to occur when the message is found not to
be spam. be spam.
skipping to change at page 17, line 33 skipping to change at page 18, line 5
shouldered per user is equal to the number of messages from unknown shouldered per user is equal to the number of messages from unknown
senders (that is, senders not on the white list) that are received. senders (that is, senders not on the white list) that are received.
For a busy user, assume about 10 new senders per day. If the deposit For a busy user, assume about 10 new senders per day. If the deposit
is 5 cents, the transaction provider would take .75 cents and deliver is 5 cents, the transaction provider would take .75 cents and deliver
4.25 cents. If the sender is allowed, the recipient returns 4.25 4.25 cents. If the sender is allowed, the recipient returns 4.25
cents, the provider takes 64 cents, and returns 3.6 cents. This cents, the provider takes 64 cents, and returns 3.6 cents. This
costs the sender .65 cents on each transaction, if it was legitimate. costs the sender .65 cents on each transaction, if it was legitimate.
If there are ten new recipients per day, thats US $1.95 per month, If there are ten new recipients per day, thats US $1.95 per month,
which is relatively inexpensive. which is relatively inexpensive.
3.11 Legal Action 3.11. Legal Action
In this solution, countries pass laws that prohibit spam. These laws In this solution, countries pass laws that prohibit spam. These laws
could apply to IM or call spam just as easily as they could apply to could apply to IM or call spam just as easily as they could apply to
email spam. email spam.
There is a lot of debate about whether these laws would really be There is a lot of debate about whether these laws would really be
effective in preventing spam. Whether they are or are not effective, effective in preventing spam. Whether they are or are not effective,
they would appear to be equally effective (or ineffective, as the they would appear to be equally effective (or ineffective, as the
case may be) in preventing SIP spam. case may be) in preventing SIP spam.
skipping to change at page 18, line 18 skipping to change at page 18, line 38
There are also schemes that cause laws other than anti-spam laws to There are also schemes that cause laws other than anti-spam laws to
be broken if spam is sent. This does not inherently reduce SPAM, but be broken if spam is sent. This does not inherently reduce SPAM, but
it allows more legal options to be brought to bear against the it allows more legal options to be brought to bear against the
spammer. For example, Habeas <http://www.habeas.com> inserts spammer. For example, Habeas <http://www.habeas.com> inserts
material in the header that, if a spammer inserted it without an material in the header that, if a spammer inserted it without an
appropriate license, allegedly causes the spammer to be violating US appropriate license, allegedly causes the spammer to be violating US
copyright and trademark laws, possibly reciprocal laws, and similar copyright and trademark laws, possibly reciprocal laws, and similar
laws in many countries. laws in many countries.
3.12 Circles of Trust 3.12. Circles of Trust
In this model, a group of domains (e.g., a set of enterprises) all In this model, a group of domains (e.g., a set of enterprises) all
get together. They agree to exchange SIP calls amongst each other, get together. They agree to exchange SIP calls amongst each other,
and they also agree to introduce a fine should any one of them be and they also agree to introduce a fine should any one of them be
caught spamming. Each company would then enact measures to terminate caught spamming. Each company would then enact measures to terminate
employees who spam from their accounts. employees who spam from their accounts.
This technique relies on secure inter-domain authentication - that This technique relies on secure inter-domain authentication - that
is, domain B can know that messages are received from domain A. In is, domain B can know that messages are received from domain A. In
SIP, this is readily provided by usage of the mutually authenticated SIP, this is readily provided by usage of the mutually authenticated
skipping to change at page 18, line 40 skipping to change at page 19, line 12
domain identification, although new techniques are being investigated domain identification, although new techniques are being investigated
to add it using reverse DNS checks (see below). to add it using reverse DNS checks (see below).
This kind of technique works well for small domains or small sets of This kind of technique works well for small domains or small sets of
providers, where these policies can be easily enforced. However, it providers, where these policies can be easily enforced. However, it
is unclear how well it scales up. Could a very large domain truly is unclear how well it scales up. Could a very large domain truly
prevent its users from spamming? Would a very large enterprise just prevent its users from spamming? Would a very large enterprise just
pay the fine? How would the pricing be structured to allow both pay the fine? How would the pricing be structured to allow both
small and large domains alike to participate? small and large domains alike to participate?
3.13 Centralized SIP Providers 3.13. Centralized SIP Providers
In this technique, a small number of providers get established as In this technique, a small number of providers get established as
"inter-domain SIP providers". These providers act as a SIP- "inter-domain SIP providers". These providers act as a SIP-
equivalent to the interexchange carriers in the PSTN. Every equivalent to the interexchange carriers in the PSTN. Every
enterprise, consumer SIP provider or other SIP network (call these enterprise, consumer SIP provider or other SIP network (call these
the local SIP providers) connects to one of these inter-domain the local SIP providers) connects to one of these inter-domain
providers. The local SIP providers only accept SIP messages from providers. The local SIP providers only accept SIP messages from
their chosen inter-domain provider. The inter-domain provider their chosen inter-domain provider. The inter-domain provider
charges the local provider, per SIP message, for the delivery of SIP charges the local provider, per SIP message, for the delivery of SIP
messages to other local providers. The local provider can choose to messages to other local providers. The local provider can choose to
skipping to change at page 19, line 21 skipping to change at page 19, line 41
of funds between the inter-domain providers. of funds between the inter-domain providers.
The result of such a system is that a fixed cost can be associated The result of such a system is that a fixed cost can be associated
with sending a SIP message, and that this cost does not require with sending a SIP message, and that this cost does not require
micro-payments to be exchanged between local providers, as it does in micro-payments to be exchanged between local providers, as it does in
Section 3.10. Since all of the relationships are pre-established and Section 3.10. Since all of the relationships are pre-established and
negotiated, cheaper techniques for monetary transactions (such as negotiated, cheaper techniques for monetary transactions (such as
monthly post-paid transactions) can be used. monthly post-paid transactions) can be used.
This technique can be made to work in SIP, whereas it cannot in This technique can be made to work in SIP, whereas it cannot in
email, because inter-domain SIP connectivity has not yet been email, because inter-domain SIP connectivity has not yet been broadly
established. In email, there already exists a no-cost form of inter- established. In email, there already exists a no-cost form of inter-
domain connectivity that cannot be eliminated without destroying the domain connectivity that cannot be eliminated without destroying the
utility of email. If, however, SIP inter-domain communications get utility of email. If, however, SIP inter-domain communications get
established from the start using this structure, there is a path to established from the start using this structure, there is a path to
deployment. deployment.
This structure is more or less the same as the one in place for the This structure is more or less the same as the one in place for the
PSTN today, and since there is relatively little spam on the PSTN PSTN today, and since there is relatively little spam on the PSTN
(compared to email!), there is some proof that this kind of (compared to email!), there is some proof that this kind of
arrangement can work. However, it puts back into SIP much of the arrangement can work. However, it puts back into SIP much of the
complexity and monopolistic structures that SIP promised to complexity and monopolistic structures that SIP promised to
eliminate. As such, it is a solution that the authors find eliminate. As such, it is a solution that the authors find
distasteful and contrary to the SIP design and architecture. distasteful and contrary to the SIP design and architecture.
3.14 Sender Checks 4. Authenticated Identity in Email
In email, there has been a lot of interest in defining new DNS Though not a form of anti-spam in and of itself, authenticated or
resource records that will allow a domain that receives a message to verifiable identities are a key part of making other anti-spam
verify that the sender is a valid MTA for the sending domain [18] mechanisms work. Many of the techniques described above are most
[16]. effective when combined with a white or black list, which itself
requires a strong form of identity.
In email, two types of authenticated identity have been developed -
sender checks and signature-based solutions.
4.1. Sender Checks
In email, DNS resource records have been defined that will allow a
domain that receives a message to verify that the sender is a valid
MTA for the sending domain [18] [19] [20] [21]. They don't prevent
spam by themselves, but may help in preventing spoofed emails. As
has been mentioned several times, a form of strong authenticated
identity is key in making many other anti-spam techniques work.
Are these techniques useful for SIP? They can be used for SIP but Are these techniques useful for SIP? They can be used for SIP but
are not necessary. In email, there are no standards established for are not necessary. In email, there are no standards established for
securely identifying the identity of the sending domain of a message. securely identifying the identity of the sending domain of a message.
In SIP, however, TLS with mutual authentication can be used inter- In SIP, however, TLS with mutual authentication can be used inter-
domain. A provider receiving a message can then reject any message domain. A provider receiving a message can then reject any message
coming from a domain that does not match the asserted identity of the coming from a domain that does not match the asserted identity of the
sender of the message. Such a policy only works in the "trapezoid" sender of the message. Such a policy only works in the "trapezoid"
model of SIP, whereby there are only two domains in any call - the model of SIP, whereby there are only two domains in any call - the
sending domain, which is where the originator resides, and the sending domain, which is where the originator resides, and the
receiving domain. These techniques are discussed in Section 26.3.2.2 receiving domain. These techniques are discussed in Section 26.3.2.2
of RFC 3261 [2]. These techinques, however, are only applicable in of RFC 3261 [2]. In forwarding situations, the assumption no longer
the trapezoid model where there is a sending and a receiving domain holds and these techniques no longer work.
only. In forwarding situations, the assumption no longer holds and
these techniques no longer work.
Thus, instead of creating DNS entries containing the IP address of However, the authenticated identity mechanism for SIP, discussed
each legitimate relay for a domain, the provider can give each below, does work in more complex network configurations and provides
legitimate relay a certificate that allows them to authenticate fairly strong assertion of identity.
themselves as coming from that domain. Such a technique would work
even in the face of IP address spoofing, which the marid techniques
are susceptible to.
4. Authenticated Identity in SIP 4.2. Signature-Based Techniques
Domain Keys Identified Mail (DKIM) [16] (and several non-standard
techniques that preceded it) provide stronger identity assertions by
allowing the sending domain to sign an email, and then providing
mechanisms by which the receiving MTA or MUA can validate the
signature.
Unfortunately, when used with blacklists, this kind of authenticated
identity is only as useful as the fraction of the emails which
utilize it. This is partly true for whitelists as well; if any
unauthenticated email is accepted for an address on a white list, a
spammer can spoof that address. However a white list can be
effective with limited deployment of DKIM if all of the people on the
white list are those whose domains are utilizing the mechanism.
This kind of identity mechanism is also applicable to SIP, and is in
fact exactly what is defined by SIP's authenticated identity
mechanism [17]
5. Authenticated Identity in SIP
One of the key parts of many of the solutions described above is the One of the key parts of many of the solutions described above is the
ability to securely identify the identity of a sender of a SIP ability to securely identify the identity of a sender of a SIP
message. SIP provides a secure solution for this problem, and it is message. SIP provides a secure solution for this problem, and it is
important to discuss it here. important to discuss it here.
The solution starts by having each domain authenticate its own users. The solution starts by having each domain authenticate its own users.
SIP provides HTTP digest authentication as part of the core SIP SIP provides HTTP digest authentication as part of the core SIP
specification, and all clients and servers are required to support specification, and all clients and servers are required to support
it. Indeed, digest is widely deployed for SIP. However, digest it. Indeed, digest is widely deployed for SIP. However, digest
skipping to change at page 20, line 43 skipping to change at page 21, line 44
not widely deployed yet. In the long term, this approach will be not widely deployed yet. In the long term, this approach will be
necessary for the security properties needed to prevent SIP spam. necessary for the security properties needed to prevent SIP spam.
Once a domain has authenticated the identity of a user, when it Once a domain has authenticated the identity of a user, when it
relays a message from that user to another domain, the sending domain relays a message from that user to another domain, the sending domain
can assert the identity of the sender, and include a signature to can assert the identity of the sender, and include a signature to
validate that assertion. This is done using the SIP identity validate that assertion. This is done using the SIP identity
mechanism [17]. mechanism [17].
A weaker form of identity assertion is possible using the P-Asserted- A weaker form of identity assertion is possible using the P-Asserted-
Identity header field [6], but this technique requires mutual trust Identity header field [5], but this technique requires mutual trust
among all domains. Unfortunately, this becomes expontentially harder among all domains. Unfortunately, this becomes exponentially harder
to provide as the number of interconnected domains grows. As that to provide as the number of interconnected domains grows. As that
happens, the value of the identity assertion becomes equal to the happens, the value of the identity assertion becomes equal to the
trustworthiness of the least trustworthy domain. Since spam is a trustworthiness of the least trustworthy domain. Since spam is a
consequence of untrusted domains and users that get connected to the consequence of untrusted domains and users that get connected to the
network, the P-Asserted-Identity technique becomes ineffective at network, the P-Asserted-Identity technique becomes ineffective at
exactly the same levels of interconnectness that introduce spam. exactly the same levels of interconnectness that introduce spam.
A further weakness of P-Asserted-ID is that the actual domain which Consider the following example to help illustrate this fact. A
asserted the identity cannot be known. If that domain could be malicious domain, let us call them spammers.com, would like to send
reliably known, then its assertions could be tempered based on user SIP INVITE requests with false P-Asserted-Identity, indicating users
or domain-wide policiies. This weakness is not present in [17], outside of its own domain. Spammers.com finds a regional SIP
which allows the recipient of a message to cryptographically provider in a small country who, due to its small size and
determine the identity of the asserting domain. disinterest in spam, accepts any P-Asserted-Identity from its
customers without verification. This provider, in turn, connects to
a larger, interconnect provider. They do ask each of their customers
to verify P-Asserted-Identity but have no easy way of enforcing it.
This provider, in turn, connects to everyone else. As a consequence,
the spammers.com domain is able to inject calls with a spoofed called
ID. This request can be directed to any recipient reachable through
the network (presumably everyone due to the large size of the root
provider). There is no way for a recipient to know that this
particular P-Asserted-Identity came from this bad spammers.com
domain. As the example shows, even though the central provider's
policy is good, the overall effectiveness of P-Asserted-Identity is
still only as good as the policies of the weakest link in the chain.
SIP also defines the usage of TLS between domains, using mutual SIP also defines the usage of TLS between domains, using mutual
authentication, as part of the base specification. This technique authentication, as part of the base specification. This technique
provides a way for one domain to securely determine that it is provides a way for one domain to securely determine that it is
talking to a server that is a valid representative of another domain. talking to a server that is a valid representative of another domain.
5. Framework for Anti-Spam in SIP 6. Framework for Anti-Spam in SIP
Unfortunately, there is no magic bullet for preventing SIP spam, just Unfortunately, there is no magic bullet for preventing SIP spam, just
as there is none for email spam. However, the combination of several as there is none for email spam. However, the combination of several
techniques can provide a framework for dealing with spam in SIP. techniques can provide a framework for dealing with spam in SIP.
This section provides recommendations for network designers in order
to help mitigate the risk of spam.
Strong Authenticated Identity is Key: In almost all of the solutions There are three core recommendations that can be made.
discussed above, there is a dependency on the ability to
authenticate the sender of a SIP message inter-domain. As such,
we would argue that any provider that performs inter-domain SIP
messaging must use the techniques described in Section 4, and in
particular, depend on the strong identity techniques in [17].
Whitelists: With a strong identity mechanism in place, whitelists can Firstly, in almost all of the solutions discussed above, there is a
facilitate communications from known callers. That reduces the dependency on the ability to authenticate the sender of a SIP message
scope of the problem to the introduction problem. inter-domain. Consent, reputation systems, computational puzzles,
and payments at risk, amongst others, all work best when applied only
to new requests, and successful completion of an introduction results
in the placement of a user on a white list. However, usage of white
lists depends on strong identity assertions. Consequently, any
network that interconnects with others should make use of strong SIP
identity as described in RFC 4474. P-Asserted-Identity is not strong
enough.
Consent Framework: The SIP consent framework [13] extends the Secondly, with a strong identity system in place, networks are
presence framework for consent to all communications. Consent recommended to make use of white lists. These are ideally built off
plays an important role in helping address the introduction of the existing buddy lists if present. If not, separate white lists
problem. can be managed for spam. Placement on these lists can be manual or
based on the successful completion of one or more introduction
mechanisms.
Leverage What Email has to Offer: With the consent framework in This in turn leads to the final recommendation to be made. Network
place, spammers have only a small window through which they can designers should make use of one or more mechanisms meant to solve
introduce content to recipients. Fortunately, that problem is the introduction problem. Indeed, it is possible to use more than
similar to traditional email spam, and can be addressed using the one and combine the results through some kind of weight. A user that
various email-based anti-spam techniques. Providers of SIP successfully completes the introduction mechanism can be
services should keep tabs on solutions in email as they evolve, automatically added to the white list. Of course, that can only be
and utilize the best of what those techniques have to offer. But done usefully if their identity is verified by RFC 4474. The set of
perhaps most importantly, providers should not ignore the spam mechanisms for solving the introduction problem, as described in this
problem until it happens! That is the pitfall email fell into. document, are based on some (but not all) of the techniques known and
As soon as a provider inter-connects with other providers, or used at the time of writing. Providers of SIP services should keep
allows SIP messages from the open Internet, that provider must tabs on solutions in email as they evolve, and utilize the best of
consider how they will deal with spam. what those techniques have to offer.
6. Additional Work But perhaps most importantly, providers should not ignore the spam
problem until it happens! That is the pitfall email fell into. As
soon as a provider inter-connects with other providers, or allows SIP
messages from the open Internet, that provider must consider how they
will deal with spam.
7. Additional Work
Though the above framework serves as a good foundation on which to Though the above framework serves as a good foundation on which to
deal with spam in SIP, there are gaps, some of which can be addressed deal with spam in SIP, there are gaps, some of which can be addressed
by additional work that has yet to be undertaken. by additional work that has yet to be undertaken.
One of the difficulties with the strong identity techniques is that a One of the difficulties with the strong identity techniques is that a
receiver of a SIP request without an authenticated identity cannot receiver of a SIP request without an authenticated identity cannot
know whether the request lacked such an identity because the know whether the request lacked such an identity because the
originating domain didn't support it, or because a man-in-the-middle originating domain didn't support it, or because a man-in-the-middle
removed it. As a result, transition mechanisms should be put in removed it. As a result, transition mechanisms should be put in
place to allow these to be differentiated. Without it, the value of place to allow these to be differentiated. Without it, the value of
the identity mechanism is much reduced. the identity mechanism is much reduced.
The consent framework depends on the ability for users to make a 8. Security Considerations
determination about whether to grant consent for unknown senders. In
order for that framework to be useful, it needs to be coupled with
techniques to ascertain trustworthiness. Reputation systems, for
example, can help with that. At this time, reputation systems have
seen implementation only within single domains, and using proprietary
techniques. A standards-based inter-domain solution would be a
valuable part of this framework.
7. Security Considerations
This memo is entirely devoted to issues relating to secure usage of This memo is entirely devoted to issues relating to secure usage of
SIP services on the Internet. SIP services on the Internet.
8. Acknowledgements 9. Acknowledgements
The authors would like to thank Rohan Mahy for providing information The authors would like to thank Rohan Mahy for providing information
on Habeas, Baruch Sterman for providing costs on VoIP termination on Habeas, Baruch Sterman for providing costs on VoIP termination
services, and Gonzalo Camarillo for his review. Useful comments and services, and Gonzalo Camarillo for his review. Useful comments and
feedback were provided by Nils Ohlmeir, Tony Finch, Randy Gellens and feedback were provided by Nils Ohlmeir, Tony Finch, Randy Gellens and
Yakov Shafranovich. Yakov Shafranovich. Jon Peterson wrote some of the text in this
document and has contributed to the work as it has moved along.
9. Informative References 10. Informative References
[1] Campbell, B., "The Message Session Relay Protocol", [1] Campbell, B., "The Message Session Relay Protocol",
draft-ietf-simple-message-sessions-14 (work in progress), draft-ietf-simple-message-sessions-15 (work in progress),
February 2006. July 2006.
[2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., [2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
Session Initiation Protocol", RFC 3261, June 2002. Session Initiation Protocol", RFC 3261, June 2002.
[3] Campbell, B., Rosenberg, J., Schulzrinne, H., Huitema, C., and [3] Campbell, B., Rosenberg, J., Schulzrinne, H., Huitema, C., and
D. Gurle, "Session Initiation Protocol (SIP) Extension for D. Gurle, "Session Initiation Protocol (SIP) Extension for
Instant Messaging", RFC 3428, December 2002. Instant Messaging", RFC 3428, December 2002.
[4] Roach, A., "Session Initiation Protocol (SIP)-Specific Event [4] Roach, A., "Session Initiation Protocol (SIP)-Specific Event
Notification", RFC 3265, June 2002. Notification", RFC 3265, June 2002.
[5] Peterson, J., "A Privacy Mechanism for the Session Initiation [5] Jennings, C., Peterson, J., and M. Watson, "Private Extensions
Protocol (SIP)", RFC 3323, November 2002.
[6] Jennings, C., Peterson, J., and M. Watson, "Private Extensions
to the Session Initiation Protocol (SIP) for Asserted Identity to the Session Initiation Protocol (SIP) for Asserted Identity
within Trusted Networks", RFC 3325, November 2002. within Trusted Networks", RFC 3325, November 2002.
[7] Rosenberg, J., "A Presence Event Package for the Session [6] Rosenberg, J., "A Presence Event Package for the Session
Initiation Protocol (SIP)", RFC 3856, August 2004. Initiation Protocol (SIP)", RFC 3856, August 2004.
[8] Rosenberg, J., "A Watcher Information Event Template-Package [7] Rosenberg, J., "A Watcher Information Event Template-Package
for the Session Initiation Protocol (SIP)", RFC 3857, for the Session Initiation Protocol (SIP)", RFC 3857,
August 2004. August 2004.
[9] Rosenberg, J., "An Extensible Markup Language (XML) Based [8] Rosenberg, J., "An Extensible Markup Language (XML) Based
Format for Watcher Information", RFC 3858, August 2004. Format for Watcher Information", RFC 3858, August 2004.
[10] Faltstrom, P. and M. Mealling, "The E.164 to Uniform Resource [9] Faltstrom, P. and M. Mealling, "The E.164 to Uniform Resource
Identifiers (URI) Dynamic Delegation Discovery System (DDDS) Identifiers (URI) Dynamic Delegation Discovery System (DDDS)
Application (ENUM)", RFC 3761, April 2004. Application (ENUM)", RFC 3761, April 2004.
[11] Rosenberg, J., "The Extensible Markup Language (XML) [10] Rosenberg, J., "The Extensible Markup Language (XML)
Configuration Access Protocol (XCAP)", Configuration Access Protocol (XCAP)",
draft-ietf-simple-xcap-08 (work in progress), October 2005. draft-ietf-simple-xcap-11 (work in progress), May 2006.
[12] Rosenberg, J., "Presence Authorization Rules", [11] Rosenberg, J., "Presence Authorization Rules",
draft-ietf-simple-presence-rules-04 (work in progress), draft-ietf-simple-presence-rules-07 (work in progress),
October 2005. June 2006.
[13] Rosenberg, J., "A Framework for Consent-Based Communications in [12] Rosenberg, J., "A Framework for Consent-Based Communications in
the Session Initiation Protocol (SIP)", the Session Initiation Protocol (SIP)",
draft-ietf-sipping-consent-framework-04 (work in progress), draft-ietf-sip-consent-framework-00 (work in progress),
March 2006. September 2006.
[13] Camarillo, G., "A Document Format for Requesting Consent",
draft-ietf-sipping-consent-format-00 (work in progress),
September 2006.
[14] Rosenberg, J., "A Framework for Application Interaction in the [14] Rosenberg, J., "A Framework for Application Interaction in the
Session Initiation Protocol (SIP)", Session Initiation Protocol (SIP)",
draft-ietf-sipping-app-interaction-framework-05 (work in draft-ietf-sipping-app-interaction-framework-05 (work in
progress), July 2005. progress), July 2005.
[15] Burger, E., "A Session Initiation Protocol (SIP) Event Package [15] Dolly, M. and E. Burger, "A Session Initiation Protocol (SIP)
for Key Press Stimulus (KPML)", draft-ietf-sipping-kpml-07 Event Package for Key Press Stimulus (KPML)",
(work in progress), December 2004. draft-ietf-sipping-kpml-08 (work in progress), July 2006.
[16] Lyon, J., "Sender ID: Authenticating E-Mail", [16] Hansen, T., "DomainKeys Identified Mail (DKIM) Service
draft-ietf-marid-core-03 (work in progress), August 2004. Overview", draft-ietf-dkim-overview-01 (work in progress),
June 2006.
[17] Peterson, J. and C. Jennings, "Enhancements for Authenticated [17] Peterson, J. and C. Jennings, "Enhancements for Authenticated
Identity Management in the Session Initiation Protocol (SIP)", Identity Management in the Session Initiation Protocol (SIP)",
draft-ietf-sip-identity-06 (work in progress), October 2005. RFC 4474, August 2006.
[18] Danisch, H., "The RMX DNS RR and method for lightweight SMTP [18] Allman, E. and H. Katz, "SMTP Service Extension for Indicating
sender authorization", draft-danisch-dns-rr-smtp-04 (work in the Responsible Submitter of an E-Mail Message", RFC 4405,
progress), May 2004. April 2006.
[19] Abadi, M., Burrows, M., Manasse, M., and T. Wobber, "Moderately [19] Lyon, J. and M. Wong, "Sender ID: Authenticating E-Mail",
RFC 4406, April 2006.
[20] Lyon, J., "Purported Responsible Address in E-Mail Messages",
RFC 4407, April 2006.
[21] Wong, M. and W. Schlitt, "Sender Policy Framework (SPF) for
Authorizing Use of Domains in E-Mail, Version 1", RFC 4408,
April 2006.
[22] Abadi, M., Burrows, M., Manasse, M., and T. Wobber, "Moderately
Hard, Memory Bound Functions, NDSS 2003", February 2003. Hard, Memory Bound Functions, NDSS 2003", February 2003.
[20] Abadi, M., Burrows, M., Birrell, A., Dabek, F., and T. Wobber, [23] Abadi, M., Burrows, M., Birrell, A., Dabek, F., and T. Wobber,
"Bankable Postage for Network Services, Proceedings of the 8th "Bankable Postage for Network Services, Proceedings of the 8th
Asian Computing Science Conference, Mumbai, India", Asian Computing Science Conference, Mumbai, India",
December 2003. December 2003.
[21] Clayton, R. and B. Laurie, "Proof of Work Proves not to Work, [24] Clayton, R. and B. Laurie, "Proof of Work Proves not to Work,
Third Annual Workshop on Economics and Information Security", Third Annual Workshop on Economics and Information Security",
May 2004. May 2004.
[25] Rosenberg, J., "User Agent Loose Routing in the Session
Initiation Protocol (SIP)",
draft-rosenberg-sip-ua-loose-route-00 (work in progress),
October 2006.
Authors' Addresses Authors' Addresses
Jonathan Rosenberg Jonathan Rosenberg
Cisco Cisco
600 Lanidex Plaza 600 Lanidex Plaza
Parsippany, NJ 07054 Parsippany, NJ 07054
US US
Phone: +1 973 952-5000 Phone: +1 973 952-5000
Email: jdrosen@cisco.com Email: jdrosen@cisco.com
URI: http://www.jdrosen.net URI: http://www.jdrosen.net
Cullen Jennings Cullen Jennings
Cisco Cisco
170 West Tasman Dr. 170 West Tasman Dr.
San Jose, CA 95134 San Jose, CA 95134
US US
Phone: +1 408 527-9132 Phone: +1 408 527-9132
Email: fluffy@cisco.com Email: fluffy@cisco.com
Jon Peterson
Neustar
1800 Sutter Street
Suite 570
Concord, CA 94520
US
Phone: +1 925 363-8720
Email: jon.peterson@neustar.biz
URI: http://www.neustar.biz
Intellectual Property Statement Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be on the procedures with respect to rights in RFC documents can be
 End of changes. 75 change blocks. 
192 lines changed or deleted 258 lines changed or added

This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/