[Docs] [txt|pdf] [Tracker] [Email] [Nits]

Versions: 00

Network Working Group                                          A. Newton
Internet-Draft                                            VeriSign, Inc.
Expires: August 10, 2005                                 Y. Shafranovich
                                          SolidMatrix Technologies, Inc.
                                                        February 9, 2005

                     Distributed Black/White Lists

Status of this Memo

   By submitting this Internet-Draft, I certify that any applicable
   patent or other IPR claims of which I am aware have been disclosed,
   and any of which I become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

   This Internet-Draft will expire on August 10, 2005.

Copyright Notice

   Copyright (C) The Internet Society (2005).  All Rights Reserved.


   Many traditional, centrally-managed blacklists and whitelists
   describe Internet end-points by characteristics such as connectivity
   type or network function, and these characteristics are often used to
   infer behavior from which authorization is derived.  However, it is
   often the case that connectivity type or network function are not
   related to good or bad behavior.  This document describes a means of
   creating blacklists and whitelists representative of Internet
   end-points based on observed behavior by many participants in a

Newton & Shafranovich    Expires August 10, 2005                [Page 1]

Internet-Draft                    DxL                      February 2005

   distributed monitoring network.  The authors hope that distributed
   lists will mitigate some of the problems associated with existing
   centrally managed lists.  While the concept, architecture, and data
   model are general enough to be applied to any type of network
   service, the authors of this document are specifically addressing the
   problem of spam in blogs.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Document Terminology . . . . . . . . . . . . . . . . . . . . .  5
   3.  Motivations  . . . . . . . . . . . . . . . . . . . . . . . . .  6
   4.  Architecture . . . . . . . . . . . . . . . . . . . . . . . . .  8
   5.  Data Model . . . . . . . . . . . . . . . . . . . . . . . . . .  9
   6.  Formal XML Syntax  . . . . . . . . . . . . . . . . . . . . . . 12
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 15
   7.1   Normative References . . . . . . . . . . . . . . . . . . . . 15
   7.2   Informative References . . . . . . . . . . . . . . . . . . . 15
       Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 15
       Intellectual Property and Copyright Statements . . . . . . . . 17

Newton & Shafranovich    Expires August 10, 2005                [Page 2]

Internet-Draft                    DxL                      February 2005

1.  Introduction

   For years, blacklists have been used as an authorization policy
   mechanism for public network services, mostly email.  These
   centrally-managed blacklists lists can be categorized into two
   o  lists containing Internet end-points based on certain
      characteristics, such as how they are connected to the Internet
      (e.g.  dial-up or residential broadband) or a type of network
      function they may serve (e.g.  proxy or relay)
   o  lists containing Internet end-points that have been observed to
      exhibit certain behavior (e.g.  sending unsolicited email).

   Additionally, recently a smaller but evergrowing number of whitelists
   have been developed and deployed to assist network administrators in
   determining authorization rights for public network services.
   Centrally managed whitelists usually contain positive information
   about Internet end-points that is being vouched for by the party that
   administers the list.  In some cases this information is collected by
   the administrating party independently of the end points listed, but
   in many cases the party administering the list charges a fee for
   inclusion, thus essentially operating an accreditation service.

   Some blacklists and whitelists are do not necessarally list bad or
   good information, but rather seek to provide reputation information
   about Internet end points.  Unfortunatly, as the case with
   blacklists, reputation services tend to suffer from many of the same
   problems stemming from accountability issues.

   The purpose of such lists is to erradicate certain undesirable
   side-effects of a highly successful network, usually unsolicited
   email.  However, these lists have a great tendacy to inhibit
   universal network access, in many cases outweighing their perceived
   benefits.  For example:
   o  While it is true that many senders of unsolicited email (spam) use
      dial-up network connections, it is not reasonable to assume that
      all dial-up network connections are used to send spam: the two are
   o  Constrained by the need for human verification, many lists
      specializing in observed unwanted behavior tend to mark whole
      networks as bad versus specific end-points, though there is no
      evidence that every end-point in a network has exhibited
      undesirable behavior.
   o  There is often little guidance available on the criteria used to
      create these lists and seldom useful information on how to correct
      errors in these lists.
   o  In the case of whitelists, a fee chargable for accreditation and
      inclusion into a whitelist may inhibit certain Internet users from

Newton & Shafranovich    Expires August 10, 2005                [Page 3]

Internet-Draft                    DxL                      February 2005

      obtaining network access.  For example, individuals and
      non-commercial users, especially ones from poorer countries may
      not have the resources to pay an admission fee for inclusion into
      a whitelist.  If multiple whitelists become popular, the financial
      burden will greatly descrease accessibility of Internet services
      to those users.
   For these reasons and more, these centrally-managed lists have failed
   to make an impact on the spam problem and to be universally adopted.
   This is all too evident given that spam continues to be a growing
   problem not only in email, but slowly spreading to other network
   services as well.

   This document describes an architecture and data model for
   Distributed Black/White Lists (DxL).  The intent is to leverage an
   peer to peer web-of-trust as opposed to a centrally managed list,
   hopefully providing greater accuracy and understood accountability.
   It should be  noted, however, that the concept, architecture, and
   data-model for DxLs could be applied to other network services.
   However, the authors chose to target the design of DxLs toward a
   relatively new type of web application called blogging.

Newton & Shafranovich    Expires August 10, 2005                [Page 4]

Internet-Draft                    DxL                      February 2005

2.  Document Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in RFC 2119 [1].

Newton & Shafranovich    Expires August 10, 2005                [Page 5]

Internet-Draft                    DxL                      February 2005

3.  Motivations

   Many of the problems arising in the use of blacklists and whitelists
   is the fact that they are centrally managed by a third-party which
   may not be accountable to or trusted by a network administrator who
   wishes to use such lists.  List users may also wish to express their
   opinion on specific list entries or entire lists, but due to the
   central nature of these lists that is not currently possible.
   Additionally, many Internet users and network operators already have
   existing     relationships in place with others which can be utilized to
   pass along blacklist and whitelist information, instead of
   establishing new ones with the parties administering central lists.

   In the real world, existing relationships and social networks are
   often used to pass along reputation information, and the digital
   world should in theory be no different.  Thefore, in order to step
   around the problem of trusting the party administering the central
   list, we choose to distribute DxL information in a peer to peer
   fashion.  This gives users the ability to use their existing
   relationships to establish a web of trust for the purposes of
   authorizing access to public network services (which in this case are
   ability to leave comments and trackbacks on blog posts, and passing
   referer information).  We also chose to allow lists to be combined
   and passed on as new lists, thus allowing trust information to be
   propogates via a social network.

   Aditionally, in order to enforce accountability and transparency, we
   chose to require URLs pointing to the original list from which the
   information originates, URLs pointing to a removal page, and
   creation/update data for all entries.  While these may not be checked
   for validity in all cases, nevertheless their presence indicates to
   the list creators and users that these are mattersnot to be ignored.
   Additionally, we believe that users will take the validity of this
   information into account when trusting or not trusting specific

   In order to allow flexibility for this system, we choose to add
   weights to the list entries indicating the "black" or "white" value.
   Many existing lists provides a binary "yes/no" decision in regards to
   their entries which may not be flexible enough for all cases.
   Additionally, a weight mechanism allows users to adjust weight
   ratings on lists coming from other users based on their trust level.

   Though this document may be the first formulization of a distributed
   black/white list using XML, the concept of a peer-to-peer style
   distribution of these lists has been seen in

Newton & Shafranovich    Expires August 10, 2005                [Page 6]

Internet-Draft                    DxL                      February 2005


Newton & Shafranovich    Expires August 10, 2005                [Page 7]

Internet-Draft                    DxL                      February 2005

4.  Architecture

   Unlike DNS-based blacklists [9] (known as DNSBLs) which operate over
   DNS, a DxL is an XML document and is retrieved over the Internet by
   using a protocol such as HTTP.  This is modelled after RSS, which is
   commonly found in the "blogosphere".  Once retreived, a DxL is cached
   for a period of time and checked for updates upon expiration.  Note,
   that this is not the only possible implementation or exchange
   mechanism available for this data.

   A DxL can be composed of entries derived from a private list based on
   direct observation and other DxLs, known as component DxLs.  Hence, a
   DxL propogates data from many sources.

Newton & Shafranovich    Expires August 10, 2005                [Page 8]

Internet-Draft                    DxL                      February 2005

5.  Data Model

   This section describes the data model of a DxL.  The formal syntax
   for a DxL is described in Section 6.

   Each DxL has the following attributes:
   o  DxL URI - a URI pointing to the DxL
   o  description - a short, textual description describing the DxL
   o  description URI - a URI pointing to a longer description of the
   o  expiration date and time
   o  creation date and time
   o  last updated date and time
   Each of these attributes is optional.

   Each item in a DxL describes an observed instance with the following
   trace data:
   o  either an IPv4 or IPv6 address
   o  a protocol identifier: either a domain name or a URI (a domain
      name is RECOMMENDED given that URIs are free to manufacture)
   o  protocol content: domain names, URIs, or regular expressions
      (regex) describing parts of content (domain names are RECOMMENDED)
      - regular expressions must be typed with one of the following
      *  Perl - denotes a Perl style regular expression
      *  POSIX-enhanced - denotes a POSIX enhanced style regular
      *  POSIX-basic - denotes a POSIX basic style regular expression
   o  proxy - a simple note indicating it was possible to detect that
      the end-point served as a protocol-level proxy
   o  user agent
   o  application: text in the form of XXX.YYY where XXX is an
      application name and YYY is a sub-application name - describes the
      application or network service type specific to the trace data.
      These values are defined as:
      *  web.referrer - web-based referrals
      *  blog.comments
      *  blog.trackbacks

   The following are two examples of trace data from observed incidents:
   1.  A comment is left on a blog.  The blog software records the
       comment as coming from  The "URL" field was submitted
       with the URI "http://example.org/foo" and the "comment" field was
       submitted with the text "Buy all your foos at foo.example.org for
       the lowest prices".  The trace data would consist of the
       *  an IPv4 address of

Newton & Shafranovich    Expires August 10, 2005                [Page 9]

Internet-Draft                    DxL                      February 2005

       *  a protocol URI of http://example.org/foo
       *  a content domain of foo.example.org or example.org
   2.  An entry is left in a referrer log on a web server.  The entry
       shows the request coming from with a referral URI of
       http://example.com/bar.  The trace data would consist of the
       *  an IPv4 address of
       *  a protocol URI of http://example.com/bar or a protocol domain
          name of example.com

   Each item in a DxL as the following meta-data associated with it:
   o  URI of DxL source - taken directly from the Dxl URI of the DxL
      document where the item originated
   o  description
   o  description URI
   o  removal URI - points to a location where instructions may be found
      for removing an item from the source DxL
   o  method - describes what process was used to determine inclusion of
      the item if it originated from a component DxL.  These methods
      *  intersection - the item was found in a component DxL and by
         direct observation of this DxL publisher
      *  union - the item was found in a component DxL and was not
         directly observed by the publisher of this DxL
      *  direct - the item was found only by direct obersvation
   o  hops - a non-negative integer indicating the number of times the
      item has been derived from a component DxL.  Zero indicates the
      item is in the DxL of the publisher who made the observation.
   o  weight - a value between -1.0 and 1.0 indicating a value judgement
      on the item.  Values less than 0 are considered negative (i.e.  a
      blacklisted item) and values greater than 0 are considered
      positive (i.e.  a whitelisted item).  Zero is considered neutral.
      If value judgements are simply to be boolean (either positive or
      negative), the values 1.0 and -1.0 SHOULD be used.
   o  expiration date and time
   o  created date and time
   o  last updated date and time

   The following is an example of a DxL document:

   <?xml version="1.0" encoding="UTF-8"?>
   <dxl xmlns="urn:ietf:params:xml:ns:dxl0.1"
     xsi:schemaLocation="urn:ietf:params:xml:ns:dxl0.1 dxl.xsd"
     expires="2005-01-31T12:00:00Z" description="grumpOps DxL"

Newton & Shafranovich    Expires August 10, 2005               [Page 10]

Internet-Draft                    DxL                      February 2005

       <traceData application="blog.trackback">
       <description>a persistent spammer</description>
       <traceData application="web.referrer">
       <description>a very persistent spammer</description>

Newton & Shafranovich    Expires August 10, 2005               [Page 11]

Internet-Draft                    DxL                      February 2005

6.  Formal XML Syntax

   The following describes the formal XML syntax for DxL instances using
   XML Schema (see [2], [3], [5], and [4]).  Implementors should note
   that this is only a formalization of the syntax for creation of
   interoperable processes and that an XML Schema capable parser is not

   This formal definition uses the XML Schema 'anyType' is places where
   formal syntax definitions already exist:
   o  the syntax for domains is defined in [8]
   o  the syntax for IPv4 addresses is defined in [7]
   o  the syntax for IPv6 addresses is defined in [6]
   In these cases, the formal syntax defers to the appropriate original

   <?xml version="1.0" encoding="UTF-8"?>
   <schema xmlns="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified" >

         A schema for describing
         distributed black/white lists (DxL)

     <element name="dxl">
           <element name="item" type="dxl:item"
             minOccurs="1" maxOccurs="unbounded"/>
           <any namespace="##other" processContents="skip"
             minOccurs="0" maxOccurs="unbounded" />
         <attribute name="expires" type="dateTime"/>
         <attribute name="created" type="dateTime"/>
         <attribute name="lastUpdated" type="dateTime"/>
         <attribute name="dxlUri" type="anyURI"/>
         <attribute name="description" type="string"/>
         <attribute name="descriptionUri" type="anyURI"/>

     <complexType name="item">

Newton & Shafranovich    Expires August 10, 2005               [Page 12]

Internet-Draft                    DxL                      February 2005

         <element name="traceData">
                 <element name="ip4" type="anyType">
                     <documentation>as defined by RFC 0791</documentation>
                 <element name="ip6" type="anyType">
                     <documentation>as defined by RFC 3513</documentation>
               <choice minOccurs="0">
                 <element name="submitterDomain" type="anyType">
                     <documentation>as defined by RFC 1035</documentation>
                 <element name="submitterUri" type="anyURI"/>
               <element name="content" minOccurs="0">
                   <choice minOccurs="0" maxOccurs="unbounded">
                     <element name="domain" type="anyType">
                         <documentation>as defined by RFC 1035</documentation>
                     <element name="uri" type="anyURI"/>
                     <element name="regex">
                           <extension base="string">
                             <attribute name="type" type="NMTOKEN" use="required"/>
               <element name="proxy" type="boolean" minOccurs="0"/>
               <element name="userAgent" type="token" minOccurs="0"/>
               <any namespace="##other" processContents="skip"
                 minOccurs="0" maxOccurs="unbounded"/>

Newton & Shafranovich    Expires August 10, 2005               [Page 13]

Internet-Draft                    DxL                      February 2005

             <attribute name="application" type="dxl:application"/>

         <element name="sourceDxlUri" type="anyURI" minOccurs="0"/>
         <element name="description" type="string" minOccurs="0"/>
         <element name="descriptionUri" type="anyURI" minOccurs="0"/>
         <element name="removalUri" type="anyURI" minOccurs="0"/>
         <element name="method" type="NMTOKEN" minOccurs="0"/>
         <element name="hops" type="nonNegativeInteger" minOccurs="0"/>
         <element name="weight" type="dxl:weight" minOccurs="0"/>
         <element name="expires" type="dateTime" minOccurs="0"/>
         <element name="created" type="dateTime" minOccurs="0"/>
         <element name="lastUpdated" type="dateTime" minOccurs="0"/>
         <any namespace="##other" processContents="skip"
           minOccurs="0" maxOccurs="unbounded" />


     <simpleType name="weight">
       <restriction base="decimal">
         <minInclusive value="-1.0"/>
         <maxInclusive value="1.0"/>
         <fractionDigits value="3"/>

     <simpleType name="application">
       <restriction base="string">
         <pattern value="\w*(\.\w*)?"/>


Newton & Shafranovich    Expires August 10, 2005               [Page 14]

Internet-Draft                    DxL                      February 2005

7.  References

7.1  Normative References

   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", RFC 2119, BCP 14, March 1997.

   [2]  World Wide Web Consortium, "Extensible Markup Language (XML)
        1.0", W3C XML, February 1998,

   [3]  World Wide Web Consortium, "Namespaces in XML", W3C XML
        Namespaces, January 1999,

   [4]  World Wide Web Consortium, "XML Schema Part 2: Datatypes", W3C
        XML Schema, October 2000,

   [5]  World Wide Web Consortium, "XML Schema Part 1: Structures", W3C
        XML Schema, October 2000,

   [6]  Hinden, R. and S. Deering, "Internet Protocol Version 6 (IPv6)
        Addressing Architecture", RFC 3513, April 2003.

   [7]  Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981.

   [8]  Mockapetris, P., "Domain names - implementation and
        specification", STD 13, RFC 1035, November 1987.

7.2  Informative References

   [9]  Levine, J., "DNS Based Blacklists and Whitelists for E-Mail",
        draft-irtf-asrg-dnsbl-01.txt (work in progress), November 2004.

Authors' Addresses

   Andrew L. Newton
   VeriSign, Inc.
   21345 Ridgetop Circle
   Sterling, VA  20166

   Phone: +1 703 948 3382
   EMail: anewton@verisignlabs.com; andy@hxr.us
   URI:   http://www.verisignlabs.com/

Newton & Shafranovich    Expires August 10, 2005               [Page 15]

Internet-Draft                    DxL                      February 2005

   Yakov Shafranovich
   SolidMatrix Technologies, Inc.

   EMail: YakovS@solidmatrix.com; ietf@shaftek.org
   URI:   http://www.shaftek.org/

Newton & Shafranovich    Expires August 10, 2005               [Page 16]

Internet-Draft                    DxL                      February 2005

Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at

Disclaimer of Validity

   This document and the information contained herein are provided on an

Copyright Statement

   Copyright (C) The Internet Society (2005).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


   Funding for the RFC Editor function is currently provided by the
   Internet Society.

Newton & Shafranovich    Expires August 10, 2005               [Page 17]

Html markup produced by rfcmarkup 1.129c, available from https://tools.ietf.org/tools/rfcmarkup/