[Docs] [txt|pdf|xml|html] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 RFC 6249

Network Working Group                                           A. Bryan
Internet-Draft                                                  N. McNab
Intended status: Standards Track                            T. Tsujikawa
Expires: August 31, 2011
                                                                P. Poeml
                                                             MirrorBrain
                                                            H. Nordstrom
                                                       February 27, 2011


 Metalink/HTTP: Mirrors and Cryptographic Hashes in HTTP Header Fields
                      draft-bryan-metalinkhttp-21

Abstract

   This document specifies Metalink/HTTP: Mirrors and Cryptographic
   Hashes in HTTP header fields, a different way to get information that
   is usually contained in the Metalink XML-based download description
   format.  Metalink/HTTP describes multiple download locations
   (mirrors), Peer-to-Peer, cryptographic hashes, digital signatures,
   and other information using existing standards for HTTP header
   fields.  Clients can use this information to make file transfers more
   robust and reliable.

Editorial Note (To be removed by RFC Editor)

   Discussion of this draft should take place on the HTTPBIS working
   group mailing list (ietf-http-wg@w3.org), although this draft is not
   a WG item.

   The changes in this draft are summarized in Appendix C.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on August 31, 2011.



Bryan, et al.            Expires August 31, 2011                [Page 1]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.





































Bryan, et al.            Expires August 31, 2011                [Page 2]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.1.  Examples Metalink Server Response  . . . . . . . . . . . .  5
     1.2.  Notational Conventions . . . . . . . . . . . . . . . . . .  5
   2.  Requirements . . . . . . . . . . . . . . . . . . . . . . . . .  5
   3.  Mirrors / Multiple Download Locations  . . . . . . . . . . . .  7
     3.1.  Mirror Priority  . . . . . . . . . . . . . . . . . . . . .  7
     3.2.  Mirror Geographical Location . . . . . . . . . . . . . . .  8
     3.3.  Coordinated Mirror Policies  . . . . . . . . . . . . . . .  8
     3.4.  Mirror Depth . . . . . . . . . . . . . . . . . . . . . . .  8
   4.  Peer-to-Peer / Metainfo  . . . . . . . . . . . . . . . . . . .  9
     4.1.  Metalink/XML Files . . . . . . . . . . . . . . . . . . . .  9
   5.  Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . 10
     5.1.  OpenPGP Signatures . . . . . . . . . . . . . . . . . . . . 10
     5.2.  S/MIME Signatures  . . . . . . . . . . . . . . . . . . . . 10
   6.  Cryptographic Hashes of Whole Documents  . . . . . . . . . . . 10
   7.  Client / Server Multi-source Download Interaction  . . . . . . 11
     7.1.  Error Prevention, Detection, and Correction  . . . . . . . 14
       7.1.1.  Error Prevention (Early File Mismatch Detection) . . . 14
       7.1.2.  Error Correction . . . . . . . . . . . . . . . . . . . 15
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 16
   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 16
     9.1.  URIs and IRIs  . . . . . . . . . . . . . . . . . . . . . . 16
     9.2.  Spoofing . . . . . . . . . . . . . . . . . . . . . . . . . 16
     9.3.  Cryptographic Hashes . . . . . . . . . . . . . . . . . . . 16
     9.4.  Signing  . . . . . . . . . . . . . . . . . . . . . . . . . 17
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 17
     10.2. Informative References . . . . . . . . . . . . . . . . . . 18
   Appendix A.  Acknowledgements and Contributors . . . . . . . . . . 18
   Appendix B.  Comparisons to Similar Options  . . . . . . . . . . . 18
   Appendix C.  Document History  . . . . . . . . . . . . . . . . . . 19
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21

















Bryan, et al.            Expires August 31, 2011                [Page 3]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


1.  Introduction

   Metalink/HTTP is an alternative and complementary representation of
   Metalink information, which is usually presented as an XML-based
   document format [RFC5854].  Metalink/HTTP attempts to provide as much
   functionality as the Metalink/XML format by using existing standards
   such as Web Linking [RFC5988], Instance Digests in HTTP [RFC3230],
   and Entity Tags (also known as ETags) [RFC2616].  Metalink/HTTP is
   used to list information about a file to be downloaded.  This can
   include lists of multiple URIs (mirrors), Peer-to-Peer information,
   cryptographic hashes, and digital signatures.

   Identical copies of a file are frequently accessible in multiple
   locations on the Internet over a variety of protocols (such as FTP,
   HTTP, and Peer-to-Peer).  In some cases, users are shown a list of
   these multiple download locations (mirrors) and must manually select
   a single one on the basis of geographical location, priority, or
   bandwidth.  This distributes the load across multiple servers, and
   should also increase throughput and resilience.  At times, however,
   individual servers can be slow, outdated, or unreachable, but this
   can not be determined until the download has been initiated.  Users
   will rarely have sufficient information to choose the most
   appropriate server, and will often choose the first in a list which
   might not be optimal for their needs, and will lead to a particular
   server getting a disproportionate share of load.  The use of
   suboptimal mirrors can lead to the user canceling and restarting the
   download to try to manually find a better source.  During downloads,
   errors in transmission can corrupt the file.  There are no easy ways
   to repair these files.  For large downloads this can be extremely
   troublesome.  Any of the number of problems that can occur during a
   download lead to frustration on the part of users.

   Some popular sites automate the process of selecting mirrors using
   DNS load balancing, both to approximately balance load between
   servers, and to direct clients to nearby servers with the hope that
   this improves throughput.  Indeed, DNS load balancing can balance
   long-term server load fairly effectively, but it is less effective at
   delivering the best throughput to users when the bottleneck is not
   the server but the network.

   This document describes a mechanism by which the benefit of mirrors
   can be automatically and more effectively realized.  All the
   information about a download, including mirrors, cryptographic
   hashes, digital signatures, and more can be transferred in
   coordinated HTTP header fields hereafter referred to as a Metalink.
   This Metalink transfers the knowledge of the download server (and
   mirror database) to the client.  Clients can fallback to other
   mirrors if the current one has an issue.  With this knowledge, the



Bryan, et al.            Expires August 31, 2011                [Page 4]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   client is enabled to work its way to a successful download even under
   adverse circumstances.  All this can be done without complicated user
   interaction and the download can be much more reliable and efficient.
   In contrast, a traditional HTTP redirect to a mirror conveys only
   extremely minimal information - one link to one server, and there is
   no provision in the HTTP protocol to handle failures.  Furthermore,
   in order to provide better load distribution across servers and
   potentially faster downloads to users, Metalink/HTTP facilitates
   multi-source downloads, where portions of a file are downloaded from
   multiple mirrors (and optionally, Peer-to-Peer) simultaneously.

   Upon connection to a Metalink/HTTP server, a client will receive
   information about other sources of the same resource and a
   cryptographic hash of the whole resource.  The client will then be
   able to request chunks of the file from the various sources,
   scheduling appropriately in order to maximize the download rate.

1.1.  Examples Metalink Server Response

   This example shows a brief Metalink server response with ETag,
   mirrors, .meta4, OpenPGP signature, and a cryptographic hash of the
   whole file:

   Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
   Link: <http://www2.example.com/example.ext>; rel=duplicate
   Link: <ftp://ftp.example.com/example.ext>; rel=duplicate
   Link: <http://example.com/example.ext.torrent>; rel=describedby;
   type="application/x-bittorrent"
   Link: <http://example.com/example.ext.meta4>; rel=describedby;
   type="application/metalink4+xml"
   Link: <http://example.com/example.ext.asc>; rel=describedby;
   type="application/pgp-signature"
   Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
   DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==

1.2.  Notational Conventions

   This specification describes conformance of Metalink/HTTP.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in BCP 14, [RFC2119], as
   scoped to those conformance targets.


2.  Requirements

   In this context, "Metalink" refers to Metalink/HTTP which consists of



Bryan, et al.            Expires August 31, 2011                [Page 5]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   mirrors and cryptographic hashes in HTTP header fields as described
   in this document.  "Metalink/XML" refers to the XML format described
   in [RFC5854].

   Metalink resources include Link header fields [RFC5988] to present a
   list of mirrors in the response to a client request for the resource.
   Metalink servers MUST include the cryptographic hash of a resource
   via Instance Digests in HTTP [RFC3230].  Algorithms used in the
   Instance Digest field are registered in the IANA registry named
   "Hypertext Transfer Protocol (HTTP) Digest Algorithm Values" at
   <http://www.iana.org/assignments/http-dig-alg/http-dig-alg.xhtml>.
   This document restricts the use of these algorithms.  SHA-256 and
   SHA-512 were added to the registry by [RFC5843].  If a Metalink
   contains whole file hashes as described in Section 6, it SHOULD
   include SHA-256, as specified in [FIPS-180-3], or stronger.  It MAY
   also include other hashes.

   Metalink servers are HTTP servers with one or more Metalink
   resources.  Metalink servers MUST support the Link header fields for
   listing mirrors and MUST support Instance Digests in HTTP [RFC3230].
   Metalink servers MUST return the same Link header fields and Instance
   Digests on HEAD requests.  Metalink servers and their associated
   mirror servers SHOULD all share the same ETag policy.  It is up to
   the administrator of the Metalink server to communicate the details
   of the shared ETag policy to the administrators of the mirror servers
   so that the mirror servers can be configured with the same ETag
   policy.  To have the same ETag policy means that ETags are
   synchronized across servers for resources that are mirrored, i.e.
   byte-for-byte identical files will have the same ETag on mirrors that
   they have on the Metalink server.  For example, it would be better to
   derive an ETag from a cryptographic hash of the file contents than on
   server-unique filesystem metadata.  Metalink servers SHOULD offer
   Metalink/XML documents that contain cryptographic hashes of parts of
   the file (and other information) if error recovery is desirable.

   Mirror servers are typically FTP or HTTP servers that "mirror"
   another server.  That is, they provide identical copies of (at least
   some) files that are also on the mirrored server.  Mirror servers
   SHOULD support serving partial content.  HTTP mirror servers SHOULD
   share the same ETag policy as the originating Metalink server.  HTTP
   Mirror servers SHOULD support Instance Digests in HTTP [RFC3230]
   using the same algorithm as the Metalink server.  Optimally, mirror
   servers will share the same ETag policy and support Instance Digests
   in HTTP.  Mirror servers that share the same ETag policy and/or
   support Instance Digests in HTTP using the same algorithm as a
   Metalink server are known as preferred mirror servers.

   Metalink clients use the mirrors provided by a Metalink server in



Bryan, et al.            Expires August 31, 2011                [Page 6]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   Link header fields [RFC5988] but it is restricted to the initial
   Metalink server they contacted.  If Metalink clients find Link header
   fields [RFC5988] for listing mirrors from mirrors, they MUST discard
   such Link header fields [RFC5988] to prevent a possible infinite
   loop.  Metalink clients MUST support HTTP and SHOULD support FTP
   [RFC0959].  Metalink clients MAY support BitTorrent [BITTORRENT], or
   other download methods.  Metalink clients SHOULD switch downloads
   from one mirror to another if a mirror becomes unreachable.  Metalink
   clients MAY support multi-source, or parallel, downloads, where
   portions of a file can be downloaded from multiple mirrors
   simultaneously (and optionally, from Peer-to-Peer sources).  Metalink
   clients MUST support Instance Digests in HTTP [RFC3230] by requesting
   and verifying cryptographic hashes.  Metalink clients SHOULD support
   error recovery by using the cryptographic hashes of parts of the file
   listed in Metalink/XML files.  Metalink clients SHOULD support
   checking digital signatures.


3.  Mirrors / Multiple Download Locations

   Mirrors are specified with the Link header fields [RFC5988] and a
   relation type of "duplicate" as defined in Section 8.

   The following OPTIONAL attributes are defined:
   o  "depth" : mirror depth in Section 3.4.
   o  "geo" : mirror geographical location in Section 3.2.
   o  "pref" : a preferred mirror server in Section 3.3.
   o  "pri" : mirror priority in Section 3.1.

   This example shows a brief Metalink server response with two mirrors
   only:

   Link: <http://www2.example.com/example.ext>; rel=duplicate;
   pri=1; pref
   Link: <ftp://ftp.example.com/example.ext>; rel=duplicate;
   pri=2; geo=gb; depth=1

   As some organizations can have many mirrors, it is up to the
   organization to configure the amount of Link header fields the
   Metalink server will provide.  Such a decision could be a random
   selection or a hard-coded limit based on network proximity, file
   size, server load, or other factors.

3.1.  Mirror Priority

   Entries for mirror servers are listed in order of priority (from most
   preferred to least) or have a "pri" value, where mirrors with lower
   values are used first.



Bryan, et al.            Expires August 31, 2011                [Page 7]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   This is purely an expression of the server's preferences; it is up to
   the client what it does with this information, particularly with
   reference to how many servers to use at any one time.

3.2.  Mirror Geographical Location

   Entries for a mirror server MAY have a "geo" value, which is a
   [ISO3166-1] alpha-2 two letter country code for the geographical
   location of the physical server the URI is used to access.  A client
   MAY use this information to select a mirror, or set of mirrors, that
   are geographically near (if the client has access to such
   information), with the aim of reducing network load at inter-country
   bottlenecks.

3.3.  Coordinated Mirror Policies

   There are two types of mirror servers: preferred and normal.
   Preferred mirror servers are HTTP mirror servers that MUST share the
   same ETag policy as the originating Metalink server and/or MUST
   provide Instance Digests using the same algorithm as the Metalink
   server.  Preferred mirrors make it possible for Metalink clients to
   detect early on, before data is transferred, if the file requested
   matches the desired file.  This early file mismatch detection is
   described in Section 7.1.1.  Entries for preferred HTTP mirror
   servers have a "pref" value.  By default, if unspecified then mirrors
   are considered "normal" and do not necessarily share the same ETag
   policy or support Instance Digests using the same algorithm as the
   Metalink server.  FTP mirrors are considered "normal", as they do not
   emit ETags or support Instance Digests.

3.4.  Mirror Depth

   Some mirrors can mirror single files, whole directories, or multiple
   directories.

   Entries for mirror servers can have a "depth" value, where "depth=0"
   is the default.  A value of 0 means only that file is mirrored and
   that other URI path segments are not.  A value of 1 means that file
   and all other files and URI path segments contained in the rightmost
   URI path segment are mirrored.  For values of N, N-1 URI path
   segments closer to the Host are mirrored.  A value of 2 means one URI
   path segment closer to the Host is mirrored, and all files and URI
   path segments contained are mirrored.  For each higher value, another
   URI path segment closer to the Host is mirrored.







Bryan, et al.            Expires August 31, 2011                [Page 8]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   This example shows a mirror with a depth value of 4:

   Link: <http://www2.example.com/dir1/dir2/dir3/dir4/dir5/example.ext>;
   rel=duplicate; pri=1; pref; depth=4

   In the above example, 4 URI path segments closer to the Host are
   mirrored, from /dir2/ and all files and directories included.


4.  Peer-to-Peer / Metainfo

   Entries for metainfo files, which describe ways to download a file
   over Peer-to-Peer networks or otherwise, are specified with the Link
   header fields [RFC5988] and a relation type of "describedby" and a
   type parameter that indicates the MIME type of the metadata available
   at the URI.  Since metainfo files can sometimes describe multiple
   files, or the filename MAY not be the same on the Metalink server and
   in the metainfo file but still have the same content, an OPTIONAL
   "name" attribute can be used.

   The following OPTIONAL attribute is defined:
   o  "name" : a file described within the metainfo file.

   This example shows a brief Metalink server response with .torrent and
   .meta4:

   Link: <http://example.com/example.ext.torrent>; rel=describedby;
   type="application/x-bittorrent"; name="differentname.ext"
   Link: <http://example.com/example.ext.meta4>; rel=describedby;
   type="application/metalink4+xml"

   Metalink clients MAY support the use of metainfo files for
   downloading files.

4.1.  Metalink/XML Files

   Metalink/XML files for a given resource MAY be provided in a Link
   header field as shown in the example in Section 4.  Metalink/XML
   files are specified in [RFC5854] and they are particularly useful for
   providing metadata such as cryptographic hashes of parts of a file,
   allowing a client to recover from errors (see Section 7.1.2).
   Metalink servers SHOULD provide Metalink/XML files with partial file
   hashes in Link header fields and Metalink clients SHOULD use them for
   error recovery.







Bryan, et al.            Expires August 31, 2011                [Page 9]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


5.  Signatures

5.1.  OpenPGP Signatures

   OpenPGP signatures [RFC3156] of requested files are specified with
   the Link header fields [RFC5988] and a relation type of "describedby"
   and a type parameter of "application/pgp-signature".

   This example shows a brief Metalink server response with OpenPGP
   signature only:

   Link: <http://example.com/example.ext.asc>; rel=describedby;
   type="application/pgp-signature"

   Metalink clients SHOULD support the use of OpenPGP signatures.

5.2.  S/MIME Signatures

   S/MIME signatures [RFC5751] of requested files are specified with the
   Link header fields [RFC5988] and a relation type of "describedby" and
   a type parameter of "application/pkcs7-mime".

   This example shows a brief Metalink server response with S/MIME
   signature only:

   Link: <http://example.com/example.ext.p7m>; rel=describedby;
   type="application/pkcs7-mime"

   Metalink clients SHOULD support the use of S/MIME signatures.


6.  Cryptographic Hashes of Whole Documents

   If Instance Digests are not provided by the Metalink servers, the
   Link header fields pertaining to this specification MUST be ignored.

   This example shows a brief Metalink server response with ETag,
   mirror, and cryptographic hash:

   Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
   Link: <http://www2.example.com/example.ext>; rel=duplicate
   Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
   DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==








Bryan, et al.            Expires August 31, 2011               [Page 10]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


7.  Client / Server Multi-source Download Interaction

   Metalink clients begin a download with a standard HTTP [RFC2616] GET
   request to the Metalink server.  Metalink clients MAY use a Range
   limit if desired.


   GET /distribution/example.ext HTTP/1.1
   Host: www.example.com

   The Metalink server responds with the data and these header fields:

   HTTP/1.1 200 OK
   Accept-Ranges: bytes
   Content-Length: 14867603
   Content-Type: application/x-cd-image
   Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
   Link: <http://www2.example.com/example.ext>; rel=duplicate; pref
   Link: <ftp://ftp.example.com/example.ext>; rel=duplicate
   Link: <http://example.com/example.ext.torrent>; rel=describedby;
   type="application/x-bittorrent"
   Link: <http://example.com/example.ext.meta4>; rel=describedby;
   type="application/metalink4+xml"
   Link: <http://example.com/example.ext.asc>; rel=describedby;
   type="application/pgp-signature"
   Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
   DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==

   Alternatively, Metalink clients can begin with a HEAD request to the
   Metalink server to discover mirrors via Link header fields, and then
   skip to making the following decisions on every available mirror
   server found via the Link header fields.

   After that, the client follows with a GET request to the desired
   mirrors.

   From the Metalink server response the client learns some or all of
   the following metadata about the requested object, in addition to
   also starting to receive the object:

   o  Mirror profile link, which can describe the mirror's priority,
      whether it shares the ETag policy of the originating Metalink
      server, geographical location, and mirror depth.
   o  Instance Digest, which is the whole file cryptographic hash.
   o  ETag.
   o  Object size from the Content-Length header field.





Bryan, et al.            Expires August 31, 2011               [Page 11]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   o  Metalink/XML, which can include partial file cryptographic hashes
      to repair a file.
   o  Peer-to-peer information.
   o  Digital signature.

   Next, the Metalink client requests a Range of the object from a
   preferred mirror server, so it can use If-Match conditions:

   GET /example.ext HTTP/1.1
   Host: www2.example.com
   Range: bytes=7433802-
   If-Match: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
   Referer: http://www.example.com/distribution/example.ext

   Metalink clients SHOULD use preferred mirrors, if possible, as they
   allow early file mismatch detection as described in Section 7.1.1.
   Preferred mirrors have coordinated ETags, as described in
   Section 3.3, and Metalink clients SHOULD use If-Match conditions
   based on the ETag to quickly detect out-of-date mirrors by using the
   ETag from the Metalink server response.  Metalink clients SHOULD use
   partial file cryptographic hashes as described in Section 7.1.2, if
   available, to detect if the mirror server returned the correct data.

   Optimally, the mirror server also will include an Instance Digest in
   the mirror response to the client GET request, which the client can
   also use to detect a mismatch early.  Metalink clients MUST reject
   individual downloads from mirrors that support Instance Digests if
   the Instance Digest from the mirror does not match the Instance
   Digest as reported by the Metalink server and the same algorithm is
   used.  If normal mirrors are used, then a mismatch can not be
   detected until the completed object is verified.  Errors in
   transmission and substitutions of incorrect data on mirrors, whether
   deliberate or accidental, can be detected with error correction as
   described in Section 7.1.2.

   Here, the preferred mirror server has the correct file (the If-Match
   conditions match) and responds with a 206 Partial Content HTTP status
   code and appropriate "Content-Length", "Content Range", ETag, and
   Instance Digest header fields.  In this example, the mirror server
   responds, with data, to the above request:

   HTTP/1.1 206 Partial Content
   Accept-Ranges: bytes
   Content-Length: 7433801
   Content-Range: bytes 7433802-14867602/14867603
   Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
   Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
   DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==



Bryan, et al.            Expires August 31, 2011               [Page 12]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   If the object is large and gets delivered slower than expected, then
   the Metalink client MAY start a number of parallel ranged downloads
   (one per selected mirror server other than the first) using mirrors
   provided by the Link header fields with "duplicate" relation type.
   Metalink clients MUST limit the number of parallel connections to
   mirror servers, ideally based on observing how the aggregate
   throughput changes as connections are opened.  It would be pointless
   to blindly open connections once the path bottleneck is filled.
   Metalink clients SHOULD use the location of the original GET request
   in the "Referer" header field for these ranged requests.

   The Metalink client can determine the size and number of ranges
   requested from each server, based upon the type and number of mirrors
   and performance observed from each mirror.  Note that Range requests
   impose an overhead on servers and clients need to be aware of that
   and not abuse them.  Metalink clients SHOULD NOT make more than one
   concurrent Range request to each mirror server that it downloads
   from.

   Metalink clients SHOULD close all but the fastest connection if any
   Ranged requests generated after the first request end up with a
   complete response, instead of a partial response (as some mirrors
   might not support HTTP ranges), if the goal is the fastest transfer.
   Metalink clients MAY monitor mirror conditions and dynamically switch
   between mirrors to achieve the fastest download possible.  Similarly,
   Metalink clients SHOULD abort extremely slow or stalled range
   requests and finish the request on other mirrors.  If all ranges have
   finished except for the final one, the Metalink client can split the
   final range into multiple range requests to other mirrors so the
   transfer finishes faster.

   If the first request was GET and no Range header field was sent and
   the client determines later that it will issue a Range request, then
   the client SHOULD close the first connection when it catches up with
   the other parallel ranged downloads of the same object.  This means
   the first connection was sacrificed.  Metalink clients can use a HEAD
   request first, if possible, so that the client can find out if there
   are any Link header fields, and then Range-based requests are
   undertaken to the mirror servers without sacrificing a first
   connection.

   Metalink clients MUST reject individual downloads from mirrors where
   the file size does not match the file size as reported by the
   Metalink server.

   If a Metalink client does not support certain download methods (such
   as FTP or BitTorrent) that a file is available from, and there are no
   available download methods that the client supports, then the



Bryan, et al.            Expires August 31, 2011               [Page 13]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   download will have no way to complete.

   Metalink clients MUST verify the cryptographic hash of the file once
   the download has completed.  If the cryptographic hash offered by the
   Metalink server with Instance Digests does not match the
   cryptographic hash of the downloaded file, see Section 7.1.2 for a
   possible way to repair errors.

   If the download can not be repaired, it is considered corrupt.  The
   client can attempt to re-download the file.

   Metalink clients that support verifying digital signatures MUST
   verify digital signatures of requested files if they are included.
   Digital signatures MUST validate back to a trust anchor as described
   in the validation rules in [RFC3156] and [RFC5280].

7.1.  Error Prevention, Detection, and Correction

   Error prevention, or early file mismatch detection, is possible
   before file transfers with the use of file sizes, ETags, and Instance
   Digests provided by Metalink servers.  Error detection requires
   Instance Digests to detect errors in transfer after the transfers
   have completed.  Error correction, or download repair, is possible
   with partial file cryptographic hashes.

   Note that cryptographic hashes obtained from Instance Digests are in
   base64 encoding, while those from Metalink/XML are in hexadecimal.

7.1.1.  Error Prevention (Early File Mismatch Detection)

   In HTTP terms, the merging of ranges from multiple responses SHOULD
   be verified with a strong validator, which in this context is either
   an Instance Digest or a shared ETag from that Metalink server that
   matches with the same provided by a preferred mirror server.  In most
   cases, it is sufficient that the Metalink server provides mirrors and
   Instance Digest information, but operation will be more robust and
   efficient if the mirror servers do implement a shared ETag policy or
   Instance Digests as well.  There is no need to specify how the ETag
   is generated, just that it needs to be shared between the Metalink
   server and the mirror servers.  The benefit of having mirror servers
   return an Instance Digest is that the client then can detect
   mismatches early even if ETags are not used.  Mirrors that support
   both a shared ETag and Instance Digests do provide value, but just
   one is sufficient for early detection of mismatches.  If the mirror
   server provides neither shared ETag nor Instance Digest, then early
   detection of mismatches is not possible unless file length also
   differs.  Finally, errors are still detectable after the download has
   completed, when the cryptographic hash of the merged response is



Bryan, et al.            Expires August 31, 2011               [Page 14]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   verified.

   ETags can not be used for verifying the integrity of the received
   content.  If the ETag given by the mirror server matches the ETag
   given by the Metalink server, then the Metalink client assumes the
   responses are valid for that object.

   This guarantees that a mismatch will be detected by using only the
   shared ETag from a Metalink server and mirror server.  Mirror servers
   will respond with an error if ETags do not match, which will prevent
   accidental merges of ranges from different versions of files with the
   same name.

   A shared ETag or Instance Digest can not strictly protect against
   malicious attacks or server or network errors replacing content.  An
   attacker can make a mirror server seemingly respond with the expected
   Instance Digest or ETags even if the file contents have been
   modified.  The same goes for various system failures which would also
   cause bad data (i.e. corrupted files) to be returned.  The Metalink
   client has to rely on the Instance Digest returned by the Metalink
   server in the first response for the verification of the downloaded
   object as a whole.  To verify the individual ranges, which might have
   been requested from different sources, see Section 7.1.2.

7.1.2.  Error Correction

   Partial file cryptographic hashes can be used to detect errors during
   the download.  Metalink servers SHOULD provide Metalink/XML files
   with partial file hashes in Link header fields as specified in
   Section 4.1, and Metalink clients SHOULD use them for error
   correction.

   An error in transfer or a substitution attack will be detected by a
   cryptographic hash of the object not matching the Instance Digest
   from the Metalink server.  If the cryptographic hash of the object
   does not match the Instance Digest from the Metalink server, then the
   client SHOULD fetch the Metalink/XML (if available).  This may
   contain partial file cryptographic hashes which will allow detection
   of which mirror server returned incorrect data.  Metalink clients
   SHOULD use the Metalink/XML data to figure out what ranges of the
   downloaded data can be recovered and what needs to be fetched again.

   Other methods can be used for error correction.  For example, some
   other metainfo files also include partial file hashes that can be
   used to check for errors.






Bryan, et al.            Expires August 31, 2011               [Page 15]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


8.  IANA Considerations

   Accordingly, IANA will make the following registration to the Link
   Relation Type registry at <http://www.iana.org/assignments/
   link-relations/link-relations.xhtml>.

   o Relation Name: duplicate

   o Description: Refers to a resource whose available representations
   are byte-for-byte identical with the corresponding representations of
   the context IRI.

   o Reference: This specification.

   o Notes: This relation is for static resources.  That is, an HTTP GET
   request on any duplicate will return the same representation.  It
   does not make sense for dynamic or POSTable resources and should not
   be used for them.


9.  Security Considerations

9.1.  URIs and IRIs

   Metalink clients handle URIs and IRIs.  See Section 7 of [RFC3986]
   and Section 8 of [RFC3987] for security considerations related to
   their handling and use.

9.2.  Spoofing

   There is potential for spoofing attacks where the attacker publishes
   Metalinks with false information.  In that case, this could deceive
   unaware downloaders into downloading a malicious or worthless file.
   As with all downloads, users should only download from trusted
   sources.  Also, malicious publishers could attempt a distributed
   denial of service attack by inserting unrelated URIs into Metalinks.
   [RFC4732] contains information on amplification attacks and denial of
   service attacks.

9.3.  Cryptographic Hashes

   Currently, some of the digest values defined in Instance Digests in
   HTTP [RFC3230] are considered insecure.  These include the whole
   Message Digest family of algorithms which are not suitable for
   cryptographically strong verification.  Malicious people could
   provide files that appear to be identical to another file because of
   a collision, i.e. the weak cryptographic hashes of the intended file
   and a substituted malicious file could match.



Bryan, et al.            Expires August 31, 2011               [Page 16]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


9.4.  Signing

   Metalinks SHOULD include digital signatures, as described in
   Section 5.

   Digital signatures provide authentication, message integrity, and
   enable non-repudiation with proof of origin.


10.  References

10.1.  Normative References

   [BITTORRENT]
              Cohen, B., "The BitTorrent Protocol Specification",
              BITTORRENT 11031, February 2008,
              <http://www.bittorrent.org/beps/bep_0003.html>.

   [FIPS-180-3]
              National Institute of Standards and Technology (NIST),
              "Secure Hash Standard (SHS)", FIPS PUB 180-3,
              October 2008.

   [RFC0959]  Postel, J. and J. Reynolds, "File Transfer Protocol",
              STD 9, RFC 0959, October 1985.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

   [RFC3156]  Elkins, M., Del Torto, D., Levien, R., and T. Roessler,
              "MIME Security with OpenPGP", RFC 3156, August 2001.

   [RFC3230]  Mogul, J. and A. Van Hoff, "Instance Digests in HTTP",
              RFC 3230, January 2002.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              RFC 3986, January 2005.

   [RFC3987]  Duerst, M. and M. Suignard, "Internationalized Resource
              Identifiers (IRIs)", RFC 3987, January 2005.

   [RFC5280]  Cooper, D., Santesson, S., Farrell, S., Boeyen, S.,
              Housley, R., and W. Polk, "Internet X.509 Public Key



Bryan, et al.            Expires August 31, 2011               [Page 17]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


              Infrastructure Certificate and Certificate Revocation List
              (CRL) Profile", RFC 5280, May 2008.

   [RFC5751]  Ramsdell, B. and S. Turner, "Secure/Multipurpose Internet
              Mail Extensions (S/MIME) Version 3.2 Message
              Specification", RFC 5751, January 2010.

   [RFC5854]  Bryan, A., Tsujikawa, T., McNab, N., and P. Poeml, "The
              Metalink Download Description Format", RFC 5854,
              June 2010.

   [RFC5988]  Nottingham, M., "Web Linking", RFC 5988, October 2010.

10.2.  Informative References

   [ISO3166-1]
              International Organization for Standardization, "ISO 3166-
              1:2006.  Codes for the representation of names of
              countries and their subdivisions -- Part 1: Country
              codes", November 2006.

   [RFC4732]  Handley, M., Rescorla, E., and IAB, "Internet Denial-of-
              Service Considerations", RFC 4732, December 2006.

   [RFC5843]  Bryan, A., "Additional Hash Algorithms for HTTP Instance
              Digests", RFC 5843, April 2010.


Appendix A.  Acknowledgements and Contributors

   Thanks to the Metalink community, Alexey Melnikov, Julian Reschke,
   Mark Nottingham, Daniel Stenberg, Matt Domsch, Micah Cowan, David
   Morris, Yves Lafon, Juergen Schoenwaelder, Ben Campbell, Lars Eggert,
   Sean Turner, Robert Sparks, and the HTTPBIS Working Group.

   Thanks to Alan Ford and Mark Handley for spurring us on to publish
   this document.

   This document is dedicated to Zimmy Bryan and Juanita Anthony.


Appendix B.  Comparisons to Similar Options

   [[ to be removed by the RFC editor before publication as an RFC. ]]

   This draft, compared to the Metalink/XML format [RFC5854] :





Bryan, et al.            Expires August 31, 2011               [Page 18]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   o  (+) Reuses existing HTTP standards without much new besides a Link
      Relation Type.  It's more of a collection/coordinated feature set.
   o  (?)  The existing standards don't seem to be widely implemented.
   o  (+) No XML dependency, except for Metalink/XML for partial file
      cryptographic hashes.
   o  (+) Existing Metalink/XML clients can be easily converted to
      support this as well.
   o  (+) Coordination of mirror servers is preferred, but not required.
      Coordination could be difficult or impossible unless one group is
      in control of all servers on the mirror network.
   o  (-) Requires software or configuration changes to originating
      server.
   o  (-?)  Tied to HTTP, not as generic.  FTP/P2P clients won't be
      using it unless they also support HTTP, unlike Metalink/XML.
   o  (-) Requires server-side support.  Metalink/XML can be created by
      user (or server, but server component/changes not required).
   o  (-) Also, Metalink/XML files are easily mirrored on all servers.
      Even if usage in that case is not as transparent, this method
      still gives access to all download information (with no changes
      needed to servers) from all mirrors (FTP included).
   o  (-) Not portable/archivable/emailable.  Metalink/XML is used to
      import/export transfer queues.  Not as easy for search engines to
      index?
   o  (-) Not as rich metadata.
   o  (-) Not able to add multiple files to a download queue or create
      directory structure.


Appendix C.  Document History

   [[ to be removed by the RFC editor before publication as an RFC. ]]

   Known issues concerning this draft:
   o  None.

   -21 : February 27, 2011.
   o  IESG review.

   -20 : February 14, 2011.
   o  Yves Lafon's apps-team review, Juergen Schoenwaelder's secdir
      review, Ben Campbell's Gen-ART review.

   -19 : January 20, 2011.
   o  Julian Reschke's review.

   -18 : January 1, 2010.





Bryan, et al.            Expires August 31, 2011               [Page 19]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   o  AD review by Alexey Melnikov.

   -17 : September 13, 2010.
   o  RFC 5854 Metalink/XML.

   -16 : April 16, 2010.
   o  Add draft-ietf-ftpext2-hash reference and FTP mirror coordination.

   -15 : February 20, 2010.
   o  Update references and terminology.

   -14 : December 31, 2009.
   o  Baseline file hash: SHA-256.

   -13 : November 22, 2009.
   o  Metalink/XML for partial file cryptographic hashes.

   -12 : November 11, 2009.
   o  Clarifications.

   -11 : October 23, 2009.
   o  Mirror changes.

   -10 : October 15, 2009.
   o  Mirror coordination changes.

   -09 : October 13, 2009.
   o  Mirror location, coordination, and depth.
   o  Split HTTP Digest Algorithm Values Registration into
      draft-bryan-http-digest-algorithm-values-update.

   -08 : October 4, 2009.
   o  Clarifications.

   -07 : September 29, 2009.
   o  Preferred mirror servers.

   -06 : September 24, 2009.
   o  Add Mismatch Detection, Error Recovery, and Digest Algorithm
      values.
   o  Remove Content-MD5 and Want-Digest.

   -05 : September 19, 2009.
   o  ETags, preferably matching the Instance Digests.

   -04 : September 17, 2009.





Bryan, et al.            Expires August 31, 2011               [Page 20]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   o  Temporarily remove .torrent.

   -03 : September 16, 2009.
   o  Mention HEAD request, negotiate mirrors if Want-Digest is used.

   -02 : September 7, 2009.
   o  Content-MD5 for partial file cryptographic hashes.

   -01 : September 1, 2009.
   o  Link Relation Type Registration: "duplicate"

   -00 : August 24, 2009.
   o  Initial draft.


Authors' Addresses

   Anthony Bryan
   Pompano Beach, FL
   USA

   Email: anthonybryan@gmail.com
   URI:   http://www.metalinker.org


   Neil McNab

   Email: neil@nabber.org
   URI:   http://www.nabber.org


   Tatsuhiro Tsujikawa
   Shiga
   Japan

   Email: tatsuhiro.t@gmail.com
   URI:   http://aria2.sourceforge.net














Bryan, et al.            Expires August 31, 2011               [Page 21]

Internet-Draft      Metalink/HTTP: Mirrors and Hashes      February 2011


   Dr. med. Peter Poeml
   MirrorBrain
   Venloer Str. 317
   Koeln  50823
   DE

   Phone: +49 221 6778 333 8
   Email: peter@poeml.de
   URI:   http://mirrorbrain.org/~poeml/


   Henrik Nordstrom

   Email: henrik@henriknordstrom.net
   URI:   http://www.henriknordstrom.net/




































Bryan, et al.            Expires August 31, 2011               [Page 22]


Html markup produced by rfcmarkup 1.109, available from https://tools.ietf.org/tools/rfcmarkup/