draft-ietf-mhtml-rev-07.txt   rfc2557.txt 
Network Working Group Jacob Palme
Internet Draft Stockholm University/KTH
draft-ietf-mhtml-rev-07.txt Alexander Hopmann
IETF status to be: Proposed standard Microsoft Corporation
Replaces: RFC 2110 Nick Shelness
Lotus Corporation
Expires: August 1998 February 1998
MIME Encapsulation of Aggregate Documents, such as HTML (MHTML) Network Working Group J. Palme
Request for Comments: 2557 Stockholm University/KTH
Obsoletes: 2110 A. Hopmann
Category: Standards Track Microsoft Corporation
N. Shelness
Lotus Development Corporation
March 1999
Status of this Document MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)
This document is an Internet-Draft. Internet-Drafts are working Status of this Memo
documents of the Internet Engineering Task Force (IETF), its areas, and
its working groups. Note that other groups may also distribute working
documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months This document specifies an Internet standards track protocol for the
and may be updated, replaced, or obsoleted by other documents at any Internet community, and requests discussion and suggestions for
time. It is inappropriate to use Internet-Drafts as reference material improvements. Please refer to the current edition of the "Internet
or to cite them other than as ``work in progress.'' Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.
To learn the current status of any Internet-Draft, please check the Copyright Notice
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), Copyright (C) The Internet Society (1999). All Rights Reserved.
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Copyright (C) The Internet Society 1998. All Rights Reserved.
Abstract Abstract
HTML [RFC 1866] defines a powerful means of specifying multimedia HTML [RFC 1866] defines a powerful means of specifying multimedia
documents. These multimedia documents consist of a text/html root documents. These multimedia documents consist of a text/html root
resource (object) and other subsidiary resources (image, video clip, resource (object) and other subsidiary resources (image, video clip,
applet, etc. objects) referenced by Uniform Resource Identifiers (URIs) applet, etc. objects) referenced by Uniform Resource Identifiers
within the text/html root resource. When an HTML multimedia document is (URIs) within the text/html root resource. When an HTML multimedia
retrieved by a browser, each of these component resources is document is retrieved by a browser, each of these component resources
individually retrieved in real time from a location, and using a is individually retrieved in real time from a location, and using a
protocol, specified by each URI. protocol, specified by each URI.
In order to transfer a complete HTML multimedia document in a single In order to transfer a complete HTML multimedia document in a single
e-mail message, it is necessary to: a) aggregate a text/html root e-mail message, it is necessary to: a) aggregate a text/html root
resource and all of the subsidiary resources it references into a resource and all of the subsidiary resources it references into a
single composite message structure, and b) define a means by which URIs single composite message structure, and b) define a means by which
in the text/html root can reference subsidiary resources within that URIs in the text/html root can reference subsidiary resources within
composite message structure. that composite message structure.
This document a) defines the use of a MIME multipart/related structure This document a) defines the use of a MIME multipart/related
to aggregate a text/html root resource and the subsidiary resources it structure to aggregate a text/html root resource and the subsidiary
references, and b) specifies one MIME content-headers resources it references, and b) specifies a MIME content-header
(Content-Location) that allow URIs in a multipart/related text/html (Content-Location) that allow URIs in a multipart/related text/html
root body part to reference subsidiary resources in other body parts of root body part to reference subsidiary resources in other body parts
the same multipart/related structure. of the same multipart/related structure.
While initially designed to support e-mail transfer of complete While initially designed to support e-mail transfer of complete
multi-resource HTML multimedia documents, these conventions can also be multi-resource HTML multimedia documents, these conventions can also
employed by other transfer protocols such as HTTP and FTP to retrieve a be employed to resources retrieved by other transfer protocols such
complete multi-resource HTML multimedia document in a single transfer as HTTP and FTP to retrieve a complete multi-resource HTML multimedia
or for storage and archiving of complete HTML-documents. document in a single transfer or for storage and archiving of
complete HTML-documents.
Differences between this and a previous version of this standard, which Differences between this and a previous version of this standard,
was published as RFC 2110, are summarized in chapter 12. which was published as RFC 2110, are summarized in chapter 12.
Table of Contents Table of Contents
1. Introduction 1. Introduction ................................................. 3
2. Terminology 2. Terminology ................................................. 4
2.1 Conformance requirement terminology 2.1 Conformance requirement terminology ...................... 4
2.2 Other terminology 2.2 Other terminology ........................................ 4
3. Overview 3. Overview ..................................................... 6
4. The Content-Location MIME Content Header 4. The Content-Location MIME Content Header ..................... 6
4.1 MIME content headers 4.1 MIME content headers ..................................... 6
4.2 The Content-Location Header 4.2 The Content-Location Header .............................. 7
4.3 URIs of MHTML aggregates 4.3 URIs of MHTML aggregates ................................. 8
4.4 Encoding and decoding of URIs in MIME header fields 4.4 Encoding and decoding of URIs in MIME header fields ...... 8
5. Base URIs for resolution of relative URIs 5. Base URIs for resolution of relative URIs .................... 9
6. Sending documents without linked objects 6. Sending documents without linked objects ..................... 10
7. Use of the Content-Type "multipart/related" 7. Use of the Content-Type "multipart/related" .................. 11
8. Usage of Links to Other Body Parts 8. Usage of Links to Other Body Parts ........................... 13
8.1 General principle 8.1 General principle ........................................ 13
8.2 Resolution of URIs in text/html body parts 8.2 Resolution of URIs in text/html body parts ............... 13
8.3 Use of the Content-ID header and CID URLs 8.3 Use of the Content-ID header and CID URLs ................ 14
9. Examples 9. Examples ..................................................... 14
9.1 Example of a HTML body without included linked objects 9.1 Example of a HTML body without included linked objects ... 15
9.2 Example with an absolute URI to an embedded GIF picture 9.2 Example with an absolute URI to an embedded GIF picture .. 15
9.3 Example with relative URIs to embedded GIF pictures 9.3 Example with relative URIs to embedded GIF pictures ...... 16
9.4 Example with a relative URI and no BASE available 9.4 Example with a relative URI and no BASE available ........ 17
9.5 Example using CID URL and Content-ID header to an embedded GIF 9.5 Example using CID URL and Content-ID header to an embedded
picture GIF picture .............................................. 18
9.6 Example showing permitted and forbidden references between 9.6 Example showing permitted and forbidden references between
nested body parts nested body parts ........................................ 19
10. Character encoding issues and end-of-line issues 10. Character encoding issues and end-of-line issues ............ 21
11. Security Considerations 11. Security Considerations ..................................... 22
11.1 Security considerations not related to caching 11.1 Security considerations not related to caching .......... 22
11.2 Security considerations related to caching 11.2 Security considerations related to caching .............. 23
12. Differences as compared to the previous version of this proposed 12. Differences as compared to the previous version of this
standard in RFC 2110 proposed standard in RFC 2110 ............................... 24
13. Copyright 13. Acknowledgments ............................................. 24
14. Acknowledgments 14. References .................................................. 25
15. References 15. Authors' Addresses .......................................... 27
16. Author's Addresses 16. Full Copyright Statement .................................... 28
Differences since version 06 of this draft
Changed the syntax of the start parameter in examples, to show that it
must always be quoted (since it contains the special character "@", and
all Content-Type parameters containing special characters must be
quoted according to MIME.
Also the list of references has been updated.
Differences since version 05 of this draft
The definition of "HTML aggregate objects" has been changed from
HTML objects together with some or all objects, to which the HTML
object contains hyperlinks.
to
HTML objects together with some or all objects, to which the HTML
object contains hyperlinks, directly or indirectly.
Erroneous quotes around "multipart/related" have been removed in the
example in section 4.2.
In section 8.2, the following sentence:
The resolution of URIs in text/html body parts is performed in the
following way:
has been changed to
The resolution of inline, retrieval and other kinds of URIs in
text/html body parts is performed in the following way:
in order to remind the reader that also parts which are not inline can
be sent with MHTML.
In section 8.2, the following text:
(d) For each referencing URI in a text/html body part, compare the
value of the referencing URI after resolution as described in (a)
and (b), with the URI derived from Content-ID and Content-Location
headers for other body parts within the same Multipart/related
structure.
has been changed to:
(d) For each referencing URI in a text/html body part, compare the
value of the referencing URI after resolution as described in (a)
and (b), with the URI derived from Content-ID and Content-Location
headers for other body parts within the same or a surrounding
Multipart/related structure.
In section 9.3, the following text:
; Note - Relative Content-Location is resolved by base
; specified in the Multipart/Related heading
has been changed to:
; Note - Relative Content-Location is resolved by base
; specified in the Multipart/Related Content-Location heading
In section 11.1, the following paragraph has been added:
HTML-formatted messages can be used to investigate user behaviour
for example to break anonymity, in ways which invade the privacy of
individuals. If you send a message with a inline link to an object
which is not itself included in the message, the recipients mailer
or browser may request that object through HTTP. The HTTP
transaction will then reveal who is reading the message. Example: A
person who wants to find out who is behind an anonymous user
identity, or from which workstation a user is reading his mail, can
do this by sending a message with an inline link and then observe
from where this link is used to request the object.
In all the examples, all indentation which was there to make the text
more legible, but which was not correct according to RFC822, has been
removed. In one case, indentation was missing on a continuation line
and has been added.
Mailing List Information
To write contributions
Further discussion on this document should be done through the
mailing list MHTML@SEGATE.SUNET.SE.
Comments on less important details may also be sent to the editor,
Jacob Palme <jpalme@dsv.su.se>.
To subscribe
To subscribe to this mailing list, send a message to
LISTSERV@SEGATE.SUNET.SE
which contains the text
SUB MHTML <your name (not your email address)>
To unsubscribe
To unsubscribe from this list, send a message to
LISTSERV@SEGATE.SUNET.SE
which contains the text
UNS MHTML
To access mailing list archives
Archives of this list are available for bulk downloading by
anonymous ftp from
FTP://SEGATE.SUNET.SE/lists/mhtml/
The archives are available for browsing from
HTTP://segate.sunet.se/archives/mhtml.html
and may be available in searchable format from
http://www.reference.com/cgi-bin/pn/listarch?list=MHTML@segate.sunet.se
Finally, the archives are available by email. Send a message to
LISTSERV@SEGATE.SUNET.SE with the text "INDEX MHTML" to get a list
of the archive files, and then a new message "GET <file name>" to
retrieve archive files.
More information
Information about the IETF work in developing this standard may also
be available at URL:
http://www.dsv.su.se/~jpalme/ietf/mhtml.html
A collection of test messages is available at 1. Introduction
http://www.dsv.su.se/~jpalme/mimetest/MHTML-test-messages.html
An informational draft [INFO] with advice on how to implement this There are a number of document formats (Hypertext Markup Language
standard is under development. You can find the most recent draft from [HTML2], Extended Markup Language [XML], Portable Document format
http://www.dsv.su.se/~jpalme/ietf/mhtml.html#drafts, or, after it has [PDF] and Virtual Reality Markup Language [VRML]) that specify
been published, from documents consisting of a root resource and a number of distinct
http://www.dsv.su.se/~jpalme/ietf/mhtml.html#published. subsidiary resources referenced by URIs within that root resource.
There is an obvious need to be able to send such multi-resource
documents in e-mail [SMTP], [RFC822] messages.
1. Introduction The standard defined in this document specifies how to aggregate such
multi-resource documents in MIME-formatted [MIME1 to MIME5] messages
for precisely this purpose.
There are a number of document formats (Hypertext Markup Language While this specification was developed to satisfy the specific
[HTML2], Extended Markup Language [XML], Portable Document format [PDF] aggregation requirements of multi-resource HTML documents, it may
and Virtual Reality Markup Language [VRML]) that specify documents also be applicable to other multi-resource document representations
consisting of a root resource and a number of distinct subsidiary linked by URIs. While this is the case, there is no requirement that
resources referenced by URIs within that root resource. There is an implementations claiming conformance to this standard be able to
obvious need to be able to send such multi-resource documents in e-mail handle any URI linked document representations other than those whose
[SMTP], [RFC822] messages. root is HTML.
The standard defined in this document specifies how to aggregate such This aggregation into a single message of a root resource and the
multi-resource documents in MIME-formatted [MIME1 to MIME5] messages subsidiary resources it references may also be applicable to
for precisely this purpose. resources retrieved by other protocols such as HTTP or FTP, or to the
archiving of complete web pages as they appeared at a particular
point in time.
While this specification was developed to satisfy the specific An informational RFC will be published as a supplement to this
aggregation requirements of multi-resource HTML documents, it may also standard. The informational RFC will discuss implementation methods
be applicable to other multi-resource document representations linked and some implementation problems. Implementers are strongly
by URIs. While this is the case, there is no requirement that recommended to read this informational RFC when developing
implementations claiming conformance to this standard be able to handle implementations of this standard. You can find it through URL
any URI linked document representations other than those whose root is http://www.dsv.su.se/~jpalme/ietf/mhtml.html.
HTML.
This aggregation into a single message of a root resource and the This standard specifies that body parts to be referenced can be
subsidiary resources it references may also be applicable to other identified either by a Content-ID (containing a Message-ID value) or
protocols such as HTTP or FTP, or to the archiving of complete web by a Content-Location (containing an arbitrary URL). The reason why
pages as they appeared at a particular point in time. this standard does not only recommend the use of Content-ID-s is that
it should be possible to forward existing web pages via e-mail
without having to rewrite the source text of the web pages. Such
rewriting has several disadvantages, one of them that security
checksums will probably be invalidated.
An informational RFC will be published as a supplement to this 2. Terminology
standard. The informational RFC will discuss implementation methods and
some implementation problems. Implementers are strongly recommended to
read this informational RFC when developing implementations of this
standard. You can find it through URL
http://www.dsv.su.se/~jpalme/ietf/mhtml.html.
This standard specifies that body parts to be referenced can be 2.1 Conformance requirement terminology
identified either by a Content-ID (containing a Message-ID value) or by
a Content-Location (containing an arbitrary URL). The reason why this
standard does not only recommend the use of Content-ID-s is that it
should be possible to forward existing web pages via e-mail without
having to rewrite the source text of the web pages. Such rewriting has
several disadvantages, one of them that security checksums will
probably be invalidated.
2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [IETF-TERMS].
2.1 Conformance requirement terminology An implementation is not compliant if it fails to satisfy one or more
of the MUST requirements for the protocols it implements. An
implementation that satisfies all the MUST and all the SHOULD
requirements for its protocols is said to be "unconditionally
compliant"; one that satisfies all the MUST requirements but not all
the SHOULD requirements for its protocols is said to be
"conditionally compliant."
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 2.2 Other terminology
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [IETF-TERMS].
An implementation is not compliant if it fails to satisfy one or more Most of the terms used in this document are defined in other RFCs.
of the MUST requirements for the protocols it implements. An
implementation that satisfies all the MUST and all the SHOULD
requirements for its protocols is said to be "unconditionally
compliant"; one that satisfies all the MUST requirements but not all
the SHOULD requirements for its protocols is said to be "conditionally
compliant."
2.2 Other terminology Absolute URI, See Relative Uniform Resource Locators
AbsoluteURI [RELURL].
Most of the terms used in this document are defined in other RFCs. CID See Message/External Body Content-ID [MIDCID].
Absolute URI, See Relative Uniform Resource Locators [RELURL]. Content-Base This header was specified in RFC 2110, but has
AbsoluteURI been removed in this new version of the MHTML
CID See Message/External Body Content-ID [MIDCID]. standard.
Content-Base This header was specified in RFC 2110, but has been Content-ID See Message/External Body Content-ID [MIDCID].
removed in this new version of the MHTML standard.
Content-ID See Message/External Body Content-ID [MIDCID]. Content-Location MIME message or content part header with one
URI of the MIME message or content part body,
defined in section 4.2 below.
Content-Location MIME message or content part header with one URI of Content-Transfer- Conversion of a text into 7-bit octets as
the MIME message or content part body, defined in Encoding specified in [MIME1] chapter 6.
section 4.2 below.
Content-Transfer-Enco Conversion of a text into 7-bit octets as specified CR See [RFC822].
ding in [MIME1] chapter 6.
CR See [RFC822]. CRLF See [RFC822].
CRLF See [RFC822]. Displayed text The text shown to the user reading a document
with a web browser. This may be different from
the HTML markup, see the definition of HTML
markup below.
Displayed text The text shown to the user reading a document with Header Field in a message or content heading
a web browser. This may be different from the HTML specifying the value of one attribute.
markup, see the definition of HTML markup below.
Header Field in a message or content heading specifying Heading Part of a message or content before the first
the value of one attribute. CRLFCRLF, containing formatted fields with
attributes of the message or content.
Heading Part of a message or content before the first HTML See HTML 2 specification [HTML2].
CRLFCRLF, containing formatted fields with
attributes of the message or content.
HTML See HTML 2 specification [HTML2]. HTML Aggregate HTML objects together with some or all objects,
objects to which the HTML object contains hyperlinks,
directly or indirectly.
HTML Aggregate HTML objects together with some or all objects, to HTML markup A file containing HTML encodings as specified
objects which the HTML object contains hyperlinks, directly in [HTML] which may be different from the
or indirectly. displayed text which a person using a web
browser sees. For example, the HTML markup may
contain "&lt;" where the displayed text
contains the character "<".
HTML markup A file containing HTML encodings as specified in LF See [RFC822].
[HTML] which may be different from the displayed
text which a person using a web browser sees. For
example, the HTML markup may contain "&lt;" where
the displayed text contains the character "<".
LF See [RFC822]. MIC Message Integrity Codes, codes use to verify
that a message has not been modified.
MIC Message Integrity Codes, codes use to verify that a MIME See the MIME specifications [MIME1 to MIME5].
message has not been modified.
MIME See the MIME specifications [MIME1 to MIME5]. MUA Messaging User Agent.
MUA Messaging User Agent. PDF Portable Document Format, see [PDF].
PDF Portable Document Format, see [PDF]. Relative URI, See HTML 2 [HTML2] and RFC 1808 [RELURL].
RelativeURI
Relative URI, See HTML 2 [HTML2] and RFC 1808[RELURL]. URI, absolute and See RFC 1866 [HTML2].
RelativeURI relative
URI, absolute and See RFC 1866 [HTML2].
relative
URL See RFC 1738 [URL].
URL, relative See Relative Uniform Resource Locators [RELURL]. URL See RFC 1738 [URL].
VRML See Virtual Reality Markup Language [VRML]. URL, relative See Relative Uniform Resource Locators [RELURL].
3. Overview VRML See Virtual Reality Markup Language [VRML].
An aggregate document is a MIME-encoded message that contains a root 3. Overview
resource (object) as well as other resources linked to it via URIs.
These other resources may be required to display a multimedia document
based on the root resource (inline pictures, style sheets, applets,
etc.), or be the root resources of other multimedia documents. It is
important to keep in mind that aggregate documents need to satisfy the
differing needs of several audiences.
Mail sending agents might send aggregate documents as an encoding of An aggregate document is a MIME-encoded message that contains a root
normal day-to-day electronic mail. Mail sending agents might also send resource (object) as well as other resources linked to it via URIs.
aggregate documents when a user wishes to mail a particular document These other resources may be required to display a multimedia
from the web to someone else. Finally mail sending agents might send document based on the root resource (inline pictures, style sheets,
aggregate documents as automatic responders, providing access to WWW applets, etc.), or be the root resources of other multimedia
resources for non-IP connected clients. Also with other protocols such documents. It is important to keep in mind that aggregate documents
as HTTP or FTP, there may sometimes be a need to retrieve aggregate need to satisfy the differing needs of several audiences.
documents. Receiving agents also have several differing needs. Some
receiving agents might be able to receive an aggregate document and
display it just as any other text content type would be displayed.
Others might have to pass this aggregate document to a browsing
program, and provisions need to be made to make this possible.
Finally several other constraints on the problem arise. It is important Mail sending agents might send aggregate documents as an encoding of
that it be possible for a document to be signed and for it to be normal day-to-day electronic mail. Mail sending agents might also
transmitted and displayed without breaking the message integrity (MIC) send aggregate documents when a user wishes to mail a particular
checksum that is part of the signature. document from the web to someone else. Finally mail sending agents
might send aggregate documents as automatic responders, providing
access to WWW resources for non-IP connected clients. Also with other
protocols such as HTTP or FTP, there may sometimes be a need to
retrieve aggregate documents. Receiving agents also have several
differing needs. Some receiving agents might be able to receive an
aggregate document and display it just as any other text content type
would be displayed. Others might have to pass this aggregate
document to a browsing program, and provisions need to be made to
make this possible.
4. The Content-Location MIME Content Header Finally several other constraints on the problem arise. It is
important that it be possible for a document to be signed and for it
to be transmitted and displayed without breaking the message
integrity (MIC) checksum that is part of the signature.
4.1 MIME content headers 4. The Content-Location MIME Content Header
In order to resolve URI references to resources in other body parts, 4.1 MIME content headers
one MIME content header is defined, Content-Location. This header can
occur in any message or content heading.
The syntax for this header is, using the syntax definition tools from In order to resolve URI references to resources in other body parts,
[ABNF]: one MIME content header is defined, Content-Location. This header can
occur in any message or content heading.
quoted-pair = ("\" text) The syntax for this header is, using the syntax definition tools from
[ABNF]:
text = %d1-9 / ; Characters excluding CR and LF quoted-pair = ("\" text)
%d11-12 /
%d14-127
WSP = SP / HTAB ; Whitespace characters text = %d1-9 / ; Characters excluding CR and LF
%d11-12 /
%d14-127
FWS = ([*WSP CRLF] 1*WSP) ; Folding white-space WSP = SP / HTAB ; Whitespace characters
FWS = ([*WSP CRLF] 1*WSP) ; Folding white-space
ctext = NO-WS-CTL / ; Non-white-space controls ctext = NO-WS-CTL / ; Non-white-space controls
%d33-39 / ; The rest of the US-ASCII %d33-39 / ; The rest of the US-ASCII
%d42-91 / ; characters not including "(", %d42-91 / ; characters not including "(",
%d93-127 ; ")", or "\" %d93-127 ; ")", or "\"
comment = "(" *([FWS] (ctext / quoted-pair / comment)) comment = "(" *([FWS] (ctext / quoted-pair / comment))
[FWS] ")" [FWS] ")"
CFWS = *([FWS] comment) (([FWS] comment) / FWS) CFWS = *([FWS] comment) (([FWS] comment) / FWS)
content-location = "Content-Location:" [CFWS] URI [CFWS] content-location = "Content-Location:" [CFWS] URI [CFWS]
URI = absoluteURI | relativeURI URI = absoluteURI | relativeURI
where URI is restricted to the syntax for URLs as defined in Uniform where URI is restricted to the syntax for URLs as defined in Uniform
Resource Locators [URL] until IETF specifies other kinds of URIs. Resource Locators [URL] until IETF specifies other kinds of URIs.
4.2 The Content-Location Header 4.2 The Content-Location Header
A Content-Location header specifies an URI that labels the content of a A Content-Location header specifies an URI that labels the content of
body part in whose heading it is placed. Its value CAN be an absolute a body part in whose heading it is placed. Its value CAN be an
or a relative URI. Any URI or URL scheme may be used, but use of absolute or a relative URI. Any URI or URL scheme may be used, but
non-standardized URI or URL schemes might entail some risk that use of non-standardized URI or URL schemes might entail some risk
recipients cannot handle them correctly. that recipients cannot handle them correctly.
An URI in a Content-Location header need not refer to an resource which An URI in a Content-Location header need not refer to an resource
is globally available for retrieval using this URI (after resolution of which is globally available for retrieval using this URI (after
relative URIs). However, URI-s in Content-Location headers (if resolution of relative URIs). However, URI-s in Content-Location
absolute, or resolvable to absolute URIs) SHOULD still be globally headers (if absolute, or resolvable to absolute URIs) SHOULD still be
unique. globally unique.
A Content-Location header can thus be used to label a resource which is A Content-Location header can thus be used to label a resource which
not retrievable by some or all recipients of a message. For example a is not retrievable by some or all recipients of a message. For
Content-Location header may label an object which is only retrievable example a Content-Location header may label an object which is only
using this URI in a restricted domain, such as within a retrievable using this URI in a restricted domain, such as within a
company-internal web space. A Content-Location header can even contain company-internal web space. A Content-Location header can even
a fictitious URI. Such an URI need not be globally unique. contain a fictitious URI. Such an URI need not be globally unique.
A single Content-Location header field is allowed in any message or A single Content-Location header field is allowed in any message or
content heading, in addition to a Content-ID header (as specified in content heading, in addition to a Content-ID header (as specified in
[MIME1]) and, in Message headings, a Message-ID (as specified in [MIME1]) and, in Message headings, a Message-ID (as specified in
[RFC822]). All of these constitute different, equally valid body part [RFC822]). All of these constitute different, equally valid body part
labels, and any of them may be used to satisfy a reference to a body labels, and any of them may be used to satisfy a reference to a body
part. Multiple Content-Location header fields in the same message part. Multiple Content-Location header fields in the same message
heading are not allowed. heading are not allowed.
Example of a multipart/related structure containing body parts with Example of a multipart/related structure containing body parts with
both Content-Location and Content-ID labels: both Content-Location and Content-ID labels:
Content-Type: multipart/related; boundary="boundary-example"; Content-Type: multipart/related; boundary="boundary-example";
type="text/html" type="text/html"
--boundary-example --boundary-example
Content-Type: text/html; charset=US-ASCII Content-Type: text/html; charset="US-ASCII"
... ... <IMG SRC="fiction1/fiction2"> ... ... ... ... <IMG SRC="fiction1/fiction2"> ... ...
... ... <IMG SRC="cid:97116092811xyz@foo.bar.net"> ... ... ... ... <IMG SRC="cid:97116092811xyz@foo.bar.net"> ... ...
--boundary-example --boundary-example
Content-Type: image/gif Content-Type: image/gif
Content-ID: <97116092511xyz@foo.bar.net> Content-ID: <97116092511xyz@foo.bar.net>
Content-Location: fiction1/fiction2 Content-Location: fiction1/fiction2
--boundary-example --boundary-example
Content-Type: image/gif Content-Type: image/gif
Content-ID: <97116092811xyz@foo.bar.net> Content-ID: <97116092811xyz@foo.bar.net>
Content-Location: fiction1/fiction3 Content-Location: fiction1/fiction3
--boundary-example-- --boundary-example--
4.3 URIs of MHTML aggregates 4.3 URIs of MHTML aggregates
The URI of an MHTML aggregate is not the same as the URI of its root. The URI of an MHTML aggregate is not the same as the URI of its root.
The URI of its root will directly retrieve only the root resource The URI of its root will directly retrieve only the root resource
itself, even if it may cause a web browser to separately retrieve itself, even if it may cause a web browser to separately retrieve
in-line linked resources. If a Content-Location header field is used in in-line linked resources. If a Content-Location header field is used
the heading of a multipart/related, this Content-Location SHOULD apply in the heading of a multipart/related, this Content-Location SHOULD
to the whole aggregate, not to its root part. apply to the whole aggregate, not to its root part.
When an URI referring to an MHTML aggregate is used to retrieve this When an URI referring to an MHTML aggregate is used to retrieve this
aggregate, the set of resources retrieved can be different from the set aggregate, the set of resources retrieved can be different from the
of resources retrieved using the Content-Locations of its parts. For set of resources retrieved using the Content-Locations of its parts.
example, retrieving an MHTML aggregate may return an old version, while For example, retrieving an MHTML aggregate may return an old version,
retrieving the root URI and its in-line linked objects may return a while retrieving the root URI and its in-line linked objects may
newer version. return a newer version.
4.4 Encoding and decoding of URIs in MIME header fields 4.4 Encoding and decoding of URIs in MIME header fields
4.4.1 Encoding of URIs containing inappropriate characters 4.4.1 Encoding of URIs containing inappropriate characters
Some documents may contain URIs with characters that are inappropriate Some documents may contain URIs with characters that are
for an RFC 822 header, either because the URI itself has an incorrect inappropriate for an RFC 822 header, either because the URI itself
syntax according to [URL] or the URI syntax standard has been changed has an incorrect syntax according to [URL] or the URI syntax standard
to allow characters not previously allowed in MIME headers. These URIs has been changed to allow characters not previously allowed in MIME
cannot be sent directly in a message header. If such a URI occurs, all headers. These URIs cannot be sent directly in a message header. If
spaces and other illegal characters in it must be encoded using one of such a URI occurs, all spaces and other illegal characters in it must
the methods described in [MIME3] section 4. This encoding MUST only be be encoded using one of the methods described in [MIME3] section 4.
done in the header, not in the HTML text. Receiving clients MUST decode This encoding MUST only be done in the header, not in the HTML text.
the [MIME3] encoding in the heading before comparing URIs in body text Receiving clients MUST decode the [MIME3] encoding in the heading
to URIs in Content-Location headers. before comparing URIs in body text to URIs in Content-Location
headers.
The charset parameter value "US-ASCII" SHOULD be used if the URI The charset parameter value "US-ASCII" SHOULD be used if the URI
contains no octets outside of the 7-bit range. If such octets are contains no octets outside of the 7-bit range. If such octets are
present, the correct charset parameter value (derived e.g. from present, the correct charset parameter value (derived e.g. from
information about the HTML document the URI was found in) SHOULD be information about the HTML document the URI was found in) SHOULD be
used. If this cannot be safely established, the value "UNKNOWN-8BIT" used. If this cannot be safely established, the value "UNKNOWN-8BIT"
[RFC 1428] MUST be used. [RFC 1428] MUST be used.
Note, that for the matching of URIs in text/html body parts to URIs in Note, that for the matching of URIs in text/html body parts to URIs
Content-Location headers, the value of the charset parameter is in Content-Location headers, the value of the charset parameter is
irrelevant, but that it may be relevant for other purposes, and that irrelevant, but that it may be relevant for other purposes, and that
incorrect labeling MUST, therefore, be avoided. Warning: Irrelevance of incorrect labeling MUST, therefore, be avoided. Warning: Irrelevance
the charset parameter may not be true in the future, if different of the charset parameter may not be true in the future, if different
character encodings of the same non-English filename are used in HTML. character encodings of the same non-English filename are used in
HTML.
4.4.2 Folding of long URIs 4.4.2 Folding of long URIs
Since MIME header fields have a limited length and long URIs can result Since MIME header fields have a limited length and long URIs can
in Content-Location that exceed this length, Content-Location headers result in Content-Location headers that exceed this length, Content-
may have to be folded. Location headers may have to be folded.
Encoding as discussed in clause 4.4.1 MUST be done before such folding. Encoding as discussed in clause 4.4.1 MUST be done before such
After that, the folding can be done, using the algorithm defined in folding. After that, the folding can be done, using the algorithm
[URLBODY] section 3.1. defined in [URLBODY] section 3.1.
4.4.3 Unfolding and decoding of received URLs in MIME header fields 4.4.3 Unfolding and decoding of received URLs in MIME header fields
Upon receipt, folded MIME header fields should be unfolded, and then Upon receipt, folded MIME header fields should be unfolded, and then
any MIME encoding should be removed, to retrieve the original URI. any MIME encoding should be removed, to retrieve the original URI.
5. Base URIs for resolution of relative URIs 5. Base URIs for resolution of relative URIs
Relative URIs inside the contents of MIME body parts are resolved Relative URIs inside the contents of MIME body parts are resolved
relative to a base URI using the methods for resolving relative URIs relative to a base URI using the methods for resolving relative URIs
described in [RELURL]. In order to determine this base URI, the described in [RELURL]. In order to determine this base URI, the
first-applicable method in the following list applies. first-applicable method in the following list applies.
(a) There is a base specification inside the MIME body part containing (a) There is a base specification inside the MIME body part
the relative URI which resolves relative URIs into absolute URIs. containing the relative URI which resolves relative URIs into
For example, HTML provides the BASE element for this purpose. absolute URIs. For example, HTML provides the BASE element for
this purpose.
(b) There is a Content-Location header in the immediately surrounding (b) There is a Content-Location header in the immediately surrounding
heading of the body part and it contains an absolute URI. This URI heading of the body part and it contains an absolute URI. This
can serve as a base in the same way as a requested URI can serve as URI can serve as a base in the same way as a requested URI can
a base for relative URIs within a file retrieved via HTTP [HTTP]. serve as a base for relative URIs within a file retrieved via
HTTP [HTTP].
(c) If necessary, step (b) can be repeated recursively to find a (c) If necessary, step (b) can be repeated recursively to find a
suitable Content-Location header in a surrounding multi-part and suitable Content-Location header in a surrounding multi-part or
message heading. message heading.
(d) If the MIME object is returned in a HTTP response, use the (d) If the MIME object is returned in a HTTP response, use the URI
URI used to initiate the request used to initiate the request
(e) When the methods above do not yield an absolute URI, a base URL of (e) When the methods above do not yield an absolute URI, a base URL
"this_message:/" MUST be employed. This base URL has been defined of "thismessage:/" MUST be employed. This base URL has been
for the sole purpose of resolving relative references within a defined for the sole purpose of resolving relative references
multipart/related structure when no other base URI is specified. within a multipart/related structure when no other base URI is
specified.
This is also described in other words in section 8.2 below. This is also described in other words in section 8.2 below.
6. Sending documents without linked objects 6. Sending documents without linked objects
If a text/html resource (object) is sent without subsidiary resources, If a text/html resource (object) is sent without subsidiary
to which it refers, it MAY be sent by itself. In this case, embedding resources, to which it refers, it MAY be sent by itself. In this
it in a multipart/related structure is not necessary. case, embedding it in a multipart/related structure is not necessary.
Such a text/html resource may either contain no URIs, or URIs which the Such a text/html resource may either contain no URIs, or URIs which
recipient is expected to retrieve (if possible) via a URI specified the recipient is expected to retrieve (if possible) via a URI
protocol. A text/html resource may also be sent with unresolvable links specified protocol. A text/html resource may also be sent with
in special cases, such as when two authors exchange drafts of unresolvable links in special cases, such as when two authors
unfinished resources. exchange drafts of unfinished resources.
Inclusion of URIs referencing resources which the recipient has to Inclusion of URIs referencing resources which the recipient has to
retrieve via an URI specified protocol may not work for some retrieve via an URI specified protocol may not work for some
recipients. This is because not all e-mail recipients have full recipients. This is because not all e-mail recipients have full
Internet connectivity, or because URIs which work for a sender will not Internet connectivity, or because URIs which work for a sender will
work for a recipient. This occurs, for example, when an URI refers to a not work for a recipient. This occurs, for example, when an URI
resource within a company-internal network that is not accessible from refers to a resource within a company-internal network that is not
outside the company. accessible from outside the company.
7. Use of the Content-Type "multipart/related" 7. Use of the Content-Type "multipart/related"
If a message contains one or more MIME body parts containing URIs and If a message contains one or more MIME body parts containing URIs and
also contains as separate body parts, resources, to which these URIs also contains as separate body parts, resources, to which these URIs
(as defined, for example, in HTML 2.0 [HTML2]) refer, then this whole (as defined, for example, in HTML 2.0 [HTML2]) refer, then this whole
set of body parts (referring body parts and referred-to body parts) set of body parts (referring body parts and referred-to body parts)
SHOULD be sent within a multipart/related structure as defined in SHOULD be sent within a multipart/related structure as defined in
[REL]. [REL].
Even though headers can occur in a message that lacks an associated a Even though headers can occur in a message that lacks an associated
multipart/related structure, this standard only covers their use for multipart/related structure, this standard only covers their use for
resolution of URIs between body parts inside a multipart/related resolution of URIs between body parts inside a multipart/related
structure. This standard does cover the case where a resource in a structure. This standard does cover the case where a resource in a
nested multipart/related structure contains URIs that reference MIME nested multipart/related structure contains URIs that reference MIME
body parts in another multipart/related structure, in which it is body parts in another multipart/related structure, in which it is
enclosed. This standard does not cover the case where a resource in a enclosed. This standard does not cover the case where a resource in a
multipart/related structure contains URIs that reference MIME body multipart/related structure contains URIs that reference MIME body
parts in another parallel or nested multipart/related structure, or in parts in another parallel or nested multipart/related structure, or
another MIME message, even if methods similar to those described in in another MIME message, even if methods similar to those described
this standard are used. Implementers who employ such URIs are warned in this standard are used. Implementers who employ such URIs are
that receiving agents implementing this standard may not be able to warned that receiving agents implementing this standard may not be
process such references. able to process such references.
When the start body part of a multipart/related structure is an atomic When the start body part of a multipart/related structure is an
object, such as a text/html resource, it SHOULD be employed as the root atomic object, such as a text/html resource, it SHOULD be employed as
resource of that multipart/related structure. When the start body part the root resource of that multipart/related structure. When the start
of a multipart/related structure is a multipart/alternative structure, body part of a multipart/related structure is a multipart/alternative
and that structure contains at least one alternative body part which is structure, and that structure contains at least one alternative body
a suitable atomic object, such as a text/html resource, then that body part which is a suitable atomic object, such as a text/html resource,
part SHOULD be employed as the root resource of the aggregate document. then that body part SHOULD be employed as the root resource of the
Implementers are warned, however, that some receiving agents treat aggregate document. Implementers are warned, however, that some
multipart/alternative as if it had been multipart/mixed (even though receiving agents treat multipart/alternative as if it had been
MIME [MIME1] requires support for multipart/alternative). multipart/mixed (even though MIME [MIME1] requires support for
multipart/alternative).
[REL] specifies that a type parameter is mandatory in a "Content-Type: [REL] specifies that a type parameter is mandatory in a "Content-
multipart/related" header, and requires that it be employed to specify Type: multipart/related" header, and requires that it be employed to
the type of the multipart/related start object. Thus, the type specify the type of the multipart/related start object. Thus, the
parameter value shall be "multipart/alternative", when the start part type parameter value shall be "multipart/alternative", when the start
is of "Content-type multipart/alternative", even if the actual root part is of "Content-type multipart/alternative", even if the actual
resource is of type "text/html". In addition, if the multipart/related root resource is of type "text/html". In addition, if the
start object is not the first body part in a multipart/related multipart/related start object is not the first body part in a
structure, [REL] further requires that its Content-ID MUST be specified multipart/related structure, [REL] further requires that its
as the value of a start parameter in the "Content-Type: Content-ID MUST be specified as the value of a start parameter in the
multipart/related" header. "Content-Type: multipart/related" header.
When rendering a resource in a multipart/related structure, URI When rendering a resource in a multipart/related structure, URI
references within that resource can be satisfied by body parts within references within that resource can be satisfied by body parts within
the same multipart/related structure. This is useful: the same multipart/related structure (see section 8.2 below). This is
useful:
(a) For those recipients who only have email but not full Internet (a) For those recipients who only have email but not full Internet
access. access.
(b) For those recipients who for other reasons, such as firewalls or (b) For those recipients who for other reasons, such as firewalls or
the use of company-internal links, cannot retrieve URI referenced the use of company-internal links, cannot retrieve URI referenced
resources via URI specified protocols. resources via URI specified protocols.
Note, that this means that you can, via e-mail, send text/html Note, that this means that you can, via e-mail, send text/html
objects which includes URIs which the recipient cannot resolve via objects which includes URIs which the recipient cannot resolve
HTTP or other connectivity-requiring URIs. via HTTP or other connectivity-requiring URIs.
(c) To send a document whose content is preserved even if the (c) To send a document whose content is preserved even if the
resources to which embedded URIs refer are later changed resources to which embedded URIs refer are later changed or
or deleted. deleted.
(d) For resources which are not available for protocol based (d) For resources which are not available for protocol based
retrieval. retrieval.
(e) To speed up access. (e) To speed up access.
When a sending MUA sends objects which were retrieved from the WWW, it When a sending MUA sends objects which were retrieved from the WWW,
SHOULD maintain their WWW URIs. It SHOULD not transform these URIs into it SHOULD maintain their WWW URIs. It SHOULD not transform these URIs
some other URI form prior to transmitting them. This will allow the into some other URI form prior to transmitting them. This will allow
receiving MUA to both verify MICs included with the message, as well as
verify the documents against their WWW counterpoints, if this is
appropriate.
In certain cases this will not work - for example, if a resource the receiving MUA to both verify MICs included with the message, as
contains URIs as parameters to objects and applets. In such a case, it well as verify the documents against their WWW counterpoints, if this
might be better to rewrite the document before sending it. This problem is appropriate.
is discussed in more detail in the informational RFC which will be
published as a supplement to this standard.
Within a multipart/related structure, each body part MUST have, if In certain cases this will not work - for example, if a resource
assigned, a different Content-ID header value and a Content-Location contains URIs as parameters to objects and applets. In such a case,
header field values which resolve to a different URI. it might be better to rewrite the document before sending it. This
problem is discussed in more detail in the informational RFC which
will be published as a supplement to this standard.
Two body parts in the same multipart/related structure can have the Within a multipart/related structure, each body part MUST have, if
same relative Content-Location header value, only if when resolved to assigned, a different Content-ID header value and a Content-Location
absolute URIs they become different. header field values which resolve to a different URI.
8. Usage of Links to Other Body Parts Two body parts in the same multipart/related structure can have the
same relative Content-Location header value, only if when resolved to
absolute URIs they become different.
8.1 General principle 8. Usage of Links to Other Body Parts
A body part, such as a text/html body part, may contain URIs that 8.1 General principle
reference resources which are included as body parts in the same
message -- in detail, as body parts within the same multipart/related
structure. Often such URI linked resources are meant to be displayed
inline to the viewer of the referencing body part; for example, objects
referenced with the SRC attribute of the IMG element in HTML 2.0
[HTML2]. New elements and attributes with this property are proposed in
the ongoing development of HTML (examples: applet, frame, profile,
OBJECT, classid, codebase, data, SCRIPT). A sender might also want to
send a set of HTML documents which the reader can traverse, and which
are related with the attribute href of the A element.
If a user retrieves and displays a web page formed from a text/html A body part, such as a text/html body part, may contain URIs that
resource, and the subsidiary resources it references, and merely saves reference resources which are included as body parts in the same
the text/html resource, that user may not at a later time be able to message -- in detail, as body parts within the same multipart/related
retrieve and display the web page as it appeared when saved. The format structure. Often such URI linked resources are meant to be displayed
described in this standard can be used to archive and retrieve all of inline to the viewer of the referencing body part; for example,
the resources required to display the web page, as it originally objects referenced with the SRC attribute of the IMG element in HTML
appeared at a certain moment of time, in one aggregate file. 2.0 [HTML2]. New elements and attributes with this property are
proposed in the ongoing development of HTML (examples: applet, frame,
profile, OBJECT, classid, codebase, data, SCRIPT). A sender might
also want to send a set of HTML documents which the reader can
traverse, and which are related with the attribute href of the A
element.
In order to send or store complete such messages, there is a need to If a user retrieves and displays a web page formed from a text/html
specify how a URI in one body part can reference a resource in another resource, and the subsidiary resources it references, and merely
body part. saves the text/html resource, that user may not at a later time be
able to retrieve and display the web page as it appeared when saved.
The format described in this standard can be used to archive and
retrieve all of the resources required to display the web page, as it
originally appeared at a certain moment of time, in one aggregate
file.
8.2 Resolution of URIs in text/html body parts In order to send or store complete such messages, there is a need to
specify how a URI in one body part can reference a resource in
another body part.
The resolution of inline, retrieval and other kinds of URIs in 8.2 Resolution of URIs in text/html body parts
text/html body parts is performed in the following way:
(a) Unfold multiple line header values according to [URLBODY]. Do NOT The resolution of inline, retrieval and other kinds of URIs in
however translate character encodings of the kind described in text/html body parts is performed in the following way:
[URL]. Example: Do not transform "a%2eb/c%20d" into "a/b/c d".
(b) Remove all MIME encodings, such as content-transfer encoding and (a) Unfold multiple line header values according to [URLBODY]. Do NOT
header encodings as defined in MIME part 3 [MIME3] Do NOT however however translate character encodings of the kind described in
translate character encodings of the kind described in [URL]. [URL]. Example: Do not transform "a%2eb/c%20d" into "a/b/c d".
Example: Do not transform "a%2eb/c%20d" into "a/b/c d".
(c) Try to resolve all relative URIs in the HTML content and in (b) Remove all MIME encodings, such as content-transfer encoding and
Content-Location headers using the procedure described in chapter header encodings as defined in MIME part 3 [MIME3] Do NOT however
5 above. The result of this resolution can be an absolute URI, translate character encodings of the kind described in [URL].
or an absolute URI with the base "this_message:/" as specified Example: Do not transform "a%2eb/c%20d" into "a/b/c d".
in chapter 5.
(d) For each referencing URI in a text/html body part, compare the (c) Try to resolve all relative URIs in the HTML content and in
value of the referencing URI after resolution as described in (a) Content-Location headers using the procedure described in chapter
and (b), with the URI derived from Content-ID and Content-Location 5 above. The result of this resolution can be an absolute URI, or
headers for other body parts within the same or a surrounding an absolute URI with the base "thismessage:/" as specified in
Multipart/related structure. If the strings are identical, octet by chapter 5.
octet, then the referencing URI references that body part. This
comparison will only succeed if the two URIs are identical. This
means that if one of the two URIs to be compared was a fictitious
absolute URI with the base"this_message:/", the other must also be
such a fictitious absolute URI, and not resolvable to a real
absolute URI.
(e) If (d) fails, try to retrieve the URI referenced resource (d) For each referencing URI in a text/html body part, compare the
hyperlink through ordinary Internet lookup. Resolution of URIs of value of the referencing URI after resolution as described in (a)
the URL-types "mid" or "cid" to other content-parts, outside the and (b), with the URI derived from Content-ID and Content-
same multipart/related structure, or in other separately sent Location headers for other body parts within the same or a
messages, is not covered by this standard, and is thus neither surrounding Multipart/related structure. If the strings are
encouraged nor forbidden. identical, octet by octet, then the referencing URI references
that body part. This comparison will only succeed if the two URIs
are identical. This means that if one of the two URIs to be
compared was a fictitious absolute URI with the base
"thismessage:/", the other must also be such a fictitious
absolute URI, and not resolvable to a real absolute URI.
8.3 Use of the Content-ID header and CID URLs (e) If (d) fails, try to retrieve the URI referenced resource
hyperlink through ordinary Internet lookup. Resolution of URIs of
the URL-types "mid" or "cid" to other content-parts, outside the
same multipart/related structure, or in other separately sent
messages, is not covered by this standard, and is thus neither
encouraged nor forbidden.
When URIs employing a CID (Content-ID) scheme as defined in [URL] and 8.3 Use of the Content-ID header and CID URLs
[MIDCID] are used to reference other body parts in an MHTML
multipart/related structure, they MUST only be matched against
Content-ID header values, and not against Content-Location header with
CID: values. Thus, even though the following two headers are identical
in meaning, only the Content-ID value will be matched, and the
Content-Location value will be ignored.
Content-ID: <foo@bar.net> When URIs employing a CID (Content-ID) scheme as defined in [URL] and
Content-Location: CID: foo@bar.net [MIDCID] are used to reference other body parts in an MHTML
multipart/related structure, they MUST only be matched against
Content-ID header values, and not against Content-Location header
with CID: values. Thus, even though the following two headers are
identical in meaning, only the Content-ID value will be matched, and
the Content-Location value will be ignored.
Note: Content-IDs MUST be globally unique [MIME1]. It is thus not Content-ID: <foo@bar.net>
permitted to make them unique only within a message or within a single Content-Location: CID: foo@bar.net
multipart/related structure.
9. Examples Note: Content-IDs MUST be globally unique [MIME1]. It is thus not
permitted to make them unique only within a message or within a
single multipart/related structure.
Warning: The examples are provided for illustrative purposes only. If 9. Examples
there is a contradiction between the explanatory text and the examples
in this standard, then the explanatory text is normative.
Notation: The examples contain indentation to show the structure, the Warning: The examples are provided for illustrative purposes only. If
real objects should not be indented in this way. there is a contradiction between the explanatory text and the
examples in this standard, then the explanatory text is normative.
9.1 Example of a HTML body without included linked objects Notation: The examples contain indentation to show the structure, the
real objects should not be indented in this way.
The first example is the simplest form of an HTML email message. This 9.1 Example of a HTML body without included linked objects
message does not contain an aggregate HTML object, but simply a message
with a single HTML body part. This body part contains a URI but the
messages does not contain the resource referenced by that URI. To
retrieve the resource referenced by the URI the receiving client would
need either IP access to the Internet, or an electronic mail web
gateway.
From: foo1@bar.net The first example is the simplest form of an HTML email message. This
To: foo2@bar.net message does not contain an aggregate HTML object, but simply a
Subject: A simple example message with a single HTML body part. This body part contains a URI
Mime-Version: 1.0 but the messages does not contain the resource referenced by that
Content-Type: text/html; charset=iso-8859-1 URI. To retrieve the resource referenced by the URI the receiving
Content-Transfer-Encoding: 8bit client would need either IP access to the Internet, or an electronic
mail web gateway.
<HTML> From: foo1@bar.net
<head></head> To: foo2@bar.net
<body> Subject: A simple example
<h1>Acute accent</h1> Mime-Version: 1.0
The following two lines look have the same screen rendering:<p> Content-Type: text/html; charset="iso-8859-1"
E with acute accent becomes .<br> Content-Transfer-Encoding: 8bit
E with acute accent becomes &Eacute;.<p>
Try clicking <a href="http://www.ietf.cnri.reston.va.us/">
here.</a><p>
</body></HTML>
9.2 Example with an absolute URI to an embedded GIF picture <HTML>
<head></head>
<body>
<h1>Acute accent</h1>
The following two lines look have the same screen rendering:<p>
E with acute accent becomes .<br>
E with acute accent becomes &Eacute;.<p>
Try clicking <a href="http://www.ietf.cnri.reston.va.us/">
here.</a><p>
</body></HTML>
The second example is an HTML message which includes a single image, 9.2 Example with an absolute URI to an embedded GIF picture
referenced using the Content-Location mechanism.
From: foo1@bar.net The second example is an HTML message which includes a single image,
To: foo2@bar.net referenced using the Content-Location mechanism.
Subject: A simple example
Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example";
type="text/html"; start="<foo3@foo1@bar.net>"
--boundary-example From: foo1@bar.net
Content-Type: text/html;charset=US-ASCII To: foo2@bar.net
Content-ID: <foo3@foo1@bar.net> Subject: A simple example
Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example";
type="text/html"; start="<foo3@foo1@bar.net>"
... text of the HTML document, which might contain a URI --boundary-example
referencing a resource in another body part, for example Content-Type: text/html;charset="US-ASCII"
through a statement such as: Content-ID: <foo3@foo1@bar.net>
<IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif"
ALT="IETF logo">
--boundary-example ... text of the HTML document, which might contain a URI
Content-Location: referencing a resource in another body part, for example
http://www.ietf.cnri.reston.va.us/images/ietflogo.gif through a statement such as:
Content-Type: IMAGE/GIF <IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif"
Content-Transfer-Encoding: BASE64 ALT="IETF logo">
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 --boundary-example
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A Content-Location:
etc... http://www.ietf.cnri.reston.va.us/images/ietflogo.gif
Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64
--boundary-example-- R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc...
9.3 Example with relative URIs to embedded GIF pictures --boundary-example--
In this example, a Content-Location header field in the outermost 9.3 Example with relative URIs to embedded GIF pictures
heading will be a base to all relative URLs, also inside the HTML text
being sent.
From: foo1@bar.net In this example, a Content-Location header field in the outermost
To: foo2@bar.net heading will be a base to all relative URLs, also inside the HTML
Subject: A simple example text being sent.
Mime-Version: 1.0
Content-Location: http://www.ietf.cnri.reston.va.us/
Content-Type: multipart/related; boundary="boundary-example";
type="text/html"
--boundary-example From: foo1@bar.net
Content-Type: text/html; charset=ISO-8859-1 To: foo2@bar.net
Content-Transfer-Encoding: QUOTED-PRINTABLE Subject: A simple example
Mime-Version: 1.0
Content-Location: http://www.ietf.cnri.reston.va.us/
Content-Type: multipart/related; boundary="boundary-example";
type="text/html"
... text of the HTML document, which might contain URIs --boundary-example
referencing resources in other body parts, for example through Content-Type: text/html; charset="ISO-8859-1"
statements such as: Content-Transfer-Encoding: QUOTED-PRINTABLE
<IMG SRC="/images/ietflogo1.gif" ALT="IETF logo1"> ... text of the HTML document, which might contain URIs
<IMG SRC="/images/ietflogo2.gif" ALT="IETF logo2"> referencing resources in other body parts, for example through
<IMG SRC="/images/ietflogo3.gif" ALT="IETF logo3"> statements such as:
Example of a copyright sign encoded with Quoted-Printable: =A9 <IMG SRC="images/ietflogo1.gif" ALT="IETF logo1">
Example of a copyright sign mapped onto HTML markup: &#168; <IMG SRC="images/ietflogo2.gif" ALT="IETF logo2">
<IMG SRC="images/ietflogo3.gif" ALT="IETF logo3">
--boundary-example Example of a copyright sign encoded with Quoted-Printable: =A9
Content-Location: Example of a copyright sign mapped onto HTML markup: &#168;
http://www.ietf.cnri.reston.va.us/images/ietflogo1.gif
; Note - Absolute Content-Location does not require a
; base
Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 --boundary-example
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A Content-Location:
etc... http://www.ietf.cnri.reston.va.us/images/ietflogo1.gif
; Note - Absolute Content-Location does not require a
; base
Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64
--boundary-example R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
Content-Location: ietflogo2.gif NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
; Note - Relative Content-Location is resolved by base etc...
; specified in the Multipart/Related Content-Location heading
Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 --boundary-example
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A Content-Location: images/ietflogo2.gif
etc... ; Note - Relative Content-Location is resolved by base
; specified in the Multipart/Related Content-Location heading
Content-Transfer-Encoding: BASE64
--boundary-example R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
Content-Location: NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
http://www.ietf.cnri.reston.va.us/images/ietflogo3.gif etc...
Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 --boundary-example
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A Content-Location:
etc... http://www.ietf.cnri.reston.va.us/images/ietflogo3.gif
Content-Transfer-Encoding: BASE64
--boundary-example-- R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc...
9.4 Example with a relative URI and no BASE available --boundary-example--
From: foo1@bar.net 9.4 Example with a relative URI and no BASE available
To: foo2@bar.net
Subject: A simple example
Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example";
type="text/html"
--boundary-example From: foo1@bar.net
Content-Type: text/html; charset=iso-8859-1 To: foo2@bar.net
Content-Transfer-Encoding: QUOTED-PRINTABLE Subject: A simple example
Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example";
type="text/html"
... text of the HTML document, which might contain a URI --boundary-example
referencing a resource in another body part, for example Content-Type: text/html; charset="iso-8859-1"
through a statement such as: Content-Transfer-Encoding: QUOTED-PRINTABLE
<IMG SRC="ietflogo.gif" ALT="IETF logo">
Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: &#168;
--boundary-example ... text of the HTML document, which might contain a URI
Content-Location: ietflogo.gif referencing a resource in another body part, for example
Content-Type: IMAGE/GIF through a statement such as:
Content-Transfer-Encoding: BASE64 <IMG SRC="ietflogo.gif" ALT="IETF logo">
Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: &#168;
--boundary-example
Content-Location: ietflogo.gif
Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-- --boundary-example--
9.5 Example using CID URL and Content-ID header to an embedded GIF 9.5 Example using CID URL and Content-ID header to an embedded GIF
picture picture
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example"; Content-Type: multipart/related; boundary="boundary-example";
type="text/html" type="text/html"
--boundary-example --boundary-example
Content-Type: text/html; charset=US-ASCII Content-Type: text/html; charset="US-ASCII"
... text of the HTML document, which might contain a URI ... text of the HTML document, which might contain a URI
referencing a resource in another body part, for example referencing a resource in another body part, for example
through a statement such as: through a statement such as:
<IMG SRC="cid:foo4@foo1@bar.net" ALT="IETF logo"> <IMG SRC="cid:foo4@foo1@bar.net" ALT="IETF logo">
--boundary-example --boundary-example
Content-Location: CID:something@else ; this header is disregarded Content-Location: CID:something@else ; this header is disregarded
Content-ID: <foo4@foo1@bar.net> Content-ID: <foo4@foo1@bar.net>
Content-Type: IMAGE/GIF Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-- --boundary-example--
9.6 Example showing permitted and forbidden references between nested 9.6 Example showing permitted and forbidden references between nested
body parts body parts
This example shows in which cases references are allowed between This example shows in which cases references are allowed between
multiple multipart/related body parts in a message. multiple multipart/related body parts in a message.
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example-1"; Content-Type: multipart/related; boundary="boundary-example-1";
type="text/html" type="text/html"
--boundary-example-1 --boundary-example-1
Content-Type: text/html;charset=US-ASCII Content-Type: text/html;charset="US-ASCII"
Content-ID: <foo3@foo1@bar.net> Content-ID: <foo3@foo1@bar.net>
The image reference below will be resolved with the image The image reference below will be resolved with the image
in the next body part. in the next body part.
<IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif" <IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif"
ALT="IETF logo with white background"> ALT="IETF logo with white background">
The image reference below cannot be resolved within this The image reference below cannot be resolved within this
MIME message, since it contains a reference from an outside MIME message, since it contains a reference from an outside
body part to an inside body part, which is not supported body part to an inside body part, which is not supported
by this standard. by this standard.
<IMG SRC=images/ietflogo2e.gif" <IMG SRC=images/ietflogo2e.gif"
ALT="IETF logo with transparent background"> ALT="IETF logo with transparent background">
The anchor reference immediately below will be resolved with The anchor reference immediately below will be resolved with
the nested text/html body part below: the nested text/html body part below:
<A HREF="http://www.ietf.cnri.reston.va.us/more-info> <A HREF="http://www.ietf.cnri.reston.va.us/more-info>
More info</A> More info</A>
The anchor reference immediately below will be resolved with The anchor reference immediately below will be resolved with
the nested text/html body part below: the nested text/html body part below:
<A HREF="http://www.ietf.cnri.reston.va.us/even-more-info> <A HREF="http://www.ietf.cnri.reston.va.us/even-more-info>
Even more info</A> Even more info</A>
--boundary-example-1 --boundary-example-1
Content-Location: Content-Location:
http://www.ietf.cnri.reston.va.us/images/ietflogo.gif http://www.ietf.cnri.reston.va.us/images/ietflogo.gif
Content-Type: IMAGE/GIF Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1 --boundary-example-1
Content-Location: Content-Location:
http://www.ietf.cnri.reston.va.us/more-info http://www.ietf.cnri.reston.va.us/more-info
Content-Type: multipart/related; boundary="boundary-example-2"; Content-Type: multipart/related; boundary="boundary-example-2";
type="text/html" type="text/html"
--boundary-example-2 --boundary-example-2
Content-Type: text/html;charset=US-ASCII Content-Type: text/html;charset="US-ASCII"
Content-ID: <foo4@foo1@bar.net> Content-ID: <foo4@foo1@bar.net>
The image reference below will be resolved with the image The image reference below will be resolved with the image
in the surrounding multipart/related above. in the surrounding multipart/related above.
<IMG SRC=images/ietflogo.gif" <IMG SRC="images/ietflogo.gif"
ALT="IETF logo with white background"> ALT="IETF logo with white background">
The image reference below will be resolved with the image The image reference below will be resolved with the image
inside the current nested multipart/related below. inside the current nested multipart/related below.
<IMG SRC=images/ietflogo2e.gif" <IMG SRC=images/ietflogo2e.gif"
ALT="IETF logo with transparent background"> ALT="IETF logo with transparent background">
--boundary-example-2 --boundary-example-2
Content-Location: http:images/ietflogo2e.gif Content-Location: http:images/ietflogo2.gif
Content-Type: IMAGE/GIF Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgANX/ACkpKTExMTk5OUJCQkpKSlJSUlpaWmNjY2tra3Nzc3t7e4 R0lGODlhGAGgANX/ACkpKTExMTk5OUJCQkpKSlJSUlpaWmNjY2tra3Nzc3t7e4
SEhIyMjJSUlJycnKWlpa2trbW1tcDAwM7Ozv/eQnNzjHNzlGtrjGNjhFpae1pa SEhIyMjJSUlJycnKWlpa2trbW1tcDAwM7Ozv/eQnNzjHNzlGtrjGNjhFpae1pa
etc... etc...
--boundary-example-2-- --boundary-example-2--
--boundary-example-1 --boundary-example-1
Content-Location: Content-Location:
http://www.ietf.cnri.reston.va.us/more-info http://www.ietf.cnri.reston.va.us/even-more-info
Content-Type: multipart/related; boundary="boundary-example-3"; Content-Type: multipart/related; boundary="boundary-example-3";
type="text/html" type="text/html"
--boundary-example-3 --boundary-example-3
Content-Type: text/html;charset=US-ASCII Content-Type: text/html;charset="US-ASCII"
Content-ID: <4@foo@bar.net> Content-ID: <4@foo@bar.net>
The image reference below will be resolved with the image The image reference below will be resolved with the image
inside the current nested multipart/related below. inside the current nested multipart/related below.
<IMG SRC=images/ietflogo2d.gif" <IMG SRC=images/ietflogo2d.gif"
ALT="IETF logo with shadows"> ALT="IETF logo with shadows">
The image reference below cannot be resolved according to The image reference below cannot be resolved according to
this standard since references between parallel multipart/ this standard since references between parallel multipart/
related structures are not supported. related structures are not supported.
<IMG SRC=images/ietflogo2e.gif" <IMG SRC=images/ietflogo2e.gif"
ALT="IETF logo with transparent background"> ALT="IETF logo with transparent background">
--boundary-example-3
Content-Location: http:images/ietflogo2d.gif
Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64
--boundary-example-3 R0lGODlhGAGgANX/AMDAwCkpKTExMTk5OUJCQkpKSlJSUlpaWmNjY2tra3Nz
Content-Location: http:images/ietflogo2d.gif c3t7e4SEhIyMjJSUlJycnKWlpa2trbW1tb29vcbGxs7OztbW1t7e3ufn5+/v
Content-Type: IMAGE/GIF etc...
Content-Transfer-Encoding: BASE64
R0lGODlhGAGgANX/AMDAwCkpKTExMTk5OUJCQkpKSlJSUlpaWmNjY2tra3Nz --boundary-example-3--
c3t7e4SEhIyMjJSUlJycnKWlpa2trbW1tb29vcbGxs7OztbW1t7e3ufn5+/v --boundary-example-1--
etc...
--boundary-example-3-- 10. Character encoding issues and end-of-line issues
--boundary-example-1--
10. Character encoding issues and end-of-line issues For the encoding of characters in HTML documents and other text
documents into a MIME-compatible octet stream, the following
mechanisms are relevant:
For the encoding of characters in HTML documents and other text - HTML [HTML2], [HTML-I18N] as an application of SGML [SGML] allows
documents into a MIME-compatible octet stream, the following mechanisms characters to be denoted by character entities as well as by
are relevant: numeric character references (e.g. "Latin small letter a with
acute accent" may be represented by "&aacute;" or "&#225;") in the
HTML markup.
- HTML [HTML2], [HTML-I18N] as an application of SGML [SGML] allows - HTML documents, in common with other documents of the MIME
characters to be denoted by character entities as well as by numeric Content-Type "text", can be represented in MIME using one of
character references (e.g. "Latin small letter a with acute accent" several character encodings. The MIME Content-Type "charset"
may be represented by "&aacute;" or "&#225;") in the HTML markup. parameter value indicates the particular encoding used. For the
exact meaning and use of the "charset" parameter, please see
[MIME2] chapter 4.
- HTML documents, in common with other documents of the MIME Note that the "charset" parameter refers only to the MIME
Content-Type "text", can be represented in MIME using one of several character encoding. For example, the string "&aacute;" can be sent
character encodings. The MIME Content-Type "charset" parameter value in MIME with "charset=US-ASCII", while the raw character "Latin
indicates the particular encoding used. For the exact meaning and small letter a with acute accent" cannot.
use of the "charset" parameter, please see [MIME2] chapter 4.
Note that the "charset" parameter refers only to the MIME character The above mechanisms are well defined and documented, and therefore
encoding. For example, the string "&aacute;" can be sent in MIME not further explained here. In sending a message, all the above
with "charset=US-ASCII", while the raw character "Latin small letter mentioned mechanisms MAY be used, and any mixture of them MAY occur
a with acute accent" cannot. when sending the document in MIME format. Receiving user agents
(together with any Web browser they may use to display the document)
MUST be capable of handling any combinations of these mechanisms.
The above mechanisms are well defined and documented, and therefore not Also note that:
further explained here. In sending a message, all the above mentioned
mechanisms MAY be used, and any mixture of them MAY occur when sending
the document in MIME format. Receiving user agents (together with any
Web browser they may use to display the document) MUST be capable of
handling any combinations of these mechanisms.
Also note that: - Any documents including HTML documents that contain octet values
outside the 7-bit range need a content-transfer-encoding applied
before transmission over certain transport protocols [MIME1,
chapter 5].
- Any documents including HTML documents that contain octet values - The MIME standard [MIME2] requires that e-mailed documents of
outside the 7-bit range need a content-transfer-encoding applied "Content-Type: Text/ MUST be in canonical form before a Content-
before transmission over certain transport protocols [MIME1, Transfer-Encoding is applied, i.e. that line breaks are encoded as
chapter 5]. CRLFs, not as bare CRs or bare LFs or something else. This is in
contrast to [HTTP] where section 3.6.1 allows other
representations of line breaks.
- The MIME standard [MIME2] requires that e-mailed documents of Note that this might cause problems with integrity checks based on
"Content-Type: Text/ MUST be in canonical form before a checksums, which might not be preserved when moving a document from
Content-Transfer-Encoding is applied, i.e. that line breaks are the HTTP to the MIME environment. If a document has to be converted
encoded as CRLFs, not as bare CRs or bare LFs or something else. in such a way that a checksum based message integrity check becomes
This is in contrast to [HTTP] where section 3.6.1 allows other invalid, then this integrity check header SHOULD be removed from the
representations of line breaks. document.
Note that this might cause problems with integrity checks based on Other sources of problems are Content-Encoding used in HTTP but not
checksums, which might not be preserved when moving a document from the allowed in MIME, and character sets that are not able to represent
HTTP to the MIME environment. If a document has to be converted in such line breaks as CRLF. A good overview of the differences between HTTP
a way that a checksum based message integrity check becomes invalid, and MIME with regards to Content-Type: "text" can be found in [HTTP],
then this integrity check header SHOULD be removed from the document. appendix C.
Other sources of problems are Content-Encoding used in HTTP but not Some transport mechanisms may specify a default "charset" parameter
allowed in MIME, and character sets that are not able to represent line if none is supplied [HTTP, MIME1]. Because the default differs for
breaks as CRLF. A good overview of the differences between HTTP and different mechanisms, when HTML is transferred through e-mail, the
MIME with regards to Content-Type: "text" can be found in [HTTP], charset parameter SHOULD be included, rather than relying on the
appendix C. default.
Some transport mechanisms may specify a default "charset" parameter if 11. Security Considerations
none is supplied [HTTP, MIME1]. Because the default differs for
different mechanisms, when HTML is transferred through e-mail, the
charset parameter SHOULD be included, rather than relying on the
default.
11. Security Considerations 11.1 Security considerations not related to caching
11.1 Security considerations not related to caching It is possible for a message sender to misrepresent the source of a
multipart/related body part to a message recipient by labeling it
with a Content-Location URI that references another resource.
Therefore, message recipients should only interpret Content-Location
URIs as labeling a body part for the resolution of references from
body parts in the same multipart/related message structure, and not
as the source of a resource, unless this can be verified by other
means.
It is possible for a message sender to misrepresent the source of a URIs, especially File URIs, if used without change in a message, may
multipart/related body part to a message recipient by labeling it with inadvertently reveal information that was not intended to be revealed
a Content-Location URI that references another resource. Therefore, outside a particular security context. Message senders should take
message recipients should only interpret Content-Location URIs as care when constructing messages containing the new header fields,
labeling a body part for the resolution of references from body parts defined in this standard, that they are not revealing information
in the same multipart/related message structure, and not as the source outside of any security contexts to which they belong.
of a resource, unless this can be verified by other means.
URIs, especially File URIs, if used without change in a message, may Some resource servers hide passwords and tickets (access tokens to
inadvertently reveal information that was not intended to be revealed information which should not be reveled to others) and other
outside a particular security context. Message senders should take care sensitive information in non-visible fields or URIs within a
when constructing messages containing the new header fields, defined in text/html resource. If such a text/html resource is forwarded in an
this standard, that they are not revealing information outside of any email message, this sensitive information may be inadvertently
security contexts to which they belong. revealed to others.
Some resource servers hide passwords and tickets (access tokens to Since HTML documents can either directly contain executable content
information which should not be reveled to others) and other sensitive (i.e., JavaScript) or indirectly reference executable content (The
information in non-visible fields or URIs within a text/html resource. "INSERT" specification, Java). It is exceedingly dangerous for a
If such a text/html resource is forwarded in an email message, this receiving User Agent to execute content received in a mail message
sensitive information may be inadvertently revealed to others. without careful attention to restrictions on the capabilities of that
executable content.
Since HTML documents can either directly contain executable content HTML-formatted messages can be used to investigate user behaviour,
(i.e., JavaScript) or indirectly reference executable content (The for example to break anonymity, in ways which invade the privacy of
"INSERT" specification, Java). It is exceedingly dangerous for a individuals. If you send a message with a inline link to an object
receiving User Agent to execute content received in a mail message which is not itself included in the message, the recipients mailer or
without careful attention to restrictions on the capabilities of that browser may request that object through HTTP. The HTTP transaction
executable content. (Why??? I do not understand this! What will then reveal who is reading the message. Example: A person who
resdtrictions of what capabilities???/jp) wants to find out who is behind an anonymous user identity, or from
which workstation a user is reading his mail, can do this by sending
a message with an inline link and then observe from where this link
is used to request the object.
HTML-formatted messages can be used to investigate user behaviour, for 11.2 Security considerations related to caching
example to break anonymity, in ways which invade the privacy of
individuals. If you send a message with a inline link to an object
which is not itself included in the message, the recipients mailer or
browser may request that object through HTTP. The HTTP transaction will
then reveal who is reading the message. Example: A person who wants to
find out who is behind an anonymous user identity, or from which
workstation a user is reading his mail, can do this by sending a
message with an inline link and then observe from where this link is
used to request the object.
11.2 Security considerations related to caching There is a well-known problem with the caching of directly retrieved
web resources. A resource retrieved from a cache may differ from that
re-retrieved from its source. This problem, also manifests itself
when a copy of a resource is delivered in a multipart/related
structure.
There is a well-known problem with the caching of directly retrieved When processing (rendering) a text/html body part in an MHTML
web resources. A resource retrieved from a cache may differ from that multipart/related structure, all URIs in that text/html body part
re-retrieved from its source. This problem, also manifests itself when which reference subsidiary resources within the same
a copy of a resource is delivered in a multipart/related structure. multipart/related structure SHALL be satisfied by those resources and
not by resources from any another local or remote source.
When processing (rendering) a text/html body part in an MHTML Therefore, if a sender wishes a recipient to always retrieve an URI
multipart/related structure, all URIs in that text/html body part which referenced resource from its source, an URI labeled copy of that
reference subsidiary resources within the same multipart/related resource MUST NOT be included in the same multipart/related
structure SHALL be satisfied by those resources and not by resources structure.
from any another local or remote source.
Therefore, if a sender wishes a recipient to always retrieve an URI In addition, since the source of a resource received in a
referenced resource from its source, an URI labeled copy of that multipart/related structure can be misrepresented (see 11.1 above),
resource MUST NOT be included in the same multipart/related structure. if a resource received in multipart/related structure is stored in a
cache, it MUST NOT be retrieved from that cache other than by a
reference contained in a body part of the same multipart/related
structure. Failure to honor this directive will allow a
multipart/related structure to be employed as a Trojan Horse. For
example, to inject bogus resources (i.e. a misrepresentation of a
competitor's Web site) into a recipient's generally accessible Web
cache.
In addition, since the source of a resource received in a 12. Differences as compared to the previous version of this proposed
multipart/related structure can be misrepresented (see 11.1 above), if standard in RFC 2110
a resource received in multipart/related structure is stored in a
cache, it MUST NOT be retrieved from that cache other than by a
reference contained in a body part of the same multipart/related
structure. Failure to honor this directive will allow a
multipart/related structure to be employed as a Trojan Horse. For
example, to inject bogus resources (i.e. a misrepresentation of a
competitor's Web site) into a recipient's generally accessible Web
cache.
12. Differences as compared to the previous version of this proposed The specification has been changed to show that the formats described
standard in RFC 2110 do not only apply to multipart MIME in email, but also to multipart
MIME transferred through other protocols such as HTTP or FTP.
The specification has been changed to show that the formats described In order to agree with [RELURL], Content-Location headers in
do not only apply to multipart MIME in email, but also to multipart multipart Content-Headings can now be used as a base to resolve
MIME transferred through other protocols such as HTTP or FTP. relative URIs in their component parts, but only if no base URI can
be derived from the component part itself. Base URIs in Content-
Location header fields in inner headings have precedence over base
URIs in outer multipart headings.
In order to agree with [RELURL], Content-Location headers in multipart The Content-Base header, which was present in RFC 2110, has been
Content-Headings can now be used as a base to resolve relative URIs in removed. A conservative implementor may choose to accept this header
their component parts, but only if no base URI can be derived from the in input for compatibility with implementations of RFC 2110, but MUST
component part itself. Base URIs in Content-Location header fields in never send any Content-Base header, since this header is not any more
inner headings have precedence over base URIs in outer multipart a part of this standard.
headings.
The Content-Base header, which was present in RFC 2110, has been A section 4.4.1 has been added, specifying how to handle the case of
removed. A conservative implementor may choose to accept this header in sending a body part whose URI does not agree with the correct URI
input for compatibility with implementations of RFC 2110, but MUST syntax.
never send any Content-Base header, since this header is not any more a
part of this standard.
A section 4.4.1 has been added, specifying how to handle the case of The handling of relative and absolute URIs for matching between body
sending a body part whose URI does not agree with the correct URI parts have been merged into a single description, by specifying that
syntax. relative URIs, which cannot be resolved otherwise, should be handled
as if they had been given the URL "thismessage:/".
The handling of relative and absolute URIs for matching between body 13. Acknowledgments
parts have been merged into a single description, by specifying that
relative URIs, which cannot be resolved otherwise, should be handled as
if they had been given the URL "this_message:/".
13. Copyright Harald T. Alvestrand, Richard Baker, Isaac Chan, Dave Crocker, Martin
J. Duerst, Lewis Geer, Roy Fielding, Ned Freed, Al Gilman, Paul
Hoffman, Andy Jacobs, Richard W. Jesmajian, Mark K. Joseph, Greg
Herlihy, Valdis Kletnieks, Daniel LaLiberte, Ed Levinson, Jay Levitt,
Albert Lunde, Larry Masinter, Keith Moore, Gavin Nicol, Martyn W.
Peck, Pete Resnick, Jon Smirl, Einar Stefferud, Jamie Zawinski, Steve
Zilles and several other people have helped us with preparing this
document. We alone take responsibility for any errors which may still
be in the document.
Copyright (C) The Internet Society 1998. All Rights Reserved. 14. References
This document and translations of it may be copied and furnished to [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax
others, and derivative works that comment on or otherwise explain it or Specifications: ABNF", RFC 2234, November 1997.
assist in its implementation may be prepared, copied, published and
distributed, in whole or in part, without restriction of any kind,
provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing the
copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of developing
Internet standards in which case the procedures for copyrights defined
in the Internet Standards process must be followed, or as required to
translate it into languages other than English.
The limited permissions granted above are perpetual and will not be [CONDISP] Troost, R. and S. Dorner, "Communicating Presentation
revoked by the Internet Society or its successors or assigns. Information in Internet Messages: The Content-
Disposition Header", RFC 2183, August 1997.
This document and the information contained herein is provided on an [HOSTS] Braden, R., Ed., "Requirements for Internet Hosts --
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING Application and Support", STD 3, RFC 1123, October
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT 1989.
NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL
NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE.
14. Acknowledgments [HTML-I18N] Yergeau, F., Nicol, G. Adams, G. and M. Duerst:
"Internationalization of the Hypertext Markup
Language", RFC 2070, January 1997.
Harald T. Alvestrand, Richard Baker, Isaac Chan, Dave Crocker, Martin [HTML2] Berners-Lee, T. and D. Connolly: "Hypertext Markup
J. Duerst, Lewis Geer, Roy Fielding, Ned Freed, Al Gilman, Paul Language - 2.0", RFC 1866, November 1995.
Hoffman, Andy Jacobs, Richard W. Jesmajian, Mark K. Joseph, Greg
Herlihy, Valdis Kletnieks, Daniel LaLiberte, Ed Levinson, Jay Levitt,
Albert Lunde, Larry Masinter, Keith Moore, Gavin Nicol, Martyn W. Peck,
Pete Resnick, Jon Smirl, Einar Stefferud, Jamie Zawinski, Steve Zilles
and several other people have helped us with preparing this document. I
alone take responsibility for any errors which may still be in the
document.
15. References [HTML3.2] Dave Raggett: HTML 3.2 Reference Specification, W3C
Recommendation, January 1997, at URL
http://www.w3.org/TR/REC-html32.html
Ref. Author, title [HTTP] Berners-Lee, T., Fielding, R. and H. Frystyk,
"Hypertext Transfer Protocol -- HTTP/1.0", RFC 1945,
May 1996.
|ABNF] D. Rocker, P. Overell: Augmented BNF for Syntax [IETF-TERMS] Bradner, S., "Key words for use in RFCs to Indicate
Specifications: ABNF, RFC 2234, November 1997. Requirements Levels", BCP 14, RFC 2119, March 1997.
[CONDISP] R. Troost, S. Dorner: "Communicating Presentation [INFO] J. Palme: Sending HTML in MIME, an informational
Information in Internet Messages: The supplement to the RFC: MIME Encapsulation of
Content-Disposition Header", RFC 2183, August 1997. Aggregate Documents, such as HTML (MHTML), Work in
Progress.
[HOSTS] R. Braden (editor): "Requirements for Internet Hosts -- [MD5] Rivest, R., "The MD5 Message-Digest Algorithm", RFC
Application and Support", STD-3, RFC 1123, October 1989. 1321, April 1992.
[HTML-I18N] F. Yergeau, G. Nicol, G. Adams, & M. Duerst: [MIDCID] Levinson, E., "Content-ID and Message-ID Uniform
"Internationalization of the Hypertext Markup Language". Resource Locators", RFC 2387, August 1998.
RFC 2070, January 1997.
[HTML2] T. Berners-Lee, D. Connolly: "Hypertext Markup Language [MIME1] Freed, N. and N. Borenstein, "Multipurpose Internet
- 2.0", RFC 1866, November 1995. Mail Extensions (MIME) Part One: Format of Internet
Message Bodies", RFC 2045, December 1996.
[HTML3.2] Dave Raggett: HTML 3.2 Reference Specification, W3C [MIME2] Freed, N. and N. Borenstein, "Multipurpose Internet
Recommendation, January 1997, at URL Mail Extensions (MIME) Part Two: Media Types", RFC
http://www.w3.org/TR/REC-html32.html 2046, December 1996.
[HTTP] T. Berners-Lee, R. Fielding, H. Frystyk: Hypertext [MIME3] Moore, K., "MIME (Multipurpose Internet Mail
Transfer Protocol -- HTTP/1.0. RFC 1945, May 1996. Extensions) Part Three: Message Header Extensions for
Non-ASCII Text", RFC 2047, December 1996.
[IETF-TERMS] S. Bradner: Key words for use in RFCs to Indicate [MIME4] Freed, N., Klensin, J. and J. Postel, "Multipurpose
Requirements Levels. RFC 2119, March 1997. Internet Mail Extensions (MIME) Part Four:
Registration Procedures", RFC 2048, January 1997.
[INFO] J. Palme: Sending HTML in MIME, an informational [MIME5] Freed, N. and N. Borenstein, "Multipurpose Internet
supplement to the RFC: MIME Encapsulation of Aggregate Mail Extensions (MIME) Part Five: Conformance
Documents, such as HTML (MHTML), work in progress within Criteria and Examples", RFC 2049, November 1996.
IETF in April 1998.
[MD5] R. Rivest: "The MD5 Message-Digest Algorithm", RFC 1321, [NEWS] Horton, M. and R. Adams: "Standard for interchange of
April 1992. USENET messages", RFC 1036, December 1987.
[MIDCID] E. Levinson: Content-ID and Message-ID Uniform Resource [PDF] Tim Bienz and Richar Cohn: "Portable Document Format
Locators", draft-ietf-mhtml-cid-v2-00.txt, July 1997. Reference Manual", Addison-Wesley, Reading, MA, USA,
1993, ISBN 0-201-62628-4.
[MIME1] N. Freed, N. Borenstein, "Multipurpose Internet Mail [REL] Levinson, E., "The MIME Multipart/Related Content-
Extensions (MIME) Part One: Format of Internet Message Type", RFC 2389, August 1998.
Bodies", RFC 2045, December 1996.
.
[MIME2] N. Freed, N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", RFC 2046,
December 1996.
[MIME3] K. Moore, "MIME (Multipurpose Internet Mail Extensions) [RELURL] Fielding, R., "Relative Uniform Resource Locators",
Part Three: Message Header Extensions for Non-ASCII RFC 1808, June 1995.
Text", RFC 2047, December 1996.
[MIME4] N. Freed, J. Klensin, J. Postel, "Multipurpose Internet [RFC822] Crocker, D., "Standard for the format of ARPA
Mail Extensions (MIME) Part Four: Registration Internet text messages." STD 11, RFC 822, August
Procedures", RFC 2048, January 1997. 1982.
[MIME5] "Multipurpose Internet Mail Extensions (MIME) Part Five: [SGML] ISO 8879. Information Processing -- Text and Office -
Conformance Criteria and Examples", RFC 2049, December Standard Generalized Markup Language (SGML), 1986.
1996. <URL:http://www.iso.ch/cate/d16387.html>
[NEWS] M.R. Horton, R. Adams: "Standard for interchange of [SMTP] Postel, J., "Simple Mail Transfer Protocol", STD 10,
USENET messages", RFC 1036, December 1987. RFC 821, August 1982.
[PDF] Tim Bienz and Richar Cohn: "Portable Document Format [URL] Berners-Lee, T., Masinter, L. and M. McCahill,
Reference Manual", Addison-Wesley, Reading, MA, USA, "Uniform Resource Locators (URL)", RFC 1738, December
1993, ISBN 0-201-62628-4. 1994.
[REL] Edward Levinson: "The MIME [URLBODY] Freed, N. and K. Moore, "Definition of the URL MIME
Multipart/Related"multipart/related" Content-Type", External-Body Access-Type", RFC 2017, October 1996.
draft-ietf-mhtml-re-v2-00.txt, September 1997.
[RELURL] R. Fielding: "Relative Uniform Resource Locators", RFC [VRML] Gavin Bell, Anthony Parisi, Mark Pesce: "Virtual
1808, June 1995. Reality Modeling Language (VRML) Version 1.0 Language
Specification." May 1995,
http://www.vrml.org/Specifications/.
[RFC822] D. Crocker: "Standard for the format of ARPA Internet [XML] Extensible Markup Language, published by the World
text messages." STD 11, RFC 822, August 1982. Wide Web Consortium, URL http://www.w3.org/XML/
[SGML] ISO 8879. Information Processing -- Text and Office - 15. Authors' Addresses
Standard Generalized Markup Language (SGML), 1986.
<URL:http://www.iso.ch/cate/d16387.html>
[SMTP] J. Postel: "Simple Mail Transfer Protocol", STD 10, RFC For contacting the editors, preferably write to Jacob Palme.
821, August 1982.
[URL] T. Berners-Lee, L. Masinter, M. McCahill: "Uniform Jacob Palme
Resource Locators (URL)", RFC 1738, December 1994. Stockholm University and KTH
Electrum 230
S-164 40 Kista, Sweden
[URLBODY] N. Freed and Keith Moore: "Definition of the URL MIME Phone: +46-8-16 16 67
External-Body Access-Type", RFC 2017, October 1996. Fax: +46-8-783 08 29
EMail: jpalme@dsv.su.se
[VRML] Gavin Bell, Anthony Parisi, Mark Pesce: "Virtual Reality Alex Hopmann
Modeling Language (VRML) Version 1.0 Language Microsoft Corporation
Specification." May 1995, One Microsoft Way
http://www.vrml.org/Specifications/. Redmond WA 98052
[XML] Extensible Markup Language, published by the World Wide Phone: +1-425-703-8238
Web Consortium, URL http://www.w3.org/XML/ EMail: alexhop@microsoft.com
16. Author's Addresses Nick Shelness
Lotus Development Corporation
55 Cambridge Parkway
Cambridge MA 02142-1295
For contacting the editors, preferably write to Jacob Palme. EMail: Shelness@lotus.com
Jacob Palme Phone: +46-8-16 16 67 Working group chairman:
Stockholm University and KTH Fax: +46-8-783 08 29
Electrum 230 Email: jpalme@dsv.su.se
S-164 40 Kista, Sweden
Alex Hopmann Email: alexhop@microsoft.com Einar Stefferud
Microsoft Corporation Phone: +1-425-703-8238 EMail: stef@nma.com
One Microsoft Way
Redmond WA 98052
Nick Shelness Email: Shelness@lotus.com 16. Full Copyright Statement
Lotus Development Corporation
55 Cambridge Parkway
Cambridge MA 02142-1295
Working group chairman: Copyright (C) The Internet Society (1999). All Rights Reserved.
Einar Stefferud Email: stef@nma.com This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
 End of changes. 270 change blocks. 
1105 lines changed or deleted 987 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/