draft-ietf-mhtml-rev-03.txt   draft-ietf-mhtml-rev-04.txt 
Network Working Group Jacob Palme Network Working Group Jacob Palme
Internet Draft Stockholm University/KTH Internet Draft Stockholm University/KTH
draft-ietf-mhtml-rev-03.txt Alexander Hopmann draft-ietf-mhtml-rev-04.txt Alexander Hopmann
IETF status to be: Proposed standard Microsoft Corporation IETF status to be: Proposed standard Microsoft Corporation
Revises: RFC 2110 Revises: RFC 2110
Expires: May 1998 November 1997 Expires: May 1998 November 1997
MIME Encapsulation of Aggregate Documents, such as HTML (MHTML) MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)
Status of this Document Status of this Document
This document is an Internet-Draft. Internet-Drafts are working This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, and documents of the Internet Engineering Task Force (IETF), its areas, and
skipping to change at line 30 skipping to change at line 30
or to cite them other than as ``work in progress.'' or to cite them other than as ``work in progress.''
To learn the current status of any Internet-Draft, please check the To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast). ftp.isi.edu (US West Coast).
Abstract Abstract
Although HTML [RFC 1866] was designed within the context of MIME, HTML [RFC 1866] defines a powerful means of specifying multimedia
more than the specification of HTML as defined in RFC 1866 is needed documents. These multimedia documents consist of a text/html root
for two electronic mail user agents to be able to interoperate using resource (object)and other subsidiary resources (image, video clip,
HTML as a document format. These issues include the naming of applet, etc. objects) referenced by Uniform Resource Identifiers (URIs)
objects that are normally referred to by URIs, and the means of within the text/html root resource. When an HTML multimedia document is
aggregating objects that go together. This document describes a set retrieved by a browser, each of these component resources is
of guidelines that will allow conforming mail user agents to be able individually retrieved in real time from a location, and using a
to send, deliver and display these objects, such as HTML objects, protocol, specified by each URI.
that can contain links represented by URIs. In order to be able to
handle inter-linked objects, the document uses the MIME type
''multipart/related'' and specifies the MIME content-headers
''Content-Location'' and ''Content-Base''. The guidelines in this
document can also be used when sending aggregate HTML objects in
other forms than e-mail, such as through HTTP or FTP.
Differences compared to the previous version of this proposed In order to transfer a complete HTML multimedia document in a single e-
standard, published in RFC 2110, are summarized in chapter 13. mail message, it is necessary to:- a) aggregate a text/html root
resource and all of the subsidiary resources it references into a
single composite message structure, and b) define a means by which URIs
in the text/html root can reference subsidiary resources within that
composite message structure.
This document does both. It a) defines the use of a MIME
multipart/related structure to aggregate a text/html root resource and
the subsidiary resources it references, and b) specifies two MIME
content-headers (Content-Base and Content-Location) that allow URIs in
a multipart/related text/html root body part to reference subsidiary
resources in other body parts of the same multipart/related structure.
While initially designed to support e-mail transfer of complete multi-
resource HTML multimedia documents, these conventions can also be
employed by other transfer protocols such as HTTP and FTP to retrieve a
complete multi-resource HTML multimedia document in a single transfer
or for storage and archiving of complete HTML-documents.
Differences between this and a previous version of this standard, which
was published as RFC 2110, are summarized in chapter 13.
Table of Contents Table of Contents
1. Introduction 1. Introduction
2. Terminology 2. Terminology
2.1 Conformance requirement terminology 2.1 Conformance requirement terminology
2.2 Other terminology 2.2 Other terminology
3. Overview 3. Overview
4. The Content-Location and Content-Base MIME Content Headers 4. The Content-Location and Content-Base MIME Content Headers
4.1 MIME content headers 4.1 MIME content headers
4.2 The Content-Location Header 4.2 The Content-Location Header
4.3 The Content-Base header 4.3 The Content-Base header
4.4 Encoding of URIs in MIME headers 4.4 Encoding of URIs in MIME headers
5. Base URIs for resolution of relative URIs 5. Base URIs for resolution of relative URIs
6. Sending documents without linked objects 6. Sending documents without linked objects
7. Use of the Content-Type "multipart/related" 7. Use of the Content-Type "multipart/related"
8. Usage of Links to Other Body Parts 8. Usage of Links to Other Body Parts
8.1 General principle 8.1 General principle
8.2 Resolution of hyperlinks in text/html body parts 8.2 Resolution of URIs in text/html body parts
8.3 Use of the Content-ID header and CID URLs 8.3 Use of the Content-ID header and CID URLs
8.4 Conformance requirement on receipt 8.4 Conformance requirement on receipt
9. Examples 9. Examples
9.1 Example of a HTML body without included linked objects 9.1 Example of a HTML body without included linked objects
9.2 Example with absolute URIs to an embedded GIF picture 9.2 Example with an absolute URI to an embedded GIF picture
9.3 Example with relative URIs to an embedded GIF picture 9.3 Example with a relative URI to an embedded GIF picture
9.4 Example with relative URIs and no BASE available 9.4 Example with a relative URI and no BASE available
9.5 Example using a BASE on the Multipart 9.5 Example using a BASE on the Multipart
9.6 Example using CID URL and Content-ID header to an embedded GIF 9.6 Example using CID URL and Content-ID header to an embedded GIF
picture picture
10. Content-Disposition header 10. Content-Disposition header
11. Character encoding issues and end-of-line issues 11. Character encoding issues and end-of-line issues
12. Security Considerations 12. Security Considerations
13. Differences as compared to the previous version of this proposed 13. Differences as compared to the previous version of this proposed
standard in RFC 2110 standard in RFC 2110
14. Copyright 14. Copyright
15. Acknowledgments 15. Acknowledgments
skipping to change at line 140 skipping to change at line 154
Information about the IETF work in developing this standard may Information about the IETF work in developing this standard may
also be available at URL: also be available at URL:
http://www.dsv.su.se/~jpalme/ietf/mhtml.html http://www.dsv.su.se/~jpalme/ietf/mhtml.html
A collection of test messages is available at A collection of test messages is available at
http://www.dsv.su.se/~jpalme/mimetest/MHTML-test-messages.html http://www.dsv.su.se/~jpalme/mimetest/MHTML-test-messages.html
1. Introduction 1. Introduction
There are a number of document formats, Hypertext Markup Language There are a number of document formats (Hypertext Markup Language
[HTML2], Portable Document format [PDF] and Virtual Reality Markup [HTML2], Portable Document format [PDF] and Virtual Reality Markup
Language [VRML] for example, which provide links using URIs for their Language [VRML]) that specify documents consisting of a root resource
resolution. There is an obvious need to be able to send documents in and a number of distinct subsidiary resources referenced by URIs within
these formats in email [SMTP], [RFC822]. This document gives additional that root resource. There is an obvious need to be able to send such
specifications on how to send such documents in MIME-formatted [MIME1 multi-resource documents in e-mail [SMTP], [RFC822] messages.
to MIME5] messages. This version of this standard was based on full
consideration only of the needs for objects with links in the text/html
media type (as defined in [HTML2]), but the standard may still be
applicable also to other formats for sets of interlinked objects,
linked by URIs. There is no conformance requirement that
implementations claiming conformance to this standard are able to
handle URI-s in other document formats than HTML.
URIs in documents in HTML and other similar formats reference other The standard defined in this document specifies how to aggregate such
objects and resources, either embedded or directly accessible through multi-resource documents in MIME-formatted [MIME1 to MIME5] messages
hypertext links. When mailing such a document, it is often desirable to for precisely this purpose.
also mail all of the additional resources that are referenced in it;
those elements are necessary for the complete interpretation of the
primary object. Also with other protocols such as HTTP or FTP, it can
sometimes be desirable to send several documents in one aggregate
document.
Since the formats specified in this standard specifies a way of saving While this specification was developed to satisfy the specific
a complete web page with all in-line objects copied into one single aggregation requirements of multi-resource HTML documents, it may also
file, the formats might also be useful for archiving of complete web be applicable to other multi-resource document representations linked
page as they looked at a particular moment of time. by URIs. While this is the case, there is no requirement that
implementations claiming conformance to this standard be able to handle
any URI linked document representations other than those whose root is
HTML.
An alternative way for sending an HTML document or other object This aggregation into a single message of a root resource and the
containing URIs in email is to only send the URI, and let the recipient subsidiary resources it references may also be applicable to other
look up the document using HTTP. That method is described in [URLBODY] protocols such as HTTP or FTP, or to the archiving of complete web
and is not described in this document. pages as they appeared at a particular point in time.
An informational RFC will be published as a supplement to this An informational RFC will be published as a supplement to this
standard. The informational RFC will discuss implementation methods and standard. The informational RFC will discuss implementation methods and
some implementation problems. Implementors are recommended to read this some implementation problems. Implementors are strongly recommended to
informational RFC when developing implementations of the MHTML read this informational RFC when developing implementations of this
standard. This informational RFC is, when this RFC is published, still standard. You can find it through URL
in IETF draft status. http://www.dsv.su.se/~jpalme/ietf/mhtml.html.
2. Terminology 2. Terminology
2.1 Conformance requirement terminology 2.1 Conformance requirement terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [IETF-TERMS]. document are to be interpreted as described in [IETF-TERMS].
An implementation is not compliant if it fails to satisfy one or more An implementation is not compliant if it fails to satisfy one or more
skipping to change at line 268 skipping to change at line 273
URL See RFC 1738 [URL]. URL See RFC 1738 [URL].
URL, relative See Relative Uniform Resource Locators [RELURL]. URL, relative See Relative Uniform Resource Locators [RELURL].
VRML See Virtual Reality Markup Language [VRML]. VRML See Virtual Reality Markup Language [VRML].
3. Overview 3. Overview
An aggregate document is a MIME-encoded message that contains a root An aggregate document is a MIME-encoded message that contains a root
document as well as other data that is required in order to represent resource (object) as well as other resources that are required to
that document (inline pictures, style sheets, applets, etc.). Aggregate represent that document (inline pictures, style sheets, applets, etc.).
documents can also include additional elements that are linked to the It is important to keep in mind that aggregate documents need to
first object. It is important to keep in mind the differing needs of satisfy the differing needs of several audiences. This standard can
several audiences. Mail sending agents might send aggregate documents also be used to send sets of linked documents which are not shown
as an encoding of normal day-to-day electronic mail. Mail sending simultaneously, and where the user can use links to move between them.
agents might also send aggregate documents when a user wishes to mail a
particular document from the web to someone else. Finally mail sending
agents might send aggregate documents as automatic responders,
providing access to WWW resources for non-IP connected clients. Also
with other protocols such as HTTP or FTP, there may sometimes be a need
to send aggregate documents in MIME multipart format.
Receiving agents also have several differing needs. Some receiving Mail sending agents might send aggregate documents as an encoding of
agents might be able to receive an aggregate document and display it normal day-to-day electronic mail. Mail sending agents might also send
just as any other text content type would be displayed. Others might aggregate documents when a user wishes to mail a particular document
have to pass this aggregate document to a browsing program, and from the web to someone else. Finally mail sending agents might send
provisions need to be made to make this possible. aggregate documents as automatic responders, providing access to WWW
resources for non-IP connected clients. Also with other protocols such
as HTTP or FTP, there may sometimes be a need to retrieve aggregate
documents. Receiving agents also have several differing needs. Some
receiving agents might be able to receive an aggregate document and
display it just as any other text content type would be displayed.
Others might have to pass this aggregate document to a browsing
program, and provisions need to be made to make this possible.
Finally several other constraints on the problem arise. It is important Finally several other constraints on the problem arise. It is important
that it be possible for a document to be signed and for it to be able that it be possible for a document to be signed and for it to be
to be transmitted to a client and displayed with a minimum risk of transmitted and displayed without breaking the message integrity (MIC)
breaking the message integrity (MIC) check that is part of the check that is part of the signature.
signature.
4. The Content-Location and Content-Base MIME Content Headers 4. The Content-Location and Content-Base MIME Content Headers
4.1 MIME content headers 4.1 MIME content headers
In order to resolve URI references to other body parts, two MIME In order to resolve URI references to resources in other body parts,
content headers are defined, Content-Location and Content-Base. Both two MIME content headers are defined, Content-Location and
these headers can occur in any message or content heading, and will Content-Base. Both of these headers can occur in any message or content
then be valid within this heading and for its immediate content. heading, and will then be valid within this heading and over its
immediate content. If they occur in multipart or message headings, they
These two headers are valid for the content heading or message heading apply to its body parts only in that they can be used to derive a base
where they occur and its text. If they occur in multipart headings, for relative URIs within those body parts, and only if no such base is
they apply to its body parts only in that they can be used to derive a provided in the body part itself or in multipart or message headings
base for relative URIs in the body parts, and only if no such base is closer in scope to the body part.
provided in the body part itself or in headings closer to the body.
These two headers may occur on any message or content heading, but These two headers may occur on any message or content heading, but
their usage for handling hyperlinks between body parts in a message their usage for handling hyperlinks between body parts in a message
SHOULD only occur inside the same "multipart/related". SHOULD only occur between body parts within the same multipart/related
structure.
In practice, at present only those URIs which are URLs are used, but it At present only those URIs which are URLs are affected by these
is anticipated that other forms of URIs will in the future be used. headers, but it is anticipated that in future other forms of URIs maybe
affected.
The syntax for these headers is, using the syntax definition tools from The syntax for these headers is, using the syntax definition tools from
[RFC822]: [RFC822]:
content-location = "Content-Location:" content-location = "Content-Location:"
( absoluteURI | relativeURI ) ( absoluteURI | relativeURI )
content-base = "Content-Base:" absoluteURI content-base = "Content-Base:" absoluteURI
where URI is restricted to the syntax for URLs as defined in Unform where URI is restricted to the syntax for URLs as defined in Uniform
Resource Locators [URL] until IETF specifies other kinds of URIs. Resource Locators [URL] until IETF specifies other kinds of URIs.
4.2 The Content-Location Header 4.2 The Content-Location Header
The Content-Location header specifies the URI that corresponds to the A Content-Location header specifies an URI that labels the content of a
content of the body part in whose heading the header is placed. Its body part in whose heading it is placed. Its value CAN be an absolute
value CAN be an absolute or relative URI. Any URI or URL scheme may be or a relative URI. Any URI or URL scheme may be used, but use of
used, but use of non-standardized URI or URL schemes might entail some non-standardized URI or URL schemes might entail some risk that
risk that recipients cannot handle them correctly. recipients cannot handle them correctly.
The Content-Location header can be used to indicate that the data sent An The Content-Location header can be used to indicate that the data
under this heading is also retrievable, in identical format, through sent under this heading is also retrievable, in identical format,
normal use of this URI. If used for this purpose, it must contain an through normal use of this URI. If used for this purpose, it must
absolute URI or be resolvable, through a Content-Base header, into an contain an absolute URI or be resolvable, through a Content-Base
absolute URI. In this case, the information sent in the message can be header, into an absolute URI. In this case, the information sent in the
seen as a cached version of the original data. message can be seen as a cached version of the original data.
The URI in the Content-Location header may, but need not refer to an An URI in a Content-Location header need not refer to an resource which
object which is actually available globally for retrieval using this is globally available for retrieval using this URI (after resolution of
URI (after resolution of relative URIs). However, URI-s in relative URIs). However, URI-s in Content-Location headers (if
Content-Location headers (if absolute, or resolvable to absolute URIs) absolute, or resolvable to absolute URIs) SHOULD still be globally
SHOULD still be globally unique. unique.
The header can also be used for data which is not available to some or A Content-Location header can also be used to label a resource which is
all recipients of the message, for example if the header refers to an not retrievable by some or all recipients of a message. For example a
object which is only retrievable using this URI in a restricted domain, Content-Location header may label an object which is only retrievable
such as within a company-internal web space. The header can even using this URI in a restricted domain, such as within a
contain a fictious URI and need in that case not be globally unique. company-internal web space. A Content-Location header can even contain
a fictitious URI. Such an URI need not be globally unique.
There MUST only be a single Content-Location header in each message or There MUST only be a single Content-Location header in each message or
content-heading, and its value is a single URI. Note however, that both content heading, whose value is a single URI. Note, however, that both
one Content-Location and one Content-ID or Message-ID header are one Content-Location and one Message-ID or Content-ID header are
allowed. In such a case, these will indicate two different, equally allowed in a message or content heading. In such a case, these will
valid references for this body part, and any of them may be used in indicate two different, equally valid references to a body part, and
other body parts within one "multipart/related" to refer to this body either of them may be used to refer to this body part.
part.
Example: Example of a multipart/related structure containing body parts with
both Content-Location and Content-ID labels:
Content-Type: "multipart/related"; boundary="boundary-example-1"; Content-Type: "multipart/related"; boundary="boundary-example";
type="text/html" type="text/html"
--boundary-example-1 --boundary-example
Part 1:
Content-Type: text/html; charset=US-ASCII Content-Type: text/html; charset=US-ASCII
... ... <IMG SRC="fiction1/fiction2"> ... ... ... ... <IMG SRC="fiction1/fiction2"> ... ...
... ... <IMG SRC="cid:97116092811xyz*foo.bar.net"> ... ...
--boundary-example-1 --boundary-example
Part 2: Content-Type: image/gif
Content-Type: text/html; charset=US-ASCII Content-ID: <97116092511xyz*foo.bar.net>
Content-Location: fiction1/fiction2 Content-Location: fiction1/fiction2
--boundary-example-1-- --boundary-example
Content-Type: image/gif
Content-ID: <97116092811xyz*foo.bar.net>
Content-Location: fiction1/fiction3
--boundary-example--
4.3 The Content-Base header 4.3 The Content-Base header
The Content-Base gives a base for relative URIs occurring in other A Content-Base header provides a base for resolving relative URIs
fields in the same content heading and in its content, if the text is a occurring in other header fields in the same content heading, relative
HTML document which does not have any BASE element in its HTML code. URIs occurring in other header fields nested within its content that
Its value MUST be an absolute URI. lack their own base, or relative URIs occurring in body parts nested
within its content that do not contain an embedded base specification -
for example, an HTML BASE element. The value of a Content-Base header
MUST be an absolute URI.
Example showing which Content-Base is valid where: Example showing which Content-Base is valid where:
Content-Type: "multipart/related"; boundary="boundary-example-1"; Content-Type: "multipart/related"; boundary="boundary-example";
type="text/html"; start=<foo2*foo3@bar2.net> type="text/html"; start=<foo2*foo3@bar2.net>
; A Content-Base header is allowed here, and can be used ; A Content-Base header is allowed here, and can be used
; for resolution of relative URL-s in Part 1 and Part 2, ; for resolution of relative URL-s in Part 1 and Part 2,
; if these did not have any absolute base of their own. ; if these did not have any absolute base of their own.
; However, both part 1 and part 2 below have an absolute ; However, both part 1 and part 2 below have an absolute
; base, in part 1 through an absolute Content-Location header, ; base, in part 1 through an absolute Content-Location header,
; in part 2 through a Content-Base header, and thus a Content- ; in part 2 through a Content-Base header, and thus a Content-
; base up here would not be used for resoultion of relative ; base up here would not be used for resolution of relative
; URLs within the body parts 1 and 2. ; URLs within the body parts 1 and 2.
--boundary-example-1 --boundary-example
Part 1: Part 1:
Content-Type: text/html; charset=US-ASCII Content-Type: text/html; charset=US-ASCII
Content-ID: <foo2*foo3@bar2.net> Content-ID: <foo2*foo3@bar2.net>
Content-Location: http://www.ietf.cnri.reston.va.us/foo1.bar1 Content-Location: http://www.ietf.cnri.reston.va.us/foo1.bar1
; Since this Content-Location contains an absolute URL, it ; Since this Content-Location contains an absolute URL, it
; does not need to be resolved using any Content-Base header. ; does not need to be resolved using any Content-Base header.
; A combination of a Content-Location with a relative URL ; A combination of a Content-Location with a relative URL
; and a Content-Base with an absolute URL would also be valid, ; and a Content-Base with an absolute URL would also be valid,
; as well as only a Content-Location with a relative URL ; as well as only a Content-Location with a relative URL
; and resolved through the Content-Base in the surrounding ; and resolved through the Content-Base in the surrounding
; multipart heading. ; multipart heading.
<FRAME NAME=topwindow src="/frames/foo2.bar2"> <FRAME NAME=topwindow src="/frames/foo2.bar2">
--boundary-example-1 --boundary-example
Part 2: Part 2:
Content-Type: text/html; charset=US-ASCII Content-Type: text/html; charset=US-ASCII
Content-ID: <foo4*foo5@bar2.net> Content-ID: <foo4*foo5@bar2.net>
Content-Location: foo2.bar2 ; The Content-Base below applies to Content-Location: foo2.bar2 ; The Content-Base below applies to
; this relative URI ; this relative URI
Content-Base: http://www.ietf.cnri.reston.va.us/frames/ Content-Base: http://www.ietf.cnri.reston.va.us/frames/
<A HREF="http://www.ietf.cnri.reston.va.us/foo1.bar1"> <A HREF="http://www.ietf.cnri.reston.va.us/foo1.bar1">
To top window </A> To top window </A>
--boundary-example-1-- --boundary-example--
4.4 Encoding of URIs in MIME headers 4.4 Encoding of URIs in MIME headers
4.4.1 Handling of URIs containing inappropriate characters 4.4.1 Handling of URIs containing inappropriate characters
Some documents may contain URIs with characters that are inappropriate Some documents may contain URIs with characters that are inappropriate
for an RFC 822 header, either because the URI itself has an incorrect for an RFC 822 header, either because the URI itself has an incorrect
syntax according to [URL] or the URI syntax standard has been changed syntax according to [URL] or the URI syntax standard has been changed
to allow characters not previously allowed in MIME headers. These URIs to allow characters not previously allowed in MIME headers. These URIs
cannot be sent directly in a message header. There are two approaches cannot be sent directly in a message header. There are two approaches
that can be taken when encountering such a URI as the text to be placed that can be taken when encountering such a URI as the text to be placed
in a Content-Location or Content-Base header: in a Content-Location or Content-Base header:
(a) In some situations, an implementation might be able to replace the (a) In some situations, an implementation might be able to replace the
URI with one that can be sent directly. This might be accomplished, URI with one that can be sent directly. This might be accomplished,
for example, by using the encoding method of [URL] to replace for example, by using the encoding method of [URL] to replace
inappropriate characters within the URI with ones encoded using the inappropriate characters within the URI with ones encoded using the
"%nn" encoding. This replacement MUST in that case be done both in "%nn" encoding. This replacement MUST in that case be done both in
the header and in the HTML text which has a hyperlink which is to the header and in the text/html body part that contains the URI
match the header. Since the change is done in both places, a references the header. Since the change is done in both places, a
receiving agent need not decode it, and MUST NOT decode [URL]- receiving agent need not decode it, and MUST NOT decode the [URL]-
encoding before matching hyperlinks to body parts. encoding before matching URIs to body parts.
(b) The URI might be encoded using the method described in [MIME3]. (b) The URI might be encoded using the method described in [MIME3].
This replacement MUST only be done in the header, not in the HTML This replacement MUST only be done in the header, not in the HTML
text. Receiving clients must decode the [MIME3] encoding in the text. Receiving clients must decode the [MIME3] encoding in the
heading before comparing hyperlinks in body text to URIs in heading before comparing URIs in body text to URIs in
Content-Location headers. Content-Location headers.
With method (b), the charset parameter value "US-ASCII" SHOULD be used With method (b), the charset parameter value "US-ASCII" SHOULD be used
if the URI contains no octets outside of the 7-bit range. If such if the URI contains no octets outside of the 7-bit range. If such
octets are present, the correct charset parameter value (derived e.g. octets are present, the correct charset parameter value (derived e.g.
from information about the HTML document the URI was found in) SHOULD from information about the HTML document the URI was found in) SHOULD
be used. If this cannot be safely established, the value "UKNOWN-8BIT" be used. If this cannot be safely established, the value "UKNOWN-8BIT"
[RFC 1428] MUST be used. [RFC 1428] MUST be used.
Note that for the MHTML processing of matching URIs in body text to URI Note, that for the matching of URIs in text/html body parts to URIs in
in Content-Location headers the value of the charset parameter is Content-Location headers, the value of the charset parameter is
irrelevant, but it may be relevant for other purposes, and incorrect irrelevant, but that it may be relevant for other purposes, and that
labeling MUST therefore be avoided. Warning: Irrelevance of the charset incorrect labeling MUST, therefore, be avoided. Warning: Irrelevance of
parameter may not be true in the future, if different character the charset parameter may not be true in the future, if different
encodings of the same non-English filename is used in HTML. character encodings of the same non-English filename are used in HTML.
Caution should be taken in using method (a), since, in general, this Caution should be taken in using method (a), since, in general, this
encoding can not be applied safely to characters that are used for encoding can not be applied safely to characters that are used for
reserved purposes within the URI scheme. In addition, changing the HTML reserved purposes within the URI scheme. In addition, changing the HTML
body which contains the URI might invalidate a message integrity check. body which contains the URI might invalidate a message integrity check.
Because of these problems, this method SHOULD only be used if it is For these reasons, this method SHOULD only be used if it is performed
performed in cooperation with the author/owner of the documents in cooperation with the author/owner of the documents involved.
involved.
4.4.2 Folding of long URIs 4.4.2 Folding of long URIs
Since MIME header fields have a limited length and URIs can get quite Since MIME header fields have a limited length and long URIs can result
long, these lines may have to be folded. in Content-Location and Content-Base headers that exceed this length ,
Content-Location and Content-Base headers may have to be folded.
Encoding as discussed in clause 4.4.1 MUST be done before such folding. Encoding as discussed in clause 4.4.1 MUST be done before such folding.
This MUST include encoding of space characters, if any. After that, the This MUST include encoding of space characters, if any. After that, the
folding can be done, using the algorithm defined in [URLBODY] section folding can be done, using the algorithm defined in [URLBODY] section
3.1. 3.1.
5. Base URIs for resolution of relative URIs 5. Base URIs for resolution of relative URIs
Relative URIs inside contents of MIME body parts are resolved relative Relative URIs inside the contents of MIME body parts are resolved
to a base URI using the methods for resolving relative URIs described relative to a base URI using the methods for resolving relative URIs
in [RELURL]. In order to determine this base URI, the first-applicable described in [RELURL]. In order to determine this base URI, the
method in the following list applies. first-applicable method in the following list applies.
(a) There is a base specification inside the MIME body part containing (a) There is a base specification inside the MIME body part containing
the link which resolves relative URIs into absolute URIs. For the relative URI which resolves relative URIs into absolute URIs.
example, HTML provides the BASE element for this. For example, HTML provides the BASE element for this purpose.
(b) There is a Content-Base header (as defined in section 4.2), in the (b) There is a Content-Base header (as defined in section 4.2), in the
immediately surrounding content heading, specifying the base to be immediately surrounding content heading, specifying the base to be
used. used.
(c) There is a Content-Location header in the immediately surrounding (c) There is a Content-Location header in the immediately surrounding
heading of the body part which contains an absolute URI and can heading of the body part which contains an absolute URI. This URI
then serve as the base in the same way as the requested URI can can serve as a base in the same way as a requested URI can
serve as a base for relative URIs within a file retrieved via HTTP serve as a base for relative URIs within a file retrieved via HTTP
[HTTP]. [HTTP].
(d) Step (b) and (c) can be repeated recursively on Content-Base and (d) Step (b) and (c) can be repeated recursively to find a suitable
Content-Location headers in surrounding multi-part headings. Content-Base or Content-Location header in a surrounding multi-part
However, a base from an absolute Content-Location in an inner and message heading. Note, that a base from an absolute
heading takes precedence over a base from a Content-Base or a Content-Location in an inner heading takes precedence over a base
Content-Location in a surrounding heading. from a Content-Base or a Content-Location in a surrounding heading.
When the methods above do not yield an absolute URI, matching of two (e) When the methods above do not yield an absolute URI, a base URL of
relative URIs against each other can still be done for matches within a "this_message:/" MUST be employed. This base URL has been defined
multipart/related. This matching is done as if they had been given as for the sole purpose of resolving relative references within a
base an imaginary URL "this_message:/", which exists for the sole multipart/related structure when no other base URI exists.
purpose of resolving relative references within a multipart/related
entitity.
This is also described in other words in section 8.2 below. This is also described in other words in section 8.2 below.
6. Sending documents without linked objects 6. Sending documents without linked objects
If a document, such as an HTML object, is sent without other objects, If a text/html resource (object) is sent without subsidiary resources ,
to which it is linked, it MAY be sent as a text/html body part by to which it is linked, it MAY be sent by itself. In this case,
itself. In this case, "multipart/related" need not be used. embedding it in a multipart/related structure is not necessary.
Such a document may either not include any links, or contain links Such a text/html resource may contain no URIs, or URIs which the
which the recipient resolves via ordinary net look up, or contain links recipient is expected to retrieve (if possible) via a URI specified
which the recipient cannot resolve. protocol. Although not normal, a text/html resource may be sent with
unresolvable links, for example when two authors exchange drafts of
unfinished resources.
Inclusion of links which the recipient has to look up through the net Inclusion of URIs referencing resources which the recipient has to
may not work for some recipients, since all email recipients do not retrieve via an URI specified protocol may not work for some
have full internet connectivity. Also, such links may work for the recipients. This is because not all e-mail recipients have full
sender but not for the recipient, for example when the link refers to internet connectivity, or because URIs which work for a sender will not
an URI within a company-internal network not accessible from outside work for a recipient. This occurs, for example, when an URI refers to a
the company. resource within a company-internal network that is not accessible from
outside the company.
Note that documents with links that the recipient cannot resolve MAY be Note that text/html resources containing URIs that reference resources
sent, although this is discouraged. For example, two persons developing that a recipient cannot retrieve MAY be sent, although this is
a new HTML page may exchange incomplete versions. discouraged. For example, two persons developing a new Web page may
exchange incomplete versions of that page.
7. Use of the Content-Type "multipart/related" 7. Use of the Content-Type "multipart/related"
If a message contains one or more MIME body parts containing links and If a message contains one or more MIME body parts containing URIs and
also contains as separate body parts, data, to which these links (as also contains as separate body parts, resources, to which these URIs
defined, for example, in HTML 2.0 [HTML2]) refers, then this whole set (as defined, for example, in HTML 2.0 [HTML2]) refer, then this whole
of body parts (referring body parts and referred-to body parts) SHOULD set of body parts (referring body parts and referred-to body parts)
be sent within a "multipart/related" body part as defined in [REL]. SHOULD be sent within a multipart/related structure as defined in
[REL].
Even though Content-Location and Content-Base can occur without Even though Content-Location and Content-Base headers can occur in a
multipart/related, this standard only covers their use for resolution message that lacks an associated a multipart/related structure, this
of links between body parts inside one multipart/related. This standard standard only covers their use for resolution of URIs between body
does not cover links from one multipart/related to another parts inside a single multipart/related structure. This standard does
multipart/related in a message containing multiple multipart/related not cover URIs from one multipart/related structure to another
objects. multipart/related structure in a message containing multiple
multipart/related objects either in parallel or nested one within the
other.
The root body part of the "multipart/related" SHOULD be the start When the start body part of a multipart/related structure is an atomic
object for rendering the object, such as a text/html object, and which object, such as a text/html resource, it SHOULD be employed as the root
contains links to objects in other body parts, or a resource of that multipart/related structure. When the start body part
multipart/alternative of which at least one alternative resolves to of a multipart/related structure is a multipart/alternative structure,
such a start object. Implementors are warned, however, that some and that structure contains at least one alternative body part which is
receiving agents treat multipart/alternative as if it had been a suitable atomic object, such as a text/html resource, then that body
multipart/mixed (even though MIME [MIME1] requires support for part SHOULD be employed as the root resource of the aggregate document.
multipart/alternative). Implementors are warned, however, that some receiving agents treat
multipart/alternative as if it had been multipart/mixed (even though
MIME [MIME1] requires support for multipart/alternative).
[REL] specifies that the type attribute is mandatory in "Content-Type: [REL] specifies that a type parameter is mandatory in a "Content-Type:
multipart/related" headers, and requires that this attribute be the multipart/related" header, and requires that it be employed to specify
type of the root object, and this value shall thus for example be the type of the multipart/related start object. Thus, the type
"multipart/alternative", if the root part is of "Content-type parameter value shall be "multipart/alternative", when the start part
multipart/alternative", even if one of the subparts of the is of "Content-type multipart/alternative", even if the actual root
"multipart/alternative" is of type "text/html". If the root is not the resource is of type "text/html". In addition, if the multipart/related
first body part within the "multipart/related", [REL] further requires start object is not the first body part in a multipart/related
that its Content-ID MUST be given in a start parameter to the structure, [REL] further requires that its Content-ID MUST be specified
"Content-Type: multipart/related" header. as the value of a start parameter in the "Content-Type:
multipart/related" header.
When presenting the root body part to the user, the additional body When rendering a resource in a multipart/related structure, URI
parts within the "multipart/related" can be used: references within that resource can be satisfied by body parts within
the same multipart/related structure. This is useful:
(a) For those recipients who only have email but not full Internet (a) For those recipients who only have email but not full Internet
access. access.
(b) For those recipients who for other reasons, such as firewalls or (b) For those recipients who for other reasons, such as firewalls or
the use of company-internal links, cannot retrieve the linked body the use of company-internal links, cannot retrieve URI referenced
parts through the net. resources via URI specified protocols.
Note that this means that you can, via email, send HTML which Note, that this means that you can, via e-mail, send text/html
includes URIs which the recipient cannot resolve via HTTPor other objects which includes URIs which the recipient cannot resolve via
connectivity-requiring URIs. HTTP or other connectivity-requiring URIs.
(c) To send a document in a format which is preserved even if the (c) To send a document whose content is preserved even if the
object to which the hyperlinks refer through HTTP is later changed resources to which embedded URIs refer are later changed
or deleted. or deleted.
(d) For items which are not available on the web. (d) For resources which are not available for protocol based
retrieval.
(e) For any recipient to speed up access.
The type parameter of the "Content-Type: multipart/related" MUST be the (e) To speed up access.
same as the Content-Type of its root.
When a sending MUA sends objects which were retrieved from the WWW, it When a sending MUA sends objects which were retrieved from the WWW, it
SHOULD maintain their WWW URIs. It SHOULD not transform these URIs into SHOULD maintain their WWW URIs. It SHOULD not transform these URIs into
some other URI form prior to transmitting them. This will allow the some other URI form prior to transmitting them. This will allow the
receiving MUA to both verify MICs included with the message, as well as receiving MUA to both verify MICs included with the message, as well as
verify the documents against their WWW counterpoints. verify the documents against their WWW counterpoints, if this is
appropriate.
In certain special cases this will not work if the original HTML In certain cases this will not work - for example, if a resource
document contains URIs as parameters to objects and applets. In such a contains URIs as parameters to objects and applets. In such a case, it
case, it might be better to rewrite the document before sending it. might be better to rewrite the document before sending it. This problem
This problem is discussed in more detail in the informational RFC which is discussed in more detail in the informational RFC which will be
will be published as a supplement to this standard. published as a supplement to this standard.
This standard does not cover the case where a "multipart/related" This standard does not cover the case where a resource in a
contains links to MIME body parts outside of the current multipart/related structure contains URIs that reference MIME body
"multipart/related" or in other MIME messages, even if methods similar parts outside of the current multipart/related structure or in other
to those described in this standard are used. Implementors who provide MIME messages, even if methods similar to those described in this
such links are warned that receiving agents implementing this standard standard are used. Implementors who employ such URIs are warned that
may not be able to resolve such links. receiving agents implementing this standard may not be able to process
them.
Within a "multipart/related", ALL different parts MUST have different Within a multipart/related structure, each body part MUST have, if
Content-ID values or Content-Location headers which resolve to assigned, a different Content-ID header value and a Content-Location
different URIs. header values which resolves to a different URI.
Two body parts in the same multipart/related can have the same relative Two body parts in the same multipart/related structure can have the
URI as value of their Content-Location headers only if there are same relative Content-Location header value, only if when resolved to
headers containing a different Content-Base header, so that the absolute URIs in combination with Content-Base header values, they are
absolute URI after resolution against the Content-Base header is then different.
different.
8. Usage of Links to Other Body Parts 8. Usage of Links to Other Body Parts
8.1 General principle 8.1 General principle
A body part, such as a text/html body part, may contain hyperlinks to A body part, such as a text/html body part, may contain URIs that
objects which are included as other body parts in the same message and reference resources which are included as body parts in the same
within the same "multipart/related" content. Often such linked objects message -- in detail, as body parts within the same multipart/related
are meant to be displayed inline to the reader of the main document; structure. Often such URI linked resources are meant to be displayed
for example, objects referenced with the src attribute of the IMG inline to the viewer of the referencing body part; for example, objects
element in HTML 2.0 [HTML2]. New elements and attributes with this referenced with the SRC attribute of the IMG element in HTML 2.0
property are proposed in the ongoing development of HTML (examples: [HTML2]. New elements and attributes with this property are proposed in
applet, frame, profile, OBJECT, classid, codebase, data, SCRIPT). A the ongoing development of HTML (examples: applet, frame, profile,
sender might also want to send a set of HTML documents which the reader OBJECT, classid, codebase, data, SCRIPT). A sender might also want to
can traverse, and which are related with the attribute href of the A send a set of HTML documents which the reader can traverse, and which
element. are related with the attribute href of the A element.
In order to send such messages, there is a need to indicate which other In order to send such messages, there is a need to specify how a URI in
body parts are referred to by the links in the body parts containing one body part can reference a resource in another body part.
such links. For example, a body part of "Content-Type: text/html" often
has links to other objects, which might be included in other body parts
in the same MIME message.
8.2 Resolution of hyperlinks in text/html body parts 8.2 Resolution of URIs in text/html body parts
The resolution of hyperlinks in text/html body parts is performed in The resolution of URIs in text/html body parts is performed in the
the following way: following way:
(a) Unfold multiple line header values according to [URLBODY]. Do NOT (a) Unfold multiple line header values according to [URLBODY]. Do NOT
however translate character encodings of the kind described in however translate character encodings of the kind described in
[URL]. Example: Do not transform "a%2eb/c%20d" into "a/b/c d". [URL]. Example: Do not transform "a%2eb/c%20d" into "a/b/c d".
(b) Remove all MIME encodings, such as content-transfer encoding and (b) Remove all MIME encodings, such as content-transfer encoding and
header encodings as defined in MIME part 3 [MIME3] Do NOT however header encodings as defined in MIME part 3 [MIME3] Do NOT however
translate character encodings of the kind described in [URL]. translate character encodings of the kind described in [URL].
Example: Do not transform "a%2eb/c%20d" into "a/b/c d". Example: Do not transform "a%2eb/c%20d" into "a/b/c d".
(c) Try to resolve all relative URIs in the HTML content and in (c) Try to resolve all relative URIs in the HTML content and in
Content-Location headers using the procedure described in chapter 5 Content-Location headers using the procedure described in chapter 5
above. The result of this resolution can be an absolute URI, or a above. The result of this resolution can be an absolute URI, or a
fictiuous absolute URI with the base "this_message:/" as specified fictitious absolute URI with the base "this_message:/" as specified
in chapter 5. in chapter 5.
(d) For each hyperlink in any HTML body, compare the value of the (d) For each referencing URI in a text/html body part, compare the
hyperlink after resolution as described in (a) and (b), with the value of the referencing URI after resolution as described in (a)
URI derived from Content-ID and Content-Location headers for other and (b), with the URI derived from Content-ID and Content-Location
body parts within the same Multipart/related. If the strings are headers for other body parts within the same Multipart/related
identical, octet by octet, then this hyperlink is resolved by the structure. If the strings are identical, octet by octet, then the
body part with the same URI. This comparison will only succeed if referencing URI references that body part. This comparison will
the two URIs are identical. This means that if one of the two URIs only succeed if the two URIs are identical. This means that if one
to be compared was a fictituous absolute URI with the base of the two URIs to be compared was a fictitious absolute URI with
"this_message:/", the other must also be such a fictituous absolute the base"this_message:/", the other must also be such a fictitious
URI, and not resolvable to a real absolute URI. absolute URI, and not resolvable to a real absolute URI.
(e) If (c) fails, try to resolve the hyperlink through ordinary (e) If (d) fails, try to retrieve the URI referenced resource
Internet lookup. Resolution of hyperlinks of the URL-types "mid" or hyperlink through ordinary Internet lookup. Resolution of URIs of
"cid" to other content-parts, outside multipart/related, or in the URL-types "mid" or "cid" to other content-parts, outside the
other separately sent messages, is not covered by this standard, same multipart/related structure, or in other separately sent
and is thus neither encouraged nor forbidden. messages, is not covered by this standard, and is thus neither
encouraged nor forbidden.
8.3 Use of the Content-ID header and CID URLs 8.3 Use of the Content-ID header and CID URLs
When CID (Content-ID) URLs as defined in [URL] and [MIDCID] are used When CID (Content-ID) URLs as defined in [URL] and [MIDCID] are used to
for links between body parts, the Content-ID header MUST be used reference other body parts, they MUST only be matched against
instead of the Content-Location header. Thus, even though the following Content-ID header values, and not against Content-Location header with
two headers are identical in meaning, only the Content-ID variant MUST CID: values. Thus, even though the following two headers are
be used, and all "Content-Location: CID:" should be ignored. identical in meaning, only Content-ID value will be matched, and the
Content-Location value will be ignored.
Content-ID: <foo@bar.net> Content-ID: <foo@bar.net>
Content-Location: CID: foo@bar.net Content-Location: CID: foo@bar.net
Note: Content-IDs MUST be globally unique [MIME1]. It is thus not Note: Content-IDs MUST be globally unique [MIME1]. It is thus not
permitted to make them unique only within this message or within this permitted to make them unique only within a message or within a single
"multipart/related". multipart/related structure.
8.4 Conformance requirement on receipt 8.4 Conformance requirement on receipt
An email system which claims conformance to this standard MUST support An e-mail system which claims conformance to this standard MUST support
receipt of "multipart/related" (as defined in section 7) with links receipt of multipart/related structures (as defined in section 7) with
between body parts using both the Content-Location (as defined in URIs referencing body parts using both the Content-Location (as defined
section 8.2) and the Content-ID method (as defined in section 8.3). in section 8.2) and the Content-ID method (as defined in section 8.3).
9. Examples 9. Examples
Warning: If there is a contradiction between the explanatory text and Warning: If there is a contradiction between the explanatory text and
the examples in this standard, then the explanatory text, not the the examples in this standard, then the explanatory text, not the
examples are normative. examples are normative.
9.1 Example of a HTML body without included linked objects 9.1 Example of a HTML body without included linked objects
The first example is the simplest form of an HTML email message. This The first example is the simplest form of an HTML email message. This
is not an aggregate HTML object, but simply a message with a single message does not contain an aggregate HTML object, but simply a message
HTML body part. This message contains a hyperlink but does not provide with a single HTML body part. This body part contains a URI but the
the ability to resolve the hyperlink. To resolve the hyperlink the messages does not contain the resource referenced by that URI. To
receiving client would need either IP access to the Internet, or an retrieve the resource referenced by the URI the receiving client would
electronic mail web gateway. need either IP access to the Internet, or an electronic mail web
gateway.
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: text/html; charset=US-ASCII Content-Type: text/html; charset=US-ASCII
<HTML> <HTML>
<head></head> <head></head>
<body> <body>
<h1>Hi there!</h1> <h1>Hi there!</h1>
An example of an HTML message.<p> An example of an HTML message.<p>
Try clicking <a href="http://www.resnova.com/">here.</a><p> Try clicking <a href="http://www.resnova.com/">here.</a><p>
</body></HTML> </body></HTML>
9.2 Example with absolute URIs to an embedded GIF picture 9.2 Example with an absolute URI to an embedded GIF picture
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example-1"; Content-Type: multipart/related; boundary="boundary-example";
type="text/html"; start=<foo3*foo1@bar.net> type="text/html"; start=<foo3*foo1@bar.net>
--boundary-example-1 --boundary-example
Content-Type: text/html;charset=US-ASCII Content-Type: text/html;charset=US-ASCII
Content-ID: <foo3*foo1@bar.net> Content-ID: <foo3*foo1@bar.net>
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a URI
to the other body part, for example through a statement such as: referencing a resource in
another body part, for example through a statement such as:
<IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif" <IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif"
ALT="IETF logo"> ALT="IETF logo">
--boundary-example-1 --boundary-example
Content-Location: Content-Location:
http://www.ietf.cnri.reston.va.us/images/ietflogo.gif http://www.ietf.cnri.reston.va.us/images/ietflogo.gif
Content-Type: IMAGE/GIF Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1-- --boundary-example--
9.3 Example with relative URIs to an embedded GIF picture 9.3 Example with a relative URI to an embedded GIF picture
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example-1"; Content-Type: multipart/related; boundary="boundary-example";
type="text/html" type="text/html"
--boundary-example-1 --boundary-example
Content-Base: http://www.ietf.cnri.reston.va.us/ Content-Base: http://www.ietf.cnri.reston.va.us/
Content-Type: text/html; charset=ISO-8859-1 Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE Content-Transfer-Encoding: QUOTED-PRINTABLE
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a URI
to the other body part, for example through a statement such as: referencing a resource in
another body part, for example through a statement such as:
<IMG SRC="/images/ietflogo.gif" ALT="IETF logo"> <IMG SRC="/images/ietflogo.gif" ALT="IETF logo">
Example of a copyright sign encoded with Quoted-Printable: =A9 Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: &#168; Example of a copyright sign mapped onto HTML markup: &#168;
--boundary-example-1 --boundary-example
Content-Location: ietflogo.gif Content-Location:
Content-Base: http://www.ietf.cnri.reston.va.us/images/ http://www.ietf.cnri.reston.va.us/images/ietflogo.gif
; Note that the fact that the Content-Base comes after the
; Content-Location within the same Content-Heading will not
; influence their interpretation
Content-Type: IMAGE/GIF Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1-- --boundary-example--
9.4 Example with relative URIs and no BASE available 9.4 Example with a relative URI and no BASE available
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example-1"; Content-Type: multipart/related; boundary="boundary-example";
type="text/html" type="text/html"
--boundary-example-1 --boundary-example
Content-Type: text/html; charset=ISO-8859-1 Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE Content-Transfer-Encoding: QUOTED-PRINTABLE
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a URI
to the other body part, for example through a statement such as: referencing a resource in
another body part, for example through a statement such as:
<IMG SRC="ietflogo.gif" ALT="IETF logo"> <IMG SRC="ietflogo.gif" ALT="IETF logo">
Example of a copyright sign encoded with Quoted-Printable: =A9 Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: &#168; Example of a copyright sign mapped onto HTML markup: &#168;
--boundary-example-1 --boundary-example
Content-Location: ietflogo.gif Content-Location: ietflogo.gif
Content-Type: IMAGE/GIF Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1-- --boundary-example--
9.5 Example using a BASE on the Multipart 9.5 Example using a BASE on the Multipart
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example-1"; Content-Type: multipart/related; boundary="boundary-example";
type="text/html" type="text/html"
Content-Base: http://www.ietf.cnri.reston.va.us/ Content-Base: http://www.ietf.cnri.reston.va.us/
--boundary-example-1 --boundary-example
Content-Type: text/html; charset=ISO-8859-1 Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE Content-Transfer-Encoding: QUOTED-PRINTABLE
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a URI
to the other body part, for example through a statement such as: referencing a resource in
another body part, for example through a statement such as:
<IMG SRC="ietflogo.gif" ALT="IETF logo"> <IMG SRC="ietflogo.gif" ALT="IETF logo">
Example of a copyright sign encoded with Quoted-Printable: =A9 Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: &#168; Example of a copyright sign mapped onto HTML markup: &#168;
--boundary-example-1 --boundary-example
Content-Location: http://www.ietf.cnri.reston.va.us/ietflogo.gif Content-Location: ietflogo.gif
Content-Type: IMAGE/GIF Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1-- --boundary-example--
9.6 Example using CID URL and Content-ID header to an embedded GIF 9.6 Example using CID URL and Content-ID header to an embedded GIF
picture picture
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example-1"; Content-Type: multipart/related; boundary="boundary-example";
type="text/html" type="text/html"
--boundary-example-1 --boundary-example
Content-Type: text/html; charset=US-ASCII Content-Type: text/html; charset=US-ASCII
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a URI
to the other body part, for example through a statement such as: referencing a resource in
another body part, for example through a statement such as:
<IMG SRC="cid:foo4*foo1@bar.net" ALT="IETF logo"> <IMG SRC="cid:foo4*foo1@bar.net" ALT="IETF logo">
--boundary-example-1 --boundary-example
Content-Location: CID:something@else ; this header is disregarded Content-Location: CID:something@else ; this header is disregarded
Content-ID: <foo4*foo1@bar.net> Content-ID: <foo4*foo1@bar.net>
Content-Type: IMAGE/GIF Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1-- --boundary-example--
10. Content-Disposition header 10. Content-Disposition header
Note the specification in [REL] on the relations between Note the specification in [REL] on the relations between
Content-Disposition and multipart/related. Content-Disposition and multipart/related.
11. Character encoding issues and end-of-line issues 11. Character encoding issues and end-of-line issues
For the encoding of characters in HTML documents and other text For the encoding of characters in HTML documents and other text
documents into a MIME-compatible octet stream, the following mechanisms documents into a MIME-compatible octet stream, the following mechanisms
skipping to change at line 942 skipping to change at line 969
Web browser they may use to display the document) MUST be capable of Web browser they may use to display the document) MUST be capable of
handling any combinations of these mechanisms. handling any combinations of these mechanisms.
Also note that: Also note that:
- Any documents including HTML documents that contain octet values - Any documents including HTML documents that contain octet values
outside the 7-bit range need a content-transfer-encoding applied outside the 7-bit range need a content-transfer-encoding applied
before transmission over certain transport protocols [MIME1, before transmission over certain transport protocols [MIME1,
chapter 5]. chapter 5].
- The MIME standard [MIME2] requires that emailed documents of - The MIME standard [MIME2] requires that e-mailed documents of
"Content-Type: Text MUST be in canonical form before "Content-Type: Text/ MUST be in canonical form before a
Content-Transfer-Encoding, i.e. that line breaks are encoded as Content-Transfer-Encoding is applied, i.e. that line breaks are
CRLFs, not as bare CRs or bare LFs or something else. This is in encoded as CRLFs, not as bare CRs or bare LFs or something else.
contrast to [HTTP] where section 3.6.1 allows other representations This is in contrast to [HTTP] where section 3.6.1 allows other
of line breaks. representations of line breaks.
Note that this might cause problems with integrity checks based on Note that this might cause problems with integrity checks based on
checksums, which might not be preserved when moving a document from the checksums, which might not be preserved when moving a document from the
HTTP to the MIME environment. If a document has to be converted in such HTTP to the MIME environment. If a document has to be converted in such
a way that a checksum integrity check becomes invalid, then this a way that a checksum based message integrity check becomes invalid,
integrity check header SHOULD be removed from the document. then this integrity check header SHOULD be removed from the document.
Other sources of problems are Content-Encoding used in HTTP but not Other sources of problems are Content-Encoding used in HTTP but not
allowed in MIME, and charsets that are not able to represent line allowed in MIME, and charsets that are not able to represent line
breaks as CRLF. A good overview of the differences between HTTP and breaks as CRLF. A good overview of the differences between HTTP and
MIME with regards to Content-Type: "text" can be found in [HTTP], MIME with regards to Content-Type: "text" can be found in [HTTP],
appendix C. appendix C.
If the original document has line breaks in the canonical form (CRLF), If the original document has line breaks in the canonical form (CRLF),
then the document SHOULD remain unconverted so that integrity check then the document SHOULD remain unconverted so that integrity check
sums are not invalidated. sums are not invalidated.
A provider of HTML documents who wants his documents to be transferable A provider of HTML documents who wants his documents to be transferable
via both HTTP and SMTP without invalidating checksum integrity checks, via both HTTP and SMTP without invalidating checksum integrity checks,
should always provide original documents in the canonical form with should always provide original documents in the canonical form with
CRLF for line breaks. CRLF for line breaks.
Some transport mechanisms may specify a default "charset" parameter if Some transport mechanisms may specify a default "charset" parameter if
none is supplied [HTTP, MIME1]. Because the default differs for none is supplied [HTTP, MIME1]. Because the default differs for
different mechanisms, when HTML is transferred through mail, the different mechanisms, when HTML is transferred through e-mail, the
charset parameter SHOULD be included, rather than relying on the charset parameter SHOULD be included, rather than relying on the
default. default.
12. Security Considerations 12. Security Considerations
Some Security Considerations include the potential to send someone an Some Security Considerations include the potential to send someone an
object, and claim that it is represented by a particular URI (by giving object, and claim that it is represented by a particular URI (by giving
it a Content-Location header). There can be no assurance that a WWW it a Content-Location header). There can be no assurance that a WWW
request (like HTTP or FTP) for that same URI would normally result in request (like HTTP or FTP) for that same URI would normally result in
that same object. It might be unsuitable to cache the data in such a that same object. It might be unsuitable to cache the data in such a
way that the cached data can be used for retrieval of this URI from way that the cached data can be used for retrieval of this URI from
other messages or message parts than those included in the same message sources other than body parts included in the same multipart/related
as the Content-Location header. Because of this problem, receiving User structure as the Content-Location header. Because of this problem,
Agents SHOULD not cache this data in the same way that data that was receiving User Agents SHOULD not cache this data in the same way that
retrieved through an HTTP or FTP request might be cached. data that was retrieved through an HTTP or FTP request might be cached.
URIs, especially File URIs, may in their name contain company-internal URIs, especially File URIs, may in their name contain company-internal
information, which may then inadvertently be revealed to recipients of information, which may then inadvertently be revealed to recipients of
documents containing such URIs. documents containing such URIs.
One way of implementing messages with linked body parts is to handle One way of implementing messages with URI linked body parts is to
the linked body parts in a combined mail and WWW proxy server. The mail handle the linked body parts in a combined mail and WWW proxy server.
client is only given the start body part, which it passes to a web The mail client is only given the start body part, which it passes to a
browser. This web browser requests the linked parts from the proxy web browser. This web browser requests the linked parts from the proxy
server. If this method is used, and if the combined server is used by server. If this method is used, and if the combined server is used by
more than one user, then methods must be employed to ensure that body more than one user, then methods must be employed to ensure that body
parts of a message to one person is not retrievable by another person. parts of a message to one person is not retrievable by another person.
Use of passwords (also known as tickets or magic cookies) is one way of Use of passwords (also known as tickets or magic cookies) is one way of
achieving this. Note that some caching WWW proxy servers may not achieving this. Note that some caching WWW proxy servers may not
distinguish between cached objects from email and HTTP, which may be a distinguish between cached objects from email and HTTP, which may be a
security risk. security risk.
In addition, by allowing people to mail aggregate objects, we are In addition, by allowing people to mail aggregate objects, we are
opening the door to other potential security problems that until now opening the door to other potential security problems that until now
skipping to change at line 1052 skipping to change at line 1079
parts have been merged into a single description, by specifying that parts have been merged into a single description, by specifying that
relative URIs which cannot be resolved otherwise should be handled as relative URIs which cannot be resolved otherwise should be handled as
if they had been given imaginary URL "this_message:/". if they had been given imaginary URL "this_message:/".
14. Copyright 14. Copyright
Copyright (C) The Internet Society (date). All Rights Reserved. Copyright (C) The Internet Society (date). All Rights Reserved.
This document and translations of it may be copied and furnished to This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it or others, and derivative works that comment on or otherwise explain it or
assist in its implmentation may be prepared, copied, published and assist in its implementation may be prepared, copied, published and
distributed, in whole or in part, without restriction of any kind, distributed, in whole or in part, without restriction of any kind,
provided that the above copyright notice and this paragraph are provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing the document itself may not be modified in any way, such as by removing the
copyright notice or references to the Internet Society or other copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of developing Internet organizations, except as needed for the purpose of developing
Internet standards in which case the procedures for copyrights defined Internet standards in which case the procedures for copyrights defined
in the Internet Standards process must be followed, or as required to in the Internet Standards process must be followed, or as required to
translate it into languages other than English. translate it into languages other than English.
skipping to change at line 1098 skipping to change at line 1125
--------- -------------------------------------------------------- --------- --------------------------------------------------------
[CONDISP] R. Troost, S. Dorner: "Communicating Presentation [CONDISP] R. Troost, S. Dorner: "Communicating Presentation
Information in Internet Messages: The Information in Internet Messages: The
Content-Disposition Header", RFC 1806, June 1995. Content-Disposition Header", RFC 1806, June 1995.
[HOSTS] R. Braden (editor): "Requirements for Internet Hosts -- [HOSTS] R. Braden (editor): "Requirements for Internet Hosts --
Application and Support", STD-3, RFC 1123, October 1989. Application and Support", STD-3, RFC 1123, October 1989.
[HTML-I18N] F. Yergeau, G. Nicol, G. Adams, & M. Duerst: [HTML-I18N] F. Yergeau, G. Nicol, G. Adams, & M. Duerst:
"Internationalization of the Hypertext Markup "Internationalization of the Hypertext Markup Language".
Language". RFC 2070, January 1997. RFC 2070, January 1997.
[HTML2] T. Berners-Lee, D. Connolly: "Hypertext Markup Language [HTML2] T. Berners-Lee, D. Connolly: "Hypertext Markup Language
- 2.0", RFC 1866, November 1995. - 2.0", RFC 1866, November 1995.
[HTTP] T. Berners-Lee, R. Fielding, H. Frystyk: Hypertext [HTTP] T. Berners-Lee, R. Fielding, H. Frystyk: Hypertext
Transfer Protocol -- HTTP/1.0. RFC 1945, May 1996. Transfer Protocol -- HTTP/1.0. RFC 1945, May 1996.
[MD5] R. Rivest: "The MD5 Message-Digest Algorithm", RFC 1321, [MD5] R. Rivest: "The MD5 Message-Digest Algorithm", RFC 1321,
April 1992. April 1992.
[MIDCID] E. Levinson: Message/External-Body Content-ID [MIDCID] E. Levinson: Message/External-Body Content-ID
Access"Message/External-Body Content-ID and Message-ID Access"Message/External-Body Content-ID and Message-ID
Uniform Resource Locators", draft-ietf-mhtml-cid-v2- Uniform Resource Locators", draft-ietf-mhtml-cid-v2-
00.txt, July 1997. 00.txt, July 1997.
[MIME1] N. Freed, N. Borenstein, "Multipurpose Internet Mail [MIME1] N. Freed, N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message Extensions (MIME) Part One: Format of Internet Message
Bodies", RFC 2045, December 1996. Bodies", RFC 2045, December 1996.
. .
[MIME-IMB] N. Freed & N. Borenstein [MIME-IMB] N. Freed & N. Borenstein, "Multipurpose Internet Mail
"Multipurpose Internet Mail Extensions (MIME) Part Extensions (MIME) Part One: Format of Internet Message
One: Format of Internet Message Bedies". RFC 2045, Bedies". RFC 2045, November 1996.
November 1996.
[MIME2] N. Freed, N. Borenstein, "Multipurpose Internet Mail [MIME2] N. Freed, N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", RFC 2046, Extensions (MIME) Part Two: Media Types", RFC 2046,
December 1996. December 1996.
[MIME3] K. Moore, "MIME (Multipurpose Internet Mail Extensions) [MIME3] K. Moore, "MIME (Multipurpose Internet Mail Extensions)
Part Three: Message Header Extensions for Non-ASCII Part Three: Message Header Extensions for Non-ASCII
Text", RFC 2047, December 1996. Text", RFC 2047, December 1996.
[MIME1] N. Borenstein & N. Freed: "MIME (Multipurpose Internet [MIME1] N. Borenstein & N. Freed: "MIME (Multipurpose Internet
skipping to change at line 1162 skipping to change at line 1188
[REL] Edward Levinson: "The MIME [REL] Edward Levinson: "The MIME
Multipart/Related"multipart/related" Content-Type", Multipart/Related"multipart/related" Content-Type",
draft-ietf-mhtml-re-v2-00.txt, September 1997. draft-ietf-mhtml-re-v2-00.txt, September 1997.
[RELURL] R. Fielding: "Relative Uniform Resource Locators", RFC [RELURL] R. Fielding: "Relative Uniform Resource Locators", RFC
1808, June 1995. 1808, June 1995.
[RFC822] D. Crocker: "Standard for the format of ARPA Internet [RFC822] D. Crocker: "Standard for the format of ARPA Internet
text messages." STD 11, RFC 822, August 1982. text messages." STD 11, RFC 822, August 1982.
[SGML] ISO 8879. Information Processing -- Text and Office - [SGML] ISO 8879. Information Processing -- Text and Office -
Standard Generalized Markup Language (SGML), Standard Generalized Markup Language (SGML), 1986.
1986. <URL:http://www.iso.ch/cate/d16387.html> <URL:http://www.iso.ch/cate/d16387.html>
[SMTP] J. Postel: "Simple Mail Transfer Protocol", STD 10, RFC [SMTP] J. Postel: "Simple Mail Transfer Protocol", STD 10, RFC
821, August 1982. 821, August 1982.
[URL] T. Berners-Lee, L. Masinter, M. McCahill: "Uniform [URL] T. Berners-Lee, L. Masinter, M. McCahill: "Uniform
Resource Locators (URL)", RFC 1738, December 1994. Resource Locators (URL)", RFC 1738, December 1994.
[URLBODY] N. Freed and Keith Moore: "Definition of the URL MIME [URLBODY] N. Freed and Keith Moore: "Definition of the URL MIME
External-Body Access-Type", RFC 2017, October 1996. External-Body Access-Type", RFC 2017, October 1996.
 End of changes. 114 change blocks. 
360 lines changed or deleted 386 lines changed or added

This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/