draft-ietf-mhtml-rev-00.txt   draft-ietf-mhtml-rev-01.txt 
Network Working Group Jacob Palme Network Working Group Jacob Palme
Internet Draft Stockholm University/KTH Internet Draft Stockholm University/KTH
draft-ietf-mhtml-rev-00.txt Alexander Hopmann draft-ietf-mhtml-rev-01.txt Alexander Hopmann
IETF status: Standards track Microsoft Corporation IETF status to be: Proposed standard Microsoft Corporation
Revises: RFC 2110
Expires: March 1998 September 1997
MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML) MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)
Status of this Document Status of this Document
This document is an Internet-Draft. Internet-Drafts are working This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, and documents of the Internet Engineering Task Force (IETF), its areas, and
its working groups. Note that other groups may also distribute working its working groups. Note that other groups may also distribute working
documents as Internet-Drafts. documents as Internet-Drafts.
skipping to change at line 38 skipping to change at line 40
Although HTML [RFC 1866] was designed within the context of MIME, Although HTML [RFC 1866] was designed within the context of MIME,
more than the specification of HTML as defined in RFC 1866 is needed more than the specification of HTML as defined in RFC 1866 is needed
for two electronic mail user agents to be able to interoperate using for two electronic mail user agents to be able to interoperate using
HTML as a document format. These issues include the naming of HTML as a document format. These issues include the naming of
objects that are normally referred to by URIs, and the means of objects that are normally referred to by URIs, and the means of
aggregating objects that go together. This document describes a set aggregating objects that go together. This document describes a set
of guidelines that will allow conforming mail user agents to be able of guidelines that will allow conforming mail user agents to be able
to send, deliver and display these objects, such as HTML objects, to send, deliver and display these objects, such as HTML objects,
that can contain links represented by URIs. In order to be able to that can contain links represented by URIs. In order to be able to
handle inter-linked objects, the document uses the MIME type handle inter-linked objects, the document uses the MIME type
multipart/related and specifies the MIME content-headers 'multipart/related' and specifies the MIME content-headers
"Content-Location" and "Content-Base". 'Content-Location' and 'Content-Base'.
Temporary note
This is a revision of RFC 2110 to take into account problems which have Differences compared to the previous version of this proposed
cropped up by developers when developing software adhering to RFC 2110. standard, published in RFC 2110, are summarized in chapter 13.
RFC 2110 is an IETF Proposed Standard, and the intention is that this
document, possibly after more revisions, will either be submitted as a
revised Proposed Standard or as a Draft Standard.
Table of Contents Table of Contents
1. Introduction 1. Introduction
2. Terminology 2. Terminology
2.1 Conformance requirement terminology 2.1 Conformance requirement terminology
2.2 Other terminology 2.2 Other terminology
3. Overview 3. Overview
4. The Content-Location and Content-Base MIME Content Headers 4. The Content-Location and Content-Base MIME Content Headers
4.1 MIME content headers 4.1 MIME content headers
4.2 The Content-Base header 4.2 The Content-Location Header
4.3 The Content-Location Header 4.3 The Content-Base header
4.4 Encoding of URIs in e-mail headers 4.4 Encoding of URIs in MIME headers
5. Base URIs for resolution of relative URIs 5. Base URIs for resolution of relative URIs
6. Sending documents without linked objects 6. Sending documents without linked objects
7. Use of the Content-Type: Multipart/related 7. Use of the Content-Type: "multipart/related"
8. Format of Links to Other Body Parts 8. Usage of Links to Other Body Parts
8.1 General principle 8.1 General principle
8.2 Use of the Content-Location header 8.2 Resolution of hyperlinks in text/HTML body parts
8.3 Use of the Content-ID header and CID URLs 8.3 Use of the Content-ID header and CID URLs
8.4 Conformance requirement on receipt
9. Examples 9. Examples
9.1 Example of a HTML body without included linked objects 9.1 Example of a HTML body without included linked objects
9.2 Example with absolute URIs to an embedded GIF picture 9.2 Example with absolute URIs to an embedded GIF picture
9.3 Example with relative URIs to an embedded GIF picture 9.3 Example with relative URIs to an embedded GIF picture
9.4 Example using CID URL and Content-ID header to an embedded GIF 9.4 Example with relative URIs and no BASE available
9.5 Example using a BASE on the Multipart
9.6 Example using CID URL and Content-ID header to an embedded GIF
picture picture
10. Content-Disposition header 10. Content-Disposition header
11. Character encoding issues and end-of-line issues 11. Character encoding issues and end-of-line issues
12. Security Considerations 12. Security Considerations
13. Robustness Principle 13. Differences as compared to the previous version of this proposed
13.1 Content of the "type" parameter to Content-Type: standard in RFC 2110
Multipart/related
13.2 Quoting of the "type" parameter to Content-Type:
Multipart/related
13.3 Quoting of the "start" parameter to Content-Type:
Multipart/related and the value of the Message-ID and Content-
ID header
13.4 Content-Base and Content-Location on Multipart Content
headings
14. Acknowledgments 14. Acknowledgments
15. References 15. References
16. Author's Addresses 16. Author's Addresses
Mailing List Information Mailing List Information
To write contributions To write contributions
Further discussion on this document should be done through the Further discussion on this document should be done through the
mailing list MHTML@SEGATE.SUNET.SE. mailing list MHTML@SEGATE.SUNET.SE.
skipping to change at line 129 skipping to change at line 121
FTP://SEGATE.SUNET.SE/lists/mhtml/ FTP://SEGATE.SUNET.SE/lists/mhtml/
The archives are available for browsing from The archives are available for browsing from
HTTP://segate.sunet.se/archives/mhtml.html HTTP://segate.sunet.se/archives/mhtml.html
and in searchable format from and in searchable format from
http://www.reference.com/cgi-bin/pn/ http://www.reference.com/cgi-bin/pn/
listarch?list=MHTML@segate.sunet.se listarch?list=MHTML@segate.sunet.se
Finally, thhe archives are available by e-mail. Send a message to Finally, the archives are available by e-mail. Send a message to
LISTSERV@SEGATE.SUNET.SE with the text "INDEX MHTML" to get a list LISTSERV@SEGATE.SUNET.SE with the text "INDEX MHTML" to get a list
of the archive files, and then a new message "GET <file name>" to of the archive files, and then a new message "GET <file name>" to
retrieve the archive files. retrieve the archive files.
More information More information
Information about the IETF work in developing this standard may Information about the IETF work in developing this standard may
also be available at URL: also be available at URL:
HTTP://www.dsv.su.se/~jpalme/ietf/jp-ietf-home.html#mhtml HTTP://www.dsv.su.se/~jpalme/ietf/mhtml.html
It is the intention to set up a collection of test messages at the
above URL, but no such test collection exists when this is written
(August 1997).
1. Introduction 1. Introduction
There are a number of document formats, Hypertext Markup Language There are a number of document formats, Hypertext Markup Language
[HTML2], Portable Document format [PDF] and Virtual Reality Markup [HTML2], Portable Document format [PDF] and Virtual Reality Markup
Language [VRML] for example, which provide links using URIs for their Language [VRML] for example, which provide links using URIs for their
resolution. There is an obvious need to be able to send documents in resolution. There is an obvious need to be able to send documents in
these formats in e-mail [SMTP], [RFC822]. This document gives these formats in e-mail [SMTP], [RFC822]. This document gives
additional specifications on how to send such documents in MIME [MIME1 additional specifications on how to send such documents in MIME [MIME1
to MIME5] e-mail messages. This version of this standard was based on to MIME5] e-mail messages. This version of this standard was based on
skipping to change at line 168 skipping to change at line 164
hypertext links. When mailing such a document, it is often desirable to hypertext links. When mailing such a document, it is often desirable to
also mail all of the additional resources that are referenced in it; also mail all of the additional resources that are referenced in it;
those elements are necessary for the complete interpretation of the those elements are necessary for the complete interpretation of the
primary object. primary object.
An alternative way for sending an HTML document or other object An alternative way for sending an HTML document or other object
containing URIs in e-mail is to only send the URL, and let the containing URIs in e-mail is to only send the URL, and let the
recipient look up the document using HTTP. That method is described in recipient look up the document using HTTP. That method is described in
[URLBODY] and is not described in this document. [URLBODY] and is not described in this document.
An informational RFC will at a later time be published as a supplement An informational RFC will be published as a supplement to this
to this standard. The informational RFC will discuss implementation standard. The informational RFC will discuss implementation methods and
methods and some implementation problems. Implementors are recommended some implementation problems. Implementors are recommended to read this
to read this informational RFC when developing implementations of the informational RFC when developing implementations of the MHTML
MHTML standard. This informational RFC is, when this RFC is published, standard. This informational RFC is, when this RFC is published, still
still in IETF draft status, and will stay that way for at least six in IETF draft status.
months in order to gain more implementation experience before it is
published.
2. Terminology 2. Terminology
2.1 Conformance requirement terminology 2.1 Conformance requirement terminology
This specification uses the same words as the Requirement for Internet This specification uses the same words as the Requirement for Internet
Hosts [HOSTS] for defining the significance of each particular Hosts [HOSTS] for defining the significance of each particular
requirement. These words are: requirement. These words are:
MUST This word or the adjective "required" means that the item is MUST This word or the adjective "required" means that the item is
skipping to change at line 200 skipping to change at line 194
item, but the full implications should be understood and the item, but the full implications should be understood and the
case carefully weighed before choosing a different course. case carefully weighed before choosing a different course.
MAY This word or the adjective "optional" means that this item is MAY This word or the adjective "optional" means that this item is
truly optional. One vendor may choose to include the item truly optional. One vendor may choose to include the item
because a particular marketplace requires it or because it because a particular marketplace requires it or because it
enhances the product, for example; another vendor may omit the enhances the product, for example; another vendor may omit the
same item. same item.
An implementation is not compliant if it fails to satisfy one or more An implementation is not compliant if it fails to satisfy one or more
of of the MUST requirements for the protocols it implements. An
the MUST requirements for the protocols it implements. An implementation that satisfies all the MUST and all the SHOULD
implementation requirements for its protocols is said to be "unconditionally
that satisfies all the MUST and all the SHOULD requirements for its compliant"; one that satisfies all the MUST requirements but not all
protocols is said to be "unconditionally compliant"; one that satisfies the SHOULD requirements for its protocols is said to be "conditionally
all the MUST requirements but not all the SHOULD requirements for its compliant."
protocols is said to be "conditionally compliant."
2.2 Other terminology 2.2 Other terminology
Most of the terms used in this document are defined in other RFCs. Most of the terms used in this document are defined in other RFCs.
Absolute URI, See Relative Uniform Resource Locators [RELURL]. Absolute URI, See Relative Uniform Resource Locators [RELURL].
AbsoluteURI AbsoluteURI
CID See Message/External Body Content-ID [MIDCID]. CID See Message/External Body Content-ID [MIDCID].
Content-Base See section 4.2 below. Content-Base See section 4.2 below.
Content-ID See Message/External Body Content-ID [MIDCID]. Content-ID See Message/External Body Content-ID [MIDCID].
Content-Location MIME message or content part header with the URI of Content-Location MIME message or content part header with the URI of
the MIME message or content part body, defined in the MIME message or content part body, defined in
section 4.3 below. section 4.3 below.
Content-Transfer-Enco Conversion of a text into 7-bit octets as specified Content-Transfer- Conversion of a text into 7-bit octets as specified
ding in [MIME1] chapter 6. Encoding in [MIME1] chapter 6.
CR See [RFC822]. CR See [RFC822].
CRLF See [RFC822]. CRLF See [RFC822].
Displayed text The text shown to the user reading a document with Displayed text The text shown to the user reading a document with
a web browser. This may be different from the HTML a web browser. This may be different from the HTML
markup, see the definition of HTML markup below. markup, see the definition of HTML markup below.
Header Field in a message or content heading specifying Header Field in a message or content heading specifying
skipping to change at line 312 skipping to change at line 305
4. The Content-Location and Content-Base MIME Content Headers 4. The Content-Location and Content-Base MIME Content Headers
4.1 MIME content headers 4.1 MIME content headers
In order to resolve URI references to other body parts, two MIME In order to resolve URI references to other body parts, two MIME
content headers are defined, Content-Location and Content-Base. Both content headers are defined, Content-Location and Content-Base. Both
these headers can occur in any message or content heading, and will these headers can occur in any message or content heading, and will
then be valid within this heading and for its immediate content. then be valid within this heading and for its immediate content.
These two headers are valid only for exactly the content heading or These two headers are valid for the content heading or message heading
message heading where they occur and its text. They are thus not valid where they occur and its text. If they occur in multipart headings,
for the parts inside multipart headings. They are allowed, but cannot they apply to its body parts only in that they can be used to derive a
be used for resolution, when they occur in multipart headings. base for relative URIs in the body parts, but only if no such base is
provided in the body part itself.
These two headers may occur both inside and outside of a These two headers may occur on any message or content heading, but
Multipart/related part, but their usage for handling HTML links between their usage for handling hyperlinks between body parts in a message
body parts in a message SHOULD only occur inside Multipart/related. SHOULD only occur inside the same "multipart/related".
In practice, at present only those URIs which are URLs are used, but it In practice, at present only those URIs which are URLs are used, but it
is anticipated that other forms of URIs will in the future be used. is anticipated that other forms of URIs will in the future be used.
The syntax for these headers is, using the syntax definition tools from The syntax for these headers is, using the syntax definition tools from
[RFC822]: [RFC822]:
content-location ::= "Content-Location:" content-location = "Content-Location:"
( absoluteURI | relativeURI ) ( absoluteURI | relativeURI )
content-base ::= "Content-Base:" absoluteURI content-base = "Content-Base:" absoluteURI
where URI is at present (June 1996) restricted to the syntax for URLs where URI is at present (June 1996) restricted to the syntax for URLs
as defined in Unform Resource Locators [URL]. as defined in Unform Resource Locators [URL].
4.2 The Content-Base header 4.2 The Content-Location Header
The Content-Base gives a base for relative URIs occurring in other
heading fields and in HTML documents which do not have any BASE element
in its HTML code. Its value MUST be an absolute URI.
Example showing which Content-Base is valid where:
Content-Type: Multipart/related; boundary="boundary-example-1";
type="Text/HTML"; start=<foo2*foo3@bar2.net>
; A Content-Base header is allowed here, but is not valid
; for resolution of relative URL-s in Part 1 and Part 2.
; A Content-Base header here would thus be rather meaningless.
--boundary-example-1
Part 1:
Content-Type: Text/HTML; charset=US-ASCII
Content-ID: <foo2*foo3@bar2.net>
Content-Location: http://www.ietf.cnir.reston.va.us/foo1.bar1
; This Content-Location must contain an absolute URI, since no base
; is valid here. A combination of Content-Base with an absolute
; URL and a Content-Location with a relative URL would also be
; allowed here.
<FRAME NAME=topwindow src="/frames/foo2.bar2">
--boundary-example-1
Part 2:
Content-Type: Text/HTML; charset=US-ASCII
Content-ID: <foo4*foo5@bar2.net>
Content-Location: foo2.bar2 ; The Content-Base below applies to
; this relative URI
Content-Base: http://www.ietf.cnri.reston.va.us/frames/
<A HREF="http://www.ietf.cnir.reston.va.us/foo1.bar1">
To top window </A>
--boundary-example-1--
Note: If there is both a Content-ID and a Content-Location header on
the same body parts, then these will indicate two different, equally
valid references for this body part, and any of them may be used in
other body parts within the Multipart/related to refer to such a body
part.
4.3 The Content-Location Header
The Content-Location header specifies the URI that corresponds to the The Content-Location header specifies the URI that corresponds to the
content of the body part in whose heading the header is placed. Its content of the body part in whose heading the header is placed. Its
value CAN be an absolute or relative URI. Any URI or URL scheme may be value CAN be an absolute or relative URI. Any URI or URL scheme may be
used, but use of non-standardized URI or URL schemes might entail some used, but use of non-standardized URI or URL schemes might entail some
risk that recipients cannot handle them correctly. risk that recipients cannot handle them correctly.
The Content-Location header can be used to indicate that the data sent The Content-Location header can be used to indicate that the data sent
under this heading is also retrievable, in identical format, through under this heading is also retrievable, in identical format, through
normal use of this URI. If used for this purpose, it must contain an normal use of this URI. If used for this purpose, it must contain an
absolute URI or be resolvable, through a Content-Base header, into an absolute URI or be resolvable, through a Content-Base header, into an
absolute URI. In this case, the information sent in the message can be absolute URI. In this case, the information sent in the message can be
seen as a cached version of the original data. seen as a cached version of the original data.
The URI in the Content-Location header may, but need not refer to an
object which is actually available globally for retrieval using this
URI (after resolution of relative URIs). However, URI-s in
Content-Location headers (if absolute, or resolvable to absolute URIs)
SHOULD still be globally unique.
The header can also be used for data which is not available to some or The header can also be used for data which is not available to some or
all recipients of the message, for example if the header refers to an all recipients of the message, for example if the header refers to an
object which is only retrievable using this URI in a restricted domain, object which is only retrievable using this URI in a restricted domain,
such as within a company-internal web space. The header can even such as within a company-internal web space. The header can even
contain a fictious URI and need in that case not be globally unique. contain a fictious URI and need in that case not be globally unique.
There MUST only be a single Content-Location header in each message or
content-heading, and its value is a single URI. Note however, that both
one Content-Location and one Content-ID or Message-ID header are
allowed. In such a case, these will indicate two different, equally
valid references for this body part, and any of them may be used in
other body parts within one "multipart/related" to refer to this body
part.
Example: Example:
Content-Type: Multipart/related; boundary="boundary-example-1"; Content-Type: "multipart/related"; boundary="boundary-example-1";
type="Text/HTML" type="Text/HTML"
--boundary-example-1 --boundary-example-1
Part 1: Part 1:
Content-Type: Text/HTML; charset=US-ASCII Content-Type: Text/HTML; charset=US-ASCII
... ... <IMG SRC="fiction1/fiction2"> ... ... ... ... <IMG SRC="fiction1/fiction2"> ... ...
--boundary-example-1 --boundary-example-1
Part 2: Part 2:
Content-Type: Text/HTML; charset=US-ASCII Content-Type: Text/HTML; charset=US-ASCII
Content-Location: fiction1/fiction2 Content-Location: fiction1/fiction2
--boundary-example-1-- --boundary-example-1--
4.4 Encoding of URIs in e-mail headers 4.3 The Content-Base header
The Content-Base gives a base for relative URIs occurring in other
fields in the same content heading and in the body text covered by this
content heading, if the text is a HTML documents which does not have
any BASE element in its HTML code. Its value MUST be an absolute URI.
The full text of the Content-Base header is used as a base, even if it
does not end in a "/". Thus: "Content-Base: http://foo.bar/" and
"Content-Base: http://foo.bar" are identical.
Example showing which Content-Base is valid where:
Content-Type: "multipart/related"; boundary="boundary-example-1";
type="Text/HTML"; start=<foo2*foo3@bar2.net>
; A Content-Base header is allowed here, and can be used
; for resolution of relative URL-s in Part 1 and Part 2,
; if these did not have any absolute base of their own.
; However, both part 1 and part 2 below have an absolute
; base, in part 1 through an absolute Content-Location header,
; in part 2 through a Content-Base header, and thus a Content-
; base up here would not be used for resoultion of relative
; URLs within the body parts 1 and 2.
--boundary-example-1
Part 1:
Content-Type: Text/HTML; charset=US-ASCII
Content-ID: <foo2*foo3@bar2.net>
Content-Location: http://www.ietf.cnri.reston.va.us/foo1.bar1
; Since this Content-Location contains an absolute URL, it
; does not need to be resolved using any Content-Base header.
; A combination of a Content-Location with a relative URL
; and a Content-Base with an absolute URL would also be valid,
; as well as only a Content-Location with a relative URL
; and resolved through the Content-Base in the surrounding
; multipart heading.
<FRAME NAME=topwindow src="/frames/foo2.bar2">
--boundary-example-1
Part 2:
Content-Type: Text/HTML; charset=US-ASCII
Content-ID: <foo4*foo5@bar2.net>
Content-Location: foo2.bar2 ; The Content-Base below applies to
; this relative URI
Content-Base: http://www.ietf.cnri.reston.va.us/frames/
<A HREF="http://www.ietf.cnri.reston.va.us/foo1.bar1">
To top window </A>
--boundary-example-1--
4.4 Encoding of URIs in MIME headers
4.4.1 Handling of URIs containing inappropriate characters
Some documents may contain URIs with characters that are inappropriate
for an RFC 822 header, either because the URI itself has an incorrect
syntax according to [URL] or the URI syntax standard has been changed
to allow characters not previously allowed in MIME headers. These URIs
cannot be sent directly in a mail header. There are two approaches that
can be taken when encountering such a URI as the text to be placed in a
Content-Location or Content-Base header:
a) In some situations, an implementation might be able to replace the
URL with one that can be sent directly. This might be accomplished, for
example, by using the encoding method of [URL] to replace inappropriate
characters within the URL with ones encoded using the %nn encoding.
This replacement MUST in that case be done both in the header and in
the HTML text which has a hyperlink which is to match the header. Since
the change is done in both places, a receiving mailer need not decode
it, and MUST NOT decode [URL]-encoding before matching hyperlinks to
body parts.
b) The URL might be encoded using the method described in [MIME3]. This
replacement MUST only be done in the header, not in the HTML text.
Receiving clients must decode the [MIME3] encoding in the heading
before comparing hyperlinks in body text to URLs in Content-Location
headers.
With method (b), the charset parameter value "US-ASCII" SHOULD be used
if the URL contains no octets outside of the 7-bit range. If such
octets are present, the correct charset parameter value (derived e.g.
from information about the HTML document the URL was found in) SHOULD
be used. If this cannot be safely established, the value "UKNOWN-8BIT"
[RFC 1428] MUST be used.
Note that for the MHTML processing of (matching URLs in body text to
URL in) Content-Location headers the value of the charset parameter is
irrelevant, but it may be relevant for other purposes, and incorrect
labeling MUST therefore be avoided.
Caution should be taken in using method (a), since, in general, this
encoding can not be applied safely to characters that are used for
reserved purposes within the URL scheme. In addition, changing the HTML
body which contains the URL might invalidate a message integrity check.
Because of these problems, this method SHOULD only be used if it is
performed in cooperation with the author/owner of the documents
involved.
4.4.2 Folding of long URIs
Since MIME header fields have a limited length and URIs can get quite Since MIME header fields have a limited length and URIs can get quite
long, these lines may have to be folded. If such folding is done, the long, these lines may have to be folded.
algorithm defined in [URLBODY] section 3.1 should be employed.
Encoding as discussed in clause 4.4.1 must be done before such folding.
After that, the folding can be done, using the algorithm defined in
[URLBODY] section 3.1.
5. Base URIs for resolution of relative URIs 5. Base URIs for resolution of relative URIs
Relative URIs inside contents of MIME body parts are resolved relative Relative URIs inside contents of MIME body parts are resolved relative
to a base URI. In order to determine this base URI, the to a base URI using the methods for resolving relative URIs described
first-applicable method in the following list applies. in [RELURL]. In order to determine this base URI, the first-applicable
method in the following list applies.
(a) There is a base specification inside the MIME body part (a) There is a base specification inside the MIME body part containing
containing the link which resolves relative URIs into absolute the link which resolves relative URIs into absolute URIs. For
URIs. For example, HTML provides the BASE element for this. example, HTML provides the BASE element for this.
(b) There is a Content-Base header (as defined in section 4.2), in (b) There is a Content-Base header (as defined in section 4.2), in the
the immediately surrounding content heading, specifying the base immediately surrounding content heading, specifying the base to be
to be used. used.
(c) There is a Content-Location header in the immediately (c) There is a Content-Location header in the immediately surrounding
surrounding heading of the body part which can then serve as the heading of the body part which contains an absolute URI and can
base in the same way as the requested URI can serve as a base then serve as the base in the same way as the requested URI can
for relative URIs within a file retrieved via HTTP [HTTP]. serve as a base for relative URIs within a file retrieved via HTTP
[HTTP].
When the methods above do not yield an absolute URI the procedure in (d) Step (b) and (c) can be repeated recursively on Content-Base and
section 8.2 for matching relative URIs MUST be followed. Content-Location headers in surrounding multi-part headings.
However, a base from an absolute Content-Location in an inner
heading takes precedence over a base from a Content-Base or a
Content-Location in a surrounding heading.
When the methods above do not yield an absolute URI matching of two
relative URIs against each other can still be done for matches within a
multipart/related. This matching is done as if they had been given as
base an imaginary URL "This_message:/", which exists for the sole
purpose of resolving relative references within a multipart entitity.
This is also described in other words in section 8.2 below.
6. Sending documents without linked objects 6. Sending documents without linked objects
If a document, such as an HTML object, is sent without other objects, If a document, such as an HTML object, is sent without other objects,
to which it is linked, it MAY be sent as a Text/HTML body part by to which it is linked, it MAY be sent as a Text/HTML body part by
itself. In this case, multipart/related need not be used. itself. In this case, "multipart/related" need not be used.
Such a document may either not include any links, or contain links Such a document may either not include any links, or contain links
which the recipient resolves via ordinary net look up, or contain links which the recipient resolves via ordinary net look up, or contain links
which the recipient cannot resolve. which the recipient cannot resolve.
Inclusion of links which the recipient has to look up through the net Inclusion of links which the recipient has to look up through the net
may not work for some recipients, since all e-mail recipients do not may not work for some recipients, since all e-mail recipients do not
have full internet connectivity. Also, such links may work for the have full internet connectivity. Also, such links may work for the
sender but not for the recipient, for example when the link refers to sender but not for the recipient, for example when the link refers to
an URI within a company-internal network not accessible from outside an URI within a company-internal network not accessible from outside
the company. the company.
Note that documents with links that the recipient cannot resolve MAY be Note that documents with links that the recipient cannot resolve MAY be
sent, although this is discouraged. For example, two persons developing sent, although this is discouraged. For example, two persons developing
a new HTML page may exchange incomplete versions. a new HTML page may exchange incomplete versions.
7. Use of the Content-Type: Multipart/related 7. Use of the Content-Type: "multipart/related"
If a message contains one or more MIME body parts containing links and If a message contains one or more MIME body parts containing links and
also contains as separate body parts, data, to which these links (as also contains as separate body parts, data, to which these links (as
defined, for example, in HTML 2.0 [HTML2]) refers, then this whole set defined, for example, in HTML 2.0 [HTML2]) refers, then this whole set
of body parts (referring body parts and referred-to body parts) SHOULD of body parts (referring body parts and referred-to body parts) SHOULD
be sent within a multipart/related body part as defined in [REL]. be sent within a "multipart/related" body part as defined in [REL].
The root body part of the multipart/related SHOULD be the start object Even though Content-Location and Content-Base can occur without
for rendering the object, such as a text/html object, and which multipart/related, this standard only covers their use for resolution
of links between body parts inside one multipart/related. This standard
does not cover links from one multipart/related to another
multipart/related in a message containing multiple multipart/related
objects.
The root body part of the "multipart/related" SHOULD be the start
object for rendering the object, such as a text/html object, and which
contains links to objects in other body parts, or a contains links to objects in other body parts, or a
multipart/alternative of which at least one alternative resolves to multipart/alternative of which at least one alternative resolves to
such a start object. Implementors are warned, however, that many mail such a start object. Implementors are warned, however, that some mail
programs treat multipart/alternative as if it had been multipart/mixed programs treat multipart/alternative as if it had been multipart/mixed
(even though MIME [MIME1] requires support for multipart/alternative). (even though MIME [MIME1] requires support for multipart/alternative).
[REL] specifies that the type attribute is mandatory in Content-Type: [REL] specifies that the type attribute is mandatory in Content-Type:
Multipart/related" headers, and requires that the this attribute be the "multipart/related" headers, and requires that this attribute be the
type of the root object, and this value shall thus for example be type of the root object, and this value shall thus for example be
"multipart/alternative", if the root part is of Content-type "multipart/alternative", if the root part is of Content-type
"multipart/alternative", even if one of the subparts of the "multipart/alternative", even if one of the subparts of the
"multipart/alternative" is of type "text/html". If the root is not the "multipart/alternative" is of type "text/html". If the root is not the
first body part within the multipart/related, [REL] further requires first body part within the "multipart/related", [REL] further requires
that its Content-ID MUST be given in a start parameter to the that its Content-ID MUST be given in a start parameter to the
"Content-Type: Multipart/related" header. "Content-Type: "multipart/related" header.
When presenting the root body part to the user, the additional body When presenting the root body part to the user, the additional body
parts within the multipart/related can be used: parts within the "multipart/related" can be used:
(a) For those recipients who only have e-mail but not full (a) For those recipients who only have e-mail but not full Internet
Internet access. access.
(b) For those recipients who for other reasons, such as firewalls (b) For those recipients who for other reasons, such as firewalls or
or the use of company-internal links, cannot retrieve the the use of company-internal links, cannot retrieve the linked body
linked body parts through the net. parts through the net.
Note that this means that you can, via e-mail, send HTML which Note that this means that you can, via e-mail, send HTML which
includes URIs which the recipient cannot resolve via HTTPor includes URIs which the recipient cannot resolve via HTTPor other
other connectivity-requiring URIs. connectivity-requiring URIs.
(c) For items which are not available on the web. (c) To send a document in a format which is preserved even if the
object to which the hyperlinks refer through HTTP is later changed
or deleted.
(d) For any recipient to speed up access. (d) For items which are not available on the web.
The type parameter of the "Content-Type: Multipart/related" MUST be the (e) For any recipient to speed up access.
same as the Content-Type of its root.
The type parameter of the "Content-Type: "multipart/related" MUST be
the same as the Content-Type of its root.
When a sending MUA sends objects which were retrieved from the WWW, it When a sending MUA sends objects which were retrieved from the WWW, it
SHOULD maintain their WWW URIs. It SHOULD not transform these URIs into SHOULD maintain their WWW URIs. It SHOULD not transform these URIs into
some other URI form prior to transmitting them. This will allow the some other URI form prior to transmitting them. This will allow the
receiving MUA to both verify MICs included with the email message, as receiving MUA to both verify MICs included with the email message, as
well as verify the documents against their WWW counterpoints. well as verify the documents against their WWW counterpoints.
In certain special cases this will not work if the original HTML In certain special cases this will not work if the original HTML
document contains URIs as parameters to objects and applets. In such a document contains URIs as parameters to objects and applets. In such a
case, it might be better to rewrite the document before sending it. case, it might be better to rewrite the document before sending it.
This problem is discussed in more detail in the informational RFC which This problem is discussed in more detail in the informational RFC which
will be published as a supplement to this standard. will be published as a supplement to this standard.
This standard does not cover the case where a multipart/related This standard does not cover the case where a "multipart/related"
contains links to MIME body parts outside of the current contains links to MIME body parts outside of the current
multipart/related or in other MIME messages, even if methods similar to "multipart/related" or in other MIME messages, even if methods similar
those described in this standard are used. Implementors who provide to those described in this standard are used. Implementors who provide
such links are warned that mailers implementing this standard may not such links are warned that mailers implementing this standard may not
be able to resolve such links. be able to resolve such links.
Within such a multipart/related, ALL different parts MUST have Within a "multipart/related", ALL different parts MUST have different
different Content-ID values or Content-Location headers which resolve Content-ID values or Content-Location headers which resolve to
to different URLs. different URLs.
8. Format of Links to Other Body Parts Two body parts in the same multipart/related can have the same relative
URI as value of their Content-Location headers only if there are
headers contain a different Content-Base header, so that the absolute
URI after resolution against the Content-Base header is different.
8. Usage of Links to Other Body Parts
8.1 General principle 8.1 General principle
A body part, such as a text/HTML body part, may contain hyperlinks to A body part, such as a text/HTML body part, may contain hyperlinks to
objects which are included as other body parts in the same message and objects which are included as other body parts in the same message and
within the same multipart/related content. Often such linked objects within the same "multipart/related" content. Often such linked objects
are meant to be displayed inline to the reader of the main document; are meant to be displayed inline to the reader of the main document;
for example, objects referenced with the IMG tag in HTML 2.0 [HTML2]. for example, objects referenced with the IMG tag in HTML 2.0 [HTML2].
New tags with this property are proposed in the ongoing development of New tags with this property are proposed in the ongoing development of
HTML (example: applet, frame). HTML (example: applet, frame).
In order to send such messages, there is a need to indicate which other In order to send such messages, there is a need to indicate which other
body parts are referred to by the links in the body parts containing body parts are referred to by the links in the body parts containing
such links. For example, a body part of Content-Type: Text/HTML often such links. For example, a body part of Content-Type: Text/HTML often
has links to other objects, which might be included in other body parts has links to other objects, which might be included in other body parts
in the same MIME message. The referencing of other body parts is done in the same MIME message.
in the following way: For each body part containing links and each
distinct URI within it, which refers to data which is sent in the same
MIME message, there SHOULD be a separate body part within the current
multipart/related part of the message containing this data. Each such
body part SHOULD contain a Content-Location header (see section 8.2) or
a Content-ID header (see section 8.3).
An e-mail system which claims conformance to this standard MUST support
receipt of multipart/related (as defined in section 7) with links
between body parts using both the Content-Location (as defined in
section 8.2) and the Content-ID method (as defined in section 8.3).
8.2 Use of the Content-Location header
8.2.1 Matching of URL-s which can be resolved to absolute URL-s 8.2 Resolution of hyperlinks in text/HTML body parts
If there is a Content-Base header, then the recipient MUST employ The resolution of hyperlinks in text/HTML body parts is performed in
relative to absolute resolution as defined in Relative Uniform Resource the following way:
Locators [RELURL] of relative URIs in both the HTML markup and the
Content-Location header before matching a hyperlink in the HTML markup
to a Content-Location header. The same applies if the Content-Location
contains an absolute URI, or if the HTML markup contains a <BASE>
element so that relative URIs in the HTML markup can be resolved.
<BASE> elements inside HTML markup MUST not be used to resolve URI-s in
the Content-Heading which contains this HTML markup.
8.2.2 Matching of URL-s which cannot be resolved to absolute URL-s (a) Unfold multipl-eline header values according to [URLBODY]. Do NOT
however translate character encodings of the kind described in [URL].
Example: Do not transform "a%2eb/c%20d" into "a/b/c d".
If there is NO Content-Base header, and the Content-Location header (b) Remove all MIME encodings, such as content-transfer encoding and
contains a relative URI, then NO relative to absolute resolution SHOULD header encodings as defined in MIME part 3 [MIME3] Do NOT however
be performed. Matching the relative URI in the Content-Location header translate character encodings of the kind described in [URL]. Example:
to a hyperlink in an HTML markup text is in this case a two step Do not transform "a%2eb/c%20d" into "a/b/c d".
process. First remove any LWSP from the relative URI which may have
been introduced as described in section 4.4. Then perform an exact
textual match against the HTML URIs. For this matching process, ignore
any <BASE> element in the HTML markup. By "exact textual match" means
case sensitive matching and no resolution of encodings like
"file%20name" to "file name". (Note that the string "file name" is an
illegal URL, since unquoted spaces are not allowed in URLs.)
Note: If there are two body parts, one with a base, one with only a (c) Try to resolve all relative URIs in the HTML content and in Content-
relative URL and no base, then one of them cannot refer to the other, Location headers using the procedure described in chapter 5 above. The
since a non-resolved relative URI cannot match an absolute URI. result of this resolution can be an absolute URI, or a fictiuous
absolute URI with the base "This_message:/" as specified in chapter 5.
8.2.3 Must the URL refer to an existing WWW object? (d) For each hyperlink in any HTML body, compare the value of the
hyperlink after resolution as described in (a) and (b), with the URI
derived from Content-ID and Content-Location headers for other body
parts within the same Multipart/related. If the strings are identical,
octet by octet, then this hyperlink is resolved by the body part with
the same URI. This comparison will only succeed if the two URIs are
identical. This means that if one of the two URIs to be compared was a
fictituous absolute URI with the base "This_message:/", the other must
also be such a fictituous absolute URI, and not resolvable to a real
absolute URI.
The URI in the Content-Location header may, but need not refer to an (e) If (c) fails, try to resolve the hyperlink through ordinary
object which is actually available globally for retrieval using this Internet lookup. Resolution of hyperlinks of the URL-types "mid" or
URI (after resolution of relative URIs). However, URI-s in "cid" to other content-parts, outside multipart/related, or in other
Content-Location headers (if absolute, or resolvable to absolute URIs) separately sent messages, is not covered by this standard, and is thus
SHOULD still be globally unique. neither encouraged nor forbidden.
8.3 Use of the Content-ID header and CID URLs 8.3 Use of the Content-ID header and CID URLs
When CID (Content-ID) URLs as defined in [URL] and [MIDCID] are used When CID (Content-ID) URLs as defined in [URL] and [MIDCID] are used
for links between body parts, the Content-Location statement will for links between body parts, the Content-ID header MUST be used
normally be replaced by a Content-ID header. Thus, the following two instead of the Content-Location header. Thus, even though the following
headers are identical in meaning: two headers are identical in meaning, only the Content-ID variant MUST
be used, and all "Content-Location: CID:" should be ignored.
Content-ID: <foo@bar.net> Content-ID: <foo@bar.net>
Content-Location: CID: foo@bar.net Content-Location: CID: foo@bar.net
Note: Content-IDs MUST be globally unique [MIME1]. It is thus not Note: Content-IDs MUST be globally unique [MIME1]. It is thus not
permitted to make them unique only within this message or within this permitted to make them unique only within this message or within this
multipart/related. "multipart/related".
8.4 Conformance requirement on receipt
An e-mail system which claims conformance to this standard MUST support
receipt of "multipart/related" (as defined in section 7) with links
between body parts using both the Content-Location (as defined in
section 8.2) and the Content-ID method (as defined in section 8.3).
9. Examples 9. Examples
Warning: If there is a contradiction between the explanatory text and
the examples in this standard, then the explanatory text, not the
examples are normative.
9.1 Example of a HTML body without included linked objects 9.1 Example of a HTML body without included linked objects
The first example is the simplest form of an HTML email message. This The first example is the simplest form of an HTML email message. This
is not an aggregate HTML object, but simply a message with a single is not an aggregate HTML object, but simply a message with a single
HTML body part. This message contains a hyperlink but does not provide HTML body part. This message contains a hyperlink but does not provide
the ability to resolve the hyperlink. To resolve the hyperlink the the ability to resolve the hyperlink. To resolve the hyperlink the
receiving client would need either IP access to the Internet, or an receiving client would need either IP access to the Internet, or an
electronic mail web gateway. electronic mail web gateway.
From: foo1@bar.net From: foo1@bar.net
skipping to change at line 656 skipping to change at line 747
An example of an HTML message.<p> An example of an HTML message.<p>
Try clicking <a href="http://www.resnova.com/">here.</a><p> Try clicking <a href="http://www.resnova.com/">here.</a><p>
</body></HTML> </body></HTML>
9.2 Example with absolute URIs to an embedded GIF picture 9.2 Example with absolute URIs to an embedded GIF picture
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: Multipart/related; boundary="boundary-example-1"; Content-Type: "multipart/related"; boundary="boundary-example-1";
type="Text/HTML"; start=<foo3*foo1@bar.net> type="Text/HTML"; start=<foo3*foo1@bar.net>
--boundary-example-1 --boundary-example-1
Content-Type: Text/HTML;charset=US-ASCII Content-Type: Text/HTML;charset=US-ASCII
Content-ID: <foo3*foo1@bar.net> Content-ID: <foo3*foo1@bar.net>
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a hyperlink
to the other body part, for example through a statement such as: to the other body part, for example through a statement such as:
<IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif" <IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif"
ALT="IETF logo"> ALT="IETF logo">
--boundary-example-1 --boundary-example-1
Content-Location: Content-Location:
http://www.ietf.cnri.reston.va.us/images/ietflogo.gif http://www.ietf.cnri.reston.va.us/images/ietflogo.gif
Content-Type: IMAGE/GIF Content-Type: "IMAGE/GIF"
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1-- --boundary-example-1--
9.3 Example with relative URIs to an embedded GIF picture 9.3 Example with relative URIs to an embedded GIF picture
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: Multipart/related; boundary="boundary-example-1"; Content-Type: "multipart/related"; boundary="boundary-example-1";
type="Text/HTML" type="Text/HTML"
--boundary-example-1 --boundary-example-1
Content-Base: http://www.ietf.cnri.reston.va.us Content-Base: http://www.ietf.cnri.reston.va.us
Content-Type: Text/HTML; charset=ISO-8859-1 Content-Type: Text/HTML; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE Content-Transfer-Encoding: QUOTED-PRINTABLE
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a hyperlink
to the other body part, for example through a statement such as: to the other body part, for example through a statement such as:
<IMG SRC="/images/ietflogo.gif" ALT="IETF logo"> <IMG SRC="/images/ietflogo.gif" ALT="IETF logo">
Example of a copyright sign encoded with Quoted-Printable: =A9 Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: &#168; Example of a copyright sign mapped onto HTML markup: &#168;
--boundary-example-1 --boundary-example-1
Content-Base: http://www.ietf.cnri.reston.va.us/images/ Content-Base: http://www.ietf.cnri.reston.va.us/images/
Content-Location: ietflogo.gif Content-Location: ietflogo.gif
Content-Type: IMAGE/GIF Content-Type: "IMAGE/GIF"
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1-- --boundary-example-1--
9.4 Example using CID URL and Content-ID header to an embedded GIF 9.4 Example with relative URIs and no BASE available
From: foo1@bar.net
To: foo2@bar.net
Subject: A simple example
Mime-Version: 1.0
Content-Type: "multipart/related"; boundary="boundary-example-1";
type="Text/HTML"
--boundary-example-1
Content-Type: Text/HTML; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE
... text of the HTML document, which might contain a hyperlink
to the other body part, for example through a statement such as:
<IMG SRC="ietflogo.gif" ALT="IETF logo">
Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: &#168;
--boundary-example-1
Content-Location: ietflogo.gif
Content-Type: "IMAGE/GIF"
Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc...
--boundary-example-1--
9.5 Example using a BASE on the Multipart
From: foo1@bar.net
To: foo2@bar.net
Subject: A simple example
Mime-Version: 1.0
Content-Type: "multipart/related"; boundary="boundary-example-1";
type="Text/HTML"
Content-Base: http://www.ietf.cnri.reston.va.us/
--boundary-example-1
Content-Type: Text/HTML; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE
... text of the HTML document, which might contain a hyperlink
to the other body part, for example through a statement such as:
<IMG SRC="ietflogo.gif" ALT="IETF logo">
Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: &#168;
--boundary-example-1
Content-Location: http://www.ietf.cnri.reston.va.us/ietflogo.gif
Content-Type: "IMAGE/GIF"
Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc...
--boundary-example-1--
9.6 Example using CID URL and Content-ID header to an embedded GIF
picture picture
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: Multipart/related; boundary="boundary-example-1"; Content-Type: "multipart/related"; boundary="boundary-example-1";
type="Text/HTML" type="Text/HTML"
--boundary-example-1 --boundary-example-1
Content-Type: Text/HTML; charset=US-ASCII Content-Type: Text/HTML; charset=US-ASCII
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a hyperlink
to the other body part, for example through a statement such as: to the other body part, for example through a statement such as:
<IMG SRC="cid:foo4*foo1@bar.net" ALT="IETF logo"> <IMG SRC="cid:foo4*foo1@bar.net" ALT="IETF logo">
--boundary-example-1 --boundary-example-1
Content-Location: CID:something@else ; this header is disregarded
Content-ID: <foo4*foo1@bar.net> Content-ID: <foo4*foo1@bar.net>
Content-Type: IMAGE/GIF Content-Type: "IMAGE/GIF"
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1-- --boundary-example-1--
10. Content-Disposition header 10. Content-Disposition header
Note the specification in [REL] on the relations between Note the specification in [REL] on the relations between
Content-Disposition and multipart/related. Content-Disposition and "multipart/related".
11. Character encoding issues and end-of-line issues 11. Character encoding issues and end-of-line issues
For the encoding of characters in HTML documents and other text For the encoding of characters in HTML documents and other text
documents into a MIME-compatible octet stream, the following mechanisms documents into a MIME-compatible octet stream, the following mechanisms
are relevant: are relevant:
- HTML [HTML2], [HTML-I18N] as an application of SGML [SGML] allows - HTML [HTML2], [HTML-I18N] as an application of SGML [SGML] allows
characters to be denoted by character entities as well as by numeric characters to be denoted by character entities as well as by numeric
character references (e.g. "Latin small letter a with acute accent" character references (e.g. "Latin small letter a with acute accent" may
may be represented by "&aacute;" or "&#225;") in the HTML markup. be represented by "&aacute;" or "&#225;") in the HTML markup.
- HTML documents, in common with other documents of the MIME - HTML documents, in common with other documents of the MIME
"Content-Type text", can be represented in MIME using one of several "Content-Type text", can be represented in MIME using one of several
character encodings. The MIME Content-Type "charset" parameter value character encodings. The MIME Content-Type "charset" parameter value
indicates the particular encoding used. For the exact meaning and indicates the particular encoding used. For the exact meaning and use
use of the "charset" parameter, please see [MIME2] chapter 4. of the "charset" parameter, please see [MIME2] chapter 4.
Note that the "charset" parameter refers only to the MIME character Note that the "charset" parameter refers only to the MIME
encoding. For example, the string "&aacute;" can be sent in MIME character encoding. For example, the string "&aacute;" can be sent in
with "charset=US-ASCII", while the raw character "Latin small letter MIME with "charset=US-ASCII", while the raw character "Latin small
a with acute accent" cannot. letter a with acute accent" cannot.
The above mechanisms are well defined and documented, and therefore not The above mechanisms are well defined and documented, and therefore not
further explained here. In sending a message, all the above mentioned further explained here. In sending a message, all the above mentioned
mechanisms MAY be used, and any mixture of them MAY occur when sending mechanisms MAY be used, and any mixture of them MAY occur when sending
the document via e-mail. Receiving mail user agents (together with any the document via e-mail. Receiving mail user agents (together with any
Web browser they may use to display the document) MUST be capable of Web browser they may use to display the document) MUST be capable of
handling any combinations of these mechanisms. handling any combinations of these mechanisms.
Also note that: Also note that:
- Any documents including HTML documents that contain octet values - Any documents including HTML documents that contain octet values
outside the 7-bit range need a content-transfer-encoding applied outside the 7-bit range need a content-transfer-encoding applied before
before transmission over certain transport protocols [MIME1, chapter transmission over certain transport protocols [MIME1, chapter 5].
5].
- The MIME standard [MIME2] requires that documents of "Content-Type: - The MIME standard [MIME2] requires that documents of
Text MUST be in canonical form before Content-Transfer-Encoding, "Content-Type: Text MUST be in canonical form before
i.e. that line breaks are encoded as CRLFs, not as bare CRs or bare Content-Transfer-Encoding, i.e. that line breaks are encoded as CRLFs,
LFs or something else. This is in contrast to [HTTP] where section not as bare CRs or bare LFs or something else. This is in contrast to
3.6.1 allows other representations of line breaks. [HTTP] where section 3.6.1 allows other representations of line breaks.
Note that this might cause problems with integrity checks based on Note that this might cause problems with integrity checks based on
checksums, which might not be preserved when moving a document from the checksums, which might not be preserved when moving a document from the
HTTP to the MIME environment. If a document has to be converted in such HTTP to the MIME environment. If a document has to be converted in such
a way that a checksum integrity check becomes invalid, then this a way that a checksum integrity check becomes invalid, then this
integrity check header SHOULD be removed from the document. integrity check header SHOULD be removed from the document.
Other sources of problems are Content-Encoding used in HTTP but not Other sources of problems are Content-Encoding used in HTTP but not
allowed in MIME, and charsets that are not able to represent line allowed in MIME, and charsets that are not able to represent line
breaks as CRLF. A good overview of the differences between HTTP and breaks as CRLF. A good overview of the differences between HTTP and
skipping to change at line 858 skipping to change at line 1010
would be exceedingly dangerous for a receiving User Agent to execute would be exceedingly dangerous for a receiving User Agent to execute
content received through a mail message without careful attention to content received through a mail message without careful attention to
restrictions on the capabilities of that executable content. restrictions on the capabilities of that executable content.
Some WWW applications hide passwords and tickets (access tokens to Some WWW applications hide passwords and tickets (access tokens to
information which may not be available to anyone) and other sensitive information which may not be available to anyone) and other sensitive
information in hidden fields in the web documents or in on-the-fly information in hidden fields in the web documents or in on-the-fly
constructed URLs. If a person gets such a document, and forwards it via constructed URLs. If a person gets such a document, and forwards it via
e-mail, the person may inadvertently disclose sensitive information. e-mail, the person may inadvertently disclose sensitive information.
13. Robustness Principle 13. Differences as compared to the previous version of this proposed
standard in RFC 2110
The Internet Hosts requirements [HOSTS] section 1.2.2 states the very
important Internet Standards Robustness Principle:
"Be liberal in what you accept, and
conservative in what you send"
This principle is of special importance when working with HTML, since
accepted practice is that HTML readers should accept all kinds of
faulty or illegal HTML codes and make the best possible use of them.
Here is a (not complete) list of ways in which this principle SHOULD be
implemented as applied to this standard.
13.1 Content of the "type" parameter to Content-Type:
Multipart/related
What you send: Always include the "type" parameter in the "Content-
type: Multipart/relative" header, and always make it identical to the
Content-type of the root as specified in RFC 2112.
What you accept: Regard the "type" parameter only as a hint, whose
value may be wrong. Also accept input where this parameter is omitted.
13.2 Quoting of the "type" parameter to Content-Type:
Multipart/related
What you send: Always quote this parameter if it contains any of the
characters "(" / ")" / "<" / ">" / "@" /, "," / ";" / ":" / "\" / <">
"/" / "[" / "]" / "?" / "=" as required by [MIME1] section 5.1.
What you accept: Accept this parameter, even if it contains these
characters without quoting.
13.3 Quoting of the "start" parameter to Content-Type:
Multipart/related and the value of the Message-ID and Content-ID header
What you send: Always surround the Message-ID in the Message-ID and
Content-ID value and in the start parameter of Content-Type
Multipart/related with "<" and ">" as specified in [REL] and [RFC822].
What you accept: Accept these values without surrounding "<" ">", and In order to agree with [RELURL], Content-Base headers in multipart
treat them as if they had been surrounded by angle brackets. Content-Headings can now be used to resolve relative URLs in their
component parts, but only if no base URL can be derived from the
component part itself. Base URLs in inner headings, both in Content-
Base and Content-Location headers, have precedence over base URls in
outer multipart headings.
13.4 Content-Base and Content-Location on Multipart Content headings Specification added that a Content-Heading cannot contain more than one
Content-Location header.
What you send: Do not use the Content-Base or the Content-Location A section 4.4.1 has been added, specifying how to handle the case of
header on a Multipart/related if you expect that this Content-Base or sending a body part whose URI does not agree with the correct URI
Content-Location is to be used for any URI resolution. These headers syntax.
are meant to convey information only for this particular body parts,
not for its subparts, and thus cannot be used for resolution of URLs
inside the subparts of the multipart.
What you accept: If a message you receive has such a Content-Base or The handling of relative and absolute URIs for matching between body
Content-Location, and lacks this information on a subpart, so that you parts have been merged into a single description, by specifying that
cannot resolve URIs in the subpart, you might try to use the Content- relative URIs which cannot be resolved otherwise should be handled as
Base and Content-Location to resolve URIs in the subpart. if they had been given imaginary URL "This_message:/".
14. Acknowledgments 14. Acknowledgments
Harald T. Alvestrand, Richard Baker, Isaac Chan, Dave Crocker, Harald T. Alvestrand, Richard Baker, Isaac Chan, Dave Crocker, Martin
Martin J. Duerst, Lewis Geer, Roy Fielding, Al Gilman, Paul Hoffman, J. Duerst, Lewis Geer, Roy Fielding, Ned Freed, Al Gilman, Paul
Andy Jacobs, Richard W. Jesmajian, Mark K. Joseph, Greg Herlihy, Hoffman, Andy Jacobs, Richard W. Jesmajian, Mark K. Joseph, Greg
Valdis Kletnieks, Daniel LaLiberte, Ed Levinson, Jay Levitt, Herlihy, Valdis Kletnieks, Daniel LaLiberte, Ed Levinson, Jay Levitt,
Albert Lunde, Larry Masinter, Keith Moore, Gavin Nicol, Pete Resnick, Albert Lunde, Larry Masinter, Keith Moore, Gavin Nicol, Martyn W. Peck,
Jon Smirl, Einar Stefferud, Jamie Zawinski, Steve Zilles and several Pete Resnick, Nick Shelness, Jon Smirl, Einar Stefferud, Jamie
other people have helped us with preparing this document. I alone Zawinski, Steve Zilles and several other people have helped us with
take responsibility for any errors which may still be in the document. preparing this document. I alone take responsibility for any errors
which may still be in the document.
15. References 15. References
Ref. Author, title Ref. Author, title
--------- -------------------------------------------------------- --------- --------------------------------------------------------
[CONDISP] R. Troost, S. Dorner: "Communicating Presentation [CONDISP] R. Troost, S. Dorner: "Communicating Presentation
Information in Internet Messages: The Information in Internet Messages: The
Content-Disposition Header", RFC 1806, June 1995. Content-Disposition Header", RFC 1806, June 1995.
skipping to change at line 953 skipping to change at line 1069
[HTML2] T. Berners-Lee, D. Connolly: "Hypertext Markup Language [HTML2] T. Berners-Lee, D. Connolly: "Hypertext Markup Language
- 2.0", RFC 1866, November 1995. - 2.0", RFC 1866, November 1995.
[HTTP] T. Berners-Lee, R. Fielding, H. Frystyk: Hypertext [HTTP] T. Berners-Lee, R. Fielding, H. Frystyk: Hypertext
Transfer Protocol -- HTTP/1.0. RFC 1945, May 1996. Transfer Protocol -- HTTP/1.0. RFC 1945, May 1996.
[MD5] R. Rivest: "The MD5 Message-Digest Algorithm", RFC 1321, [MD5] R. Rivest: "The MD5 Message-Digest Algorithm", RFC 1321,
April 1992. April 1992.
[MIDCID] E. Levinson: "Message/External-Body Content-ID and [MIDCID] E. Levinson:
Message-ID Uniform Resource Locators", RFC 2111, "
February 1997.
Message/External-Body Content-ID Access"Message/External-
Body Content-ID and Message-ID Uniform Resource
Locators", RFC 2111, February 1997.
%%% This must be replaced by a reference to the new IETF
draft which replaces RFC 2111 %%%
[MIME1] N. Freed, N. Borenstein, "Multipurpose Internet Mail [MIME1] N. Freed, N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message Extensions (MIME) Part One: Format of Internet Message
Bodies", RFC 2045, December 1996 Bodies", RFC 2045, December 1996.
. .
[MIME-IMB] N. Freed & N. Borenstein
:: "Multipurpose Internet Mail Extensions (MIME) Part
One: Format of Internet Message Bedies". RFC 2045,
November 1996.
[MIME2] N. Freed, N. Borenstein, "Multipurpose Internet Mail [MIME2] N. Freed, N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", RFC 2046, Extensions (MIME) Part Two: Media Types", RFC 2046,
December 1996. December 1996.
[MIME3] K. Moore, "MIME (Multipurpose Internet Mail Extensions) [MIME3] K. Moore, "MIME (Multipurpose Internet Mail Extensions)
Part Three: Message Header Extensions for Non-ASCII Part Three: Message Header Extensions for Non-ASCII
Text", RFC 2047, December 1996. Text", RFC 2047, December 1996.
[MIME1] N. Borenstein & N. Freed: "MIME (Multipurpo
N. Borenstein & N. Freed:se Internet Mail Extensions)
Part One: Mechanisms for Specify
One: Mechanisms for Specifying and ing and Describing
the Format of Internet Message Bodies", RFC 1521, Sept
1993.
[MIME4] N. Freed, J. Klensin, J. Postel, "Multipurpose Internet [MIME4] N. Freed, J. Klensin, J. Postel, "Multipurpose Internet
Mail Extensions (MIME) Part Four: Registration Mail Extensions (MIME) Part Four: Registration
Procedures", RFC 2048, January 1997. Procedures", RFC 2048, January 1997.
[MIME5] "Multipurpose Internet Mail Extensions (MIME) Part Five: [MIME5] "Multipurpose Internet Mail Extensions (MIME) Part Five:
Conformance Criteria and Examples", RFC 2049, December Conformance Criteria and Examples", RFC 2049, December
1996. 1996.
[NEWS] M.R. Horton, R. Adams: "Standard for interchange of [NEWS] M.R. Horton, R. Adams: "Standard for interchange of
USENET messages", RFC 1036, December 1987. USENET messages", RFC 1036, December 1987.
[PDF] Tim Bienz and Richar Cohn: "Portable Document Format [PDF] Tim Bienz and Richar Cohn: "Portable Document Format
Reference Manual", Addison-Wesley, Reading, MA, USA, Reference Manual", Addison-Wesley, Reading, MA, USA,
1993, ISBN 0-201-62628-4. 1993, ISBN 0-201-62628-4.
[REL] Edward Levinson: "The MIME Multipart/Related Content- [REL] Edward Levinson: "The MIME
Type", RFC 2112, February 1997. Multipart/Related"multipart/related" Content-Type", RFC
2112, February 1997.
%%% This must be replaced by a reference to the new IETF
draft which replaces RFC 2112 %%%
[RELURL] R. Fielding: "Relative Uniform Resource Locators", RFC [RELURL] R. Fielding: "Relative Uniform Resource Locators", RFC
1808, June 1995. 1808, June 1995.
[RFC822] D. Crocker: "Standard for the format of ARPA Internet [RFC822] D. Crocker: "Standard for the format of ARPA Internet
text messages." STD 11, RFC 822, August 1982. text messages." STD 11, RFC 822, August 1982.
[SGML] ISO 8879. Information Processing -- Text and Office - [SGML] ISO 8879. Information Processing -- Text and Office -
Standard Generalized Markup Language (SGML), Standard Generalized Markup Language (SGML),
1986. <URL:http://www.iso.ch/cate/d16387.html> 1986. <URL:http://www.iso.ch/cate/d16387.html>
skipping to change at line 1031 skipping to change at line 1165
Alex Hopmann E-mail: alexhop@microsoft.com Alex Hopmann E-mail: alexhop@microsoft.com
Microsoft Corporation Microsoft Corporation
3590 North First Street 3590 North First Street
Suite 300 Suite 300
San Jose San Jose
CA 95134 CA 95134
Working group chairman: Working group chairman:
Einar Stefferud <stef@nma.com> Einar Stefferud <stef@nma.com>
I.
 End of changes. 87 change blocks. 
293 lines changed or deleted 427 lines changed or added

This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/