draft-ietf-mhtml-spec-03.txt   draft-ietf-mhtml-spec-04.txt 
Network Working Group Jacob Palme Network Working Group Jacob Palme
Internet Draft Stockholm University/KTH Internet Draft Stockholm University/KTH
draft-ietf-mhtml-spec-03.txt Alexander Hopmann draft-ietf-mhtml-spec-04.txt Alexander Hopmann
Category-to-be: Proposed standard ResNova Software, Inc. Category-to-be: Proposed standard ResNova Software, Inc.
Expires: February 1997 August 1996 Expires: March 1997 October 1996
MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML) MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)
Status of this Document Status of this Document
This document is an Internet-Draft. Internet-Drafts are working This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, and documents of the Internet Engineering Task Force (IETF), its areas, and
its working groups. Note that other groups may also distribute working its working groups. Note that other groups may also distribute working
documents as Internet-Drafts. documents as Internet-Drafts.
skipping to change at page 1, line 39 skipping to change at page 1, line 39
Although HTML [RFC 1866] was designed within the context of MIME, more Although HTML [RFC 1866] was designed within the context of MIME, more
than the specification of HTML as defined in RFC 1866 is needed for two than the specification of HTML as defined in RFC 1866 is needed for two
electronic mail user agents to be able to interoperate using HTML as a electronic mail user agents to be able to interoperate using HTML as a
document format. These issues include the naming of objects that are document format. These issues include the naming of objects that are
normally referred to by URIs, and the means of aggregating objects that normally referred to by URIs, and the means of aggregating objects that
go together. This document describes a set of guidelines that will allow go together. This document describes a set of guidelines that will allow
conforming mail user agents to be able to send, deliver and display conforming mail user agents to be able to send, deliver and display
these objects, such as HTML objects, that can contain links represented these objects, such as HTML objects, that can contain links represented
by URIs. In order to be able to handle inter-linked objects, the by URIs. In order to be able to handle inter-linked objects, the
document proposes to use the MIME type multipart/related and specifies document uses the MIME type multipart/related and specifies the MIME
the MIME content-headers "Content-Location" and "Content-Base". content-headers "Content-Location" and "Content-Base".
Table of Contents Table of Contents
1. Introduction 1. Introduction
2. Terminology 2. Terminology
2.1 Conformance requirement terminology 2.1 Conformance requirement terminology
2.2 Other terminology 2.2 Other terminology
4. The Content-Location and Content-Base MIME Content Headers 4. The Content-Location and Content-Base MIME Content Headers
4.1 MIME content headers 4.1 MIME content headers
4.2 The Content-Base header 4.2 The Content-Base header
skipping to change at page 3, line 9 skipping to change at page 1, line 99
Comments on less important details may also be sent to the editor, Jacob Comments on less important details may also be sent to the editor, Jacob
Palme <jpalme@dsv.su.se>. Palme <jpalme@dsv.su.se>.
More information may also be available at URL: More information may also be available at URL:
HTTP://www.dsv.su.se/~jpalme/ietf/jp-ietf-home.HTML HTTP://www.dsv.su.se/~jpalme/ietf/jp-ietf-home.HTML
1. Introduction 1. Introduction
There are a number of document formats, HTML [HTML2], PDF [PDF] and VRML There are a number of document formats, HTML [HTML2], PDF [PDF] and VRML
for example, which provide links using URIs for their resolution. There for example, which provide links using URIs for their resolution. There
is an obvious need to be able to send documents in these formats in e- is an obvious need to be able to send documents in these formats in
mail [RFC821=SMTP, RFC822]. This document gives additional e-mail [RFC821=SMTP, RFC822]. This document gives additional
specifications on how to send such documents in MIME [RFC 1521=MIME1] e- specifications on how to send such documents in MIME [RFC 1521=MIME1]
mail messages. This version of this standard was based on full e-mail messages. This version of this standard was based on full
consideration only of the needs for objects with links in the Text/HTML consideration only of the needs for objects with links in the Text/HTML
media type (as defined in RFC 1866 [HTML2]), but the standard may still media type (as defined in RFC 1866 [HTML2]), but the standard may still
be applicable also to other formats for sets of interlinked objects, be applicable also to other formats for sets of interlinked objects,
linked by URIs. There is no conformance requirement that implementations linked by URIs. There is no conformance requirement that implementations
claiming conformance to this standard are able to handle URI-s in other claiming conformance to this standard are able to handle URI-s in other
document formats than HTML. document formats than HTML.
URIs in documents in HTML and other similar formats reference other URIs in documents in HTML and other similar formats reference other
objects and resources, either embedded or directly accessible through objects and resources, either embedded or directly accessible through
hypertext links. When mailing such a document, it is often desirable to hypertext links. When mailing such a document, it is often desirable to
also mail all of the additional resources that are referenced in it; also mail all of the additional resources that are referenced in it;
those elements are necessary for the complete interpretation of the those elements are necessary for the complete interpretation of the
primary object. primary object.
An alternative way for sending an HTML document or other object An alternative way for sending an HTML document or other object
containing URIs in e-mail is to only send the URL, and let the recipient containing URIs in e-mail is to only send the URL, and let the recipient
look up the document using HTTP. That method is described in [URLBODY] look up the document using HTTP. That method is described in [URLBODY]
and is not described in this document. and is not described in this document.
An informational RFC [MHTML-INFO] will be published as a supplement to
this standard. The informational RFC will discuss implementation methods
and some implementation problems. Implementors are recommended to read
this informational RFC when developing implementations of the MHTML
standard.
2. Terminology 2. Terminology
2.1 Conformance requirement terminology 2.1 Conformance requirement terminology
This specification uses the same words as RFC 1123 [HOSTS] for defining This specification uses the same words as RFC 1123 [HOSTS] for defining
the significance of each particular requirement. These words are: the significance of each particular requirement. These words are:
MUST This word or the adjective "required" means that the item is MUST This word or the adjective "required" means that the item is
an absolute requirement of the specification. an absolute requirement of the specification.
skipping to change at page 4, line 25 skipping to change at page 1, line 173
CID See [MIDCID]. CID See [MIDCID].
Content-Base See section 4.2 below. Content-Base See section 4.2 below.
Content-ID See [MIDCID]. Content-ID See [MIDCID].
Content-Location MIME message or content part header with the URI of Content-Location MIME message or content part header with the URI of
the MIME message or content part body, defined in the MIME message or content part body, defined in
section 4.3 below. section 4.3 below.
Content-Transfer- Conversion of a text into 7-bit octets as specified Content-Transfer-Enco Conversion of a text into 7-bit octets as specified
Encoding in [MIME1]. ding in [MIME1].
CR See [RFC822]. CR See [RFC822].
CRLF See [RFC822]. CRLF See [RFC822].
Displayed text The text shown to the user reading a document with Displayed text The text shown to the user reading a document with
a web browser. This may be different from the HTML a web browser. This may be different from the HTML
markup, see the definition of HTML markup below. markup, see the definition of HTML markup below.
Header Field in a message or content heading specifying Header Field in a message or content heading specifying
skipping to change at page 4, line 57 skipping to change at page 1, line 205
HTML markup A file containing HTML encodings as specified in HTML markup A file containing HTML encodings as specified in
[HTML] which may be different from the displayed [HTML] which may be different from the displayed
text which a person using a web browser sees. For text which a person using a web browser sees. For
example, the HTML markup may contain "&lt;" where example, the HTML markup may contain "&lt;" where
the displayed text contains the character "<". the displayed text contains the character "<".
LF See [RFC822]. LF See [RFC822].
MIC Message Integrity Codes, codes use to verify that a MIC Message Integrity Codes, codes use to verify that a
message has not been illegally modified. message has not been modified.
MIME See RFC 1521 [MIME1], [MIME2]. MIME See RFC 1521 [MIME1], [MIME2].
MUA Messaging User Agent. MUA Messaging User Agent.
PDF Portable Document Format, see [PDF]. PDF Portable Document Format, see [PDF].
Relative URI, See RFC 1866 [HTML2] and RFC 1808[RELURL]. Relative URI, See RFC 1866 [HTML2] and RFC 1808[RELURL].
RelativeURI RelativeURI
skipping to change at page 6, line 30 skipping to change at page 1, line 284
message heading where they occurs and its text. They are thus not valid message heading where they occurs and its text. They are thus not valid
for the parts inside multipart headings, and are thus meaningless in for the parts inside multipart headings, and are thus meaningless in
multipart headings. multipart headings.
These two headers may occur both inside and outside of a These two headers may occur both inside and outside of a
multipart/related part. multipart/related part.
4.2 The Content-Base header 4.2 The Content-Base header
The Content-Base gives a base for relative URIs occurring in other The Content-Base gives a base for relative URIs occurring in other
heading fields and in content which do not have any BASE element in its heading fields and in HTML documents which do not have any BASE element
HTML code. Its value MUST be an absolute URI. in its HTML code. Its value MUST be an absolute URI.
Example showing which Content-Base is valid where: Example showing which Content-Base is valid where:
Content-Type: Multipart/related; boundary="boundary-example-1"; Content-Type: Multipart/related; boundary="boundary-example-1";
type=Text/HTML; start=foo2*foo3@bar2.net type=Text/HTML; start=foo2*foo3@bar2.net
; A Content-Base header cannot be placed here, since this is a ; A Content-Base header cannot be placed here, since this is a
; multipart MIME object. ; multipart MIME object.
--boundary-example-1 --boundary-example-1
skipping to change at page 7, line 55 skipping to change at page 1, line 364
4.4 Encoding of URIs in e-mail headers 4.4 Encoding of URIs in e-mail headers
Since MIME header fields have a limited length and URIs can get quite Since MIME header fields have a limited length and URIs can get quite
long, these lines may have to be folded. If such folding is done, the long, these lines may have to be folded. If such folding is done, the
algorithm defined in [URLBODY] section 3.1 should be employed. algorithm defined in [URLBODY] section 3.1 should be employed.
5. Base URIs for resolution of relative URIs 5. Base URIs for resolution of relative URIs
Relative URIs inside contents of MIME body parts are resolved relative Relative URIs inside contents of MIME body parts are resolved relative
to a base URI. In order to determine this base URI, the first-listed to a base URI. In order to determine this base URI, the first-applicable
method in the following list applies. method in the following list applies.
(a) There is a base specification inside the MIME body part (a) There is a base specification inside the MIME body part
containing the link which resolves relative URIs into absolute containing the link which resolves relative URIs into absolute
URIs. For example, HTML provides the BASE element for this. URIs. For example, HTML provides the BASE element for this.
(b) There is a Content-Base header (as defined in section 4.2), (b) There is a Content-Base header (as defined in section 4.2),
specifying the base to be used. specifying the base to be used.
(c) There is a Content-Location header in the heading of the body (c) There is a Content-Location header in the heading of the body
part which can then serve as the base in the same way as the part which can then serve as the base in the same way as the
request URI can serve as a base for relative URIs within a file requested URI can serve as a base for relative URIs within a
retrieved via HTTP [HTTP]. file retrieved via HTTP [HTTP].
When the methods above do not yield an absolute URI the procedure in When the methods above do not yield an absolute URI the procedure in
section 8.2 for matching relative URIs MUST be followed. section 8.2 for matching relative URIs MUST be followed.
6. Sending documents without linked objects 6. Sending documents without linked objects
If a document, such as an HTML object, is sent without other objects, to If a document, such as an HTML object, is sent without other objects, to
which it is linked, it MAY be sent as a Text/HTML body part by itself. which it is linked, it MAY be sent as a Text/HTML body part by itself.
In this case, multipart/related need not be used. In this case, multipart/related need not be used.
skipping to change at page 9, line 6 skipping to change at page 1, line 420
The root body part of the multipart/related SHOULD be the start object The root body part of the multipart/related SHOULD be the start object
for rendering the object, such as a text/html object, and which contains for rendering the object, such as a text/html object, and which contains
links to objects in other body parts, or a multipart/alternative of links to objects in other body parts, or a multipart/alternative of
which at least one alternative resolves to such a start object. which at least one alternative resolves to such a start object.
Implementors are warned, however, that many mail programs treat Implementors are warned, however, that many mail programs treat
multipart/alternative as if it had been multipart/mixed (even though multipart/alternative as if it had been multipart/mixed (even though
MIME [MIME1] requires support for multipart/alternative). MIME [MIME1] requires support for multipart/alternative).
[REL] requires that the type attribute of the "Content-Type: [REL] requires that the type attribute of the "Content-Type:
multipart/related" statement be the type of the root object, and this Multipart/related" statement be the type of the root object, and this
value can thus be "multipart/alternative". If the root is not the first value can thus be "multipart/alternative". If the root is not the first
body part within the multipart/related, [REL] further requires that its body part within the multipart/related, [REL] further requires that its
Content-ID MUST be given in a start parameter to the "Content-Type: Content-ID MUST be given in a start parameter to the "Content-Type:
multipart/related" header. Multipart/related" header.
When presenting the root body part to the user, the additional body When presenting the root body part to the user, the additional body
parts within the multipart/related can be used: parts within the multipart/related can be used:
(a) For those recipients who only have e-mail but not full Internet (a) For those recipients who only have e-mail but not full Internet
access. access.
(b) For those recipients who for other reasons, such as firewalls (b) For those recipients who for other reasons, such as firewalls
or the use of company-internal links, cannot retrieve the or the use of company-internal links, cannot retrieve the
linked body parts through the net. linked body parts through the net.
Note that this means that you can, via e-mail, send HTML which Note that this means that you can, via e-mail, send HTML which
includes URIs which the recipient cannot resolve via HTTPor includes URIs which the recipient cannot resolve via HTTPor
other connectivity-requiring URIs. other connectivity-requiring URIs.
(c) For items which are not available on the web. (c) For items which are not available on the web.
(d) For any recipient to speed up access. (d) For any recipient to speed up access.
The type parameter of the Content-Type: multipart/related MUST be the The type parameter of the "Content-Type: Multipart/related" MUST be the
same as the Content-Type of its root. same as the Content-Type of its root.
When a sending MUA sends objects which were retrieved from the WWW, it When a sending MUA sends objects which were retrieved from the WWW, it
SHOULD maintain their WWW URIs. It SHOULD not transform these URIs into SHOULD maintain their WWW URIs. It SHOULD not transform these URIs into
some other URI form prior to transmitting them. This will allow the some other URI form prior to transmitting them. This will allow the
receiving MUA to both verify MICs included with the email message, as receiving MUA to both verify MICs included with the email message, as
well as verify the documents against their WWW counterpoints. well as verify the documents against their WWW counterpoints.
In certain special cases this will not work if the original HTML
document contains URIs as parameters to objects and applets. In such a
case, it might be better to rewrite the document before sending it. This
problem is discussed in more detail in the informational RFC which will
be published as a supplement to this standard.
This standard does not cover the case where a multipart/related contains This standard does not cover the case where a multipart/related contains
links to MIME body parts outside of the current multipart/related or in links to MIME body parts outside of the current multipart/related or in
other MIME messages, even if methods similar to those described in this other MIME messages, even if methods similar to those described in this
standard are used. Implementors who provide such links are warned that standard are used. Implementors who provide such links are warned that
mailers implementing this standard may not be able to resolve such mailers implementing this standard may not be able to resolve such
links. links.
Within such a multipart/related, ALL different parts MUST have different Within such a multipart/related, ALL different parts MUST have different
Content-Location or Content-ID values. Content-Location or Content-ID values.
skipping to change at page 10, line 37 skipping to change at page 1, line 510
If there is a Content-Base header, then the recipient MUST employ If there is a Content-Base header, then the recipient MUST employ
relative to absolute resolution as defined in RFC 1808 [RELURL] of relative to absolute resolution as defined in RFC 1808 [RELURL] of
relative URIs in both the HTML markup and the Content-Location header relative URIs in both the HTML markup and the Content-Location header
before matching a hyperlink in the HTML markup to a Content-Location before matching a hyperlink in the HTML markup to a Content-Location
header. The same applies if the Content-Location contains an absolute header. The same applies if the Content-Location contains an absolute
URI, and the HTML markup contains a BASE element so that relative URIs URI, and the HTML markup contains a BASE element so that relative URIs
in the HTML markup can be resolved. in the HTML markup can be resolved.
If there is NO Content-Base header, and the Content-Location header If there is NO Content-Base header, and the Content-Location header
contains a relative URI, then NO relative to absolute resolution SHOULD contains a relative URI, then NO relative to absolute resolution SHOULD
be performed when matching Content-Location headers (even if there is a
BASE specification, such as the BASE element in HTML, in the body part
containing the URI), and exact textual match of the relative URI-s in
the Content-Location and the HTML markup is performed instead (after
removal of LWSP introduced as described in section 4.4 above). Note that
this only applies for matching Content-Location headers, not for URL-s
in the HTML document which are resolved through network look up at read
time.
If there is NO Content-Base header, and the Content-Location header
contains a relative URI, then NO relative to absolute resolution SHOULD
be performed. Matching the relative URI in the Content-Location header be performed. Matching the relative URI in the Content-Location header
to a hyperlink in an HTML markup text is in this case a two step to a hyperlink in an HTML markup text is in this case a two step
process. First remove any LWSP from the relative URI which may have been process. First remove any LWSP from the relative URI which may have been
introduced as described in section 4.4. Then perform an exact textual introduced as described in section 4.4. Then perform an exact textual
match against the HTML URIs. For this matching process, ignore BASE match against the HTML URIs. For this matching process, ignore BASE
specifications, such as the BASE element in HTML. Note that this only specifications, such as the BASE element in HTML. Note that this only
applies for matching Content-Location headers, not for URL-s in the HTML applies for matching Content-Location headers, not for URL-s in the HTML
document which are resolved through network look up at read time. document which are resolved through network look up at read time.
The URI in the Content-Location header need not refer to an object which The URI in the Content-Location header need not refer to an object which
is actually available globally for retrieval using this URI (after is actually available globally for retrieval using this URI (after
resolution of relative URIs). However, URI-s in Content-Location headers resolution of relative URIs). However, URI-s in Content-Location headers
(if absolute, or resolvable to absolute URIs) SHOULD still be globally (if absolute, or resolvable to absolute URIs) SHOULD still be globally
unique. unique.
8.3 Use of the Content-ID header and CID URLs 8.3 Use of the Content-ID header and CID URLs
When CID (Content-ID) URLs as defined in RFC 1738 [URL] and RFC 1873 When CID (Content-ID) URLs as defined in RFC 1738 [URL] and RFC 1873
[MIDCID] is used for links between body parts, the Content-Location [MIDCID] are used for links between body parts, the Content-Location
statement will normally be replaced by a Content-ID header. Thus, the statement will normally be replaced by a Content-ID header. Thus, the
following two headers are identical in meaning: following two headers are identical in meaning:
Content-ID: foo@bar.net Content-ID: foo@bar.net
Content-Location: CID: foo@bar.net Content-Location: CID: foo@bar.net
Note: Content-IDs MUST be globally unique [MIME1]. It is thus not Note: Content-IDs MUST be globally unique [MIME1]. It is thus not
permitted to make them unique only within this message or within this permitted to make them unique only within this message or within this
multipart/related. multipart/related.
skipping to change at page 11, line 56 skipping to change at page 1, line 570
An example of an HTML message.<p> An example of an HTML message.<p>
Try clicking <a href="http://www.resnova.com/">here.</a><p> Try clicking <a href="http://www.resnova.com/">here.</a><p>
</body></HTML> </body></HTML>
9.2 Example with absolute URIs to an embedded GIF picture: 9.2 Example with absolute URIs to an embedded GIF picture:
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example-1"; Content-Type: Multipart/related; boundary="boundary-example-1";
type=Text/HTML; start=foo3*foo1@bar.net type=Text/HTML; start=foo3*foo1@bar.net
--boundary-example 1 --boundary-example 1
Content-Type: Text/HTML;charset=US-ASCII Content-Type: Text/HTML;charset=US-ASCII
Content-ID: foo3*foo1@bar.net Content-ID: foo3*foo1@bar.net
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a hyperlink
to the other body part, for example through a statement such as: to the other body part, for example through a statement such as:
<IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif" <IMG SRC="http://www.ietf.cnri.reston.va.us/images/ietflogo.gif"
ALT="IETF logo"> ALT="IETF logo">
skipping to change at page 12, line 33 skipping to change at page 1, line 601
--boundary-example-1-- --boundary-example-1--
9.3 Example with relative URIs to an embedded GIF picture 9.3 Example with relative URIs to an embedded GIF picture
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Base: "http://www.ietf.cnri.reston.va.us" Content-Base: "http://www.ietf.cnri.reston.va.us"
Content-Type: multipart/related; boundary="boundary-example-1"; Content-Type: Multipart/related; boundary="boundary-example-1";
type=Text/HTML type=Text/HTML
--boundary-example 1 --boundary-example 1
Content-Type: Text/HTML; charset=ISO-8859-1 Content-Type: Text/HTML; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE Content-Transfer-Encoding: QUOTED-PRINTABLE
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a hyperlink
to the other body part, for example through a statement such as: to the other body part, for example through a statement such as:
<IMG SRC="/images/ietflogo.gif" ALT="IETF logo"> <IMG SRC="/images/ietflogo.gif" ALT="IETF logo">
Example of a copyright sign encoded with Quoted-Printable: =A9 Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: &#168; Example of a copyright sign mapped onto HTML markup: &#168;
skipping to change at page 13, line 12 skipping to change at page 1, line 631
--boundary-example-1-- --boundary-example-1--
9.4 Example using CID URL and Content-ID header to an embedded GIF 9.4 Example using CID URL and Content-ID header to an embedded GIF
picture picture
From: foo1@bar.net From: foo1@bar.net
To: foo2@bar.net To: foo2@bar.net
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example-1"; Content-Type: Multipart/related; boundary="boundary-example-1";
type=Text/HTML type=Text/HTML
--boundary-example 1 --boundary-example 1
Content-Type: Text/HTML; charset=US-ASCII Content-Type: Text/HTML; charset=US-ASCII
... text of the HTML document, which might contain a hyperlink ... text of the HTML document, which might contain a hyperlink
to the other body part, for example through a statement such as: to the other body part, for example through a statement such as:
<IMG SRC="cid:foo4*foo1@bar.net" ALT="IETF logo"> <IMG SRC="cid:foo4*foo1@bar.net" ALT="IETF logo">
--boundary-example-1 --boundary-example-1
skipping to change at page 13, line 35 skipping to change at page 1, line 654
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example-1-- --boundary-example-1--
10. Content-Disposition header 10. Content-Disposition header
Note the specification in [REL] on the relations between Content- Note the specification in [REL] on the relations between
Disposition and multipart/related. Content-Disposition and multipart/related.
11. Character encoding issues and end-of-line issues 11. Character encoding issues and end-of-line issues
For the encoding of characters in HTML documents and other text For the encoding of characters in HTML documents and other text
documents into a MIME-compatible octet stream, the following mechanisms documents into a MIME-compatible octet stream, the following mechanisms
are relevant: are relevant:
- HTML [HTML2, HTML-I18N] as an application of SGML [SGML] allows - HTML [HTML2, HTML-I18N] as an application of SGML [SGML] allows
characters to be denoted by character entities as well as by numeric characters to be denoted by character entities as well as by numeric
character references (e.g. "Latin small letter a with acute accent" character references (e.g. "Latin small letter a with acute accent"
may may be represented by "&aacute;" or "&#225;") in the HTML markup.
be represented by "&aacute;" or "&#225;") in the HTML markup.
- HTML documents, in common with other documents of the MIME content- - HTML documents, in common with other documents of the MIME
type "Content-Type
"text", can be represented in MIME using one of several character text", can be represented in MIME using one of several character
encodings. The MIME content-type "charset" parameter value indicates encodings. The MIME Content-Type "charset" parameter value indicates
the particular encoding used. For the exact meaning and use of the the particular encoding used. For the exact meaning and use of the
"charset" parameter, please see [MIME-IMB section 4.2]. "charset" parameter, please see [MIME-IMB section 4.2].
Note that the "charset" parameter refers only to the MIME character Note that the "charset" parameter refers only to the MIME character
encoding. For example, the string "&aacute;" can be sent in MIME with encoding. For example, the string "&aacute;" can be sent in MIME with
"charset=US-ASCII", while the raw character "Latin small letter a with "charset=US-ASCII", while the raw character "Latin small letter a with
acute accent" cannot. acute accent" cannot.
The above mechanisms are well defined and documented, and therefore not The above mechanisms are well defined and documented, and therefore not
further explained here. In sending a message, all the above mentioned further explained here. In sending a message, all the above mentioned
mechanisms MAY be used, and any mixture of them MAY occur when sending mechanisms MAY be used, and any mixture of them MAY occur when sending
the document via e-mail. Receiving mail user agents (together with the the document via e-mail. Receiving mail user agents (together with any
Web browser they may use to display the document) MUST be capable of Web browser they may use to display the document) MUST be capable of
handling any combinations of these mechanisms. handling any combinations of these mechanisms.
Also note that: Also note that:
- Any documents including HTML documents that contain octet values - Any documents including HTML documents that contain octet values
outside outside
the 7-bit range need a content-transfer-encoding applied before the 7-bit range need a content-transfer-encoding applied before
transmission over certain transport protocols [MIME1, chapter 5]. transmission over certain transport protocols [MIME1, chapter 5].
- The MIME standard [MIME1] requires that documents of "Content- - The MIME standard [MIME1] requires that documents of "Content-Type:
Type:text" Text
MUST be in canonical form before Content-Transfer-Encoding, i.e. that MUST be in canonical form before Content-Transfer-Encoding, i.e. that
line breaks are encoded as CRLFs, not as bare CRs or bare LFs or line breaks are encoded as CRLFs, not as bare CRs or bare LFs or
something else. This is in contrast to [HTTP] where section 3.6.1 something else. This is in contrast to [HTTP] where section 3.6.1
allows other representations of line breaks. allows other representations of line breaks.
Note that this might cause problems with integrity checks based on Note that this might cause problems with integrity checks based on
checksums, which might not be preserved when moving a document from the checksums, which might not be preserved when moving a document from the
HTTP to the MIME environment. If a document has to be converted in such HTTP to the MIME environment. If a document has to be converted in such
a way that a checksum integrity check becomes invalid, then this a way that a checksum integrity check becomes invalid, then this
integrity check header SHOULD be removed from the document. integrity check header SHOULD be removed from the document.
Other sources of problems are "Content-Encoding" used in HTTP but not Other sources of problems are Content-Encoding used in HTTP but not
allowed in MIME, and "charsets that are not able to represent line allowed in MIME, and charsets that are not able to represent line breaks
breaks as CRLF. A good overview of the differences between HTTP and MIME as CRLF. A good overview of the differences between HTTP and MIME with
with regards to Content-Type:text" can be found in [HTTP], appendix C. regards to "Content-Type: Text" can be found in [HTTP], appendix C.
If the original document has line breaks in the canonical form (CRLF), If the original document has line breaks in the canonical form (CRLF),
then the document SHOULD remain unconverted so that integrity check sums then the document SHOULD remain unconverted so that integrity check sums
are not invalidated. are not invalidated.
A provider of HTML documents who wants his documents to be transferable A provider of HTML documents who wants his documents to be transferable
via both HTTP and SMTP without invalidating checksum integrity checks, via both HTTP and SMTP without invalidating checksum integrity checks,
should always provide original documents in the canonical form with CRLF should always provide original documents in the canonical form with CRLF
for line breaks. for line breaks.
Some transport mechanisms may specify a default "charset" parameter if Some transport mechanisms may specify a default "charset" parameter if
none is supplied [HTTP, MIME1]. Because the default differs for none is supplied [HTTP, MIME1]. Because the default differs for
different mechanisms, when HTML is transferred through mail, the charset different mechanisms, when HTML is transferred through mail, the charset
parameter SHOULD be included, rather than relying on the default. parameter SHOULD be included, rather than relying on the default.
12. Security Considerations 12. Security Considerations
Some Security Considerations include the potential to mail someone an Some Security Considerations include the potential to mail someone an
object, and claim that it is represented by a particular URI (by giving object, and claim that it is represented by a particular URI (by giving
it a Content-Location: header). There can be no assurance that a WWW it a Content-Location header). There can be no assurance that a WWW
request for that same URI would normally result in that same object. It request for that same URI would normally result in that same object. It
might be unsuitable to cache the data in such a way that the cached data might be unsuitable to cache the data in such a way that the cached data
can be used for retrieval of this URI from other messages or message can be used for retrieval of this URI from other messages or message
parts than those included in the same message as the Content-Location parts than those included in the same message as the Content-Location
header. Because of this problem, receiving User Agents SHOULD not cache header. Because of this problem, receiving User Agents SHOULD not cache
this data in the same way that data that was retrieved through an HTTP this data in the same way that data that was retrieved through an HTTP
or FTP request might be cached. or FTP request might be cached.
URLs, especially File URLs, may in their name contain company-internal URLs, especially File URLs, may in their name contain company-internal
information, which may then inadvertently be revealed to recipients of information, which may then inadvertently be revealed to recipients of
skipping to change at page 15, line 39 skipping to change at page 1, line 760
Use of passwords (also known as tickets or magic cookies) is one way of Use of passwords (also known as tickets or magic cookies) is one way of
achieving this. Note that some caching WWW proxy servers may not achieving this. Note that some caching WWW proxy servers may not
distinguish between cached objects from e-mail and HTTP, which may be a distinguish between cached objects from e-mail and HTTP, which may be a
security risk. security risk.
In addition, by allowing people to mail aggregate objects, we are In addition, by allowing people to mail aggregate objects, we are
opening the door to other potential security problems that until now opening the door to other potential security problems that until now
were only problems for WWW users. For example, some HTML documents now were only problems for WWW users. For example, some HTML documents now
either themselves contain executable content (JavaScript) or contain either themselves contain executable content (JavaScript) or contain
links to executable content (The "INSERT" specification, Java). It would links to executable content (The "INSERT" specification, Java). It would
be exceedingly dangerous for a receiving User Agent to execute content be exceedingly dangerous for a receiving User Agent to execute content
received through a mail message without careful attention to received through a mail message without careful attention to
restrictions on the capabilities of that executable content. restrictions on the capabilities of that executable content.
Some WWW applications hide passwords and tickets (access tokens to
information which may not be available to anyone) and other sensitive
information in hidden fields in the web documents or in on-the-fly
constructed URLs. If a person gets such a document, and forwards it via
e-mail, the person may inadvertently disclose sensitive information.
13. Acknowledgments 13. Acknowledgments
Harald T. Alvestrand, Richard Baker, Dave Crocker, Martin J. Duerst, Harald T. Alvestrand, Richard Baker, Dave Crocker, Martin J. Duerst,
Lewis Geer, Roy Fielding, Al Gilman, Paul Hoffman, Richard W. Jesmajian, Lewis Geer, Roy Fielding, Al Gilman, Paul Hoffman, Richard W. Jesmajian,
Mark K. Joseph, Greg Herlihy, Valdis Kletnieks, Daniel LaLiberte, Ed Mark K. Joseph, Greg Herlihy, Valdis Kletnieks, Daniel LaLiberte, Ed
Levinson, Jay Levitt, Albert Lunde, Larry Masinter, Keith Moore, Gavin Levinson, Jay Levitt, Albert Lunde, Larry Masinter, Keith Moore, Gavin
Nicol, Pete Resnick, Jon Smirl, Einar Stefferud, Jamie Zawinski, Steve Nicol, Pete Resnick, Jon Smirl, Einar Stefferud, Jamie Zawinski, Steve
Zilles and several other people have helped us with preparing this Zilles and several other people have helped us with preparing this
document. I alone take responsibility for any errors which may still be document. I alone take responsibility for any errors which may still be
in the document. in the document.
14. References 14. References
Ref. Author, title Ref. Author, title
--------- -------------------------------------------------------- --------- --------------------------------------------------------
[CONDISP] R. Troost, S. Dorner: "Communicating Presentation [CONDISP] R. Troost, S. Dorner: "Communicating Presentation
Information in Internet Messages: The Content- Information in Internet Messages: The
Disposition Header", RFC 1806, June 1995. Content-Disposition Header", RFC 1806, June 1995.
[HOSTS] R. Braden (editor): "Requirements for Internet Hosts -- [HOSTS] R. Braden (editor): "Requirements for Internet Hosts --
Application and Support", STD-3, RFC 1123, October 1989. Application and Support", STD-3, RFC 1123, October 1989.
[HTML2] T. Berners-Lee, D. Connolly: "Hypertext Markup Language
- 2.0", RFC 1866, November 1995.
[HTML-I18N] F. Yergeau, G. Nicol, G. Adams, & M. Duerst: [HTML-I18N] F. Yergeau, G. Nicol, G. Adams, & M. Duerst:
"Internationalization of the Hypertext Markup "Internationalization of the Hypertext Markup
Language". draft-ietf-html-i18n-04.txt, May 1996. Language". draft-ietf-html-i18n-04.txt, May 1996.
[HTML2] T. Berners-Lee, D. Connolly: "Hypertext Markup Language
- 2.0", RFC 1866, November 1995.
[HTTP] T. Berners-Lee, R. Fielding, H. Frystyk: Hypertext [HTTP] T. Berners-Lee, R. Fielding, H. Frystyk: Hypertext
Transfer Protocol -- HTTP/1.0. RFC 1945, May 1996. Transfer Protocol -- HTTP/1.0. RFC 1945, May 1996.
[MD5] R. Rivest, "The MD5 Message-Digest Algorithm", RFC 1321, [MD5] R. Rivest: "The MD5 Message-Digest Algorithm", RFC 1321,
April 1992. April 1992.
[MIDCID] E. Levinson: "Message/External-Body Content-ID Access [MHTML-INFO] J. Palme: "Sending HTML in E-mail, an informational
Type", draft-ietf-mhtml-cid-00.txt, August 1996. supplement to RFC ???: MIME E-mail Encapsulation of
Aggregate HTML Documents (MHTML)", to be published as an
informational supplement to the MHTML standard.
[MIDCID] E. Levinson: "
Message/External-Body Content-ID AccessContent-ID and
Message-ID Uniform Resource Locators",
draft-ietf-mhtml-cid-00.txt, August 1996.
[MIME-IMB] N. Freed & N. Borenstein: "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message
Bedies". draft-ietf-822ext-mime-imb-07.txt, June 1996.
[MIME1] N. Borenstein & N. Freed: "MIME (Multipurpose Internet [MIME1] N. Borenstein & N. Freed: "MIME (Multipurpose Internet
Mail Extensions) Part One: Mechanisms for Specifying and Mail Extensions) Part One: Mechanisms for Specifying and
Describing the Format of Internet Message Bodies", RFC Describing the Format of Internet Message Bodies", RFC
1521, Sept 1993. 1521, Sept 1993.
[MIME2] N. Borenstein & N. Freed: "Multipurpose Internet Mail [MIME2] N. Borenstein & N. Freed: "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types". draft-ietf- Extensions (MIME) Part Two: Media Types".
822ext-mime-imt-02.txt, December 1995. draft-ietf-draft-ietf-822ext-mime-imt-02.txt, December
1995.
[MIME-IMB] N. Freed & N. Borenstein: "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message
Bedies". draft-ietf-822ext-mime-imb-07.txt, June 1996.
[NEWS] M.R. Horton, R. Adams: "Standard for interchange of [NEWS] M.R. Horton, R. Adams: "Standard for interchange of
USENET messages", RFC 1036, December 1987. USENET messages", RFC 1036, December 1987.
[PDF] Bienz, T., Cohn, R. and Meehan, J.: "Portable Document [PDF] Bienz, T., Cohn, R. and Meehan, J.: "Portable Document
Format Reference Manual, Version 1.1", Adboe Systems Format Reference Manual, Version 1.1", Adboe Systems
Inc. Inc.
[REL] Harald Tveit Alvestrand, Edward Levinson: "The MIME [REL] Edward Levinson: "The MIME Multipart/Related Content-
Multipart/Related Content-type", <draft-ietf-mhtml- Type", <draft-ietf-mhtml-related-00.txt>, May 1995.
related-00.txt>, May 1995.
[RELURL] R. Fielding: "Relative Uniform Resource Locators", RFC [RELURL] R. Fielding: "Relative Uniform Resource Locators", RFC
1808, June 1995. 1808, June 1995.
[RFC822] D. Crocker: "Standard for the format of ARPA Internet [RFC822] D. Crocker: "Standard for the format of ARPA Internet
text messages." STD 11, RFC 822, August 1982. text messages." STD 11, RFC 822, August 1982.
[SGML] ISO 8879. Information Processing -- Text and Office - [SGML] ISO 8879. Information Processing -- Text and Office -
Standard Generalized Markup Language (SGML), Standard Generalized Markup Language (SGML),
1986. <URL:http://www.iso.ch/cate/d16387.html> 1986. <URL:http://www.iso.ch/cate/d16387.html>
[SMTP] J. Postel: "Simple Mail Transfer Protocol", STD 10, RFC [SMTP] J. Postel: "Simple Mail Transfer Protocol", STD 10, RFC
821, August 1982. 821, August 1982.
[URL] T. Berners-Lee, L. Masinter, M. McCahill: "Uniform [URL] T. Berners-Lee, L. Masinter, M. McCahill: "Uniform
Resource Locators (URL)", RFC 1738, December 1994. Resource Locators (URL)", RFC 1738, December 1994.
[URLBODY] N. Freed and Keith Moore: "Definition of the URL MIME [URLBODY] N. Freed and Keith Moore: "Definition of the URL MIME
External-Body Access-Type", draft-ietf-mailext-acc-url- External-Body Access-Type",
01.txt, November 1995. draft-ietf-mailext-acc-url-01.txt, November 1995.
15. Author's Address 15. Author's Address
For contacting the editors, preferably write to Jacob Palme rather than For contacting the editors, preferably write to Jacob Palme rather than
Alex Hopmann. Alex Hopmann.
Jacob Palme Phone: +46-8-16 16 67 Jacob Palme Phone: +46-8-16 16 67
Stockholm University and KTH Fax: +46-8-783 08 29 Stockholm University and KTH Fax: +46-8-783 08 29
Electrum 230 E-mail: jpalme@dsv.su.se Electrum 230 E-mail: jpalme@dsv.su.se
S-164 40 Kista, Sweden S-164 40 Kista, Sweden
 End of changes. 37 change blocks. 
70 lines changed or deleted 83 lines changed or added

This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/