draft-ietf-mhtml-rev-05.txt   draft-ietf-mhtml-rev-06.txt 
Network Working Group Jacob Palme Network Working Group Jacob Palme
Internet Draft Stockholm University/KTH Internet Draft Stockholm University/KTH
draft-ietf-mhtml-rev-05.txt Alexander Hopmann draft-ietf-mhtml-rev-06.txt Alexander Hopmann
IETF status to be: Proposed standard Microsoft Corporation IETF status to be: Proposed standard Microsoft Corporation
Replaces: RFC 2110 Nick Shelness Replaces: RFC 2110 Nick Shelness
Lotus Corporation Lotus Corporation
Expires: August 1998 February 1998 Expires: August 1998 February 1998
MIME Encapsulation of Aggregate Documents, such as HTML (MHTML) MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)
Status of this Document Status of this Document
This document is an Internet-Draft. Internet-Drafts are working This document is an Internet-Draft. Internet-Drafts are working
skipping to change at line 103 skipping to change at line 105
11. Security Considerations 11. Security Considerations
11.1 Security considerations not related to caching 11.1 Security considerations not related to caching
11.2 Security considerations related to caching 11.2 Security considerations related to caching
12. Differences as compared to the previous version of this proposed 12. Differences as compared to the previous version of this proposed
standard in RFC 2110 standard in RFC 2110
13. Copyright 13. Copyright
14. Acknowledgments 14. Acknowledgments
15. References 15. References
16. Author's Addresses 16. Author's Addresses
Differences since version 05 of this draft
The definition of "HTML aggregate objects" has been changed from
HTML objects together with some or all objects, to which the HTML
object contains hyperlinks.
to
HTML objects together with some or all objects, to which the HTML
object contains hyperlinks, directly or indirectly.
Erroneous quotes around "multipart/related" have been removed in the
example in section 4.2.
In section 8.2, the following sentence:
The resolution of URIs in text/html body parts is performed in the
following way:
has been changed to
The resolution of inline, retrieval and other kinds of URIs in
text/html body parts is performed in the following way:
in order to remind the reader that also parts which are not inline can
be sent with MHTML.
In section 8.2, the following text:
(d) For each referencing URI in a text/html body part, compare the
value of the referencing URI after resolution as described in (a)
and (b), with the URI derived from Content-ID and Content-Location
headers for other body parts within the same Multipart/related
structure.
has been changed to:
(d) For each referencing URI in a text/html body part, compare the
value of the referencing URI after resolution as described in (a)
and (b), with the URI derived from Content-ID and Content-Location
headers for other body parts within the same or a surrounding
Multipart/related structure.
In section 9.3, the following text:
; Note - Relative Content-Location is resolved by base
; specified in the Multipart/Related heading
has been changed to:
; Note - Relative Content-Location is resolved by base
; specified in the Multipart/Related Content-Location heading
In section 11.1, the following paragraph has been added:
HTML-formatted messages can be used to investigate user behaviour
for example to break anonymity, in ways which invade the privacy of
individuals. If you send a message with a inline link to an object
which is not itself included in the message, the recipients mailer
or browser may request that object through HTTP. The HTTP
transaction will then reveal who is reading the message. Example: A
person who wants to find out who is behind an anonymous user
identity, or from which workstation a user is reading his mail, can
do this by sending a message with an inline link and then observe
from where this link is used to request the object.
In all the examples, all indentation which was there to make the text
more legible, but which was not correct according to RFC822, has been
removed. In one case, indentation was missing on a continuation line
and has been added.
Mailing List Information Mailing List Information
To write contributions To write contributions
Further discussion on this document should be done through the Further discussion on this document should be done through the
mailing list MHTML@SEGATE.SUNET.SE. mailing list MHTML@SEGATE.SUNET.SE.
Comments on less important details may also be sent to the editor, Comments on less important details may also be sent to the editor,
Jacob Palme <jpalme@dsv.su.se>. Jacob Palme <jpalme@dsv.su.se>.
skipping to change at line 252 skipping to change at line 312
Header Field in a message or content heading specifying Header Field in a message or content heading specifying
the value of one attribute. the value of one attribute.
Heading Part of a message or content before the first Heading Part of a message or content before the first
CRLFCRLF, containing formatted fields with CRLFCRLF, containing formatted fields with
attributes of the message or content. attributes of the message or content.
HTML See HTML 2 specification [HTML2]. HTML See HTML 2 specification [HTML2].
HTML Aggregate HTML objects together with some or all objects, to HTML Aggregate HTML objects together with some or all objects, to
objects which the HTML object contains hyperlinks. objects which the HTML object contains hyperlinks, directly
or indirectly.
HTML markup A file containing HTML encodings as specified in HTML markup A file containing HTML encodings as specified in
[HTML] which may be different from the displayed [HTML] which may be different from the displayed
text which a person using a web browser sees. For text which a person using a web browser sees. For
example, the HTML markup may contain "&lt;" where example, the HTML markup may contain "&lt;" where
the displayed text contains the character "<". the displayed text contains the character "<".
LF See [RFC822]. LF See [RFC822].
MIC Message Integrity Codes, codes use to verify that a MIC Message Integrity Codes, codes use to verify that a
skipping to change at line 381 skipping to change at line 442
content heading, in addition to a Content-ID header (as specified in content heading, in addition to a Content-ID header (as specified in
[MIME1]) and, in Message headings, a Message-ID (as specified in [MIME1]) and, in Message headings, a Message-ID (as specified in
[RFC822]). All of these constitute different, equally valid body part [RFC822]). All of these constitute different, equally valid body part
labels, and any of them may be used to satisfy a reference to a body labels, and any of them may be used to satisfy a reference to a body
part. Multiple Content-Location header fields in the same message part. Multiple Content-Location header fields in the same message
heading are not allowed. heading are not allowed.
Example of a multipart/related structure containing body parts with Example of a multipart/related structure containing body parts with
both Content-Location and Content-ID labels: both Content-Location and Content-ID labels:
Content-Type: "multipart/related"; boundary="boundary-example"; Content-Type: multipart/related; boundary="boundary-example";
type="text/html" type="text/html"
--boundary-example --boundary-example
Content-Type: text/html; charset=US-ASCII Content-Type: text/html; charset=US-ASCII
... ... <IMG SRC="fiction1/fiction2"> ... ... ... ... <IMG SRC="fiction1/fiction2"> ... ...
... ... <IMG SRC="cid:97116092811xyz@foo.bar.net"> ... ... ... ... <IMG SRC="cid:97116092811xyz@foo.bar.net"> ... ...
--boundary-example --boundary-example
skipping to change at line 635 skipping to change at line 694
described in this standard can be used to archive and retrieve all of described in this standard can be used to archive and retrieve all of
the resources required to display the web page, as it originally the resources required to display the web page, as it originally
appeared at a certain moment of time, in one aggregate file. appeared at a certain moment of time, in one aggregate file.
In order to send or store complete such messages, there is a need to In order to send or store complete such messages, there is a need to
specify how a URI in one body part can reference a resource in another specify how a URI in one body part can reference a resource in another
body part. body part.
8.2 Resolution of URIs in text/html body parts 8.2 Resolution of URIs in text/html body parts
The resolution of URIs in text/html body parts is performed in the The resolution of inline, retrieval and other kinds of URIs in
following way: text/html body parts is performed in the following way:
(a) Unfold multiple line header values according to [URLBODY]. Do NOT (a) Unfold multiple line header values according to [URLBODY]. Do NOT
however translate character encodings of the kind described in however translate character encodings of the kind described in
[URL]. Example: Do not transform "a%2eb/c%20d" into "a/b/c d". [URL]. Example: Do not transform "a%2eb/c%20d" into "a/b/c d".
(b) Remove all MIME encodings, such as content-transfer encoding and (b) Remove all MIME encodings, such as content-transfer encoding and
header encodings as defined in MIME part 3 [MIME3] Do NOT however header encodings as defined in MIME part 3 [MIME3] Do NOT however
translate character encodings of the kind described in [URL]. translate character encodings of the kind described in [URL].
Example: Do not transform "a%2eb/c%20d" into "a/b/c d". Example: Do not transform "a%2eb/c%20d" into "a/b/c d".
(c) Try to resolve all relative URIs in the HTML content and in (c) Try to resolve all relative URIs in the HTML content and in
Content-Location headers using the procedure described in chapter Content-Location headers using the procedure described in chapter
5 above. The result of this resolution can be an absolute URI, 5 above. The result of this resolution can be an absolute URI,
or an absolute URI with the base "this_message:/" as specified or an absolute URI with the base "this_message:/" as specified
in chapter 5. in chapter 5.
(d) For each referencing URI in a text/html body part, compare the (d) For each referencing URI in a text/html body part, compare the
value of the referencing URI after resolution as described in (a) value of the referencing URI after resolution as described in (a)
and (b), with the URI derived from Content-ID and Content-Location and (b), with the URI derived from Content-ID and Content-Location
headers for other body parts within the same Multipart/related headers for other body parts within the same or a surrounding
Multipart/related structure. If the strings are identical, octet by Multipart/related structure. If the strings are identical, octet by
octet, then the referencing URI references that body part. This octet, then the referencing URI references that body part. This
comparison will only succeed if the two URIs are identical. This comparison will only succeed if the two URIs are identical. This
means that if one of the two URIs to be compared was a fictitious means that if one of the two URIs to be compared was a fictitious
absolute URI with the base"this_message:/", the other must also be absolute URI with the base"this_message:/", the other must also be
such a fictitious absolute URI, and not resolvable to a real such a fictitious absolute URI, and not resolvable to a real
absolute URI. absolute URI.
(e) If (d) fails, try to retrieve the URI referenced resource (e) If (d) fails, try to retrieve the URI referenced resource
hyperlink through ordinary Internet lookup. Resolution of URIs of hyperlink through ordinary Internet lookup. Resolution of URIs of
skipping to change at line 721 skipping to change at line 779
Subject: A simple example Subject: A simple example
Mime-Version: 1.0 Mime-Version: 1.0
Content-Type: text/html; charset=iso-8859-1 Content-Type: text/html; charset=iso-8859-1
Content-Transfer-Encoding: 8bit Content-Transfer-Encoding: 8bit
<HTML> <HTML>
<head></head> <head></head>
<body> <body>
<h1>Acute accent</h1> <h1>Acute accent</h1>
The following two lines look have the same screen rendering:<p> The following two lines look have the same screen rendering:<p>
E with acute accent becomes .<br> E with acute accent becomes .<br>
E with acute accent becomes &Eacute;.<p> E with acute accent becomes &Eacute;.<p>
Try clicking <a href="http://www.ietf.cnri.reston.va.us/"> Try clicking <a href="http://www.ietf.cnri.reston.va.us/">
here.</a><p> here.</a><p>
</body></HTML> </body></HTML>
9.2 Example with an absolute URI to an embedded GIF picture 9.2 Example with an absolute URI to an embedded GIF picture
The second example is an HTML message which includes a single image, The second example is an HTML message which includes a single image,
referenced using the Content-Location mechanism. referenced using the Content-Location mechanism.
skipping to change at line 805 skipping to change at line 863
Content-Type: IMAGE/GIF Content-Type: IMAGE/GIF
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example --boundary-example
Content-Location: ietflogo2.gif Content-Location: ietflogo2.gif
; Note - Relative Content-Location is resolved by base ; Note - Relative Content-Location is resolved by base
; specified in the Multipart/Related heading ; specified in the Multipart/Related Content-Location heading
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A
etc... etc...
--boundary-example --boundary-example
Content-Location: Content-Location:
http://www.ietf.cnri.reston.va.us/images/ietflogo3.gif http://www.ietf.cnri.reston.va.us/images/ietflogo3.gif
Content-Transfer-Encoding: BASE64 Content-Transfer-Encoding: BASE64
skipping to change at line 1088 skipping to change at line 1146
sensitive information may be inadvertently revealed to others. sensitive information may be inadvertently revealed to others.
Since HTML documents can either directly contain executable content Since HTML documents can either directly contain executable content
(i.e., JavaScript) or indirectly reference executable content (The (i.e., JavaScript) or indirectly reference executable content (The
"INSERT" specification, Java). It is exceedingly dangerous for a "INSERT" specification, Java). It is exceedingly dangerous for a
receiving User Agent to execute content received in a mail message receiving User Agent to execute content received in a mail message
without careful attention to restrictions on the capabilities of that without careful attention to restrictions on the capabilities of that
executable content. (Why??? I do not understand this! What executable content. (Why??? I do not understand this! What
resdtrictions of what capabilities???/jp) resdtrictions of what capabilities???/jp)
HTML-formatted messages can be used to investigate user behaviour, for
example to break anonymity, in ways which invade the privacy of
individuals. If you send a message with a inline link to an object
which is not itself included in the message, the recipients mailer or
browser may request that object through HTTP. The HTTP transaction will
then reveal who is reading the message. Example: A person who wants to
find out who is behind an anonymous user identity, or from which
workstation a user is reading his mail, can do this by sending a
message with an inline link and then observe from where this link is
used to request the object.
11.2 Security considerations related to caching 11.2 Security considerations related to caching
There is a well-known problem with the caching of directly retrieved There is a well-known problem with the caching of directly retrieved
web resources. A resource retrieved from a cache may differ from that web resources. A resource retrieved from a cache may differ from that
re-retrieved from its source. This problem, also manifests itself when re-retrieved from its source. This problem, also manifests itself when
a copy of a resource is delivered in a multipart/related structure. a copy of a resource is delivered in a multipart/related structure.
When processing (rendering) a text/html body part in an MHTML When processing (rendering) a text/html body part in an MHTML
multipart/related structure, all URIs in that text/html body part which multipart/related structure, all URIs in that text/html body part which
reference subsidiary resources within the same multipart/related reference subsidiary resources within the same multipart/related
 End of changes. 10 change blocks. 
9 lines changed or deleted 79 lines changed or added

This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/