[Docs] [txt|pdf] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 04 05 06 07 08 RFC 2822

Network Working Group                                   P. Resnick, Editor
Internet-Draft                                          QUALCOMM Incorporated
<draft-ietf-drums-msg-fmt-04.txt>                       March 13, 1998

Internet Message Format Standard

0. Status of this memo

This document is an Internet-Draft. Internet-Drafts are working documents
of the Internet Engineering Task Force (IETF), its areas, and its working
groups. Note that other groups may also distribute working documents as
Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and
may be updated, replaced, or obsoleted by other documents at any time. It
is inappropriate to use Internet-Drafts as reference material or to cite
them other than as "work in progress."

To learn the current status of any Internet-Draft, please check the
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au
(Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West
Coast).

1. Introduction

1.1 Scope

This standard specifies a syntax for text messages that are sent between
computer users, within the framework of ''electronic mail'' messages. This
standard supersedes the one specified in Request For Comments 822,
''Standard for the Format of ARPA Internet Text Messages'' [RFC-822],
updating it to reflect current practice and incorporating incremental
changes that were specified in other RFCs.

This standard only specifies a syntax for text messages. In particular, it
makes no provision for the transmission of images, audio, or other sorts of
structured data in electronic mail messages. There are several extensions
published, such as the MIME document series [RFC-2045, RFC-2046, RFC-2049],
which describe mechanisms for the transmission of such data through
electronic mail, either by extending the syntax provided here or by
structuring such messages to conform to this syntax. These mechanisms are
outside of the scope of this standard.

In the context of electronic mail, messages are viewed as having an
envelope and contents. The envelope contains whatever information is needed
to accomplish transmission and delivery. (See [SMTP] for a discussion of
the envelope.) The contents comprise the object to be delivered to the
recipient. This standard applies only to the format and some of the
semantics of message contents. It contains no specification of the
information in the envelope.

However, some message systems may use information from the contents to
create the envelope. It is intended that this standard facilitate the
acquisition of such information by programs.

Some message systems may store messages in formats that differ from the one
specified in this standard. This specification is intended strictly as a
definition of what message content format is to be passed BETWEEN hosts.

Note: This standard is NOT intended to dictate the internal formats used by
sites, the specific message system features that they are expected to
support, or any of the characteristics of user interface programs that
create or read messages. In addition, this standard does not specify an
encoding of the characters for either transport or storage; that is, it
does not specify the number of bits used or how those bits are specifically
transferred over the wire or stored on disk.

1.2 Notational conventions

1.2.1 Requirements notation

This document occasionally uses terms that appear in capital letters. When
the terms "MUST", "SHOULD", "MUST NOT", "SHOULD NOT", and "MAY" appear
capitalized, they are being used to indicate particular requirements of
this specification. A discussion of the meanings of these terms appears in
[RFC-2119].

1.2.2 Syntactic notation

This standard uses the Augmented Backus-Naur Form (ABNF) notation specified
in [RFC-2234] for the formal definitions of the syntax of messages.
Characters will be specified either by a decimal value (e.g., the value
%d65 for uppercase A and %d97 for lowercase A) or by a case-insensitive
literal value enclosed in quotation marks (e.g., "A" for either uppercase
or lowercase A). See [RFC-2234] for the full description of the notation.

1.3 Structure of this document

This document is divided into several sections.

This section, section 1, is a short introduction to the document.

Section 2 will lay out the general description of a message and its
constituent parts. This is an overview to help the reader understand some
of the general principles used in the later portions of this document. Any
examples in this section MUST NOT be taken as specification of the formal
syntax of any part of a message.

Section 3 will give the formal syntax and semantics for each of the parts
of a message. That is, it will describe the actual rules for the structure
of each part of a message (the syntax) as well as a description of the
parts and instructions on how they ought to be interpreted (the semantics).
This will include analysis of the syntax and semantics of subparts of
messages which have specific structure. The syntax included in section 3
represents messages as they MUST be created. There are also notes in
section 3 to indicate if any of the options specified in the syntax SHOULD
be used over any of the others.

Both sections 2 and 3 describe messages which are legal to generate for
purposes of this standard.

Section 4 of this document specifies an "obsolete" syntax. There are
references in section 3 to these obsolete syntactic elements. The rules of
the obsolete syntax are elements that have appeared in earlier revisions of
this standard or have previously been widely used in Internet messages. As
such, these elements MUST be interpreted by parsers of messages in order to
be conformant to this standard. However, since items in this syntax have
been determined to be non-interoperable or cause significant problems for
recipients of messages, they MUST NOT be generated by creators of
conformant messages.

Section 5 details security considerations to take into account when
implementing this standard.

Section 6 is a bibliography of references in this document.

Section 7 contains the author's address and instructions on where to send
comments.

Section 8 contains acknowledgements.

Appendix A lists examples of different sorts of messages. These examples
are not exhaustive of the types of messages that appear on the Internet,
but give a broad overview of certain syntactic forms.

Appendix B lists the differences between this standard and earlier
standards for Internet messages.

2. Lexical Analysis of Messages

2.1 General Description

At the most basic level, a message is a series of characters. A message
that is conformant with this standard is comprised of characters with
values in the range 1 through 127 and interpreted as US-ASCII characters
[ASCII]. For brevity, this document sometimes refers to this range of
characters as simply "US-ASCII characters". Messages are divided into lines
of characters. A line is a series of characters which is delimited with the
two characters carriage-return and line-feed; that is, the carriage return
(CR) character (ASCII value 13) followed immediately by the line feed (LF)
character (ASCII value 10). (The carriage-return/line-feed pair is usually
written in this document as "CRLF".)

Note: This standard specifies that messages are made up of characters in
the US-ASCII range of 1 through 127. There are other documents,
specifically the MIME document series [RFC-2045, RFC-2046, RFC-2047,
RFC-2048, RFC-2049], which extend this standard to allow for values outside
of that range. Discussion of these mechanisms is not within the scope of
this standard.

A message consists of header fields (collectively called the header of the
message) followed, optionally, by a body. The header is a sequence of lines
of characters with special syntax as defined in this standard. The body is
simply a sequence of characters that follows the header and is separated
from the header by an empty line (i.e., a line with nothing preceding the
CRLF).

2.2 Header Fields

Header fields are lines which have a specific syntax. Header fields are all
composed of a field name, followed by a colon (":"), followed by a field
body, and terminated by CRLF. A field name must be composed of printable
US-ASCII characters (i.e., characters that have values between 33 and 126),
except colon. A field body may be composed of any US-ASCII characters,
except for CR and LF. However, a field body may contain CRLF when used in
header "folding" and "unfolding" as described in section 2.2.3. All field
bodies must conform to the syntax described in sections 3 and 4 of this
standard.

2.2.1 Unstructured Header Field Bodies

Some field bodies in this standard are defined simply as "unstructured"
(which is specified below as any US-ASCII characters, except for CR and LF)
with no further restrictions. These are referred to as unstructured field
bodies. Semantically, unstructured field bodies are simply to be treated as
a single line of characters with no further processing (except for header
"folding" and "unfolding" as described in section 2.2.3).

2.2.2 Structured Header Field Bodies

Some field bodies in this standard have specific lexical structure more
restrictive than the unstructured field bodies described above. These are
referred to as "structured" field bodies. Structured field bodies are lines
of specific lexical tokens as described in sections 3 and 4 of this
standard. Many of these tokens are allowed (according to their syntax) to
be freely surrounded by comments (as described in section 3.2.4) as well as
space (SP, ASCII value 32) and horizontal tab (HTAB, ASCII value 9)
characters, and those surrounding SP and HTAB characters are subject to
header "folding" and "unfolding" as described in section 2.2.3. Semantic
analysis of structured field bodies is given along with their syntax.

2.2.3 Long Header Fields

Each header field is logically a single line of characters comprising the
field name, the colon, and the field body. For convenience however, the
field body portion of a header field can be split into a multiple line
representation; this is called "folding". The general rule is that wherever
this standard allows for folding white-space (not simply SP or HTAB), a
CRLF followed by AT LEAST one SP or HTAB may instead be inserted. For
example, the header field:

        Subject: This is a test

can be represented as:

        Subject: This
         is a test

Note: Though structured field bodies are defined in such a way that folding
can take place between many of the lexical tokens (and even within some of
the lexical tokens), folding SHOULD be limited to placing the CRLF at
higher-level syntactic breaks. For instance, if a field body is defined as
comma-separated values, it is recommended that folding occur after the
comma separating the structured items, even if it is allowed elsewhere.

The process of moving from this folded multiple-line representation of a
header field to its single line representation is called "unfolding".
Unfolding is accomplished by simply removing any CRLF that is immediately
followed by SP or HTAB. Each header field should be treated in its unfolded
form for syntactic and semantic evaluation.

2.3 Body

The body of a message is simply lines of US-ASCII characters. The only two
limitations on the body are as follows:

- CR and LF MUST only occur together as CRLF; they MUST NOT appear
independently in the body.

- Lines of characters in the body MUST be limited to 998 characters, and
SHOULD be limited to 78 characters, excluding the CRLF.

Note: As was stated earlier, there are other standards documents,
specifically the MIME documents [RFC-2045, RFC-2046, RFC-2048, RFC-2049]
which extend this standard to allow for different sorts of message bodies.
Again, these mechanisms are beyond the scope of this document.

3. Syntax

3.1 Introduction

The syntax as given in this section defines the legal syntax of Internet
messages. Messages which are conformant to this standard MUST conform to
the syntax in this section. If there are options in this section where one
option SHOULD be generated, that is indicated either in the prose or in a
comment next to the syntax.

For the defined tokens, a short description of the syntax and use is given,
followed by the syntax in ABNF, followed by a semantic analysis. Primitive
tokens that are used but otherwise unspecified come from [RFC-2234].

In some of the token definitions, there will be elements whose names start
with "obs-". These "obs-" elements refer to tokens defined in the obsolete
syntax in section 4. In all cases, these tokens are to be ignored for the
purposes of generating legal Internet messages and MUST NOT be used as part
of such a message. However, when interpreting messages, these tokens MUST
be honored as part of the legal syntax. In this sense, section 3 defines a
grammar for generation of messages, with "obs-" elements which must be
ignored, while section 4 adds grammar for interpretation of messages.

3.2 Lexical Tokens

The following rules are used to define an underlying lexical analyzer,
which feeds tokens to the higher level parsers. This section is basically
devoted to defining tokens used in structured header field bodies.

Note: Readers of this standard should pay special attention to how these
lexical tokens are used in both the lower-level and higher-level syntax
later in the document. Particularly, the white-space tokens defined in
section 3.2.2 and the comment tokens defined in section 3.2.3 get used in
the lower-level tokens defined here and those lower-level tokens are in
turn used as parts of the higher-level tokens defined later. Therefore, the
white-space and comments may be allowed in the higher-level tokens even
though they may not explicitly appear in a particular definition.

3.2.1 Primitive Tokens

The following are primitive tokens referred to elsewhere in this standard,
but are not otherwise defined in [RFC-2234]. Some of them will not appear
anywhere else in the syntax, but they are convenient to refer to in other
parts of this document.

Note: The "specials" below are just such an example. Though the specials
token does not appear anywhere else in this standard, it is useful for
implementors who use tools which lexically analyze messages. Each of the
characters in specials can be used to indicate a tokenization point in
lexical analysis.

NO-WS-CTL       =       %d1-8 /         ; US-ASCII control characters
                        %d11 /          ;  which do not include the
                        %d12 /          ;  carriage return, line feed,
                        %d14-31 /       ;  and whitespace characters
                        %d127

text            =       %d1-9 /         ; Characters excluding CR and LF
                        %d11-12 /
                        %d14-127 /
                        obs-text

specials        =       "(" / ")" /     ; Special characters used in other
                        "<" / ">" /     ;  parts of the syntax
                        "[" / "]" /
                        ":" / ";" /
                        "@" / "\" /
                        "," / "." /
                        DQUOTE

No special semantics attaches to these tokens. They are simply single
characters.

3.2.2 Quoted characters

Some characters are reserved for special interpretation, such as delimiting
lexical tokens. To permit use of these characters as uninterpreted data, a
quoting mechanism is provided.

quoted-pair     =       ("\" text) / obs-qp

Where any quoted-pair appears, it should be interpreted as the text
character alone.

3.2.3 Whitespace

The following define the white-space characters used in this standard. See
section 3.2.4 for more information on the use of white-space in the rest of
this standard.

WSP             =       SP / HTAB               ; Whitespace characters
FWS             =       ([*WSP CRLF] 1*WSP) /   ; Folding white-space
                        obs-FWS

Throughout this standard, where FWS (the folding white-space token)
appears, it indicates a place where header folding, as discussed in section
2.2.3, may take place. Wherever header folding appears in a message (that
is, a header field body containing a CRLF followed by any WSP), header
unfolding (removal of the CRLF) should be performed before any further
lexical analysis is performed on that header field according to this
standard. That is to say, any CRLF that appears in FWS is semantically
"invisible."

Runs of FWS that occur between lexical tokens are semantically interpreted
as identical to a single space character.

3.2.4 Comments

Strings of characters which are treated as comments may be included in
structured field bodies as characters enclosed in parenthesis. Strings of
characters enclosed in parenthesis are considered comments so long as they
do not appear within a "quoted-string", as defined in section 3.2.6.
Comments may nest.

There are several places in this standard where comments and FWS may be
freely inserted. To accommodate that syntax, an additional token for "CFWS"
is defined for places where comments and/or FWS can occur. However, where
CFWS occurs in this standard, it MUST NOT be inserted in such a way that
any line of a folded header field is made up entirely of WSP characters and
nothing else.

ctext           =       NO-WS-CTL /     ; Non-white-space controls

                        %d33-39 /       ; The rest of the US-ASCII
                        %d42-91 /       ;  characters not including "(",
                        %d93-127        ;  ")", or "\"

comment         =       "(" *([FWS] (ctext / quoted-pair / comment)) [FWS] ")"

CFWS            =       *([FWS] comment) (([FWS] comment) / FWS)

A comment is normally used in a structured field body to provide some human
readable informational text. A comment is semantically interpreted as a
single SP. Since a comment is allowed to contain FWS, folding is permitted.
Also note that since quoted-pair is allowed in a comment, the parentheses
and backslash characters may appear in a comment so long as they appear as
a quoted-pair. Semantically, the enclosing parentheses are not part of the
comment token; the token is what is contained between the two parentheses.

Runs of CFWS are semantically interpreted as a single space character.

3.2.5 Atom

Several tokens in structured header field bodies are simply strings of
certain basic characters. Such tokens are represented as atoms. Two atoms
must be separated by some other token, since putting two atoms next to each
other would create a single atom.

Some of the structured header field bodies also allow the period character
(".", ASCII value 46) within runs of atext. An additional "dot-atom" token
is defined for those purposes.

atext           =       ALPHA / DIGIT / ; Any character except controls,
                        "!" / "#" /     ;  SP, and specials.
                        "$" / "%" /     ;  Used for atoms
                        "&" / "'" /
                        "*" / "+" /
                        "-" / "/" /
                        "=" / "?" /
                        "^" / "_" /
                        "`" / "{" /
                        "|" / "}" /
                        "~"

atom            =       [CFWS] 1*atext [CFWS]

dot-atom        =       [CFWS] dot-atom-text [CFWS]

dot-atom-text   =       1*atext *("." 1*atext)

Both atom and dot-atom are interpreted as a single unit, comprised of the
string of characters that make it up. Semantically, the optional comments
and FWS surrounding the rest of the characters are not part of the token;
the token is only the run of atext characters in an atom, or the atext and
"." characters in a dot-atom.

3.2.6 Quoted strings

Strings of characters which include characters other than those allowed in
atoms may be represented in a quoted string format, where the characters
are surrounded by quote characters.

qtext           =       NO-WS-CTL /     ; Non-white-space controls

                        %d33 /          ; The rest of the US-ASCII
                        %d35-91 /       ;  characters not including "\"
                        %d93-127        ;  or the quote character

quoted-string   =       [CFWS]
                        DQUOTE *([FWS] (qtext / quoted-pair)) [FWS] DQUOTE
                        [CFWS]

A quoted-string is treated as a single symbol. That is, quoted-string is
identical to atom, semantically. Since a quoted-string is allowed to
contain FWS, folding is permitted. Also note that since quoted-pair is
allowed in a quoted-string, the quote and backslash characters may appear
in a quoted-string so long as they appear as a quoted-pair.

Semantically, neither the optional CFWS outside of the quote characters nor
the quote characters themselves are part of the quoted-string token; the
token is what is contained between the two quote characters.

3.2.7 Miscellaneous tokens

Three additional tokens are defined, word and phrase for combinations of
atoms and/or quoted-strings, and unstructured for use in unstructured
header fields and in some places within structured header fields.

word            =       atom / quoted-string

phrase          =       1*word / obs-phrase

unstructured    =       *([FWS] text)

3.3 Date and Time Specification

Date and time occur in several header fields of a message. This section
specifies the syntax for a full date and time specification. Though folding
whitespace is permitted throughout the date-time specification, it is
recommended that only a single space be used where FWS is required and no
space be used where FWS is optional in the date-time specification; some
older implementations may not interpret other occurrences of folding
whitespace correctly.

date-time       =       [ day-of-week "," ] date FWS time [CFWS]

day-of-week     =       ([FWS] day-name [FWS]) / obs-day-of-week

day-name        =       "Mon" / "Tue" / "Wed" / "Thu" /
                        "Fri" / "Sat" / "Sun"

date            =       day month year

year            =       ([FWS] 4*DIGIT [FWS]) / obs-year

month           =       (FWS month-name FWS) / obs-month

month-name      =       "Jan" / "Feb" / "Mar" / "Apr" /
                        "May" / "Jun" / "Jul" / "Aug" /
                        "Sep" / "Oct" / "Nov" / "Dec"

day             =       ([FWS] 1*2DIGIT [FWS]) / obs-day

time            =       time-of-day FWS zone

time-of-day     =       hour ":" minute [ ":" second ]

hour            =       2DIGIT / obs-hour

minute          =       2DIGIT / obs-minute

second          =       2DIGIT / obs-second

zone            =       (( "+" / "-" ) 4DIGIT) / obs-zone

The day is the numeric day of the month. The year is any numeric year in
the common era.

The time-of-day specifies the number of hours, minutes, and optionally
seconds since midnight of the date indicated.

The date and time-of-day SHOULD express local time.

The zone specifies the offset from Coordinated Universal Time (UTC,
formerly referred to as "Greenwich Mean Time") that the date and
time-of-day represent. The "+" or "-" indicates whether the time-of-day is
ahead of or behind Universal Time. The first two digits indicate the number
of hours difference from Universal Time, and the last two digits indicate
the number of minutes difference from Universal Time. (Hence, +hhmm means
+(hh * 60 + mm) minutes, and -hhmm means -(hh * 60 + mm) minutes). The form
"+0000" SHOULD be used to indicate a time zone at Universal Time. Though
"-0000" also indicates Universal Time, it is used to indicate that the time
was generated on a system that may be in a local time zone other than
Universal Time.

A date-time specification MUST be semantically valid. That is, the
day-of-the week (if included) MUST be the day implied by the date, the
numeric day-of-month MUST be between 1 and the number of days allowed for
the specified month (in the specified year), the time-of-day MUST be in the
range 00:00:00 through 23:59:60 (the number of seconds allowing for a leap
second; see [STD-12]), and the zone MUST be within the range -9959 through
+9959.

3.4 Address Specification

Addresses occur in several message header fields to indicate senders and
recipients of messages. An address may either be an individual mailbox, or
a group of mailboxes.

address         =       mailbox / group

mailbox         =       name-addr / addr-spec / obs-mailbox

name-addr       =       [display-name] [CFWS] "<" addr-spec ">" [CFWS]

group           =       group-name ":" [mailbox-list / CFWS] ";" [CFWS]

display-name    =       phrase

group-name      =       phrase

mailbox-list    =       (mailbox *("," mailbox)) / obs-mbox-list

address-list    =       address *("," address) / obs-addr-list

A mailbox receives mail. It is a conceptual entity which does not
necessarily pertain to file storage. For example, some sites may choose to
print mail on a printer and deliver the output to the addressee's desk.
Normally, a mailbox is comprised of two parts: (1) an optional display name
which indicates the name of the recipient (which could be a person or a
system) that could be displayed to the user of a mail application, and (2)
an addr-spec address enclosed in angle brackets ("<" and ">"). There is
also an alternate simple form of a mailbox where the addr-spec address
appears alone, without the recipient's name or the angle brackets. The
Internet addr-spec address is described in section 3.4.1.

Note: Some legacy implementations used the simple form where the addr-spec
appears without the angle brackets, but included the name of the recipient
in parentheses as a comment following the addr-spec. Since the meaning of
the information in a comment is unspecified, implementations SHOULD use the
full name-addr form of the mailbox if a name of the recipient is being used
instead of the legacy form. Also, because some legacy implementations
interpret the comment, comments SHOULD NOT generally be used in address
fields to avoid confusion.

When it is desirable to treat several mailboxes as a single unit (i.e., in
a distribution list), the group construct can be used. The group construct
allows the sender to indicate a named group of recipients. This is done by
giving a group name, followed by a colon, followed by a comma separated
list of any number of mailboxes (including zero and one), and ending with a
semicolon. Because the list of mailboxes can be empty, using the group
construct is also a simple way to indicate in the message that a set of
recipients was sent the message without actually providing the individual
mailbox address for each of the recipients.

3.4.1 Addr-spec specification

An addr-spec is a specific Internet identifier that contains both a locally
interpreted string followed by the at-sign character ("@", ASCII value 64)
followed by an Internet domain. The locally interpreted string is either a
quoted-string or a dot-atom. If the string can be represented as a dot-atom
(that is, it contains no characters other than atext characters or "."
surrounded by atext characters), then the dot-atom form SHOULD be used and
the quoted-string form SHOULD NOT be used. Comments and folding whitespace
SHOULD NOT be used around the "@" in the addr-spec.

addr-spec       =       local-part "@" domain

local-part      =       dot-atom / quoted-string / obs-local-part

domain          =       dot-atom / domain-literal / obs-domain

domain-literal  =       [CFWS]
                        "[" *([FWS] (dtext / quoted-pair)) [FWS] "]"
                        [CFWS]

dtext           =       NO-WS-CTL /     ; Non-white-space controls

                        %d33-90 /       ; The rest of the US-ASCII
                        %d94-127        ;  characters not including "[",
                                        ;  "]", or "\"

The domain portion is a fully qualified identifier for an Internet host.
For example, in a mailbox address, it is the host on which the particular
mailbox resides. In the dot-atom form, this is interpreted as an Internet
domain name (either a host name or a mail exchanger name) as described in
[DNS]. In the domain-literal form, the domain is interpreted as the literal
Internet address of the particular host. In both cases, how addressing is
used and how messages are transported to a particular host is covered in
the mail transport document [SMTP]. These mechanisms are outside of the
scope of this document.

The local-part portion is a domain dependent string. In addresses, it is
simply interpreted on the particular host as a name of a particular mailbox.

3.5 Overall message syntax

A message consists of header fields, optionally followed by a message body.
In a message body, though all of the characters listed in the text rule MAY
be used, the US-ASCII control characters(values 1 through 8, 11, 12, and 14
through 31) SHOULD NOT be used. Also, though the lines in the body MAY be a
maximum of 998 characters excluding the CRLF, lines SHOULD be limited to 78
characters excluding the CRLF.

message         =       (fields / obs-fields)
                        [CRLF body]

body            =       *(*998text CRLF) *998text

The header fields carry most of the semantic information and are defined in
section 3.6. The body is simply a series of lines of text which are
uninterpreted for the purposes of this standard.

3.6 Field definitions

The header fields of a message are defined here. All header fields have the
same general syntactic structure: A field name, followed by a colon,
followed by the field body. The specific syntax for each header field is
defined in the subsequent sections.

Note: In the ABNF syntax for each field in subsequent sections, each field
name is followed by the required colon. However, for brevity sometimes the
colon is not referred to in the textual description of the syntax. It is,
nonetheless, required.

It is important to note that the header fields are not guaranteed to be in
a particular order. They may appear in any order, and they have been known
to be reordered occasionally when transported over the Internet. However,
for the purposes of this standard, header fields SHOULD NOT be reordered
when a message is transported or transformed. More importantly, the trace
header fields and resent header fields MUST NOT be reordered, and SHOULD be
kept in blocks prepended to the message. See sections 3.6.6 and 3.6.7 for
more information.

The only required header fields are the origination date field and the
originator address field(s). All other header fields are syntactically
optional. More information is contained in the table following this
definition.

fields          =       *(trace
                          *(resent-date /
                           resent-from /
                           resent-sender /
                           resent-to /
                           resent-cc /
                           resent-bcc /
                           resent-id))
                        *(orig-date /
                        from /
                        sender /
                        reply-to /
                        to /
                        cc /
                        bcc /
                        message-id /
                        in-reply-to /
                        references /
                        subject /
                        comments /
                        keywords /
                        optional-field)

The following table indicates limits on the number of times each field may
occur in a message header as well as any special limitations on the use of
those fields. An asterisk next to a value in the minimum or maximum column
indicates that a special restriction appears in the Notes column.

Field           Min number      Max number      Notes

trace           0               infinite        Block prepended - see 3.6.7

resent-date     0*              infinite*       One per block, required if
                                                other resent fields present
                                                - see 3.6.6

resent-from     0               infinite*       One per block - see 3.6.6

resent-sender   0*              infinite*       One per block, MUST occur
                                                with multi-address
                                                resent-from - see 3.6.6

resent-to       0               infinite*       One per block - see 3.6.6

resent-cc       0*              infinite*       One per block, SHOULD only
                                                occur with resent-to - see
                                                3.6.6

resent-bcc      0               infinite*       One per block - see 3.6.6

resent-id       0               infinite*       One per block - see 3.6.6

orig-date       1               1

from            1               1               See sender and 3.6.2

sender          0*              1               MUST occur with multi-address
                                                from - see 3.6.2

reply-to        0               1

to              0               1

cc              0               1*              SHOULD occur only with to -
                                                see 3.6.3
bcc             0               1

message-id      0*              1               SHOULD be present - see 3.6.4

in-reply-to     0*              1               SHOULD occur in some replies
                                                - see 3.6.4

references      0*              1               SHOULD occur in some replies
                                                - see 3.6.4
subject         0               1

comments        0               infinite

keywords        0               infinite

optional-field  0               infinite

The exact interpretation of each field is described in subsequent sections.

3.6.1 The origination date field

The origination date field consists of the field name "Date" followed by a
date-time specification.

orig-date       =       "Date:" date-time CRLF

The origination date specifies the date and time at which the creator of
the message indicated that the message was complete and ready to enter the
mail delivery system. For instance, this might be the time that a user
pushes the "send" or "submit" button in an application program. In any
case, it is specifically not intended to convey the time that the message
is actually transported, but rather the time at which the human or other
creator of the message has put the message in its final form, ready for
transport. (For example, a laptop user who is not connected to a network
might queue a message for delivery. The origination date should contain the
date and time that the user queued the message, not the time when the user
connected to the network to send the message.)

3.6.2 Originator fields

The originator fields of a message consist of the from field, the sender
field (when applicable) and optionally the reply-to field. The from field
consists of the field name "From" and comma-separated list of one or more
mailbox specifications. If the from field contains more than one mailbox
specification in the mailbox-list, then the sender field, containing the
field name "Sender" and a single mailbox specification, MUST appear in the
message. In either case, an optional reply-to field may also be included,
which contains the field name "Reply-To" and a comma-separated list of one
or more mailboxes.

from            =       "From:" mailbox-list CRLF

sender          =       "Sender:" mailbox CRLF

reply-to        =       "Reply-To:" address-list CRLF

The originator fields indicate the mailbox(es) of the source of the
message. The "From:" field specifies the author(s) of the message, that is,
the mailbox(es) of the person(s) or system(s) responsible for the writing
of the message. The "Sender:" field specifies the mailbox of the agent
responsible for the actual transmission of the message. For example, if a
secretary were to send a message for another person, the mailbox of the
secretary would go in the "Sender:" field and the mailbox of the actual
author would go in the "From:" field. If the originator of the message can
be indicated by a single mailbox and the author and transmitter are
identical, the "From:" field SHOULD be used and the "Sender:" field SHOULD
NOT be used. Otherwise, both fields SHOULD appear.

The originator fields also provide the information required to reply to a
message. When the "Reply-To:" field is present, it indicates the
mailbox(es) to which the author of the message suggests that replies be
sent. In the absence of the "Reply-To:" field, replies SHOULD be sent to
the mailbox(es) specified in the "From:" field.

In all cases, the "From:" field SHOULD NOT contain any mailbox which does
not belong to the author(s) of the message. See also section 3.6.3 for more
information on forming the destination addresses for a reply.

3.6.3 Destination address fields

The destination fields of a message consist of three possible fields, each
of the same form: The field name, which is either "To", "Cc", or "Bcc",
followed by a comma-separated list of one or more addresses (either mailbox
or group syntax). Both the "To:" field and the "Bcc:" field MAY occur
alone, but the "Cc:" field SHOULD only be present if the "To:" field is
also present.

to              =       "To:" address-list CRLF

cc              =       "Cc:" address-list CRLF

bcc             =       "Bcc:" (address-list / [CFWS]) CRLF

The destination fields specify the recipients of the message. Each
destination field may have one or more addresses, and each of the addresses
receives a copy of the message. The only difference between the three
fields is how each is used.

The "To:" field contains the address(es) of the primary recipient(s) of the
message.

The "Cc:" field (where the "Cc" means "Carbon Copy" in the sense of making
a copy on a typewriter using carbon paper) contains the addresses of others
who should receive the message, though the content of the message may not
be directed at them.

The "Bcc:" field (where the "Bcc" means "Blind Carbon Copy) contains
addresses of recipients of the message whose addresses should not be
revealed to other recipients of the message. There are three ways in which
the "Bcc:" field is used. In the first case, when a message containing a
"Bcc:" field is prepared to be sent, the "Bcc:" line is removed even though
all of the recipients (including those specified in the "Bcc:" field) are
sent a copy of the message. In the second case, recipients specified in the
"To:" and "Cc:" lines each are sent a copy of the message with the "Bcc:"
line removed as above, but the recipients on the "Bcc:" line get a separate
copy of the message containing a "Bcc:" line. (When there are multiple
recipient addresses in the "Bcc:" field, some implementations actually send
a separate copy of the message to each recipient with a "Bcc:" containing
only the address of that particular recipient.) Finally, since a "Bcc:"
field may contain no addresses, a "Bcc:" field can be sent without any
addresses indicating to the recipients that blind copies were sent to
someone. Which method to use with "Bcc:" fields is implementation
dependent, but refer to the "Security Considerations" section of this
document for a discussion of each.

When a message is a reply to another message, the mailboxes of the authors
of the original message (the mailboxes in the "From:" or "Reply-To:"
fields) MAY appear in the "To:" field of the reply, since that would
normally be the primary recipient. If a reply is sent to a message that has
destination fields, it is often desirable to send a copy of the reply to
all of the recipients of the message in addition to the author. When such a
reply is formed, addresses in the "To:" and "Cc:" fields of the original
message MAY appear in the "Cc:" field of the reply, since these are
normally secondary recipients of the reply. If a "Bcc:" field is present in
the original message, addresses in that field MAY appear in the "Bcc:"
field of the reply, but SHOULD NOT appear in the "To:" or "Cc:" fields.

Note: Some mail applications have automatic reply commands that include the
destination addresses of the original message in the destination addresses
of the reply. How those reply commands behave is implementation dependent
and is beyond the scope of this document. In particular, whether or not to
include the original destination addresses when the original message had a
"Reply-To:" field is not addressed here.

3.6.4 Identification fields

Though optional, every message SHOULD have a "Message-ID:" field.
Furthermore, reply messages SHOULD have "In-Reply-To:" and "References:"
fields as appropriate, as described below.

The "Message-ID:" and "In-Reply-To:" field each contain a single unique
message identifier. The "References:" field contains one or more unique
message identifiers, optionally separated by CFWS.

The message identifier (msg-id) is similar in syntax to an addr-spec
construct enclosed in the angle bracket characters, "<" and ">", without
the internal CFWS.

message-id      =       "Message-ID:" msg-id CRLF

in-reply-to     =       "In-Reply-To:" msg-id CRLF

references      =       "References:" 1*msg-id CRLF

msg-id          =       [CFWS] "<" id-left-side "@" id-right-side ">" [CFWS]

id-left-side    =       dot-atom-text / no-fold-quote / obs-id-left-side

id-right-side   =       dot-atom-text / no-fold-literal / obs-id-right-side

no-fold-quote   =       DQUOTE *(qtext / quoted-pair) DQUOTE

no-fold-literal =       "[" *(dtext / quoted-pair) "]"

The "Message-ID:" field provides a unique message identifier which refers
to a particular version of a particular message. The uniqueness of the
message identifier is guaranteed by the host which generates it (see
below). This message identifier is intended to be machine readable and not
necessarily meaningful to humans. A message identifier pertains to exactly
one instantiation of a particular message; subsequent revisions to the
message should each receive new message identifiers.

Note: When messages are introduced into the transport system, they are
often prepended with additional header fields such as trace fields
(described in section 3.6.7) and resent fields (described in section
3.6.6). Even though the addition of these fields "changes" the message in
some sense, such additional fields do not require changing the
"Message-ID:" field of the message. It is, in effect, the same message to
which transport trace information has been prepended.

The "In-Reply-To:" and "References:" fields are used when creating a reply
to a message. They hold the message identifier of the original message and
the message identifiers of other messages (for example, in the case of a
reply to a message which was itself a reply). If the original message
contains a "Message-ID:" field, the contents of that field body should be
copied into the body of an "In-Reply-To:" field and into the body of a
"References:" field in the new message. If the original message contains a
"References:" field and/or an "In-Reply-To:" field already (hence a reply
to a reply), the contents of the old "References:" field should be copied
to the "References:" field in the new message, appending to it the contents
of the old "In-Reply-To:" field (if its message identifier was not already
in the "References:" field) and the contents of the "Message-ID:" field of
the original message. In this way, a "thread" of conversation can be
established.

The msg-id itself is a domain-dependent unique identifier. The domain
portion of the msg-id SHOULD be the domain name of the host on which it was
created, to guarantee uniqueness. The local-part portion of the msg-id MAY
be any dot-atom or quoted-string. However, the entire msg-id MUST be
globally unique. In order to do this, a common practice is to form the
local-part by using a combination of the current absolute time and some
other currently unique identifier on the host (for example a system process
identifier).

The message identifier itself MUST be a globally unique identifier for a
message. The generator of the message identifier MUST guarantee that the
msg-id is unique. There are several algorithms that can be used to
accomplish this. Since the msg-id has an similar syntax to addr-spec
(identical except that comments and folding whitespace are not allowed), a
good method is to put the domain name or a domain literal IP address of the
host on which the message identifier was created on the right hand side of
the "@", and on the left hand side, put a combination of the current
absolute date and time along with some other currently unique (perhaps
sequential) identifier available on the system (for example, a process id
number). Using a date on the left hand side and a domain name or domain
literal on the right hand side makes it possible to guarantee uniqueness
since no two hosts should be using the same domain name or IP address at
the same time. Though other algorithms will work, it is RECOMMENDED that
the right hand side contain some domain identifier (either of the host
itself or otherwise) such that the generator of the message identifier can
guarantee the uniqueness of the left hand side within the scope of that
domain.

3.6.5 Informational fields

The informational fields are all optional. The "Keywords:" field contains a
comma-separated list of one or more words or quoted-strings. The "Subject:"
and "Comments:" fields are unstructured fields as defined in section 2.2.1,
and therefore may contain text or folding white-space.

subject         =       "Subject:" unstructured CRLF

comments        =       "Comments:" unstructured CRLF

keywords        =       "Keywords:" phrase *("," phrase) CRLF

These three fields are only intended to have human-readable content with
information about the message. The "Subject:" field is the most common and
contains a short string identifying the topic of the message. When used in
a reply, the field body MAY start with the string "Re: " (from the Latin
"res", in the matter of) followed by the contents of the "Subject:" field
body of the original message. The "Comments:" field contains any additional
comments on the text of the body of the message. The "Keywords:" field
contains a comma-separated list of important words and phrases that might
be useful for the recipient.

3.6.6 Resent fields

Resent fields SHOULD be added to any message which is reintroduced by a
user into the transport system. A separate set of resent fields SHOULD be
added if this occurs multiple times. All of the resent fields corresponding
to a particular resending of the message SHOULD be together. Each new set
of resent fields should be prepended to the message; that is, the most
recent set of resent fields should appear earlier in the message. No other
fields in the message should be changed when resent fields are added.

Each of the resent fields corresponds to a particular field elsewhere in
the syntax. For instance, the "Resent-Date:" field corresponds to the
"Date:" field and the "Resent-To:" field corresponds to the "To:" field. In
each case, the syntax for the field body is identical to the syntax given
previously for the corresponding field.

When resent fields are used, the "Resent-From:" and "Resent-Date:" fields
MUST be sent. The "Resent-Cc:" field SHOULD NOT be sent if the "Resent-To:"
field is not present. The "Resent-Message-ID:" field SHOULD be sent.
"Resent-Sender:" SHOULD NOT be used if "Resent-Sender:" would be identical
to "Resent-From:".

resent-date     =       "Resent-Date:" date-time CRLF

resent-from     =       "Resent-From:" mailbox-list CRLF

resent-sender   =       "Resent-Sender:" mailbox CRLF

resent-to       =       "Resent-To:" address-list CRLF

resent-cc       =       "Resent-Cc:" address-list CRLF

resent-bcc      =       "Resent-Bcc:" (address-list / [CFWS]) CRLF

resent-msg-id   =       "Resent-Message-ID:" msg-id CRLF

Resent fields are used to identify a message as having been reintroduced
into the transport system by a user. The purpose of using resent fields is
to have the message appear to the final recipient as if it were sent
directly by the original sender, with all of the original fields remaining
the same. Each set of resent fields correspond to a particular resending
event. That is, if a message is resent multiple times, each set of resent
fields gives identifying information for each individual time. Resent
fields are strictly informational. They MUST NOT be used in the normal
processing of replies or other such automatic actions on messages.

Note: Reintroducing a message into the transport system and using resent
fields is a different operation from "forwarding". Forwarding a message is
to make it the body of a new message. A forwarded message does not appear
to have come from the original sender, but is an entirely new message from
the forwarder of the message. Resent header fields are not intended for use
with forwarding.

The resent originator fields indicate the mailbox of the person(s) or
system(s) that resent the message. As with the regular originator fields,
there are two forms; a simple "Resent-From:" form which contains the
mailbox of the individual doing the resending, and the more complex form,
when one individual (identified in the "Resent-Sender:" field) resends a
message on behalf of one or more others (identified in the "Resent-From:"
field).

Note: When replying to a resent message, replies should behave just as they
would with any other message, using the original "From:", "Reply-To:",
"Message-ID:", and other fields. The resent fields are only informational
and MUST NOT be used in the normal processing of replies.

The "Resent-Date:" indicates the date and time at which the resent message
is dispatched by the resender of the message. Like the "Date:" field, it is
not the date and time that the message was actually transported.

The "Resent-To:", "Resent-Cc:", and "Resent-Bcc:" fields function
identically to the "To:", "Cc:", and "Bcc:" fields respectively, except
that they indicate the recipients of the resent message, not the recipients
of the original message.

The "Resent-Message-ID:" field provides a unique identifier for the resent
message.

3.6.7 Trace fields

The trace fields are a group of header fields consisting of an optional
"Return-Path:" field, and one or more "Received:" fields. The
"Return-Path:" header field contains a pair of angle brackets which enclose
an optional addr-spec. The "Received:" field contains a series of
token-value pairs followed by a semicolon and a date-time specification.
The first item of the token value pair is defined by token-name and the
second item is either an atom or a quoted-string. Further restrictions may
be applied to the syntax of the trace fields by standards which provide for
their use, such as [SMTP].

trace           =       [return]
                        1*received

return          =       "Return-Path:" path CRLF

path            =       ([CFWS] "<" ([CFWS] / addr-spec) ">" [CFWS]) / obs-path

received        =       "Received:" [CFWS] *token-value ";" date-time CRLF

token-value     =       [CFWS] token-name CFWS word

token-name      =       ALPHA *(["-"] (ALPHA / DIGIT))

A full discussion of the Internet mail use of trace fields is contained in
[SMTP]. For the purposes of this standard, the trace fields are strictly
informational, and any formal interpretation of them is outside of the
scope of this document.

3.6.8 Optional fields

Fields may appear in messages that are otherwise unspecified in this
standard. They must conform to the syntax of an optional-field. This is
basically a field name, made up of the printable US-ASCII characters except
SP and colon, followed by a colon, followed by unstructured text.

The field names of any optional-field MUST NOT be identical to any field
name specified elsewhere in this standard.

optional-field  =       field-name ":" unstructured

field-name      =       1*ftext

ftext           =       %d33-57 /               ; Any character except
                        %d59-126                ;  controls, SP, and ":".

For the purposes of this standard, the meaning of any optional field is
uninterpreted.

4. Obsolete Syntax

Earlier versions of this standard allowed for different (usually more
liberal) syntax than is allowed in this version. Also, there have been
syntactic elements used in messages on the Internet that have never been
documented. Though these syntactic forms MUST NOT be generated according to
the grammar in section 3, they MUST be accepted and parsed by a conformant
receiver. This section documents these syntactic elements. Taking the
grammar in section 3 and adding the definitions presented in this section
will result in the grammar to use for interpretation of messages.

One important difference between the obsolete (interpreting) and the
current (generating) syntax is that in structured header field bodies
(i.e., between the colon and the CRLF of any structured header field),
white-space characters, including folding white-space, and comments could
be freely inserted between any syntactic tokens. This allowed many complex
forms that have proven difficult for some implementations to parse.

Another key difference between the obsolete and the current syntax is that
the rule in section 3.2.4 regarding comments and folding whitespace does
not apply. See the discussion of folding whitespace in section 4.2 below.

Finally, certain characters which were formerly allowed in messages appear
in this section. The NUL character (ASCII value 0) was once allowed, but is
no longer for compatibility reasons. CR and LF were allowed to appear in
messages other than as CRLF. This use is also shown here.

Other differences in syntax and semantics are noted in the following sections.

4.1 Miscellaneous obsolete tokens

These syntactic elements are used elsewhere in the obsolete syntax or in
the main syntax. The obs-char and obs-qp elements each add ASCII value 0.
Bare CR and bare LF are added to obs-text. The period character is added to
obs-phrase.

obs-qp          =       "\" (%d0-127)

obs-text        =       *(*LF *CR obs-char)

obs-char        =       %d0-9 / %d11 /          ; %d0-127 except CR and LF
                        %d12 / %d14-127

obs-phrase      =       word *(word / "." / CFWS)

4.2 Obsolete folding whitespace

In the obsolete syntax, any amount of folding whitespace MAY be inserted
where the obs-FWS rule is allowed. This creates the possibility of having
two consecutive "folds" in a line, and therefore the possibility that a
line which makes up a folded header field could be composed entirely of
whitespace.

obs-FWS         =       1*WSP *(CRLF 1*WSP)

4.3 Obsolete Date and Time

The syntax for the obsolete date format allows a 2 digit year in the date
field and allows for a list of alphabetic time zone specifications which
were used in earlier versions of this standard. It also permits comments
and folding whitespace between many of the tokens.

obs-day-of-week =       [CFWS] day-name [CFWS]

obs-year        =       [CFWS] 2*DIGIT [CFWS]

obs-month       =       CFWS month-name CFWS

obs-day         =       [CFWS] 1*2DIGIT [CFWS]

obs-hour        =       [CFWS] 2DIGIT [CFWS]

obs-minute      =       [CFWS] 2DIGIT [CFWS]

obs-second      =       [CFWS] 2DIGIT [CFWS]

obs-zone        =       "UT" / "GMT" /          ; Universal Time
                                                ; North American UT offsets
                        "EST" / "EDT" /         ; Eastern:  - 5/ - 4
                        "CST" / "CDT" /         ; Central:  - 6/ - 5
                        "MST" / "MDT" /         ; Mountain: - 7/ - 6
                        "PST" / "PDT" /         ; Pacific:  - 8/ - 7

                        %d65-73 /               ; Military zones - "A"
                        %d75-90 /               ; through "I" and "K" through
                        %d97-105 /              ; "Z", both upper and lower
                        %d107-122               ; case

Where a two or three digit year occurs in a date, the year should be
interpreted as follows: If a two digit year is encountered whose value is
between 00 and 49, the year should be interpreted by adding 2000, ending up
with a value between 2000 and 2049. If a two digit year is encountered with
a value between 50 and 99, or any three digit year is encountered, the year
should be interpreted by adding 1900.

In the obsolete time zone, "UT" and "GMT" are indications of "Universal
Time" and "Greenwich Mean Time" respectively and are both semantically
identical to "+0000". The remaining three character zones are the US time
zones. The "T" is simply "Time" and the "E", "C", "M", and "P" are
"Eastern", "Central", "Mountain" and "Pacific". When followed by "S" (for
"Standard"), each of these are equivalent to "-0500", "-0600", "-0700", and
"-0800" respectively. When followed by "D" (for "Daylight" or summer time),
the each add an hour and are therefore "-0400", "-0500", "-0600", and
"-0700" respectively. The 1 character military time zones were defined in a
non-standard way in [RFC-822] and are therefore unpredictable in their
meaning. The original definitions of the military zones "A" through "I" are
equivalent to "+0100" through "+0900" respectively; "K", "L", and "M" are
equivalent to  "+1000", "+1100", and "+1200" respectively; "N" through "Y"
are equivalent to "-0100" through "-1200" respectively; and "Z" is
equivalent to "+0000". However, because of the error in [RFC-822], they
SHOULD all be considered equivalent to "-0000".

Other multi-character (usually between 3 and 5) alphabetic time zones have
been used in Internet messages. Any of these time zones SHOULD be
considered equivalent to "-0000".

4.4 Obsolete Addressing

There are three primary differences in addressing. First, mailbox addresses
were allowed to have a route portion before the addr-spec when enclosed in
"<" and ">". The route is simply a comma-separated list of domain names,
each preceded by "@", and the list terminated by a colon. Second, CFWS were
allowed between the period-separated elements of local-part and domain
(i.e., dot-atom was not used). Finally, mailbox-list and address-list were
allowed to have "null" members. That is, there could be two or more commas
in such a list with nothing in between them.

obs-mailbox     =       addr-spec / [display-name] obs-route-addr

obs-route-addr  =       [CFWS] "<" [obs-route] addr-spec ">" [CFWS]

obs-route       =       [CFWS] obs-domain-list ":" [CFWS]

obs-domain-list =       "@" domain *(*(CFWS / "," ) [CFWS] "@" domain)

obs-local-part  =       atom *("." atom)

obs-domain      =       atom *("." atom)

obs-mbox-list   =       *([mailbox] [CFWS] "," [CFWS])

obs-addr-list   =       *([address] [CFWS] "," [CFWS])

When interpreting addresses, the route portion SHOULD be ignored.

4.5 Obsolete header fields

Syntactically, the primary difference in the obsolete field syntax is that
it allows multiple occurrences of any of the fields and they may occur in
any order. Also, any amount of whitespace is allowed before the ":" at the
end of the field name.

obs-fields      =       *(obs-return /
                        obs-received /
                        obs-orig-date /
                        obs-from /
                        obs-sender /
                        obs-reply-to /
                        obs-to /
                        obs-cc /
                        obs-bcc /
                        obs-message-id /
                        obs-in-reply-to /
                        obs-references /
                        obs-subject /
                        obs-comments /
                        obs-keywords /
                        obs-resent-from /
                        obs-resent-send /
                        obs-resent-rply /
                        obs-resent-to /
                        obs-resent-cc /
                        obs-resent-bcc /
                        obs-resent-mid /
                        obs-optional)

Except for destination address fields (described in section 4.5.3), the
interpretation of multiple occurrences of fields is unspecified. Also, the
interpretation of trace fields and resent fields which do not occur in
blocks prepended to the message is unspecified as well. Unless otherwise
noted in the following sections, interpretation of other fields is
identical to the interpretation of their non-obsolete counterparts in
section 3.

4.5.1 Obsolete origination date field

obs-orig-date   =       "Date" *WSP ":" date-time CRLF

4.5.2 Obsolete originator fields

obs-from        =       "From" *WSP ":" mailbox-list CRLF

obs-sender      =       "Sender" *WSP ":" mailbox CRLF

obs-reply-to    =       "Reply-To" *WSP ":" mailbox-list CRLF

4.5.3 Obsolete destination address fields

obs-to          =       "To" *WSP ":" address-list CRLF

obs-cc          =       "Cc" *WSP ":" address-list CRLF

obs-bcc         =       "Bcc" *WSP ":" (address-list / [CFWS]) CRLF

When multiple occurrences of destination address fields occur in a message,
they SHOULD be treated as if the address-list in the first occurrence of
the field is combined with the address lists of the subsequent occurrences
by adding a comma and concatenating.

4.5.4 Obsolete identification fields

The obsolete "In-Reply-To:" and "References:" fields differ from the
current syntax in that they allow phrase (words or quoted strings) to
appear. The obsolete forms of the left and right sides of msg-id allow
interspersed CFWS, making them syntactically identical to local-part and
domain respectively.

obs-message-id  =       "Message-ID" *WSP ":" msg-id CRLF

obs-in-reply-to =       "In-Reply-To" *WSP ":" *(phrase / msg-id) CRLF

obs-references  =       "References" *WSP ":" *(phrase / msg-id) CRLF

obs-id-left     =       local-part

obs-id-right    =       domain

For purposes of interpretation, the phrases in the "In-Reply-To:" and
"References:" fields may be ignored.

4.5.5 Obsolete informational fields

obs-subject     =       "Subject" *WSP ":" unstructured CRLF

obs-comments    =       "Comments" *WSP ":" unstructured CRLF

obs-keywords    =       "Keywords" *WSP ":" *([phrase] ",") CRLF

4.5.6 Obsolete resent fields

The obsolete syntax adds a "Resent-Reply-To:" field, which consists of the
field name, the optional comments and folding whitespace, the colon, and a
comma separated list of addresses.

obs-resent-from =       "Resent-From" *WSP ":" mailbox-list CRLF

obs-resent-send =       "Resent-Sender" *WSP ":" mailbox CRLF

obs-resent-date =       "Resent-Date" *WSP ":" date-time CRLF

obs-resent-to   =       "Resent-To" *WSP ":" address-list CRLF

obs-resent-cc   =       "Resent-Cc" *WSP ":" address-list CRLF

obs-resent-bcc  =       "Resent-Bcc" *WSP ":" (address-list / [CFWS]) CRLF

obs-resent-mid  =       "Resent-Message-ID" *WSP ":" msg-id CRLF

obs-resent-rply =       "Resent-Reply-To" *WSP ":" address-list CRLF

As with other resent fields, the "Resent-Reply-To:" field should be treated
as trace information only.

4.5.7 Obsolete trace fields

The obs-return and obs-received are again given here as template
definitions, just as return and received are in section 3. Their full
syntax is given in [SMTP].

obs-return      =       "Return-Path" *WSP ":" *([CFWS] text) CRLF

obs-received    =       "Received" *WSP ":" *([CFWS] text) CRLF

4.5.8 Obsolete optional fields

obs-optional    =       field-name *WSP ":" unstructured CRLF

5. Security Considerations

Care should be taken when displaying messages on a terminal or terminal
emulator. Powerful terminals may act on escape sequences and other
combinations of ASCII control characters which remap the keyboard or permit
other modifications to the terminal which could lead to denial of service
or even damaged data. Message viewers may wish to strip potentially
dangerous terminal escape sequences from the message prior to display.
However, other escape sequences appear in messages for useful purposes (cf.
[RFC-2045, RFC-2046, RFC-2047, RFC-2048, RFC-2049], [ISO-2022]) and
therefore should not be stripped indiscriminately.

Transmission of non-text objects in messages raises additional security
issues. These issues are discussed is [RFC-2045, RFC-2046, RFC-2047,
RFC-2048, RFC-2049].

Many implementations use the "Bcc:" (blind carbon copy) field described in
section 3.6.3 to facilitate sending messages to recipients without
revealing the addresses of one or more of the addressees to the other
recipients. Mishandling this use of "Bcc:" has implications for
confidential information that might be revealed, which could eventually
lead to security problems through knowledge of even the existence of a
particular mail address. For example, if using the first method described
in section 3.6.3, where the "Bcc:" line is removed from the message, blind
recipients have no explicit indication that they have been sent a blind
copy, except insofar as their address does not appear in the message
header. Because of this, one of the blind addressees could potentially send
a reply to all of the shown recipients and accidentally revealing that the
message went to the blind recipient. When the second method from section
3.6.3 is used, the blind recipients address appears in the "Bcc:" field of
a separate copy of the message. If the "Bcc:" field sent contains all of
the blind addressees, all of the "Bcc:" recipients will be seen by each
"Bcc:" recipient. Even if a separate message is sent to each "Bcc:"
recipient with only the individual's address, implementations must still be
careful to process replies to the message as per section 3.6.3 so as not to
accidentally reveal the blind recipient to other recipients.

6. Bibliography

[RFC-822]

[RFC-2045]

[RFC-2046]

[RFC-2047]

[RFC-2048]

[RFC-2049]

[SMTP]

[RFC-2119]

[RFC-2234]

[ASCII]

[STD-12]

[DNS]

[ISO-2022]

7. Author's Address

Peter W. Resnick
QUALCOMM Incorporated
6455 Lusk Boulevard
San Diego, CA 92121-2779
Phone: +1 619 651 4478
FAX: +1 619 651 5334
e-mail: presnick@qualcomm.com

Grammar and syntax comments are welcome. Substantive comments on this
document should be directed to the DRUMS working group. The subscription
address is <drums-request@cs.utk.edu>.

8. Acknowledgements

[TBD]

Appendix A - Examples messages

This section presents a selection of messages. These are intended to assist
in the implementation of this standard, but should not be taken as
normative; that is to say, although the examples in this section were
carefully reviewed, if there happens to be a conflict between these
examples and the syntax described in sections 3 and 4 of this document, the
syntax in those sections is to be taken as correct.

Messages are delimited in this section between lines of "----". The "----"
lines are not part of the message itself.

A.1 Addressing examples

The following are examples of messages which might be sent between two
individuals.

A.1.1 A message from one person to another with simple addressing

This could be called a canonical message. It has a single author, John Doe,
a single recipient, Mary Smith, a subject, the date, a message identifier,
and a textual message in the body.

----
From: John Doe <jdoe@machine.tld>
To: Mary Smith <mary@harry.nil>
Subject: Saying Hello
Date: Fri, 21 Nov 1997 09:55:06 -0600
Message-ID: <1234@local.machine.tld>

This is a message just to say hello.
So, "Hello".
----

If John's secretary Michael actually sent the message, though John was the
author and replies to this message should go back to him, the sender field
would be used:

----
From: John Doe <jdoe@machine.tld>
Sender: Michael Jones <mjones@machine.tld>
To: Mary Smith <mary@harry.nil>
Subject: Saying Hello
Date: Fri, 21 Nov 1997 09:55:06 -0600
Message-ID: <1234@local.machine.tld>

This is a message just to say hello.
So, "Hello".
----

A.1.2 Different types of mailboxes

This message includes multiple addresses in the destination fields and also
uses several different forms of addresses.

----
From: "Joe Q. Public" <john.q.public@hiccup.tld>
To: Mary Smith <mary@harry.nil>, jdoe@machine.tld, Who? <one@here.nil>
Cc: <boss@test.nil>, "System's \"Big\" Box" <sysservices@hiccup.tld>
Date: Tue, 1 Jul 2003 10:52:37 +0200
Message-ID: <5678.21-Nov-1997@hiccup.tld>

Hi everyone.
----

Note that the display names for Joe Q. Public and System's "Big" Box needed
to be enclosed in double-quotes because the former contains the period and
the latter contains both single-quote and double-quote characters (the
double-quote characters appearing as quoted-pair tokens). Conversely, the
display name for Who? could appear without them because the question mark
is legal in an atom token. Notice also that jdoe@machine.tld and
boss@test.nil have no display names associated with them at all, and
joe@machine.tld uses the simpler address form without the angle brackets.

A.1.3 Group addresses

----
From: Pete <pete@silly.nil>
To: A Group:Chris Jones <c@public.tld>,joe@where.nil,John <jdoe@one.nil>;
Cc: Undisclosed recipients:;
Date: Sat, 13 Feb 1869 23:32:54 -0330
Message-ID: <testabcd.1234@silly.nil>

Testing.
----

In this message, the "To:" field has a single group recipient named A Group
which contains 3 addresses, and a "Cc:" field with an empty group recipient
named Undisclosed recipients.

A.2 Reply messages

The following is a series of three messages which make up a conversation
thread between John and Mary. John firsts sends a message to Mary, Mary
then replies to John's message, and then John replies to Mary's reply
message.

Note especially the "Message-ID:", "References:", and "In-Reply-To:" fields
in each message.

----
From: John Doe <jdoe@machine.tld>
To: Mary Smith <mary@harry.nil>
Subject: Saying Hello
Date: Fri, 21 Nov 1997 09:55:06 -0600
Message-ID: <1234@local.machine.tld>

This is a message just to say hello.
So, "Hello".
----

When sending replies, the Subject field is often retained, though prepended
with "Re: " as described in section 3.6.5.

----
From: Mary Smith <mary@harry.nil>
To: John Doe <jdoe@machine.tld>
Reply-To: "Mary Smith's Personal Account" <smith@home.nil>
Subject: Re: Saying Hello
Date: Fri, 21 Nov 1997 10:01:10 -0600
Message-ID: <3456@harry.nil>
In-Reply-To: <1234@local.machine.tld>
References: <1234@local.machine.tld>

This is a reply to your hello.
----

Note the "Reply-To:" field in the above message. When John replies to
Mary's message above, the reply should go to the address in the "Reply-To:"
field instead of the address in the "From:" field.

----
To: "Mary Smith's Personal Account" <smith@home.nil>
From: John Doe <jdoe@machine.tld>
Subject: Re: Saying Hello
Date: Fri, 21 Nov 1997 11:00:00 -0600
Message-ID: <abcd.1234@local.machine.tld>
In-Reply-To: <3456@harry.nil>
References: <1234@local.machine.tld> <3456@harry.nil>

This is a reply to your reply.
----

A.3 Resent messages

Start with the message that has been used as an example several times:

----
From: John Doe <jdoe@machine.tld>
To: Mary Smith <mary@harry.nil>
Subject: Saying Hello
Date: Fri, 21 Nov 1997 09:55:06 -0600
Message-ID: <1234@local.machine.tld>

This is a message just to say hello.
So, "Hello".
----

Say that Mary, upon receiving this message, wishes to send a copy of the
message to Jane such that (a) the message would appear to have come
straight from John; (b) if Jane replies to the message, the reply should go
back to John; and (c) all of the original information, like the date the
message was originally sent to Mary, the message identifier, and the
original addressee, is preserved. In this case, resent fields are prepended
to the message:

----
Resent-From: Mary Smith <mary@harry.nil>
Resent-To: Jane Brown <j-brown@other.tld>
Resent-Date: Mon, 24 Nov 1997 14:22:01 -0800
Resent-Message-ID: <78910@harry.nil>
From: John Doe <jdoe@machine.tld>
To: Mary Smith <mary@harry.nil>
Subject: Saying Hello
Date: Fri, 21 Nov 1997 09:55:06 -0600
Message-ID: <1234@local.machine.tld>

This is a message just to say hello.
So, "Hello".
----

If Jane, in turn, wished to resend this message to another person, she
would prepend her own set of resent header fields to the above and send
that.

A.4 Messages with trace fields

As messages are sent through the transport system as described in [SMTP],
trace fields are prepended to the message. The following is an example of
what those trace fields might look like. Note that there is some folding
whitespace in the first one since these lines can be long.

----
Received: from machine.tld
   by harry.nil
   via TCP
   with ESMTP
   id ABC12345
   for <mary@harry.nil>;  21 Nov 1997 10:05:43 -0600
Received: from john.machine.tld by machine.tld; 21 Nov 1997 10:01:22 -0600
From: John Doe <jdoe@machine.tld>
To: Mary Smith <mary@harry.nil>
Subject: Saying Hello
Date: Fri, 21 Nov 1997 09:55:06 -0600
Message-ID: <1234@local.machine.tld>

This is a message just to say hello.
So, "Hello".
----

A.5 Whitespace and comments

Whitespace, including folding whitespace, and comments can be inserted
between many of the tokens of fields. Taking the example from A.1.3,
whitespace and comments can be inserted into all of the fields.

----
From: Pete(A wonderful \) chap) <pete(his account)@silly.nil(his host)>
To:A Group(Some people)
     :Chris Jones <c@(Chris's host.)public.tld>,
         joe@where.nil,
  John <jdoe@one.nil> (my dear friend); (the end of the group)
Cc:(An empty list)(start here)Undisclosed recipients  :(nobody(that I know))  ;
Date: Sat,
      13
        Feb
          1869
      23:32:54
               -0330 (Newfoundland Time)
Message-ID:              <testabcd.1234@silly.nil>

Testing.
----

The above example is anesthetically displeasing, but perfectly legal. Note
particularly (1) the comments in the "From:" field (including one that has
a ")" character appearing as part of a quoted-pair); (2) the whitespace
absent after the ":" in the "To:" field as well as the comment and folding
whitespace after the group name, the special characters ("." and "'") in
the comment in Chris Jones's address, and the folding whitespace before and
after "joe@where.nil,"; (3) the multiple and nested comments in the "Cc:"
field as well as the comment immediately following the ":" after "Cc"; (4)
the folding whitespace (but no comments except at the end) in the date
field; and (5) the whitespace before (but not within) the identifier in the
"Message-ID:" field.

A.6 Obsoleted forms

The following are examples of obsolete (that is, the "MUST NOT generate")
syntactic elements described in section 4 of this document.

A.6.1 Obsolete addressing

Note in the below example the lack of quotes around Joe Q. Public, the
route that appears in the address for Mary Smith, the two commas that
appear in the "To:" field, and the spaces that appear around the "." in the
jdoe address.

----
From: Joe Q. Public <john.q.public@hiccup.tld>
To: Mary Smith <@machine.tld:mary@harry.nil>, , jdoe@machine   . tld
Date: Tue, 1 Jul 2003 10:52:37 +0200
Message-ID: <5678.21-Nov-1997@hiccup.tld>

Hi everyone.
----

A.6.2 Obsolete dates

The following message uses an obsolete date format, including a non-numeric
time zone and a two digit year. Note that although the day-of-week is
missing, that is not specific to the obsolete syntax; it is optional in the
current syntax as well.

----
From: John Doe <jdoe@machine.tld>
To: Mary Smith <mary@harry.nil>
Subject: Saying Hello
Date: 21 Nov 97 09:55:06 GMT
Message-ID: <1234@local.machine.tld>

This is a message just to say hello.
So, "Hello".
----

A.6.3 Obsolete whitespace and comments

Whitespace and comments can appear between many more elements than in the
current syntax. Also, folding lines which are made up entirely of
whitespace are legal.

----
>From  : John Doe <jdoe@machine(comment).   tld>
To    : Mary Smith

          <mary@harry.nil>
Subject     : Saying Hello
Date  : Fri, 21 Nov 1997 09(comment):   55  :  06 -0600
Message-ID  : <1234   @   local(blah)  .machine .tld>

This is a message just to say hello.
So, "Hello".
----

Note especially the second line of the "To:" field. It starts with two
space characters. Therefore, it is considered part of the folding as
described in section 4.2. Also, the comments and whitespace throughout
addresses, dates, and message identifiers are all part of the obsolete
syntax.

Appendix B - Differences from earlier standards

[Editor's Note: This will be real eventually, for now just changes in this
draft.

1. CFWS around identifier
2. Define no-fold-literal
3. "too" -> "to" at end of 4.3
4. "identifier" -> "msg-id"
5. CFWS -> FWS in date; obs- date stuff gets CFWS
6. CFWS between : and ; in group
7. Reply-To gets address-list
8. Note about WS and comments in 3.2
9. Any number of Comments: and Keywords:
10. Added "automatic" to Resent-* MUST NOT
11. Resent and trace fields don't require a new message-id.
12. Examples, examples, examples.

To do list:

Specifically talk about X-* headers.
Change reply-to yet again?
Bibliography
Acknowledgements
Differences

]


Html markup produced by rfcmarkup 1.124, available from https://tools.ietf.org/tools/rfcmarkup/