draft-ietf-usefor-article-02.txt   draft-ietf-usefor-article-03.txt 
INTERNET-DRAFT Charles H. Lindsey
Usenet Format Working Group University of Manchester
February 2000
News Article Format News Article Format
draft-ietf-usefor-article-02 <draft-ietf-usefor-article-03.txt>
USEFOR Working Group
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance This document is an Internet-Draft and is in full conformance with
with all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
This document is an Internet-Draft. Internet-Drafts are working Internet-Drafts are working documents of the Internet Engineering
documents of the Internet Engineering Task Force (IETF), its Task Force (IETF), its areas, and its working groups. Note that
areas, and its working groups. Note that other groups may also other groups may also distribute working documents as Internet-
distribute working documents as Internet-Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet- documents at any time. It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as Drafts as reference material or to cite them other than as "work
"work in progress." in progress."
To view the entire list of current Internet-Drafts, please check
the "1id-abstracts.txt" listing contained in the Internet-Drafts
Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
(Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
(Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
(US West Coast).
It is hoped that this document will obsolete RFC 1036 and will
become an Internet standard.
This document is a successor to Henry Spencer's "Son of 1036" The list of current Internet-Drafts can be accessed at
Draft, and has been referred to as "Grandson of 1036". http://www.ietf.org/ietf/1id-abstracts.txt.
Distribution of this memo is unlimited. The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract Abstract
This Draft defines the format of network news articles, and This Draft defines the format of Netnews articles and specifies
defines roles and responsibilities for humans and software. the requirements to be met by software which originates,
distributes, stores and displays them. It is intended as a
Network news articles resemble mail messages but are broadcast standards track document, superseding RFC 1036, which itself dates
to potentially large audiences, using a flooding algorithm from 1987.
that propagates one copy to each interested host (or group
thereof), typically stores only one copy per host, and does
not require any central administration or systematic
registration of interested users. Network news originated
as the medium of communication for Usenet, circa 1980.
The term "Usenet" refers to the protocols established in RFC Since the 1980s, Usenet has grown explosively, and many Internet and
1036 and successors; the software implementing those protocols; non-Internet sites now participate. In addition, this technology is
the network of hosts exchanging traffic using that software; now in widespread use for other purposes.
and also the traffic itself. Cooperating subnets are possible;
these are groups of hosts which agree to hold each other and
themselves to an internally adopted set of standards concerning
protocol details or implementations. When a cooperating subnet
does not exchange traffic with general Usenet hosts, then it
is no longer a part of Usenet, but a separate entity.
Since then Usenet has grown explosively, and most Internet Backward compatibility has been a major goal of this endeavour, but
sites participate in it. In addition, the news technology where this standard and earlier documents or practices conflict, this
is now in widespread use for other purposes, on the Internet standard should be followed. In most such cases, current practice is
and elsewhere. already compatible with these changes.
This document is intended to provide a definitive guide to the [The use of the words "this standard" within this document when
article format and interpretations thereof. Backward referring to itself do not imply that this draft yet has pretensions to
compatibility is a major goal, but where this document and be a standard, but rather indicates what will become the case if and
earlier documents or practices collide, this document should be when it is accepted as an RFC with the status of a proposed or draft
used. standard.]
News Article Format February 2000
Table of Contents [Remarks enclosed in square brackets and aligned with the left margin,
such as this one, are not part of this draft, but are editorial notes to
explain matters amongst ourselves, or to point out alternatives, or to
indicate work yet to be done.]
1. Introduction [Please note that this Draft describes "Work in Progress". Much remains
to be done, though the material included so far is unlikely to change in
any major way.]
1.1 Scope and Objectives Table of Contents
"Netnews" is a set of protocols that enables news "articles" 1. Introduction .................................................. 5
(which resemble mail messages) to be broadcast to 1.1. Basic Concepts ............................................ 5
potentially-large audiences, using a flooding algorithm which 1.2. Objectives ................................................ 6
propagates copies throughout a network of participating hosts, 1.3. Historical Outline ........................................ 6
typically storing only one copy per host and making it 1.4. Transport ................................................. 6
available on demand to readers able to access that host. 2. Definitions, Notations and Conventions ........................ 7
Articles are grouped, for convenience of access, into 2.1. Definitions. ............................................. 7
"newgroups", and the newsgroups themselves are arranged into 2.2. Textual Notations ......................................... 8
"hierarchies". An important characteristic of Netnews is the 2.3. Relation To Mail and MIME ................................. 9
lack of any requirement for a central administration or for 2.4. Syntax Notation ........................................... 10
the establishment of any controlling host to manage the 2.5. Language .................................................. 12
network. A network which limits participation to some 3. Changes to the existing protocols ............................. 13
restricted set of hosts (within some company, for example) is 3.1. Principal Changes ......................................... 13
a "closed" network; otherwise it is an "open" network. A set 3.2. Transitional Arrangements ................................. 13
of hosts within a network which, by mutual arrangement, 4. Basic Format .................................................. 15
operates some variant (whether more or less restrictive) of 4.1. Syntax of News Articles ................................... 15
the Netnews protocols is a "cooperating subnet". 4.2. Headers ................................................... 16
4.2.1. Names and Contents .................................... 16
4.2.2. Header Properties ..................................... 17
4.2.2.1. Experimental Headers .............................. 17
4.2.2.2. Inheritable Headers ............................... 18
4.2.2.3. Local Headers ..................................... 18
4.2.2.4. Variant Headers ................................... 18
4.2.3. White Space and Continuations ......................... 18
4.2.4. Comments .............................................. 19
4.2.5. Undesirable Headers ................................... 20
4.3. Body ...................................................... 20
4.3.1. Body Format Issues .................................... 20
4.3.2. Body Conventions ...................................... 21
4.4. Characters and Character Sets ............................. 23
4.4.1. Character Sets within Article Headers ................. 23
4.4.2. Character Sets within Article Bodies .................. 24
4.5. Size Limits ............................................... 24
4.6. Example ................................................... 25
5. Mandatory Headers ............................................. 26
5.1. Date ...................................................... 26
5.1.1. Examples .............................................. 27
5.2. From ...................................................... 27
5.2.1. Examples: ............................................ 27
News Article Format February 2000
"Usenet" is a particular worldwide open network based upon the 5.3. Message-ID ................................................ 27
Netnews protocols. Anybody can join (it is simply necessary to 5.4. Subject ................................................... 28
negotiate an exchange of articles with one or more other 5.4.1. Examples .............................................. 29
participating hosts). Usenet "belongs" to those who administer 5.5. Newsgroups ................................................ 29
the hosts of which it is comprised. There is no Cabal with 5.5.1. Forbidden newsgroup names ............................. 31
overall authority to direct what is to be be allowed. 5.6. Path ...................................................... 32
Nevertheless, there do exist agencies within Usenet that have 5.6.1. Format ................................................ 32
authority to establish policies and to perform administrative 5.6.2. Adding a path-identity to the Path header ............. 32
functions, but such authority derives solely from the consent 5.6.3. The tail-entry ........................................ 34
of those sites which choose to recognise it (and who can 5.6.4. Delimiter Summary ..................................... 34
decline to exchange articles with sites which choose not to 5.6.5. Suggested Verification Methods ........................ 35
recognise it). Usually, the authority of such an agency is 5.6.6. Example ............................................... 36
restricted to a particular hierarchy, or group of hierarchies. 6. Optional Headers .............................................. 37
6.1. Reply-To .................................................. 37
6.1.1. Examples .............................................. 37
6.2. Sender .................................................... 38
6.3. Organization .............................................. 38
6.4. Keywords .................................................. 38
6.5. Summary ................................................... 38
6.6. Distribution .............................................. 38
6.7. Followup-To ............................................... 40
6.8. References ................................................ 40
6.8.1. Examples .............................................. 41
6.9. Expires ................................................... 41
6.10. Archive .................................................. 41
6.11. Control .................................................. 41
6.12. Approved ................................................. 42
6.13. Replaces / Supersedes .................................... 42
6.13.1. Syntax and Semantics ................................. 43
6.13.2. Message-ID version procedure ......................... 44
6.13.2.1. Message version numbers .......................... 44
6.13.2.2. Implementation and Use Note ...................... 46
6.13.2.3. The Message-Version NNTP extension ............... 47
6.13.2.4. Examples ......................................... 48
6.14. Xref ..................................................... 49
6.15. Lines .................................................... 50
6.16. User-Agent ............................................... 50
6.16.1. Examples ............................................. 51
6.17. MIME headers ............................................. 51
6.17.1. Syntax ............................................... 51
6.17.2. Content-Transfer-Encoding ............................ 52
6.17.3. Content-Type ......................................... 52
6.17.3.1. Message/partial .................................. 53
6.17.3.2. Message/rfc822 ................................... 53
6.17.3.3. Message/external-body ............................ 54
6.17.3.4. Multipart types .................................. 54
6.17.4. Character Sets ....................................... 54
6.17.5. Content Disposition .................................. 55
6.17.6. Definition of some new Content-Types ................. 55
6.17.6.1. Application/news-transmission .................... 55
6.17.6.2. Message/news withdrawn ........................... 56
6.18. Obsolete Headers ......................................... 56
7. Control Messages .............................................. 57
7.1. The 'newgroup' Control Message ............................ 57
News Article Format February 2000
A "policy" is a rule intended to facilitate the smooth 7.1.1. The Body of the 'newgroup' Control Message ............ 58
operation of a network by establishing parameters which 7.1.2. Application/news-groupinfo ............................ 58
restrict behaviour that, whilst technically unexceptionable, 7.1.3. Initial Articles ...................................... 60
would nevertheless contravene some accepted standard of "Good 7.1.4. Example ............................................... 61
Netkeeping". Since the ultimate beneficiaries of a network are 7.2. The 'rmgroup' Control Message ............................. 62
its human readers, who will be less tolerant of poorly 7.2.1. Example ............................................... 62
designed interfaces than mere computers, articles in breach of 7.3. The 'mvgroup' Control Message ............................. 62
established policy can cause considerable annoyance to their 7.3.1. Single group .......................................... 62
recipients. 7.3.2. Multiple Groups ....................................... 63
7.3.3. Examples .............................................. 64
7.4. The 'checkgroups' Control Message ......................... 65
7.4.1. Application/news-checkgroups .......................... 66
7.5. Cancel .................................................... 66
7.6. Ihave, sendme ............................................. 68
7.7. Obsolete control messages. ............................... 69
8. Duties of Various Agents ...................................... 69
8.1. General principles to be followed ......................... 69
8.2. Duties of an Injecting Agent .............................. 70
8.2.1. Proto-articles ........................................ 70
8.2.2. Procedure to be followed by Injecting Agents .......... 70
8.3. Duties of a Relaying Agent ................................ 72
8.4. Duties of a Serving Agent ................................. 73
8.5. Duties of a Posting Agent ................................. 73
8.6. Duties of a Followup Agent ................................ 74
8.7. Duties of a Gateway ....................................... 74
9. Security Considerations ....................................... 74
9.1. Attacks ................................................... 75
10. References ................................................... 75
11. Acknowledgements ............................................. 77
12. Contact Addresses ............................................ 77
13. Intellectual Property Rights ................................. 78
Appendix A.1 - A-News Article Format .............................. 79
Appendix A.2 - Early B-News Article Format ........................ 79
Appendix B - Collected Syntax ..................................... 79
News Article Format February 2000
Policies may well vary from network to network, from hierarchy 1. Introduction
to hierarchy within one network, and even between individual
newgroups within one hierarchy. It is assumed, for the
purposes of this document, that agencies with the proper
authority to establish such policies will exist. However, for
the benefit of networks and hierarchies without such agencies,
and to provide a basis upon which such agencies can build,
this present document often provides default policy
parameters, usually introducing them by a phrase such as "As a
matter of policy ...".
[If we follow this route, then that phrase (or one like it, perhaps
using the word "default") can be introduced at various places in the
existing text, for example when discussing the lengths of lines in
articles, when discussing the lengths of components of newsgroup names,
and when discussing Mime Content-Types, and also in connection with the
Checkpolicies header, if we decide to have it.]
The purpose of this present document is to define the 1.1. Basic Concepts
protocols to be used for Netnews in general, and for Usenet in
particular, and to set standards to be followed by software
that implements those protocols.
It is NOT the purpose of this document to define how the "Netnews" is a set of protocols for generating, storing and
authority of various agencies to exercise control or oversight retrieving news "articles" (which resemble mail messages) and for
of the various parts of Usenet is established (that is itself exchanging them amongst a readership which is potentially widely
a matter of policy). Nevertheless, it is assumed that such distributed. It is organized around "newsgroups," with the
authorities will exist, and tools are provided within the expectation that each reader will be able to see all articles posted
protocols for their use. to each newsgroup in which he participates. These protocols most
commonly use a flooding algorithm which propagates copies throughout
a network of participating servers. Typically, only one copy is
stored per server, and each server makes it available on demand to
readers able to access that server.
1.2 Historical Outline An important characteristic of Netnews is the lack of any requirement
for a central administration or for the establishment of any
controlling host to manage the network. A network which limits
participation to some restricted set of hosts (within some company,
for example) is a "closed" network; otherwise it is an "open"
network. A set of hosts within a network which, by mutual
arrangement, operates some variant (whether more or less restrictive)
of the Netnews protocols is a "cooperating subnet".
Network news originated as the medium of communication for "Usenet" is a particular worldwide open network based upon the
Usenet, circa 1980. Since then Usenet has grown explosively, Netnews protocols, with the newsgroups being organised into
and many Internet sites participate in it. In addition, the recognized "hierarchies". Anybody can join (it is simply necessary
news technology is now in widespread use for other purposes, to negotiate an exchange of articles with one or more other
on the Internet and elsewhere. participating hosts). Usenet "belongs" to those who administer the
hosts of which it is comprised. There is no Cabal with overall
authority to direct what is to be be allowed. Nevertheless, there do
exist agencies within Usenet that have authority to establish
policies and to perform administrative functions, but such authority
derives solely from the consent of those sites which choose to
recognise it (and who can decline to exchange articles with sites
which choose not to recognise it). Usually, the authority of such an
agency is restricted to a particular hierarchy, or group of
hierarchies.
The earliest news interchange used the so-called "A News" A "policy" is a rule intended to facilitate the smooth operation of a
article format. Shortly thereafter, an article format vaguely network by establishing parameters which restrict behaviour that,
resembling Internet mail was devised and used briefly. Both whilst technically unexceptionable, would nevertheless contravene
of those formats are completely obsolete; they are documented some accepted standard of "Good Netkeeping". Since the ultimate
in appendix A for historical reasons only. With publication beneficiaries of a network are its human readers, who will be less
of [RFC-850] in 1983, news articles came to closely resemble tolerant of poorly designed interfaces than mere computers, articles
Internet mail messages, with some restrictions and some in breach of established policy can cause considerable annoyance to
additional headers. [RFC-1036] in 1987 updated [RFC-850] their recipients.
without making major changes.
A Draft popularly referred to as "Son of 1036" [RFC-1036BIS] Policies may well vary from network to network, from hierarchy to
was written in 1994 by Henry Spencer. That document formed the hierarchy within one network, and even between individual newsgroups
original basis for this document. Much is taken directly from within one hierarchy. It is assumed, for the purposes of this
Son of 1036, and it is hoped that we have followed its spirit standard, that agencies with varying degrees of authority to
and intentions. establish such policies will exist, and that where they do not,
policy will be established by mutual agreement. For the benefit of
News Article Format February 2000
1.3 Transport networks and hierarchies without such established agencies, and to
provide a basis upon which all agencies can build, this present
standard often provides default policy parameters, usually
introducing them by a phrase such as "As a matter of policy ...".
As in this document's predecessors, the exact means used to 1.2. Objectives
transmit articles from one host to another is not specified.
NNTP [RFC-977] is the most common transmission method on the
Internet, but much transmission takes place entirely
independent of the Internet. Other methods in use include the
UUCP protocol [RFC-976] extensively used in the early days of
Usenet, FTP, downloading via satellite, tape archives, and
physically delivered magnetic and optical media.
2. Definitions, Notations and Conventions The purpose of this present standard is to define the protocols to be
used for Netnews in general, and for Usenet in particular, and to set
standards to be followed by software that implements those protocols.
2.1 Definitions. It is NOT the purpose of this standard to define how the authority of
various agencies to exercise control or oversight of the various
parts of Usenet is established (that is itself a matter of policy).
Nevertheless, it is assumed that such authorities will exist, and
tools are provided within the protocols for their use.
An "article" is the unit of news, analogous to a [MAIL] 1.3. Historical Outline
"message".
A "poster" is the person or software that composes and submits Network news originated as the medium of communication for Usenet,
a possibly compliant article to an injecting agent. The poster circa 1980. Since then, Usenet has grown explosively, and many
is analogous to [MAIL]'s author(s). Internet and non-Internet sites participate in it. In addition, the
news technology is now in widespread use for other purposes, on the
Internet and elsewhere.
A "posting agent" is software that assists posters to prepare The earliest news interchange used the so-called "A News" article
articles, including adding required headers and determining format. Shortly thereafter, an article format vaguely resembling
whether the final article is compliant to this standard. If Internet mail was devised and used briefly. Both of those formats
the article is compliant it passes the article on to an are completely obsolete; they are documented in A.1 for historical
injecting agent for final checking and injection into the news reasons only. With publication of [RFC 850] in 1983, news articles
stream. If the article is not compliant or rejected by the came to closely resemble Internet mail messages, with some
injecting agent then the posting agent informs the poster with restrictions and some additional headers. [RFC 1036] in 1987 updated
an explanation of the error. [RFC 850] without making major changes.
[There should also be some mention of B News and its Appendix.
Alternatively, these appendices may go into some separate informational
RFC.]
An "injecting agent" takes the finished article from the A Draft popularly referred to as "Son of 1036" [Son-of-1036] was
posting agent (often via the NNTP "post" command ) performs written in 1994 by Henry Spencer. That document formed the original
some final checks and passes it on to a relaying agent for basis for this standard. Much is taken directly from Son of 1036, and
general distribution. it is hoped that we have followed its spirit and intentions.
A "relaying agent" is software which receives allegedly 1.4. Transport
compliant articles from injecting agents and/or other
relaying agents, and possibly passes copies on to other
relaying agents and serving agents.
A "serving agent" takes an article from a relaying agent and As in this standard's predecessors, the exact means used to transmit
files it in a "news database" . It also provides an interface articles from one host to another is not specified. NNTP [NNTP] is
for reading agents to access the news database. the most common transmission method on the Internet, but much
transmission takes place entirely independent of the Internet. Other
methods in use include the UUCP protocol [RFC 976] extensively used
in the early days of Usenet, FTP, downloading via satellite, tape
archives, and physically delivered magnetic and optical media.
A "reader" is the person or software reading news articles. News Article Format February 2000
A "reading agent" is software which presents articles to a 2. Definitions, Notations and Conventions
reader.
A "newsgroup" is a single news forum, a logical bulletin 2.1. Definitions.
board, having a name and nominally intended for articles on a
specific topic. An article is "posted to" a single newsgroup
or several newsgroups. When an article is posted to more than
one newsgroup, it is said to be "crossposted"; note that
this differs from posting the same text as part of each of
several articles, one per newsgroup. A "hierarchy" is the
set of all newsgroups whose names share a first component.
A newsgroup may be "moderated", in which case submissions An "article" is the unit of news, analogous to a [MESSFOR] "message".
are not posted directly, but mailed to a "moderator" for A "proto-article" is one that has not yet been injected into the news
consideration and possible posting. Moderators are typically system.
human but may be implemented partially or entirely in
software.
A "followup" is an article containing a response to the A "message identifier",) is a unique identifier for an article,
contents of an earlier article (the followup's "precursor"). usually supplied by the "posting agent" which posted it or, failing
that, by the "injecting agent". It distinguishes the article from
every other article ever posted anywhere. Articles with the same
message identifier are treated as if they are the same article
regardless of any differences in the body or headers.
A "followup agent" is a combination of reading agent, and A "newsgroup" is a single news forum, a logical bulletin board,
posting agent that aids in the preparation and posting of a having a name and nominally intended for articles on a specific
followup. topic. An article is "posted to" a single newsgroup or several
newsgroups. When an article is posted to more than one newsgroup, it
is said to be "crossposted"; note that this differs from posting the
same text as part of each of several articles, one per newsgroup.
A "reply agent" is a combination of reading agent and mailer A newsgroup may be "moderated", in which case submissions are not
that aids in the preparation and posting of an email response posted directly, but mailed to a "moderator" for consideration and
to an article. possible posting. Moderators are typically human but may be
implemented partially or entirely in software.
A "message ID" is a unique identifier for an article, usually A "hierarchy" is the set of all newsgroups whose names share a first
supplied by the posting agent which posted it. It component (as defined in 5.5). The term "sub-hierarchy" is also used
distinguishes the article from every other article ever posted where several initial components are shared.
anywhere. Articles with the same message ID are treated as
identical copies of the same article even if they are not in
fact identical.
A "gateway" is software which receives news articles and A "poster" is the person or software that composes and submits a
converts them to messages of some other kind (e.g. mail to a possibly compliant article to a "posting agent". The poster is
mailing list), or vice versa; in essence it is a translating analogous to [MESSFOR]'s author(s).
relaying agent that straddles boundaries between different
methods of message exchange. The most common type of gateway
connects newsgroup(s) to mailing list(s), either
unidirectionally or bidirectionally, but there are also
gateways between news networks using this document's news
format and those using other formats.
A "control message" is an article which is marked as A "posting agent" is the software that assists posters to prepare
containing control information; a relaying or serving agent proto-articles, in compliance with this standard. The proto-article
receiving such an article may (subject to permissions etc.) is then passed on to an "injecting agent" for final checking and
take actions beyond just filing and passing on the article. injection into the news stream. If the article is not compliant, or
is rejected by the injecting agent, then the posting agent informs
the poster with an explanation of the error.
An article's "reply address" is the address to which mailed A "reader" is the person or software reading news articles.
replies should be sent. This is the address specified in the
article's From header (see section 5.2), unless it also has a
Reply-To header (see section 6.3).
2.2 Textual Notations A "reading agent" is software which presents articles to a reader.
Throughout this document, [MAIL] is short for "the current A "followup" is an article containing a response to the contents of
RFCs governing electronic mail formats, beginning with the an earlier article (the followup's "precursor").
historical [RFC-822] and continuing to its modern successors".
"ASCII" is short for "the ANSI X3.4 character set" [ANSI-
X3.4]. While "ASCII" is often misused to refer to various
character sets somewhat similar to X3.4, in this document
"ASCII" means X3.4 and only X3.4. ASCII is a 7 bit character
set. Please note that this document requires that all agents
be 8 bit clean; that is, they must accept and transmit data
without changing or omitting the 8th bit.
Certain words used to define the significance of individual A "followup agent" is a combination of reading agent and posting
requirements are capitalized. "MUST", "SHOULD", "MAY" and the agent that aids in the preparation and posting of a followup.
same words followed by "NOT" should be read as having the same
meaning as in [RFC-2119]. In particular, to be fully compliant
with this document, software must satisfy every relevant
"MUST" requirement. Software that satisfies every relevant
"SHOULD" requirement but not every "MUST" requirement is
partially compliant.
[However, we could step back from this by requiring less rigour in News Article Format February 2000
observing "SHOULD" in the case of "matters of policy". Or perhaps we
could introduce an "OUGHT" category.]
This document contains explanatory notes using the following An article's "reply address" is the address to which mailed replies
format. These may be skipped by persons interested solely in should be sent. This is the address specified in the article's From
the content of the specification. The purpose of the notes is header (5.2), unless it also has a Reply-To header (6.1).
to explain why choices were made, to place them in context, or
to suggest possible implementation techniques.
NOTE: While such explanatory notes may seem superfluous in A "reply agent" is a combination of reading agent and mailer that
principle, they often help the less-than-omniscient reader aids in the preparation and posting of an email response to an
grasp the purpose of the specification and the constraints article.
involved. Given the limitations of natural language for
descriptive purposes, this improves the probability that
implementors and users will understand the true intent of the
specification in cases where the wording is not entirely
clear.
[Remarks enclosed in square brackets, such as this one, are not part of A "sender" is the person or software (usually, but not always, the
this document, but are editorial notes to explain matters amongst same as the poster) responsible for the operation of the posting
ourselves, or to point out alternatives, or to indicate work yet to be agent or, which amounts to the same thing, for passing the article to
done.] the injecting agent. The sender is analogous to [MESSFOR]'s sender.
All numeric values are given in decimal unless otherwise An "injecting agent" takes the finished article from the posting
indicated. Octets are assumed to be unsigned values for this agent (often via the NNTP "post" command) performs some final checks
purpose. and passes it on to a relaying agent for general distribution.
Throughout this document we will give examples of various A "relaying agent" is software which receives allegedly compliant
definitions, headers and other specifications. It MUST be articles from injecting agents and/or other relaying agents, and
remembered that these samples are for the aid of the reader possibly passes copies on to other relaying agents and serving
only and do NOT define any specification themselves. In order agents.
to prevent possible conflict with "Real World" entities and
people the top level domain of ".example" is used in all
sample domains and addresses. The hierarchy of example.* is
also used as a sample hierarchy. Information on the ".example"
top level domain is in [TEST-TLDS].
2.3 Relation To Mail and MIME A "news database" is the set of articles and related strutural
information stored by a serving agent and made available for access
by reading agents.
The primary intent of this document is to describe the news A "serving agent" receives an article from a relaying agent and files
article format. Insofar as news articles are a subset of it in a news database. It also provides an interface for reading
[MAIL]'s message format augmented by some new headers, this agents to access the news database.
document incorporates many (though not all) of the provisions
of [MESSFOR], with the aim of enabling news articles to pass
through mail systems and vice versa, provided only that they
contain the minimum headers required for the mode of transport
being used. Unfortunately, the match is not perfect, but it is
the intention of this document that gateways between [MAIL]
and news should be able to operate with the minimum of
tinkering.
[This document has been designed to fit on top of the drafts currently
in preparation for Mail [MESSFOR]. It is expected that those drafts will
have progressed to the RFC stage by the time the present document in
complete, at which time all references to [MESSFOR] in the present text
will be replaced by references to that RFC.]
Likewise, this document incorporates many (though not all) of A "control message" is an article which is marked as containing
the provisions of the MIME standards [RFC-2045 et seq] which, control information; a relaying or serving agent receiving such an
though designed with [MAIL] in mind, are mostly applicable to article may (subject to the policies observed at that site) take
news. actions beyond just filing and passing on the article.
2.4 Syntax Notation A "gateway" is software which receives news articles and converts
them to messages of some other kind (e.g. mail to a mailing list), or
vice versa; in essence it is a translating relaying agent that
straddles boundaries between different methods of message exchange.
The most common type of gateway connects newsgroup(s) to mailing
list(s), either unidirectionally or bidirectionally, but there are
also gateways between news networks using this standard's news format
and those using other formats.
This document uses the Augmented Backus Naur Form described in 2.2. Textual Notations
[RFC-2234]. A discussion of this is outside the bounds of
this document, but it is expected that implementors will be
able to quickly understand it with reference to the defining
document.
Much of the syntax in this document is incorporated directly This standard contains explanatory NOTEs using the following format.
from that given in [MESSFOR] or in the Mime specifications These may be skipped by persons interested solely in the content of
[RFC-2045 et seq], but with appropriate modifications to the specification. The purpose of the notes is to explain why
permit the use of full 8bit characters, and to remove those choices were made, to place them in context, or to suggest possible
parts of the syntax given in [MESSFOR] that are regarded as implementation techniques.
"obsolete". Full details of this are explained in section 4.1.
[Alternatively, we could move some parts of 4.1 forward to here.]
NOTE: News parsers historically have been much less News Article Format February 2000
permissive than [MAIL] parsers, and this is reflected in the
modifications referred to, and in some further specific rules.
NOTE: Following [RFC-2234], literal text included in the NOTE: While such explanatory notes may seem superfluous in
syntax is to be regarded as case-insensitive. However, in principle, they often help the less-than-omniscient reader grasp
contradistinction to [MAIL], the NetNews protocols are the purpose of the specification and the constraints involved.
sensitive to case in some instances (as in newsgroup names, Given the limitations of natural language for descriptive
some header parameters, etc.). Care has been taken to indicate purposes, this improves the probability that implementors and
this explicitly where required. users will understand the true intent of the specification in
cases where the wording is not entirely clear.
2.5 Language "ASCII" is short for "the ANSI X3.4 character set" [ANSI X3.4].
While "ASCII" is often misused to refer to various character sets
somewhat similar to X3.4, in this standard "ASCII" means X3.4 and
only X3.4. ASCII is a 7 bit character set. Please note that this
standard requires that all agents be 8 bit clean; that is, they must
accept and transmit data without changing or omitting the 8th bit.
Various constant strings in this document, such as header names Certain words, when capitalized, are used to define the significance
and month names, are derived from English words. Despite of individual requirements. The key words "MUST", "SHOULD", "MAY" and
their derivation, these words do NOT change when the poster the same words followed by "NOT" are to be interpreted as described
or reader employing them is interacting in a language other in [RFC 2119].
than English. Posting and reading agents MAY translate
as appropriate in their interaction with the poster or
reader, but the forms that actually appear in articles
MUST be the English-derived ones defined in this document.
3. Changes to the existing protocols NOTE: The use of "MUST" always implies a requirement that would
lead to interoperability problems if not followed, but the word
"SHOULD", especially when it is applied to actions of posting
and similar agents which the individual poster may easily
override, is often used where a violation would do no more than
breach established policy, or accepted standards of "Good
Netkeeping". Moreover, even a "MUST" requirement imposed on a
relaying or serving agent applies only to articles actually
processed by that agent (since such an agent may always reject
any article entirely for reasons of site policy).
This document prescribes many changes, clarifications and new All numeric values are given in decimal unless otherwise indicated.
features since the protocols described in [RFC-1036] and Octets are assumed to be unsigned values for this purpose.
[RFC-1036BIS]. It is the intention that they can be
assimilated into Usenet as it presently operates without major
interruption to the service, though some of the new features
may not begin to show benefit until they become widely
implemented. This sections summarizes the main changes, and
comments on some features of the transition.
3.1 Principal Changes Throughout this standard we will give examples of various
definitions, headers and other specifications. It needs to be be
remembered that these samples are for the aid of the reader only and
do NOT define any specification themselves. In order to prevent
possible conflict with "Real World" entities and people the top level
domain of ".example" is used in all sample domains and addresses.
The hierarchy of example.* is also used as a sample hierarchy.
Information on the ".example" top level domain is in [RFC 2606].
o The [MAIL] conventions for parenthesis-enclosed comments 2.3. Relation To Mail and MIME
in headers are supported.
o Whitespace is permitted in Newsgroups headers, permitting
folding of such headers. Indeed, all news headers can now
be folded.
o An enhanced syntax for the Path header enables the
injection point of and the route taken by an article to be
determined with certainty.
o Netnews is firmly established as an 8bit medium.
o Large parts of MIME are recognized as an integral part of
Netnews.
o The charset for headers is always UTF-8. This will, inter
alia, permit newsgroup-names with non-ASCII characters.
o There is a new Control command 'mvgroup' to facilitate
group renaming.
o There are several new headers defined, such as Replaces
and Author-Ids, leading to increased functionality.
o There are numerous other small changes, clarifications and
enhancements.
[Doubtless many other changes should be listed, but there is little
point in doing so until our text is nearing completion. The above gives
the flavour of what should be said.]
3.2 Transitional Arrangements The primary intent of this standard is to describe the news article
format. Insofar as news articles are a subset of the Mail message
format augmented by some new headers, this standard incorporates many
(though not all) of the provisions of [MESSFOR], with the aim of
enabling news articles to pass through mail systems and vice versa,
provided only that they contain the minimum headers required for the
mode of transport being used. Unfortunately, the match is not
perfect, but it is the intention of this standard that gateways
between Mail and News should be able to operate with the minimum of
News Article Format February 2000
An important distinction must be made between serving and tinkering.
relaying agents which are responsible for the distribution and [This standard has been designed to fit on top of the drafts currently
storage of news articles, and user agents which are in preparation for Mail [MESSFOR]. It is expected that those drafts
responsible for interactions with users. It is important that will have progressed to the RFC stage by the time the present standard
the former should be upgraded to conform to this document as in complete, at which time all references to [MESSFOR] in the present
soon as possible to provide the benefit of the enhanced text will be replaced by references to that RFC.]
facilities. Fortunately, the number of distinct
implementations of such agents is rather small, at least so
far as the main "backbone" of Usenet is concerned, and many of
the new features are already supported. Contrariwise, there
are a great number of implementations of user agents,
installed on a vastly greater number of small sites.
Therefore, the new functionality has been designed so that
existing agents may continue to be used, although the full
benefits may not be realised until a substantial proportion of
them have been upgraded.
In the list which follows, care has been taken to distinguish Likewise, this standard incorporates many (though not all) of the
the implications for both kinds of agent. provisions of the MIME standards [RFC 2045] et seq which, though
designed with Mail in mind, are mostly applicable to News.
o [MAIL] style comments in headers do not affect serving and 2.4. Syntax Notation
relaying agents (note that the Newsgroups and Path headers
do not contain them). They are unlikely to hinder their
proper display in existing user agents except in the case
of the References header in agents which thread articles.
Therefore, it is provided that they SHOULD NOT be
generated except where permitted by the previous
standards.
o Because of its importance to all serving agents, the
extension permitting whitespace and folding in Newsgroup
headers SHOULD NOT be used unless the user is willing to
take the risk of misprocessed articles. It is believed most
existing implementations handle correctly, but this is not
certain. User agents are unaffected.
o The new style of Path header is already consistent with
the previous standards. However, the intention is that
relaying agents should henceforth reject articles in the
old style, and so this should be offered as a configurable
option for relaying agents. User agents are unaffected.
o The vast majority of serving, relaying and transport
agents are believed to be already 8bit clean (in the
slightly restricted sense in which that term is used in
the MIME standards). User agents that do not implement
MIME may be disadvantaged, but no more so than at present
when faced with 8bit characters (which currently abound in
spite of the previous standards).
o The introduction of MIME reflects a practice that is
already widespread. Articles in strict compliance with
the previous standards (using strict ASCII) will be
unaffected. Many user agents already support it, at least
to the extent of widely used charsets such as ISO8859-1.
Users expecting to read articles using the more exotic
charsets will need to acquire suitable reading agents. It
is not intended, in general, that any single user agent
will be able to display every charset known to IANA, but
all such agents MUST support ASCII. Serving and relaying
agents are not affected.
o The use of the UTF-8 charset for headers will not affect
any existing usage, since ASCII is a strict subset of
UTF-8. Insofar as newsgroup names containing non-ASCII
characters can now be expected to arise, support from
serving and relaying agents will be necessary. It is
believed that the customary storage structure used by
serving agents can already cope (perhaps not ideally) with
such names. Note that it is not necessary for serving and
relaying agents to understand all the characters available
in UTF-8, though it is desirable for them to be
displayable for diagnostic purposes via some escape
mechanism using, for example, the visible subset of ASCII.
For users expecting to use the more exotic charsets
available under UTF-8, the remarks already made in
connection with MIME will apply.
o The new Control: mvgroup command will need to be
implemented in serving agents. It SHOULD be used in
conjunction with pairs of matching rmgroup and newgroup
commands (injected shortly after the mvgroup) until such
time as mvgroup is widely implemented. The new Replaces
header is also effectively a Control command, and
transitional arrangements are provided which should be
used in the meantime. User agents are unaffected.
o The headers newly introduced by this document can safely
be ignored by existing software, albeit with loss of the
new functionality.
4. Basic Format This standard uses the Augmented Backus Naur Form described in [RFC
2234]. A discussion of this is outside the bounds of this standard,
but it is expected that implementors will be able to quickly
understand it with reference to the defining document.
4.1 Overall Syntax Much of the syntax of News Articles is based on the corresponding
syntax defined in [MESSFOR] or in the Mime specifications [RFC 2045]
et seq, which is deemed to have been incorporated into this standard
as required. However, there are some important differences arising
from the fact that [MESSFOR] does not recognise anything other than
US-ASCII characters, that it does not recognise the MIME headers [RFC
2045], and that it includes much syntax described as "obsolete".
Much of the syntax of News Articles is based on the NOTE: News parsers historically have been much less permissive
corresponding syntax defined by [MESSFOR], which is deemed to than Mail parsers, and this is reflected in the modifications
have been incorporated into this standard as required. referred to, and in some further specific rules.
However, there are some important differences arising from the
fact that [MESSFOR] does not recognise anything other than
US-ASCII characters, that it does not recognise the MIME
headers [RFC2045], and that it includes much syntax described
as "obsolete".
The following syntactic forms supersede the corresponding The following syntactic forms therefore supersede the corresponding
rules given in [MESSFOR] and [RFC2045]: rules given in [MESSFOR] and [RFC 2045], thus allowing UTF-8
characters [RFC 2044] to appear in certain contexts (the four rules
begining with "strict-" reflect the corresponding original rules from
[MESSFOR]).
text = %d1-9 / ; all octets except UTF8-xtra-head = %d192-253
UTF8-xtra-tail = %d128-191
UTF8-xtra-char = UTF8-xtra-head 1*UTF8-xtra-tail
text = %d1-9 / ; all UTF-8 characters except
%d11-12 / ; US-ASCII NUL, CR and LF %d11-12 / ; US-ASCII NUL, CR and LF
%d14-255 %d14-127 /
UTF8-xtra-char
ctext = NO-WS-CTL / ; all of <text> except ctext = NO-WS-CTL / ; all of <text> except
%d33-39 / ; SP, HTAB, "(", ")" %d33-39 / ; SP, HTAB, "(", ")"
%d42-91 / ; and "\" %d42-91 / ; and "\"
%d93-255 %d93-126 /
UTF8-xtra-char
qtext = NO-WS-CTL / ; all of <text> except qtext = NO-WS-CTL / ; all of <text> except
%d33 / ; SP, HTAB, "\" and <"> %d33 / ; SP, HTAB, "\" and DQUOTE
%d35-91 / %d35-91 /
%d93-255 %d93-126 /
ftext = %d33-57 / ; all octets except UTF8-xtra-char
%d59-126 / ; CTL, SP and ":" News Article Format February 2000
%d128-255
token = 1*<any ftext except tspecials>
tspecials = "(" / ")" / "<" / ">" / "@"
"," / ";" / ":" / "
"/" / "[" / "]" / "?" / "="
Wherever in this standard the syntax is stated to be taken utext = NO-WS-CTL / ; Non white space controls
from [MESSFOR], it is to be understood as the syntax defined %d33-126 / ; The rest of US-ASCII
by [MESSFOR] after making the above changes, but NOT including UTF8-xtra-char
any syntax defined in section 4 ("Obsolete syntax") of strict-text = %d1-9 / ; text restricted to
[MESSFOR]. Software compliant with this standard MUST NOT %d11-12 / ; US-ASCII
generate any of the syntactic forms defined in that Obsolete %d14-127
Syntax, although it MAY accept such syntactic forms. Certain strict-qtext = NO-WS-CTL / ; qtext restricted to
syntax from the MIME specifications [RFC2045 et seq] is also %d33 / ; US-ASCII
considered a part of this Standard (see ...). %d35-91 /
%d93-127
strict-quoted-pair
= "\" strict-text
strict-quoted-string
= [CFWS] DQUOTE
*([FWS] (strict-qtext / strict-quoted-pair))
[FWS] DQUOTE [CFWS]
NOTE: There are sequences of octets which cannot legitimately
occur in UTF-8, even a few permitted by the above syntax. These
SHOULD NOT be generated by posting agents but, where they occur
inadavertently, they SHOULD be passed on untouched by other
agents.
Wherever in this standard the syntax is stated to be taken from
[MESSFOR], it is to be understood as the syntax defined by [MESSFOR]
after making the above changes, but NOT including any syntax defined
in section 4 ("Obsolete syntax") of [MESSFOR]. Software compliant
with this standard MUST NOT generate any of the syntactic forms
defined in that Obsolete Syntax, although it MAY accept such
syntactic forms. Certain syntax from the MIME specifications [RFC
2045] et seq is also considered a part of this standard (see 6.17).
The following syntactic forms, taken from [RFC2234] or from The following syntactic forms, taken from [RFC2234] or from
[MESSFOR], are repeated here for convenience only: [MESSFOR], are repeated here for convenience only:
ALPHA = %x41-5A / ; A-Z ALPHA = %x41-5A / ; A-Z
%x61-7A ; a-z %x61-7A ; a-z
CR = %x0D ; carriage return CR = %x0D ; carriage return
CRLF = CR LF CRLF = CR LF
DIGIT = %x30-39 ; 0-9 DIGIT = %x30-39 ; 0-9
HTAB = %x09 ; horizontal tab HTAB = %x09 ; horizontal tab
LF = %x0A ; line feed LF = %x0A ; line feed
SP = %x20 ; space SP = %x20 ; space
NO-WS-CTL = %d1-8 / ; US-ASCII control characters NO-WS-CTL = %d1-8 / ; US-ASCII control characters
%d11 / ; which do not include the %d11 / ; which do not include the
%d12 / ; carriage return, line feed, %d12 / ; carriage return, line feed,
%d14-41 / ; and whitespace characters %d14-31 / ; and whitespace characters
%d127 %d127
WSP = SP / HTAB ; Whitespace characters WSP = SP / HTAB ; Whitespace characters
FWS = ([*WSP CRLF] 1*WSP) ; Folding whitespace FWS = ([*WSP CRLF] 1*WSP) ; Folding whitespace
comment = "(" *([FWS] (ctext / quoted-pair / comment)) News Article Format February 2000
[FWS] ")"
atext = ALPHA / DIGIT /
"!" / "#" / ; Any character except
"$" / "%" / ; controls SP, and specials.
"&" / "'" / ; Used for atoms
"*" / "+" /
"-" / "/" /
"=" / "?" /
"^" / "_" /
"`" / "}" /
"|" / "}" /
"~"
atom = [CFWS] 1*atext [CFWS]
dot-atom = [CFWS] dot-atom-text [CFWS]
dot-atom-text = 1*atext *( "." 1*atext )
comment = "(" *([FWS]
(ctext / quoted-pair / comment)) [FWS] ")"
CFWS = *([FWS] comment) (([FWS] comment) / FWS ) CFWS = *([FWS] comment) (([FWS] comment) / FWS )
<"> = %d34 ; quote mark DQUOTE = %d34 ; quote mark
quoted-pair = "\" text quoted-pair = "\" text
quoted-string = *CFWS <"> *(FWS (qtext / quoted-pair)) <"> *CFWS quoted-string = [CFWS] DQUOTE
unstructured = *( [FWS] text ) *([FWS] (qtext / quoted-pair))
[FWS] DQUOTE [CFWS]
4.2. Syntax of News Articles unstructured = *( [FWS] utext ) [FWS]
The overall syntax of a news article is:
article = 1*header separator body
header = header-name ":" SP header-content CRLF
header-name = 1*name-character *( "-" 1*name-character )
name-character = ALPHA / DIGIT
header-content = usenet-header-content / unstructured
usenet-header-content
= <a header-content specifically defined in this standard>
separator = CRLF
body = *( *998text CRLF )
nonblank-text = 1*( [FWS] nbtext )
nbtext = qtext / ; all of <text> except
"\" / <"> ; SP and HTAB
An article consists of some headers followed by a body. An
empty line separates the two. The headers contain structured
information about the article and its transmission. A header
begins with a header-name identifying it, and can be continued
onto subsequent lines as described in section 4.3.2. The body
is largely unstructured text significant only to the poster
and the readers.
NOTE: Terminology here follows the current custom in the news NOTE: CFWS occurs at many places in the syntax in order to allow
community, rather than the [MESSFOR] convention of referring comments and extra whitespace to be inserted almost anywhere.
to what is here called a "header" as a "header-field" or The syntax is in fact ambiguous insofar as it may be impossible
"field". to tell in which of several possible ways a given comment or WS
was produced. However, this does not lead to semantic ambiguity
because, unless specifically stated otherwise, the presence of
absence of a comment or additional WS has no semantic meaning
and, in particular, it is a matter of indifference whether it
forms a part of the syntactic construct preceding it or the one
following it.
Note that the separator line must be truly empty, not just a NOTE: Following [RFC 2234], literal text included in the syntax
line containing white space. Further empty lines following it is to be regarded as case-insensitive. However, in
are part of the body, as are empty lines at the end of the contradistinction to [MESSFOR], the Netnews protocols are
article. sensitive to case in some instances (as in newsgroup names, some
header parameters, etc.). Care has been taken to indicate this
explicitly where required.
NOTE: The syntax above defines the canonical form of a news article as a 2.5. Language
sequence of lines each terminated by CRLF. This does not prevent serving
agents or transport agents from storing or handling the article in other
formats (e.g. using a single LF in place of CRLF) so long as the overall
effects achieved are as defined by this document when operating on the
canonical form.
4.3. Headers Various constant strings in this standard, such as header names and
month names, are derived from English words. Despite their
derivation, these words do NOT change when the poster or reader
employing them is interacting in a language other than English.
Posting and reading agents MAY translate as appropriate in their
interaction with the poster or reader, but the forms that actually
appear in articles MUST be the English-derived ones defined in this
standard.
4.3.1. Names and Contents News Article Format February 2000
Despite the restrictions on header-name syntax imposed by the 3. Changes to the existing protocols
grammar, relayers and reading agents SHOULD tolerate header
names containing any ASCII printable character other than
colon (":", ASCII 58). [That brings it into line with
<optional-field> as given in [MESSFOR].]
Header-names SHOULD be either those defined in this standard, This standard prescribes many changes, clarifications and new
or those defined in [MESSFOR], or those defined in any features since the protocols described in [RFC 1036] and [Son-of-
extension to either of these standards, or other names 1036]. It is the intention that they can be assimilated into Usenet
beginning with "X-". Software SHOULD NOT attempt to interpret as it presently operates without major interruption to the service,
headers not described in this standard or in its extensions. though some of the new features may not begin to show benefit until
Relaying agents MUST pass them on unaltered and reading agents they become widely implemented. This section summarizes the main
MUST enable them to be displayed, at least optionally. changes, and comments on some features of the transition.
Posters wishing to convey non-standard information in headers 3.1. Principal Changes
SHOULD use header-names beginning with "X-". No standard
header name will ever be of this form. Reading agents SHOULD
ignore "X-" headers, or at least treat them with great care.
The order of headers in an article is not significant. o The [MESSFOR] conventions for parenthesis-enclosed comments in
However, posting agents are encouraged to put mandatory headers are supported.
headers (see section 5) first, followed by optional headers o Whitespace is permitted in Newsgroups headers, permitting folding
(see section 6), followed by "X-" headers and headers not of such headers. Indeed, all news headers can now be folded.
defined in this standard or its extensions. Relaying agents o An enhanced syntax for the Path header enables the injection
MUST NOT change the order of the headers in an article. point of and the route taken by an article to be determined with
certainty.
o Netnews is firmly established as an 8bit medium.
o Large parts of MIME are recognised as an integral part of
Netnews.
o The charset for headers is always UTF-8. This will, inter alia,
permit newsgroup-names with non-ASCII characters.
o There is a new Control command 'mvgroup' to facilitate moving a
group to a different place (name) in a hierarchy.
o There are several new headers defined, such as Replaces and
Author-Ids, leading to increased functionality.
o There are numerous other small changes, clarifications and
enhancements.
[Doubtless many other changes should be listed, but there is little
point in doing so until our text is nearing completion. The above gives
the flavour of what should be said.]
Header-names are case-insensitive. There is a preferred case 3.2. Transitional Arrangements
convention, which posters and posting agents SHOULD use:
each hyphen-separated "word" has its initial letter (if any)
in uppercase and the rest in lowercase, except that some
abbreviations have all letters uppercase (e.g. "Message-ID"
and "MIME-Version"). The forms used in this standard are the
preferred forms for the headers described herein. Relaying and
reading agents MUST, however, tolerate articles not obeying
this convention.
[I thought we were doing away with header classes, except to An important distinction must be made between serving and relaying
discuss eXperimental. Consensus, please?] agents which are responsible for the distribution and storage of news
articles, and user agents which are responsible for interactions with
users. It is important that the former should be upgraded to conform
to this standard as soon as possible to provide the benefit of the
enhanced facilities. Fortunately, the number of distinct
implementations of such agents is rather small, at least so far as
the main "backbone" of Usenet is concerned, and many of the new
features are already supported. Contrariwise, there are a great
number of implementations of user agents, installed on a vastly
greater number of small sites. Therefore, the new functionality has
been designed so that existing agents may continue to be used,
although the full benefits may not be realised until a substantial
proportion of them have been upgraded.
4.3.2 Header Classes In the list which follows, care has been taken to distinguish the
implications for both kinds of agent.
There are four special classes of headers that may be present News Article Format February 2000
in an article: Experimental, Persistent, Comment, and
Variant. All other headers are ephemeral. These classes are
significant in how newsreaders and servers should treat them
when encountered.
4.3.3 Experimental Headers o [MESSFOR] style comments in headers do not affect serving and
relaying agents (note that the Newsgroups and Path headers do not
contain them). They are unlikely to hinder their proper display
in existing user agents except in the case of the References
header in agents which thread articles. Therefore, it is provided
that they SHOULD NOT be generated except where permitted by the
previous standards.
o Because of its importance to all serving agents, the extension
permitting whitespace and folding in Newsgroup headers SHOULD NOT
be used until it has been widely deployed amongst relaying
agents. User agents are unaffected.
o The new style of Path header is already consistent with the
previous standards. However, the intention is that relaying
agents should henceforth reject articles in the old style, and so
this should be offered as a configurable option for relaying
agents. User agents are unaffected.
[Should that "should" be a "SHOULD" or a "MAY".]
Experimental headers are headers which begin with "X-". They o The vast majority of serving, relaying and transport agents are
are to be used by newsreaders proposing new headers for some believed to be already 8bit clean (in the slightly restricted
utility or for comments to be propogated with the article. sense in which that term is used in the MIME standards). User
There are no established headers that are considered agents that do not implement MIME may be disadvantaged, but no
experimental headers; an established header cannot be more so than at present when faced with 8bit characters (which
experimental. currently abound in spite of the previous standards).
o The introduction of MIME reflects a practice that is already
widespread. Articles in strict compliance with the previous
standards (using strict US-ASCII) will be unaffected. Many user
agents already support it, at least to the extent of widely used
charsets such as ISO-8859-1. Users expecting to read articles
using the more exotic charsets will need to acquire suitable
reading agents. It is not intended, in general, that any single
user agent will be able to display every charset known to IANA,
but all such agents MUST support US-ASCII. Serving and relaying
agents are not affected.
o The use of the UTF-8 charset for headers will not affect any
existing usage, since US-ASCII is a strict subset of UTF-8.
Insofar as newsgroup names containing non-ASCII characters can
now be expected to arise, support from serving and relaying
agents will be necessary. It is believed that the customary
storage structure used by serving agents can already cope
(perhaps not ideally) with such names. Note that it is not
necessary for serving and relaying agents to understand all the
characters available in UTF-8, though it is desirable for them to
be displayable for diagnostic purposes via some escape mechanism
using, for example, the visible subset of US-ASCII. For users
expecting to use the more exotic charsets available under UTF-8,
the remarks already made in connection with MIME will apply.
o The new Control: mvgroup command will need to be implemented in
serving agents. It SHOULD be used in conjunction with pairs of
matching rmgroup and newgroup commands (injected shortly after
the mvgroup) until such time as mvgroup is widely implemented.
The new Replaces header is also effectively a Control command,
and transitional arrangements are provided which should be used
in the meantime. User agents are unaffected.
Attempts to create new headers that are to be adopted as News Article Format February 2000
standard headers MUST begin their lives as experimental
headers.
4.3.4 Persistent Headers o The headers newly introduced by this standard can safely be
ignored by existing software, albeit with loss of the new
functionality.
Persistent headers are headers which begin with "P-" (or 4. Basic Format
"X-P-", hereafter referred to simply as "P- headers") which
persist across followups either identically or by simple
modification. Headers with this behavior include:
Newsgroups 4.1. Syntax of News Articles
Content is carried over into all followups. Modified by
content of Followup-To header.
Subject The overall syntax of a news article is:
Content is carried over into all followups. Modified by
prefixing with "Re: " if not already present. Also modified by
user, often with a "(was: )" phrase preserving the previous
content.
References article = 1*header separator body
Content is carried over into all followups. Modified by header = header-name ":" 1*SP header-content CRLF
appending content of Message-ID header. header-name = 1*name-character *( "-" 1*name-character )
name-character = ALPHA / DIGIT
header-content = USENET-header-content
*( ";" header-parameter ) /
other-header-content
USENET-header-content
= <the header-content defined in this standard
(or an extension of it) for a specific
USENET header>
other-header-content
= <a header-content defined (explicitly or
implicitly) by some other standard>
header-parameter = USENET-header-parameter /
other-header-parameter
USENET-header-parameter
= <an other-header-parameter defined in
this standard for use in conjunction with
a specific USENET-header-content>
other-header-parameter
= attribute "=" value
attribute = USENET-token / iana-token / x-token
value = token / quoted-string
USENET-token = <A token defined in this standard for
use in conjunction with a specific
USENET-header-parameter>
iana-token = <A token defined in an experimental
or standards-track RFC and registered with
IANA>
x-token = [CFWS] <the two characters "X-" or "x-"
followed, with no intervening white space,
by any token>
token = [CFWS] 1*<any (US-ASCII) CHAR except SP,
CTLs or tspecials> [CFWS]
tspecials = "(" / ")" / "<" / ">" / "@" /
"," / ";" / ":" / "\" / DQUOTE /
"/" / "[" / "]" / "?" / "="
separator = CRLF
body = *( *998text CRLF )
NOTE: Though traditionally old newsreaders would treat An article consists of some headers followed by a body. An empty line
Keywords as a persistent header, it is not a persistent separates the two. The headers contain structured information about
header. More modern newsreaders do not treat it as such. the article and its transmission. A header begins with a header-name
News Article Format February 2000
4.3.5. Variant Headers identifying it, and can be continued onto subsequent lines as
described in section 4.2.3. The body is largely unstructured text
significant only to the poster and the readers.
Variant Headers are headers that are modified on articles when NOTE: Terminology here follows the current custom in the news
they are propogated. Variant headers have a "V-" prefix. community, rather than the [MESSFOR] convention of referring to
Variant headers may be experimental ("X-V-"), persistent what is here called a "header" as a "header-field" or "field".
("P-V-"), or both ("X-P-V-").
4.3.6. Header Classes Note that the separator line must be truly empty, not just a line
containing white space. Further empty lines following it are part of
the body, as are empty lines at the end of the article.
There are four special classes of headers that may be present NOTE: The syntax above defines the canonical form of a news
in an article: Experimental, Persistent, Comment, and article as a sequence of lines each terminated by CRLF. This
Variant. All other headers are ephemeral. These classes are does not prevent serving agents or transport agents from storing
significant in how newsreaders and servers should treat them or handling the article in other formats (e.g. using a single LF
when encountered. in place of CRLF) so long as the overall effects achieved are as
defined by this standard when operating on the canonical form.
4.3.6.1 Experimental Headers 4.2. Headers
Experimental headers are headers which begin with "X-". They 4.2.1. Names and Contents
are to be used by newsreaders proposing new headers for some
utility or for comments to be propogated with the article.
There are no established headers that are considered
experimental headers; an established header cannot be
experimental.
Attempts to create new headers that are to be adopted as Despite the restrictions on header-name syntax imposed by the
standard headers MUST begin their lives as experimental grammar, relayers and reading agents SHOULD tolerate header names
headers. containing any US-ASCII printable character other than colon (":",
ASCII 58).
[To bring it into line with <optional-field> as given in [MESSFOR].]
4.3.6.2 Persistent Headers Header-names SHOULD be either those for which a USENET-header-content
is defined in this standard, or those defined in [MESSFOR], or those
defined in any extension to either of these standards including, in
particular, the Mime standards [RFC 2045] et seq., or experimental
headers beginning with "X-" (as defined in 4.2.2.1). Software SHOULD
NOT attempt to interpret headers not described in this standard or in
its extensions, but relaying agents MUST pass them on unaltered and
reading agents MUST enable them to be displayed, at least optionally.
Persistent headers are headers which begin with "P-" (or The possibility of allowing header-parameters to appear in all
"X-P-", hereafter referred to simply as "P- headers") which headers is provided mainly for the purpose of allowing future
persist across followups either identically or by simple extensions to existing headers, since only a very few USENET-header-
modification. Headers with this behavior include: parameters are actually defined in this standard. Observe that such
header-parameters do not, in general, occur in headers defined in
other standards, except for the Mime standards [RFC 2045] et seq. and
their extensions. Nevertheless, compliant software MUST accept all
such header-parameters in headers defined by this standard and its
extensions (ignoring them if their meaning is unknown) and SHOULD
accept (and ignore) them in all headers.
[but what about
address = mailbox / group
group = phrase ":" [mailbox-list] ";"
Does the following NOTE cover the situation?]
News Article Format February 2000
Newsgroups NOTE: The presence of a ";" in a header-content does not
indicate the presence of a header-parameter in the few
situations where it can be parsed as part of some USENET-
header-content or other-header-content.
Content is carried over into all followups. Modified by On the other hand, posting agents SHOULD NOT generate them (even
content of Followup-To header. those using x-tokens) except in those headers for which a USENET-
header-parameter has been defined, or where that usage is permitted
by some other standard (notably one of the Mime standards). This
restriction is likely to removed in a future version of this
standard.
Subject NOTE: The given syntax is ambiguous insofar as a USENET-header-
content that is defined to be <unstructured> could contain,
within that <unstructured>, text of the form <*(";" header-
parameter)>. The intention is therefore that any such apparent
header-parameters are to be regarded as part of the
<unstructured>. This standard therefore does not (and extensions
to it SHOULD NOT) define any USENET-header-parameter to be
associated with such an unstructured USENET-header-content.
Content is carried over into all followups. Modified by The order of headers in an article is not significant. However,
prefixing with "Re: " if not already present. Also modified by posting agents are encouraged to put mandatory headers (section 5)
user, often with a "(was: )" phrase preserving the previous first, followed by optional headers (section 6), followed by
content. experimental headers and headers not defined in this standard or its
extensions. Relaying agents MUST NOT change the order of the headers
in an article, though they MAY add additional headers, preferably
either before or after all the existing ones.
References Header-names are case-insensitive. There is a preferred case
convention, which posters and posting agents SHOULD use: each
hyphen-separated "word" has its initial letter (if any) in uppercase
and the rest in lowercase, except that some abbreviations have all
letters uppercase (e.g. "Message-ID" and "MIME-Version"). The forms
used in this standard are the preferred forms for the headers
described herein. Relaying and reading agents MUST, however, tolerate
articles not obeying this convention.
Content is carried over into all followups. Modified by 4.2.2. Header Properties
appending content of Message-ID header.
NOTE: Though traditionally old newsreaders would treat There are four special properties that may apply to particular
Keywords as a persistent header, it is not a persistent headers, namely: "experimental", "inheritable", "local", and
header. More modern newsreaders do not treat it as such. "variant". When a header is defined, in this (or any future)
standard, as having one (or possibly more) of these properties, it is
subject to special treatment, as indicated below.
4.3.6.3 Examples 4.2.2.1. Experimental Headers
Newsgroups: alt.test Experimental headers are those whose header-names begin with "X-".
Subject: Persistent Header Example They are to be used for experimental Netnews features, or for
Message-ID: <001@news.site.example> enabling additional material to be propagated with an article. There
P-Author-IDs: <johnsmith-site.example-unique> are no established headers that are considered experimental headers;
User-Agent: experimental/0.1g (P-Author-ID Compliant) an established header cannot be experimental.
From: jane@site.invalid (Jane Smith) News Article Format February 2000
Newsgroups: alt.test
Followup-To: misc.test
Subject: Re: Persistent Header Example
Message-ID: <002@news.site.example>
References: <001@news.site.example>
P-Author-IDs: <johnsmith-site.example-unique>
User-Agent: modern/1.2 (Author-ID non-Compliant; P- header compliant)
Keywords: persistance, good ideas
From: andrew@isp.invalid NOTE: Some such headers may eventually be adopted as standard by
Newsgroups: misc.test some extension to this standard, at which point they will lose
Subject: Further example (was: Re: Persistent Header Example) their "X-" prefix.
Message-ID: <001@news.isp.example>
References: <001@news.site.example> <002@news.site.example>
P-Author-IDs: <johnsmith-site.example-unique> <andrew@isp.example>
User-Agent: codeveloper/2.0b (Author-ID Compliant)
4.3.6.4 Comment Headers 4.2.2.2. Inheritable Headers
Comment headers are headers that are strictly local and MUST Subject only to the overriding ability of the poster to determine the
NOT be propogated outside of a restricted subnet for local contents of the headers in a proto-article, headers with the
testing purposes. Comment headers have a prefix of "C-". Due inheritable property MUST be copied by followup agents (perhaps with
to their limited scope, they MUST NOT be combined with any some modification) into the followup article, and headers without
other prefix, such as "X-C-" headers. Headers with this that property MUST NOT be so copied. Examples include:
behavior include: o Newsgroups (5.5) - copied from the precursor, subject to any
Followup-To header.
o Subject (5.4) - modified by prefixing with "Re: ", but otherwise
copied from the precursor.
o References (6.8) - copied from the precursor, with the addition
of the precursor's Message-ID.
o Distribution (6.6) - copied from the precursor.
Xref NOTE: The Keywords header is not inheritable, though some older
newsreaders treated it as such.
Used by servers to keep track of crossposted articles' article 4.2.2.3. Local Headers
numbers in the crossposted-to news groups in the local news
spool as an aid to newsreaders marking such articles as read.
4.3.6.5. Variant Headers Headers with the local property are significant only to a particular
serving agent (or perhaps a cooperating group of such agents). They
MAY be removed by relaying agents before propagation, and MUST be
removed (and replaced as necessary) by serving agents when received.
The replaced header MAY be placed anywhere within the headers (though
placing it first is recommended). The principle example is:
o Xref (6.14) - used to keep track of the article locators of
crossposted articles so that newsreaders can mark such articles
as read.
Variant Headers are headers that are modified on articles when 4.2.2.4. Variant Headers
they are propogated. Variant headers have a "V-" prefix.
Variant headers may be experimental ("X-V-"), persistent
("P-V-"), or both ("X-P-V-").
4.3.7. White Space and Continuations Headers with the variant property are modified as articles are
propagated. The modified header MAY be placed anywhere within the
headers (though placing it first is recommended). The principle
example is:
o Path (5.6) - augmented at each relaying agent that an article
passes through.
[The following text is taken from [MESSFOR], adapted to the 4.2.3. White Space and Continuations
different terminology used for this standard.]
Each header is logically a single line of characters [The following text is taken from [MESSFOR], adapted to the different
comprising the header-name, the colon with its following terminology used for this standard.]
SP, and the header-content. For convenience, however, the
header-content can be split into a multiple line
representation; this is called "folding". The general rule is
that wherever this standard allows for FWS (which includes
CFWS, but not simply SP or HTAB) a CRLF followed by AT
LEAST one SP or HTAB may instead be inserted. For example,
the header:
Approved: modname@modsite.com(Acting Moderator of Each header is logically a single line of characters comprising the
comp.foo.bar) header-name, the colon with its following SP, and the header-content.
For convenience, however, the header-content can be split into a
multiple line representation; this is called "folding". The general
rule is that wherever this standard allows for FWS or CFWS (but not
simply SP or HTAB) a CRLF may be inserted before any WSP. For
News Article Format February 2000
example, the header:
Approved: modname@modsite.example (Moderator of comp.foo.bar)
can be represented as: can be represented as:
Approved: modname@modsite.example
Approved: modname@modsite.com (Moderator of comp.foo.bar)
(Acting Moderator of comp.foo.bar)
NOTE: Though header-contents are defined in such a way that NOTE: Though header-contents are defined in such a way that
folding can take place between many of the lexical tokens, folding can take place between many of the lexical tokens (and
folding SHOULD be limited to placing the CRLF at higher-level even within some of them), folding SHOULD be limited to placing
syntactic breaks. For instance, if a header-content is defined the CRLF at higher-level syntactic breaks, and SHOULD also avoid
as comma-separated values, it is recommended that folding leaving trailing WSP on the preceding line. For instance, if a
occur after the comma separating the structured items, even if header-content is defined as comma-separated values, it is
it is allowed elsewhere. recommended that folding occur after the comma separating the
structured items, even if it is allowed elsewhere.
Folding MUST NOT be carried out in such a way that any line of Folding MUST NOT be carried out in such a way that any line of a
a header is made up entirely of WSP characters and nothing header is made up entirely of WSP characters and nothing else.
else. [That is taken from a rather unsatisfactory line in
section 3.2.4 of [MESSFOR] (which seems to allow WSP-only
lines to arise from FWS but not from CFWS). The situation
could arise where two FWS or CFWS could be adjacent, according
to the syntax (I believe this is possible in [MESSFOR], which
goes to show how sloppy their syntax is), or where FWS or CFWS
is allowed at the end of a line.]
The colon following the header name on the start-line MUST be The colon following the header name on the first line MUST be
followed by white space, even if the header is empty. If the followed by a WSP, even if the header is empty. If the header is not
header is not empty, at least some of the content MUST appear empty, at least some of the content MUST appear on the first line
on the start-line. Posting agents MUST enforce these (this is to avoid the possibility of harm by any non-compliant agent
restrictions, but relaying agents SHOULD accept even articles that might eliminate a trailing SP). Posting agents MUST enforce
these restrictions, but relaying agents SHOULD accept even articles
that violate them. that violate them.
Posters and posting agents SHOULD use SP, not HTAB, where NOTE: This standard differs from [MESSFOR] in requiring that WSP
white space is desired in headers (some existing software followng the colon (it was also an [RFC 1036] requirement).
expects this), and MUST use SP immediately following the
colon after a header-name (this was an RFC 1036 requirement).
Relaying agents SHOULD accept HTAB in all such cases, however.
Since the white space beginning a continuation line remains a
part of the logical line, headers can be "broken" into
multiple lines only at FWS or CFWS. Posting agents SHOULD not
break headers unnecessarily (but see section 4.6).
4.3.8 Comments Posters and posting agents SHOULD use SP, not HTAB, where white space
is desired in headers (some existing software expects this), and MUST
use SP immediately following the colon after a header-name. Relaying
agents SHOULD accept HTAB in all such cases, however.
Strings of characters which are treated as comments may be Since the white space beginning a continuation line remains a part of
included in header contents wherever the syntactic element the logical line, headers can be "broken" into multiple lines only at
CFWS occurs. They consist of characters enclosed in FWS or CFWS. Posting agents SHOULD NOT break headers unnecessarily
parentheses. Such strings are considered comments so long as (but see 4.5).
they do not appear within a quoted-string. Comments may be
nested.
A comment is normally used to provide some human readable 4.2.4. Comments
informational text, except at the end of an <address> which
contains no <phrase>, as in
fred@foo.bar.com (Fred Bloggs) Strings of characters which are treated as comments may be included
in header-contents wherever the syntactic element CFWS occurs. They
consist of characters enclosed in parentheses. Such strings are
considered comments so long as they do not appear within a quoted-
string. Comments may be nested.
A comment is normally used to provide some human readable
informational text, except at the end of an address which contains no
phrase, as in
fred@foo.bar.example (Fred Bloggs)
as opposed to as opposed to
"Fred Bloggs" <fred@foo.bar.example> .
"Fred Bloggs" <fred@foo.bar.com> News Article Format February 2000
The former is a deprecated, but commonly encountered, usage The former is a deprecated, but commonly encountered, usage and
and reading agents SHOULD take special note of such comments reading agents SHOULD take special note of such comments as
as indicating the name of the person whose <address> it is. In indicating the name of the person whose address it is. In all other
all other situations a comment is semantically interpreted as situations a comment is semantically interpreted as a single SP.
a single SP. Since a comment is allowed to contain FWS, Since a comment is allowed to contain FWS, folding is permitted
folding is permitted within it as well as immediately within it as well as immediately preceding and immediately following
preceding and immediately following it. Also note that, since it. Also note that, since quoted-pair is allowed in a comment, the
quoted-pair is allowed in a comment, the parenthesis and parenthesis and backslash characters may appear in a comment so long
backslash characters may appear in a comment so long as they as they appear as a quoted-pair. Semantically, the enclosing
appear as a quoted-pair. Semantically, the enclosing parentheses are not part of the comment content; the content is what
parentheses are not part of the comment token; the token is is contained between the two parentheses.
what is contained between the two parentheses.
Since comments have not hitherto been permitted in news Since comments have not hitherto been permitted in news articles,
articles, except in a few specified places, posters and except in a few specified places, posters and posting-agents SHOULD
posting-agents SHOULD NOT insert them except in those places. NOT insert them except in those places, namely following addresses in
However, compliant software MUST accept them in all places From and similar headers, and to indicate the name of the timezone in
where they are syntactically allowed. Date headers. However, compliant software MUST accept them in all
places where they are syntactically allowed.
4.3.9. Undesirable Headers 4.2.5. Undesirable Headers
A header whose content is empty is said to be an empty header. A header whose content is empty is said to be an empty header.
Relaying and reading agents SHOULD NOT consider presence or Relaying and reading agents SHOULD NOT consider presence or absence
absence of an empty header to alter the semantics of an of an empty header to alter the semantics of an article (although
article (although syntactic rules, such as requirements that syntactic rules, such as requirements that certain header names
certain header names appear at most once in an article, MUST appear at most once in an article, MUST still be satisfied). Posting
still be satisfied). Posting and injecting agents SHOULD and injecting agents SHOULD delete empty headers from articles before
delete empty headers from articles before posting them; posting them; relaying agents MUST pass them untouched.
relaying agents MUST pass them untouched.
Headers that merely state defaults explicitly (e.g., a Headers that merely state defaults explicitly (e.g., a Followup-To
Followup-To header with the same content as the Newsgroups header with the same content as the Newsgroups header, or a Mime
header, or a MIME Content-Type header with contents Content-Type header with contents "text/plain; charset=us-ascii") or
"text/plain; charset=us-ascii") or state information that state information that reading agents can typically determine easily
reading agents can typically determine easily themselves (e.g. themselves (e.g. the length of the body in octets) are redundant and
the length of the body in octets) are redundant and posters posters and posting agents SHOULD NOT include them.
and posting agents SHOULD NOT include them.
4.4. Body 4.3. Body
4.4.1. Body Format Issues 4.3.1. Body Format Issues
The body of an article MAY be empty, although posting agents The body of an article MAY be empty, although posting agents SHOULD
SHOULD consider this an error condition (meriting returning consider this an error condition (meriting returning the article to
the article to the poster for revision). A posting or the poster for revision). A posting or injecting agent which does not
injecting agent which does not reject such an article SHOULD reject such an article SHOULD issue a warning message to the poster
issue a warning message to the poster and supply a non-empty and supply a non-empty body. Note that the separator line MUST be
body. Note that the separator line MUST be present even if the present even if the body is empty.
body is empty.
NOTE: Some existing news software is known to react badly to NOTE: Some existing news software is known to react badly to
body-less articles, hence the request for posting and body-less articles, hence the request for posting and injecting
injecting agents to insert a body in such cases. The sentence agents to insert a body in such cases. The sentence "This
"This article was probably generated by a buggy news reader" article was probably generated by a buggy news reader" has
has traditionally been used is this situation. traditionally been used is this situation.
Note that an article body is a sequence of lines terminated by News Article Format February 2000
CRLFs, not arbitrary binary data, and in particular it MUST
end with a CRLF. However, relaying agents SHOULD treat the
body of an article as an uninterpreted sequence of octets
(except as mandated by changes of CRLF representation and by
control-message processing) and SHOULD avoid imposing
constraints on it. See also section 4.6.
4.4.2. Body Conventions Note that an article body is a sequence of lines terminated by CRLFs,
not arbitrary binary data, and in particular it MUST end with a CRLF.
However, relaying agents SHOULD treat the body of an article as an
uninterpreted sequence of octets (except as mandated by changes of
CRLF representation and by control-message processing) and SHOULD
avoid imposing constraints on it. See also section 4.5.
A body is by default an uninterpreted sequence of octets for Posters SHOULD avoid using control characters in US-ASCII (or other
most of the purposes of this standard. However, a MIME CCSs) except for tab (ASCII 9), formfeed (ASCII 12), and backspace
Content-Type header may impose some structure or intended (ASCII 8). Tab signifies sufficient horizontal white space to reach
interpretation upon it, and may also specify the character set the next of a set of fixed positions; posters are warned that there
in accordance with which the octets are to be interpreted. is no standard set of positions, so tabs should be avoided if precise
spacing is essential. Formfeed (which is sometimes referred to as the
"spoiler character") signifies a point at which a reading agent
SHOULD pause and await reader interaction before displaying further
text. Backspace SHOULD be used only for underlining, done by a
sequence of underscores (ASCII 95) followed by an equal number of
backspaces, signifying that the same number of text characters
following are to be underlined. Posters are warned that underlining
is not available on all output devices and is best not relied on for
essential meaning. Reading agents SHOULD recognize underlining and
translate it to the appropriate commands for devices that support it.
Reading agents MUST NOT pass other control characters or escape
sequences unaltered to the output device.
NOTE: The syntax does not permit the NUL octet to appear in a 4.3.2. Body Conventions
body, and the octets CR and LF MUST ONLY occur together as
CRLF. See also section 4.6 for limits on the length of a A body is by default an uninterpreted sequence of octets for most of
line. the purposes of this standard. However, a Mime Content-Type header
may impose some structure or intended interpretation upon it, and may
also specify the character set in accordance with which the octets
are to be interpreted.
It is a common practice for followup agents to enable the It is a common practice for followup agents to enable the
incorporation of the followed-up article (the "precursor") incorporation of the followed-up article (the "precursor") as a
as a quotation. This SHOULD be done by prefacing each line quotation. This SHOULD be done by prefacing each line of the quoted
of the quoted text (even if it is empty) with the character text (even if it is empty) with the character ">" (or perhaps with
">" (or preferably with "> "). This will result in multiple "> " in the case of a previously unquoted line). This will result in
levels of ">" when quoted content itself contains quoted multiple levels of ">" when quoted content itself contains quoted
content. The followup agent SHOULD also precede the quoted content, and it will also facilitate the automatic analysis of
content by an "attribution line" incorporating at least the articles.
name of the precursor's poster.
The following convention for attribution lines, whilst not NOTE: Posters should edit quoted context to trim it down to the
mandated by this Standard, is intended to facilitate their minimum necessary. However, followup agents SHOULD NOT attempt
automatic recognition and processing by sophisticated reading to enforce this beyond issuing a warning (past attempts to do so
agents. The following fields describing the precursor SHOULD, have been found to be notably counter-productive).
if present, be in the given order.
A single Newsgroup name (the one from which the followup is The followup agent SHOULD also precede the quoted content by an
being made) enclosed within <...> or <news:...> "attribution line" (however, readers are warned not to assume that
they are accurate, especially within multiply nested quotations). The
following convention for such lines, whilst not mandated by this
standard, is intended to facilitate their automatic recognition and
processing by sophisticated reading agents. The attribution SHOULD
contain the name or the email address of the precursor's poster, as
News Article Format February 2000
The precursor's Message-ID enclosed within <...> or <news:...> in
Joe D. Bloggs <jdbloggs@foo.example> wrote:
or
Helmut Schmidt <helmut@bar.example> schrieb:
The precursor's poster's Name enclosed within "..." The attribution MAY contain also a single Newsgroup name (the one
from which the followup is being made), the precursor's Message-ID
and/or the precursor's Date and Time. Any of these that are present,
SHOULD precede the name and/or email address. However, the inclusion
or not of such fields SHOULD always be under the control of the
poster.
The precursor's poster's Email address enclosed within <...> or To enable this line, and the Message-ID and the Email address within
<mailto:...> it, to be recognised (for example to enable suitable reading agents
to retrieve the precursor or email its poster by clicking on them),
the following conventions SHOULD be observed:
o The precursor's Message-ID SHOULD be enclosed within <...> or
<news:...>
o The precursor's poster's Email address SHOULD be enclosed within
<...>
o The various fields may be separated by arbitrary text and they
may be folded in the same way as headers, but attributions SHOULD
always be terminated by a ":" followed by CRLF.
The fields may be separated by arbitrary text, they may be Further examples:
folded in the same way as headers, and they should be
terminated by a ":" followed by two CRLFs. Example:
On <comp.foo> in <12345678@foo.com> on 24 Dec 1997 16:40:20 +0000 On comp.foo in <1234@bar.example> on 24 Dec 1997 16:40:20 +0000,
"Joe D. Bloggs" <jdbloggs@foo.bar> wrote: Joe D. Bloggs <jdbloggs@bar.example> wrote:
NOTE: The use of the standard character ">" facilitates Am 24. Dez 1997 schrieb Helmut Schmidt <helmut@bar.example>:
automatic analysis of articles. The inclusion of the
Message-ID in the attribution would enable reading agents to
retrieve the precursor by clicking on it. However, readers are
warned not to assume that attributions are accurate,
especially within multiply nested quotations.
NOTE: Posters SHOULD edit quoted context to trim it down to A "personal signature" is a short closing text automatically added to
the minimum necessary. However, followup agents SHOULD NOT the end of articles by posting agents, identifying the poster and
attempt to enforce this beyond issuing a warning (past giving his network addresses, etc. If a poster or posting agent does
attempts to do so have been found to be notably append such a signature to an article, it MUST be preceded with a
counter-productive). delimiter line containing (only) two hyphens (ASCII 45) followed by
one SP (ASCII 32). The signature is considered to extend from the
last occurrence of that delimiter up to the end of the article (or up
to the end of the part in the case of a multipart Mime body).
Followup agents, when incorporating quoted text from a precursor,
SHOULD NOT include the signature in the quotation. Posting agents
SHOULD discourage (at least with a warning) signatures of excessive
length (4 lines is a commonly accepted limit).
A "personal signature" is a short closing text automatically NOTE: It is undesirable to have more than one personal signature
added to the end of articles by posting agents, identifying in an article body (even though the rule above admits the
the poster and giving his network addresses, etc. If a poster possibility by recognising only the last one). If, for some
or posting agent does append such a signature to an article, reason, a second signature is considered necessary, it MAY be
it MUST be preceded with a delimiter line containing (only) preceded by a different delimiter (e.g. "--- ").
two hyphens (ASCII 45) followed by one SP (ASCII 32). The [That is Clive's suggestion. Not to be included without further
signature is considered to extend from the last occurrence of support.]
that delimiter up to the end of the article (or up to the end News Article Format February 2000
of the part in the case of a multipart MIME body). Followup
agents, when incorporating quoted text from a precursor,
SHOULD NOT include the signature in the quotation. Posting
agents SHOULD discourage (at least with a warning) signatures
of excessive length (4 lines is a commonly accepted limit).
4.5. Characters And Character Sets 4.4. Characters and Character Sets
Transmission paths for news articles MUST treat news articles Transmission paths for news articles MUST treat news articles as
as uninterpreted sequences of octets, excluding the values 0 uninterpreted sequences of octets, excluding the values 0 (ASCII NUL)
(ASCII NUL) and 13 and 10 (ASCII CR and LF, which MUST only and 13 and 10 (ASCII CR and LF, which MUST ONLY appear in the
appear in the combination <CRLF> which denotes a line combination CRLF which denotes a line separator).
separator).
NOTE: this correspponds to the range of octets permitted for NOTE: this correspponds to the range of octets permitted for
MIME "8bit data" [RFC-2045]. Mime "8bit data" [RFC 2045]. Thus raw binary data cannot be
transmitted in an article body except by the use of a Content-
Transfer-Encoding such as base64.
An octet, or a sequence of octets, may represent a character An octet, or a sequence of octets, may represent a character in some
in some Coded Character Set (CCS) [RFC-2130] as determined by Coded Character Set (CCS) as determined by some Character Encoding
some Character Encoding Scheme (CES) [RFC-2130]. Scheme (CES) [RFC 2130].
If it comes to a relaying agent's attention that it is being If it comes to a relaying agent's attention that it is being asked to
asked to pass an article using the Content-Transfer-Encoding pass an article using the Content-Transfer-Encoding "8bit" to a
"8bit" to a relaying agent that does not support it, it SHOULD relaying agent that does not support it, it SHOULD report this error
report this error to its administrator. It MUST refuse to pass to its administrator. It MUST refuse to pass the article and MUST NOT
the article and MUST NOT re-encode it with different MIME re-encode it with different Mime encodings.
encodings.
NOTE: This strategy will do little harm. The target relaying NOTE: This strategy will do little harm. The target relaying
agent is unlikely to be able to make use of the article on its agent is unlikely to be able to make use of the article on its
own servers, and the usual flooding algorithm will likely find own servers, and the usual flooding algorithm will likely find
some alternative route to get the article to destinations some alternative route to get the article to destinations where
where it is needed. it is needed.
4.5.1. Character Sets within Article Headers 4.4.1. Character Sets within Article Headers
Within article headers, the CES is UTF-8 [ISO-10646 or Within article headers, the CES is UTF-8 [ISO 10646] or [RFC 2279]
RFC-2279] and hence the CCS is the Universal Multiple-Octet and hence the CCS is the Universal Multiple-Octet Coded Character Set
Coded Character Set (UCS) [ISO-10646] (which is essentially a (UCS) [ISO 10646] (which is essentially a superset of Unicode
superset of Unicode [UNICODE] and expected to remain so). [UNICODE] and expected to remain so). However, interpreting the
However, interpreting the octets directly as ASCII characters octets directly as US-ASCII characters should ensure correct
should ensure correct behaviour in most situations. behaviour in most situations.
NOTE: UTF-8 is an encoding for 16bit (and even 32bit) NOTE: UTF-8 is an encoding for 16bit (and even 32bit) character
character sets with the property that any octet less than 128 sets with the property that any octet less than 128 immediately
immediately represents the corresponding ASCII character, thus represents the corresponding US-ASCII character, thus ensuring
ensuring upwards compatibility with previous practice. upwards compatibility with previous practice. Non-ASCII
Non-ASCII characters from UCS are represented by sequences of characters from UCS are represented by sequences of octets
octets greater than 127. Only those octet sequences explicitly satisfying the syntax of a UTF8-xtra-char (2.4). Only those
permitted by [RFC 2079] shall be used. UCS includes all octet sequences explicitly permitted by [RFC 2044] shall be
characters from the ISO-8859 series of characters sets used. UCS includes all characters from the ISO-8859 series of
[ISO-8859] (which includes all Greek and Arabic characters) as characters sets [ISO 8859] (which includes all Greek and Arabic
well as the more elaborate characters used in Japan and China. characters) as well as the more elaborate characters used in
See the following section for the appropriate treatment of UCS Japan and China. See the following section for the appropriate
characters by reading agents. treatment of UCS characters by reading agents.
Notwithstanding the great flexibility permitted by UTF-8, Notwithstanding the great flexibility permitted by UTF-8, there is
there is need for restraint in its use in order that the need for restraint in its use in order that the essential components
essential components of headers may be discerned using of headers may be discerned using reading agents that cannot present
reading agents that cannot present the full UCS range. In the full UCS range. In particular, header-names and tokens MUST be in
particular, header-names MUST be in ASCII, and certain other News Article Format February 2000
components of headers, as defined elsewhere in this standard -
notably <identifier>s (as in <message-id>s), <date-time>s,
<domain>s <addr-spec>s and <path-item>s - MUST be in ASCII.
<Comment>s, <phrase>s (as in <address>es) and <unstructured>s
(as in <subject>s) MAY use other character sets. For
<newsgroup-name>s see below.
Where the use of non-ASCII characters, encoded in UTF-8, is US-ASCII, and certain other components of headers, as defined
permitted as above, they MAY also be encoded using the MIME elsewhere in this standard - notably msg-ids, date-times, dot-atoms,
mechanism defined in RFC-2047 [RFC-2047], but this usage is domains and path-identities - MUST be in US-ASCII. Comments, phrases
deprecated within news articles (even though it is required in (as in addresses) and unstructureds (as in Subject headers) MAY use
mail messages) since it is less legible in older reading the full range of UTF-8 characters. For newsgroup-names see 5.5.
agents which support neither it nor UTF-8. Nevertheless,
reading agents SHOULD support this usage, but only in those
contexts explicitly mentioned in [RFC-2047].
4.5.2 Character Sets within Article Bodies Where the use of non-ASCII characters, encoded in UTF-8, is permitted
as above, they MAY also be encoded using the Mime mechanism defined
in [RFC 2047], but this usage is deprecated within news articles
(even though it is required in mail messages) since it is less
legible in older reading agents which support neither it nor UTF-8.
Nevertheless, reading agents SHOULD support this usage, but only in
those contexts explicitly mentioned in [RFC 2047].
Within article bodies, the CES and CCS implied by any 4.4.2. Character Sets within Article Bodies
Content-Transfer-Encoding and Content-Type headers [RFC-2045]
SHOULD be applied by reading agents. In the absence of such Within article bodies, the CES and CCS implied by any Content-
headers, reading agents cannot be relied upon to display Transfer-Encoding and Content-Type headers [RFC 2045] SHOULD be
correctly more than the ASCII characters. [Observe that applied by reading agents. In the absence of such headers, reading
reading agents are not forbidden to "guess", or to interpret agents cannot be relied upon to display correctly more than the US-
as UTF-8 regardless, which would be the simplest course for ASCII characters.
[Observe that reading agents are not forbidden to "guess", or to
interpret as UTF-8 regardless, which would be the simplest course for
them to take.] them to take.]
NOTE: It is not expected that reading agents will necessarily NOTE: It is not expected that reading agents will necessarily be
be able to present characters in all possible character sets, able to present characters in all possible character sets,
although they MUST be able to present all ASCII characters. although they MUST be able to present all US-ASCII characters.
For example, a reading agent might be able to present only the For example, a reading agent might be able to present only the
ISO-8859-1 (Latin 1) characters [ISO-8859], in which case it ISO-8859-1 (Latin 1) characters [ISO 8859], in which case it
SHOULD present undisplayable characters using some distinctive SHOULD present undisplayable characters using some distinctive
glyph, or by exhibiting a suitable warning. Older reading glyph, or by exhibiting a suitable warning. Older reading agents
agents that do not understand MIME headers or UTF-8 should be that do not understand Mime headers or UTF-8 should be able to
able to display bodies in ASCII (with some loss of human display bodies in US-ASCII (with some loss of human
comprehensibility) except possibly when the comprehensibility) except possibly when the Content-Transfer-
Content-Transfer-Encoding is "8bit". Encoding is "8bit".
NOTE: Be warned that it will never be safe to send raw binary Followup agents MUST be careful to apply appropriate encodings to the
data in the body of news articles, because the presence of outbound followup. A followup to an article containing non-ASCII
ASCII NUL and changes of <CRLF> representation will inevitably material is very likely to contain non-ASCII material itself.
corrupt it. Such data MUST be encoded (e.g. by using
Content-Transfer-Encoding: base64).
Posters SHOULD avoid using control characters in ASCII (or 4.5. Size Limits
other CCSs) except for tab (ASCII 9), formfeed (ASCII 12), and
backspace (ASCII 8). Tab signifies sufficient horizontal white
space to reach the next of a set of fixed positions; posters
are warned that there is no standard set of positions, so tabs
should be avoided if precise spacing is essential. Formfeed
signifies a point at which a reading agent SHOULD pause and
await reader interaction before displaying further text.
Backspace SHOULD be used only for underlining, done by a
sequence of underscores (ASCII 95) followed by an equal number
of backspaces, signifying that the same number of text
characters following are to be underlined. Posters are warned
that underlining is not available on all output devices and is
best not relied on for essential meaning. Reading agents
SHOULD recognize underlining and translate it to the
appropriate commands for devices that support it. Reading
agents MUST NOT pass other control characters or escape
sequences unaltered to the output device.
Followup agents MUST be careful to apply appropriate encodings Posting agents SHOULD endeavour to keep all header lines, so far as
to the outbound followup. A followup to an article containing is possible, within 79 characters by folding them at suitable places
non-ASCII material is very likely to contain non-ASCII (see 4.2.3). However, posting agents MUST permit the poster to
material itself. include longer headers if he so insists, and compliant software MUST
support headers of at least 998 octets. Likewise, injecting agents
SHOULD fold any headers generated automatically by themselves.
Relaying agents MUST NOT fold headers (i.e. they must pass on the
folding as received).
4.6. Size Limits News Article Format February 2000
The syntax provides for the lines of a body to be up to 998 NOTE: There is NO restriction on the number of lines into which
octets in length, not including the CRLF. All software a header may be split, and hence there is NO restriction on the
compliant with this standard MUST support lines of at least total length of a header (in particular it may, by suitable
that length, both in headers and in bodies, and all such folding, be made to exceed the 998 octets restriction pertaining
software SHOULD support lines of arbitrary length. In to a single header line).
particular, relaying agents MUST transmit lines of arbitrary
length without truncation or any other modification. The syntax provides for the lines of a body to be up to 998 octets in
length, not including the CRLF. All software compliant with this
standard MUST support lines of at least that length, both in headers
and in bodies, and all such software SHOULD support lines of
arbitrary length. In particular, relaying agents MUST transmit lines
of arbitrary length without truncation or any other modification.
NOTE: The limit of 998 octets is consistent with the NOTE: The limit of 998 octets is consistent with the
corresponding limit in [MESSFOR]. corresponding limit in [MESSFOR].
In plain-text messages (those with no MIME headers, or those In plain-text messages (those with no Mime headers, or those with a
with a MIME Content-Type of text/plain) posting agents SHOULD Mime Content-Type of text/plain) posting agents SHOULD endeavour to
endeavour to keep the length of body lines within some keep the length of body lines within some reasonable limit. The size
reasonable limit. The size of this limit is a matter of of this limit is a matter of policy, the default being to keep within
policy, the default being to keep within 79 characters at 79 characters at most, and preferably within 72 characters (to allow
most, and preferably within 72 characters (to allow room for room for quoting in followups). Exceptionally, posting agents SHOULD
quoting in followups). However, posting agents MUST permit NOT adjust the length of quoted lines in followups unless they are
the poster to include longer lines if he so insists. able to reformat them in a consistent manner. Moreover, posting
agents MUST permit the poster to include longer lines if he so
insists.
NOTE: Plain-text messages are intended to be displayed "as-is" NOTE: Plain-text messages are intended to be displayed "as-is"
without any special action (such as automatic line splitting) without any special action (such as automatic line splitting) on
on the part of the recipient. The policy limit (e.g. 72 or 79) the part of the recipient. The policy limit (e.g. 72 or 79)
should be expressed as a number of characters (as they will be should be expressed as a number of characters (as they will be
displayed by a reading agent) rather than as the number of displayed by a reading agent) rather than as the number of
octets used to encode them. octets used to encode them.
Posting agents SHOULD fold headers by inserting CRLF followed NOTE: This standard provides no upper bound on the overall size
by 1*WSP at positions (preferably higher-level ones - see of a single article, but neither does it forbid relaying agents
4.3.2) where this is syntactically allowed so as to keep, so from dropping articles of excessive length. It is, however,
far as is possible, all header lines within 79 characters. suggested that any limits thought appropriate by particular
Likewise, injecting agents SHOULD fold any headers generated agents would be more appropriately expressed in megabytes than
automatically by themselves. Relaying agents MUST NOT fold in kilobytes.
header lines (i.e. they must pass on the folding as received).
NOTE: There is NO restriction on the number of lines into
which a header may be split, and hence there is NO restriction
on the total length of a header (in particular it may, by
suitable folding, be made to exceed the 998 octets
restriction pertaining to a single header line).
NOTE: This standard provides no upper bound on the overall
size of a single article, but neither does it forbid relaying
agents from dropping articles of excessive length. It is,
however, suggested that any limits thought appropriate by
particular agents would be more appropriately expressed in
megabytes than in kilobytes.
4.7. Example 4.6. Example
Here is a sample article: Here is a sample article:
Path: server.example,unknown.site2.example@site2.example, Path: server.example/unknown.site2.example@site2.example/
relay.site.example,site.example,injector.site.example%jsmith relay.site.example/site.example/injector.site.example%jsmith
Newsgroups: example.announce,example.chat Newsgroups: example.announce,example.chat
Message-ID: <9urrt98y53@site.example> Message-ID: <9urrt98y53@site.example>
From: Ann Example <a.example@site1.invalid> From: Ann Example <a.example@site1.example>
Subject: Announcing a new sample article. Subject: Announcing a new sample article.
Date: Fri, 27 Mar 1998 12:12:50 +1300 Date: Fri, 27 Mar 1998 12:12:50 +1300
Approved: example.announce moderator <jsmith@site.invalid> Approved: example.announce moderator <jsmith@site.example>
Followup-To: example.chat Followup-To: example.chat
News Article Format February 2000
Reply-To: Ann Example <a.example+replies@site1.example> Reply-To: Ann Example <a.example+replies@site1.example>
Expires: Wed, 22 Apr 1998 12:12:50 -0700 Expires: Wed, 22 Apr 1998 12:12:50 -0700
Organization: Site1, The Number one site for examples. Organization: Site1, The Number one site for examples.
User-Agent: ExampleNews/3.14 (Unix) User-Agent: ExampleNews/3.14 (Unix)
Keywords: example, announcement, standards, RFC 1036, Usefor Keywords: example, announcement, standards, RFC 1036, Usefor
Summary: The URL for the next standard. Summary: The URL for the next standard.
Just a quick announcemnt that a new standard example article has been Just a quick announcemnt that a new standard example article has
released; it is in the new USEFOR draft obtainable from ftp.ietf.org. been released; it is in the new USEFOR draft obtainable from
ftp.ietf.org.
Ann. Ann.
-- --
Ann Example <a.example@site1.invalid> Sample Poster to the Stars Ann Example <a.example@site1.example> Sample Poster to the Stars
"The opinions in this article are bloody good ones" - from J Clarke. "The opinions in this article are bloody good ones" - J. Clarke.
5. Mandatory Headers 5. Mandatory Headers
An article MUST have one, and only one, of each of the An article MUST have one, and only one, of each of the following
following headers: Date, From, Message-ID, Subject, headers: Date, From, Message-ID, Subject, Newsgroups, Path.
Newsgroups, Path.
NOTE: [MAIL] specifies (if read most carefully) that there
must be exactly one Date header and exactly one From header,
but otherwise does not restrict multiple appearances of
headers. (Notably, it permits multiple Message-ID
headers!) This appears singularly useless, or even
harmful, in the context of news, and much current news
software will not tolerate multiple appearances of mandatory
headers.
Note also that there are situations, discussed in the Note also that there are situations, discussed in the relevant parts
relevant parts of section 6, where References, Sender, of section 6, where References, Sender, or Approved headers are
or Approved headers are mandatory. In control articles, mandatory. In control messages, specific values are required for
specific values are required for certain headers. certain headers.
In the discussions of the individual headers, the content of For the overall syntax of headers, see section 4.1. In the
each is specified using the syntax notation. The convention discussions of the individual headers, the content of each is
used is that the content of, for example, the Subject header specified using the syntax notation. The convention used is that the
is defined as <Subject-content>. content of, for example, the Subject header is defined as <Subject-
content>.
NOTE: see also Section 7.1.1 A proto-article (see 8.2.1) may lack some of these mandatory headers,
but they MUST then be supplied by the injecting agent.
5.1. Date 5.1. Date
The Date header contains the date and time that the article The Date header contains the date and time that the article was
was submitted for transmission. The content syntax is prepared by the poster ready for transmission and SHOULD express the
defined in the Message Format Standard [MESSFOR]. poster's local time. The content syntax makes use of syntax defined
in [MESSFOR].
Date-content = date-time Date-content = date-time
5.2. From NOTE: It is a useful convention to follow the date-time with a
comment containing the time zone in human-readable form. The use
of folding in a date-time is deprecated, even though permitted
by [MESSFOR].
The From header contains the electronic address(es), and In order to prevent the reinjection of expired articles into the news
possibly the full name, of the article's author(s) . The stream, relaying and serving agents MUST refuse articles whose Date
format of the From header is defined in the Message Format header predates the earliest articles of which they normally keep
Standard [MESSFOR]. record, or which is more than 24 hours into the future (though they
News Article Format February 2000
All mailboxes in the From-content field MUST either belong to the MAY use a margin less than that 24 hours). Relaying agents MUST NOT
posters(s) of the article ( or the poster(s) are authorized by modify the Date header in transit.
the owners to use the mailboxes) or end in the top level
domain of ".invalid". 5.1.1. Examples
Date: Fri, 2 Apr 1999 20:20:51 -0500 (EST)
Date: 26 May 1999 16:13 +0000
5.2. From
The From header contains the electronic address(es), and possibly the
full name, of the article's author(s). The content syntax makes use
of syntax defined in [MESSFOR], subject to the following revised
definition of local-part.
From-content = mailbox-list From-content = mailbox-list
addr-spec = local-part "@" domain
local-part = dot-atom / strict-quoted-string
5.2.1 Examples: NOTE: This syntax ensures that the local-part of an addr-spec is
restricted to pure US-ASCII (and is thus in strict compliance
with [MESSFOR]), whilst allowing any UTF-8 character to be used
in a preceding quoted-string containing the author's full name.
If some future extension to the Mail protocols should relax this
restriction, one would expect the Netnews protocols to follow.
Any mailbox in the From-content MUST belong to one of the poster(s)
of the article, or be a mailbox which he is authorized by its owner
to use, or be an address which ends in the top level domain of
".invalid" [RFC 2606].
5.2.1. Examples:
From: John Smith <jsmith@site.example> From: John Smith <jsmith@site.example>
From: John Smith <jsmith@site.example>, dave@isp.example From: "John Smith" <jsmith@site.example>, dave@isp.example
From: John Smith <jsmith@site.example>, andrew@isp.example, From: "John D. Smith" <jsmith@site.example>, andrew@isp.example,
fred@site2.example fred@site2.example
From: Jan Jones <jan@please_setup_your_software_correctly.invalid> From: Jan Jones <jan@please_setup_your_system_correctly.invalid>
From: Jan Jones <joe@anonymous.invalid> From: Jan Jones <joe@guess-where.invalid>
From: dave@isp.example (Dave Smith) From: dave@isp.example (Dave Smith)
NOTE: the last example is in an obsolete syntax. NOTE: the last example shows a now deprecated convention of
putting an author's full name in a comment following the
mailbox, rather than in a phrase at the start of that mailbox.
Observe that the quotes around the "John D. Smith" example were
required, on account of the '.' character, and they would also
have been required had any UTF8-xtra-char been present.
5.3. Message-ID 5.3. Message-ID
The Message-ID header contains the article's message ID, a The Message-ID header contains the article's message identifier, a
unique identifier distinguishing the article from every unique identifier distinguishing the article from every other
other article. The format of the Message-ID header is defined article. The content syntax makes use of syntax defined in [MESSFOR],
in the Message Format Standard [MESSFOR] . An article's subject to the following revised definition of no-fold-quote.
message ID MUST be unique and MUST NEVER be reused.
News Article Format February 2000
Message-ID-content = msg-id Message-ID-content = msg-id
id-left = dot-atom-text / no-fold-quote
no-fold-quote = DQUOTE *( strict-qtext / strict-quoted-pair )
NOTE: This syntax ensures that a msg-id is restricted to pure
US-ASCII (and is thus in strict compliance with [MESSFOR]).
Following the provisions of [MESSFOR], an agent generating an
article's message identifier MUST ensure that it is unique and that
it is NEVER reused. Moreover, even though commonly derived from the
domain name of the originating site (and domain names are case-
insensitive), a message identifier MUST NOT be altered in any way
during transport, or when copied (as into a References header), and
thus a simple (case-sensitive) comparison of octets will always
suffice to recognise that same message identifier wherever it
subsequently reappears.
NOTE: some old software may treat message identifiers that
differ only in case within their id-right part as equivalent,
and implementors of agents that generate message identifiers
should be aware of this.
5.4. Subject 5.4. Subject
The Subject field contains a short string identifying the The Subject header contains a short string identifying the topic of
topic of the message. When used in a followup, the field body the message. This is an inheritable header (4.2.2.2) to be copied
SHOULD start with the string "Re: " ( a "back reference" ) into the Subject header of any followup, in which case the new
followed by the contents of the pure-subject of the precursor. header-content SHOULD then default to the string "Re: " (a "back
reference") followed by the contents of the pure-subject of the
precursor. Any leading "Re: " in the pure-subject MUST be stripped.
subject-content = [ back-reference ] pure-subject CRLF Subject-content = [ back-reference ] pure-subject
pure-subject = nonblank-text pure-subject = 1*( [FWS] utext )
back-reference = %x52.65.3A.20 ; which is a case-sensitive back-reference = %x52.65.3A.20
"Re: " ; which is a case-sensitive "Re: "
The pure-subject MUST NOT begin with "Re: ". The default The pure-subject MUST NOT begin with "Re: ".
subject-content of a followup is the string "Re: " followed by
the contents of the pure-subject of the precursor. Any leading
"Re: " in the pure-subject MUST be stripped.
Followup agents MAY remove instances of non-standard NOTE: The given syntax differs from that prescribed in [MESSFOR]
back-reference (such as "Re(2): ", "Re:", "RE: ", or "Sv: ") insofar as it does not permit a header content to be completely
from the subject-content when composing the subject of a empty, or to consist of WSP only (see remarks in 4.2.5
followup and add a correct back-reference in front of the concerning undesirable headers).
result.
Followup agents MAY remove instances of non-standard back-reference
(such as "Re(2): ", "Re:", "RE: ", or "Sv: ") from the Subject-
content when composing the subject of a followup and add a correct
back-reference in front of the result.
NOTE: that would be "SHOULD remove instances" except that we NOTE: that would be "SHOULD remove instances" except that we
cannot find a sufficiently robust and simple algorithm to do cannot find a sufficiently robust and simple algorithm to do the
the necessary natural language processing. necessary natural language processing.
Followup agents MUST NOT use any other string except "Re: " as News Article Format February 2000
a back reference. Specifically, a translation of "Re: " into a
local language or usage MUST NOT be used.
Agents SHOULD NOT depend on nor enforce the use of back Followup agents MUST NOT use any other string except "Re: " as a back
references by followup agents. For compatibility with legacy reference. Specifically, a translation of "Re: " into a local
news software the subject-content of a control message MAY language or usage MUST NOT be used.
start with the string "cmsg ", non-control messages MUST NOT
start with the string "cmsg ".
5.4.1 Examples: NOTE: "Re" is an abbreviation for the Latin "In re", meaning "in
the matter of", and not an abbreviation of "Reference" as is
sometimes erroneously supposed.
In the following examples, please note that only "Re: " is Agents SHOULD NOT depend on nor enforce the use of back references by
mandated by this DRAFT. "was: " is a convention used by many followup agents. For compatibility with legacy news software the
English-speaking posters to signal a change in subject matter. Subject-content of a control message (i.e. an article that also
Software should be able to deduce this information from contains a Control header) MAY start with the string "cmsg ", and
References. non-control messages MUST NOT start with the string "cmsg ". See also
section 6.11.
Subject: Film at 11. 5.4.1. Examples
In the following examples, please note that only "Re: " is mandated
by this standard. "was: " is a convention used by many English-
speaking posters to signal a change in subject matter. Software
should be able to deduce this information from References.
Subject: Film at 11
Subject: Re: Film at 11 Subject: Re: Film at 11
Subject: Use of Godwin's law considered harmful (was: Film at 11) Subject: Godwin's law considered harmful (was: Film at 11)
Subject: Godwin's law (was: Film at 11) Subject: Godwin's law (was: Film at 11)
Subject: Re: Godwin's law (was: Film at 11) Subject: Re: Godwin's law (was: Film at 11)
5.5. Newsgroups 5.5. Newsgroups
The Newsgroups header's content specifies which newsgroup(s) The Newsgroups header's content specifies which newsgroup(s) the
the article is posted to: article is posted to. It is an inheritable header (4.2.2.2) which
SHOULD then become the default Newsgroups header of any followup,
unless a Followup-To header is present to prescribe otherwise.
Newsgroups-content = newsgroup-name Newsgroups-content = newsgroup-name
*( ng-delim *FWS newsgroup-name ) *FWS *( *FWS ng-delim *FWS newsgroup-name )
newsgroup-name = component *FWS
*( "." component ) newsgroup-name = component *( "." component )
component = component-start component = component-start
*( component-start / component-other ) *( component-start / component-other )
component-start = Un-lowercase / Un-digit component-start = Un-lowercase / Un-digit
Un-lowercase = <Unicode Letter, Lowercase> / Un-lowercase = <Unicode Letter, Lowercase> /
<Unicode Letter, Other> <Unicode Letter, Other>
Un-uppercase = <Unicode Letter, Uppercase> /
<Unicode Letter, Titlecase>
Un-digit = <Unicode Number, Decimal Digit> / Un-digit = <Unicode Number, Decimal Digit> /
<Unicode Number, Other> <Unicode Number, Other>
component-other = "+" / "-" / "_" component-other = "+" / "-" / "_"
ng-delim = "," ng-delim = ","
where the <Unicode ...> items are as described in [UNICODE]. where the <Unicode ...> items are as described in [UNICODE].
An article's Newsgroups header may not contain a duplicated The inclusion of folding white space within a Newsgroups-content is a
newsgroup-name component. newly introduced feature in this standard. It MUST be accepted by all
conforming implementations (relaying agents, serving agents and
reading agents). Posting agents should be aware that such postings
News Article Format February 2000
The inclusion of folding white space within a newsgroup-name may be rejected by overly-critical old-style relaying agents. When a
is a newly introduced feature in this standard. It MUST be sufficient number of relaying agents are in conformance, posting
accepted by all conforming implementations (relaying agents, agents SHOULD generate such whitespace in the form of <CRLF WS> so as
serving agents and reading agents). Posting agents should be
aware that except for experimental posting to 'test' newsgroups
or within cooperating subnets, such postings may be rejected by
overly-critical old-style relaying agents. When a sufficient
number of relay agents are in conformance, posting agents
SHOULD generate such whitespace in the form of <CRLF WS> so as
to keep the length of lines in the relevant headers (notably to keep the length of lines in the relevant headers (notably
Newsgroups and Followup-To) to no more than than 79 characters Newsgroups and Followup-To) to no more than than 79 characters (or
(or other agreed policy limit - see 4.6). Before such critical other agreed policy limit - see 4.5). Before such critical mass
mass occurs, injecting agents MAY reformat such headers by removing occurs, injecting agents MAY reformat such headers by removing
whitespace inserted by the posting agent, but relaying agents whitespace inserted by the posting agent, but relaying agents MUST
MUST NOT do so. NOT do so.
A newsgroup name consists of one or more components. A newsgroup-name consists of one or more components. Components MAY
Components MAY contain non-ASCII letters, but these MUST be contain non-ASCII letters, but these MUST be encoded in UTF-8 and not
encoded in UTF-8 and not according to RFC-2047. A component according to [RFC 2047]. A component MUST contain at least one
MUST contain at least one letter (and must, according to the letter (and MUST, according to the syntax, begin with a letter or
syntax, begin and end with a letter or digit). Components digit). Components SHOULD begin with a letter. Composite characters
SHOULD begin with a letter. Composite characters (made by (made by overlaying one character with another) and format
overlaying one character with another) and format characters, characters, as allowed in certain parts of Unicode and needed by
as allowed in certain parts of Unicode and needed by certain certain languages, must use whatever canonical conventions apply to
languages, must use whatever canonical conventions apply to those parts of Unicode (such conventions are not defined in this
those parts of Unicode (such conventions are not Standard). The use of "_" in a component is deprecated. Serving
defined in this Standard). The use of "_" in a component is agents MAY refuse to accept newsgroups using such a component.
deprecated. Serving agents MAY refuse to accept newsgroups
using that component.
NOTE: Components composed entirely of digits would cause NOTE: Components composed entirely of digits would cause
problems for the commonly used implementation technique of problems for the commonly used implementation technique of using
using the component as the name of a directory, whilst also the component as the name of a directory, whilst also using
using sequential numbers to distinguish the articles within a sequential numbers to distinguish the articles within a group.
group. Components containing other non-permitted characters could cause
problems when newsgroup-names appear in URLs [RFC 1738] (for
example an '@' character would prevent distinguishing between
newsgroup-names and message identifiers).
NOTE: Uppercase letters MUST NOT be used. Although converting NOTE: According to the syntax, uppercase letters cannot occur in
ASCII uppercase letters to their lowercase counterparts is newsgroup-names, but this standard imposes no requirement on
straightforward enough, it would be unreasonable to expect software to check this condition, since it would be unreasonable
software to do the same in parts of Unicode for which it was to expect it to do so in parts of Unicode for which it was not
not configured (in general, a table lookup would be required). configured (in general, a table lookup is required). Rather, it
Thus software MAY attempt to convert Un-uppercase letters is the responsibility of those creating new newsgroups (7.1) not
according to the mappings defined by [UNICODE], but this to violate it. It is, moreover, to be expected that a newsgroup
behaviour is not required. created in violation of this condition will not be propagated
particularly well.
Whilst there is no longer any technical reason to limit the Whilst there is no longer any technical reason to limit the length of
length of a component (formerly, it was limited to 14 a component (formerly, it was limited to 14 characters) nor to limit
characters) nor to limit the total length of a newsgroup-name, the total length of a newsgroup-name, it should be noted that these
it should be noted that these names are also used in the names are also used in the newsgroups line (7.1.2) where an overall
newsgroups line (6.6.1.2) where an overall limit applies, and policy limit applies, and moreover excessively long names can be
moreover excessively long names can be exceedingly exceedingly inconvenient in practical use. Agencies responsible for
inconvenient in practical use. Agencies responsible for individual hierarchies SHOULD therefore, as a matter of policy, set
individual hierarchies SHOULD therefore, as a matter of reasonable limits for the length of a component and of a newsgroup-
policy, set reasonable limits for the length of a component name. In the absence of such explicit policies, the default figures
and of a newsgroup name. In the absence of such explicit are 30 characters and 71 characters respectively.
policies, the default figures are 30 characters and 72 [If the checkpolicies proposal is included in the Standard, there should
characters respectively. be a reference to it here.]
News Article Format February 2000
NOTE: The newsgroup-name as encoded in UTF-8 should be NOTE: The newsgroup-name as encoded in UTF-8 should be regarded
regarded as the canonical form. Reading agents may convert it as the canonical form. Reading agents may convert it to whatever
to whatever character set they are able to display (see 4.5.2) character set they are able to display (see 4.4.1) and serving
and serving agents may possibly need to convert it to some agents may possibly need to convert it to some form more
form more suitable as a filename. Simple algorithms for both suitable as a filename. Simple algorithms for both kinds of
kinds of conversion are readily available. conversion are readily available. Observe that the syntax does
not allow comments within the Newsgroups header; this is to
simplify processing by relaying and serving agents which have a
requirement to process this header extremely rapidly.
Posters SHOULD use only the names of existing newsgroups in Posters SHOULD use only the names of existing newsgroups in the
the Newsgroups header, because newsgroups are not created Newsgroups header. However, it is legitimate to cross-post to
simply by being posted to. However, it is legitimate to newsgroup(s) which do not exist on the posting agent's host, provided
cross-post to newsgroup(s) which do not exist on the posting that at least one of the newsgroups DOES exist there, and followup
agent's host, provided that at least one of the newsgroups agents SHOULD accept this (posting agents MAY accept it, but SHOULD
DOES exist there, and followup agents MUST accept this at least alert the poster to the situation and request confirmation).
(posting agents MAY accept it, but SHOULD at least alert the Relaying agents MUST NOT rewrite Newsgroups headers in any way, even
poster to the situation and request confirmation). Relaying if some or all of the newsgroups do not exist on the relaying agent's
agents MUST NOT rewrite Newsgroups headers in any way, even if host. Serving agents MUST NOT create new newsgroups simply because an
some or all of the newsgroups do not exist on the relaying unrecognised newsgroup-name occurs in a Newsgroups header (see 7.1
agent's host. for the correct method of newsgroup creation).
5.5.1 Forbidden newsgroup names The Newsgroups header is intended for use in Netnews articles rather
than in mail messages. It MAY be used in a mail message to indicate
that it is a copy also posted to the listed newsgroups, but it SHOULD
NOT be used in a mail-only reply to a Netnews article (thus the
"inheritable" property of this header applies only to followups to a
newsgroup, and not to followups to the poster). Moreover, if a
newsgroup-name contains any non-ASCII character, it MAY be encoded
using the mechanism defined in [RFC 2047] when sent by mail but, if
it is subsequently returned to the Netnews environment, it MUST then
be re-encoded into UTF-8.
The following newsgroup-names MUST NOT be used: 5.5.1. Forbidden newsgroup names
Newsgroup-names having only one component (reserved for The following forms of newsgroup-name MUST NOT be used except for the
newsgroups whose propagation is restricted to a single host, specific purposes indicated:
or local network, and for pseudo-newsgroups such as "poster"
(because it has special meaning in the Followup-To header (see
section 6.1)), "newsgroups" (likewise), "junk" (frequently
used for pseudo-newsgroups internal to serving agents)
and "control" (likewise).
Any newsgroup-name beginning with "control." (Used as a o Newsgroup-names having only one component. These are reserved for
pseudo-newsgroup by many serving agents.) newsgroups whose propagation is restricted to a single host or
local network, and for pseudo-newsgroups such as "poster" (which
has special meaning in the Followup-To header - see section 6.7),
"junk" (often used by serving agents), "control" (likewise),
"revise" and "repost" (which have special meanings in the Xref
header - see 6.14)
Any newsgroup-name containing the component "ctl" (likewise) o Any newsgroup-name beginning with "control." (used as pseudo-
newsgroups by many serving agents)
o Any newsgroup-name containing the component "ctl" (likewise)
o "to" or any newsgroup-name beginning with "to." (reserved for the
ihave/sendme protocol described in section 7.6, and for test
messages sent on an essentially point-to-point basis)
o Any newsgroup-name containing the component "all" (because this
is used as a wildcard in some implementations)
News Article Format February 2000
"to" or any newsgroup-name beginning with "to." (reserved for A newsgroup-name SHOULD NOT appear more than once in the Newsgroups
test messages sent on an essentially point-to-point basis (see header. The order of newsgroup names in the Newsgroups header is not
also the ihave/sendme protocol described in section 7.2) significant, except for determining which moderator to send the
article to if one of the groups is moderated (see 8.2).
Any newsgroup-name containing the component "all" (because 5.6. Path
this is used as a wildcard in some implementations)
A newsgroup MUST NOT appear more than once in the Newsgroups The Path header shows the route taken by a message since its entry
header. The order of newsgroup names in the Newsgroups into the Netnews system. It is a variant header (4.2.2.4), each agent
header is not significant. that processes an article being required to add one (or more) entries
to it. This is primarily to enable relaying agents to avoid sending
articles to sites already known to have them, in particular the site
they came from, and additionally to permit tracing the route articles
take in moving over the network, and for gathering Usenet statistics.
Finally the presence of a '%' delimiter in the Path header can be
used to identify an article injected in conformance with this
standard.
5.6 Path 5.6.1. Format
The Path header shows the route a message took from its entry Path-content = *( path-identity [FWS] delimiter [FWS] )
into the USENET system to the current system. It is a list of tail-entry *FWS
site identifiers with the origin on the right. Each relaying, path-identity = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
injecting or serving agent that processes the article adds one delimiter = "/" / "?" / "%" / "," / "!"
or more entries to this header. Aside from tracing the route tail-entry = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
articles take in moving over the network, Path is used
primarily to allow relaying systems to not send articles to
sites known to already have them, in particular the site they
came from. This improves the efficiency of links. Path is
also used for USENET statistics gathering and flow tracking.
Finally the presence of a "%" delimiter in the Path header can
be used to identify an article injected in conformance with
this standard.
5.6.1 Format NOTE: A Path-content will inevitably contain at least one path-
identity, except possibly in the case of a proto-article that
has not yet been injected onto the network.
path-content = old-path / new-path NOTE: Observe that the syntax does not allow comments within the
Path header; this is to simplify processing by relaying and
injecting agents which have a requirement to process this header
extremely rapidly.
old-id = 1*( ALPHA / digit / "-" | "." | "_") A relaying agent SHOULD NOT pass an article to another relaying agent
old-path = old-id *(punctuation old-id) whose path-identity (or some known alias thereof) already appears in
punctuation = LWSP / %x21-2f / %x3a-40 / %x5b-60 / %x7b-7f the Path-content. Since the comparison may be either case sensitive
; These are ! " # $ % & ' ( ) * or case insensitive, relaying agents SHOULD NOT generate a name which
; + , - . / : ; < = > ? @ [ \ differs from that of another site only in terms of case.
; ] ^ _ ` { | } ~ DEL
new-delims = [FWS] ("@" / "/" / "," ) [FWS]
new-path = post-injection "%" pre-injection
delim-plus-id = [FWS] "!" [FWS] old-id
/ new-delims site-id
post-injection = *(site-id 1*new-delims) site-id
pre-injection = site-id *delim-plus-id
site-id = ALPHA word ; UUCP name
/ ALPHA ; for "x" tail entry
/ "." word ; other registered name
/ <FQDN> ; as per RFC 1034
/ <dotted-quad> ; numeric IP address rep
; specified in rfc820 etc.
/ "[" dotted-quad "]"
/ "[" <ipv6-numeric> "]" ; per RFC1884
word = 1*(ALPHA / digit / "-" / "_")
5.6.2 Adding an entry to the Path header. A relaying agent MAY decline to accept an article if its own path-
identity is already present in the Path-content or if the Path-
content contains some path-identity whose articles the relaying agent
does not want, as a matter of local policy.
When a system receives a message from another system, it MUST NOTE: This last facility is sometimes used to detect and decline
add its own unique name (path-identity or site-id) and a control messages (notably cancel messages) which have been
delimiter to the beginning of the Path string. In addition, if deliberately seeded with a path-identity to be "aliased out" by
needed, folding-whitespace MAY be added. sites not wishing to act upon them.
The path-identity added MUST be unique. To this end it should 5.6.2. Adding a path-identity to the Path header
be one of: News Article Format February 2000
1. A name registered previously in the UUCP maps database When an injecting, relaying or serving agent receives an article, it
(found in the newsgroup comp.mail.maps), containing no dot MUST prepend its own path-identity followed by a delimiter to the
character. beginning of the Path-content. In addition, it SHOULD then add CRLF
and WSP if it would otherwise result in a line longer than 79
characters.
3. The fully qualified domain name or MX record, retrievable The path-identity added MUST be unique to that agent. To this end it
via the Internet DNS service. SHOULD be one of:
4. An encoding of an IP address -- dotted quad or for IPv6 as 1. A fully qualified domain name (FQDN) associated (by the Internet
per RFC1884. These encodings using SHOULD NOT be used prior to DNS service [RFC 1034]) with an A record, which SHOULD identify
draft-implementation-date. the actual machine prepending this path-identity. Ideally, this
FQDN should also be "mailable" in the sense that it enables the
construction of a valid E-mail address of the form "usenet@<FQDN>"
or "news@<FQDN>" [RFC 2142] whereby the administrators of that
agent may be reached.
Whichever form is chosen, a site SHOULD use a form which can be 2. A fully qualified domain name (FQDN) associated (by the Internet
verified using one of the schemes described below by all sites DNS service) with an MX record which MUST then enable the
to which it will forward news articles. If all forwarding is by construction of a valid E-mail address of the form "usenet@<FQDN>"
NNTP or other internet based protocols, then the FQDN or IP or "news@<FQDN>" whereby the administrators of that agent may be
address encodings are advised. For the purposes of comparison, reached.
FQDN entries should be put in an all-lower-case canonical form.
Because RFC1036 specified any punctuation or whitespace could 3. A name registered previously in the UUCP maps database (found in
act as delimiter, programs SHOULD accept this, with the the newsgroup comp.mail.maps), containing no '.' character.
exception that IPv6 addresses containing colons MUST be treated
as a single unit. Modern programs MUST generate only the set
"!,%@" plus optional additional whitespace.
When a site receives an article from another site, it SHOULD 4. An encoding of an IP address - <dotted-quad> [RFC 820] or <ipv6-
(and eventually MUST) verify the identity of numeric> [RFC 2373] (the requirement to be able to use an <ipv6-
the source site. When processing an article from a source, the numeric> is the reason for including ':' as an allowed character
leftmost entry of the Path line should be extracted, converted within a path-identity).
to a canonical form, and tested to see if it matches the
canonical form of the verified identity of the source. If it
does, a "," should be used as the delimiter, and thus the
comma, and then the receiving site's path-identity MUST be
prepended to the Path line.
The method of verification is up to the site. Any method of 5. A '.' followed by an arbitrary name not in the UUCP maps database,
suitable authenticity may be chosen, with the consideration but believed to be unique and registered at least with all sites
that in the event of problems at the source site, the relaying immediately downstream from the given site.
site may be called upon to reliably identify it.
If the leftmost entry does not match the verified identity of Of the above options, nos. 1 to 3 are much to be preferred, unless
the source, then the receiving site should prepend an "@" there are strong technical reasons dictating otherwise. In
delimiter, then a simple form of the verified identity of the particular, the injecting agent's path-identity MUST, as a special
source, then a "," delimiter and then the receiving site's own case, be an FQDN mailable address in the sense defined under option
path-identity. This adding of two identities to the line 1, or with an associated MX record as in option 2.
MUST NOT be done if the provided and verified identities
match. For articles received from an internet source, the
unique IPv4 (or IPv6) address or properly verified FQDN, whichever
is shorter, is encouraged for the generated ID.
5.6.3 The tail Entry The injecting agent's path-identity MUST be followed by the special
delimiter '%' which serves to separate the pre-injection and post-
injection regions of the Path-content (see 5.6.3).
For historical reasons, the rightmost entry in the Path string In the case of a relaying or serving agent, the delimiter is chosen
generated by most systems is not a site name, but a "user as follows. When such an agent receives an article, it MUST
name". However, the Path string is not an E-mail address and establish the identity of the source and compare it with the leftmost
MUST NOT be used to contact the user. Injecting agents MAY path-identity of the Path-content. If it matches, a '/' should be
place any string here that is not a path-identity. If no used as the delimiter when prepending the agent's own path-identity.
meaning is anticipated the string "x" SHOULD be used. If it does not match then the agent should prepend two entries to the
Path-content; firstly the true established path-identity of the
source followed by a '?' delimiter, and then, to the left of that,
the agent's own path-identity followed by a '/' delimiter as usual.
RFC1036 suggested that the last entry could be a site name, News Article Format February 2000
requiring software to check it when feeding, but said it also
should have a user-id for very old systems. As of this
specification, a systems MUST NOT treat the tail entry as a
path-identity.
Typically this field will be the only entry on the Path string This prepending of two entries SHOULD NOT be done if the provided and
generated by a poster, or if not generated by the established identities match.
posting-agent, by the injecting agent, which will prepend a "%"
and then its own verifiable path-identity. The percent divides
the verified part of the Path line from any entries provided
prior to injection into the news network. There may be more
than one entry to the left of the percent, and all but the last
are to be treated as sites.
Injecting Agents SHOULD use the tail entry for local Any method of establishing the identity of the source may be used
authentication information on the source of an article. For (but see 5.6.5 below), with the consideration that, in the event of
example, if they wish to store an encoding of the IP address of problems, the agent concerned may be called upon to justify it.
a source machine connecting to do the injection, and/or the UID
of an invoking user or any other such information, they may
encode it in the tail entry, provided they do so in a manner
that will not match any site identifier. (e.g. ending with a
dot) .
5.6.4 The Injecting Agent Entry NOTE: The use of the '%' delimiter marks the position of the
injecting agent in the chain. In normal circumstances there
should therefore be only one `%` delimiter present, and
injecting agents MAY choose to reject proto-articles with a '%'
already in them. If, for whatever reason, more than one '%' is
found, then the path-identity in front of the leftmost '%' is to
be regarded as the true injecting agent.
The injecting agent's path identity is a special case. This 5.6.3. The tail-entry
identity MUST be a FQDN which can be used as a domain for
E-mail connections (ie. it should have either an A or MX
record). See the Duties of an Injection Agents section 7.1
and RFC 2142.
5.6.5 Delimiter Summary For historical reasons, the tail-entry (i.e. the rightmost entry in
the Path-content) is regarded as a "user name", and therefore MUST
NOT be interpreted as a site through which the article has already
passed. Moreover, the Path-content is not an E-mail address and MUST
NOT be used to contact the poster. Posting and/or injecting agents
MAY place any string here. When it is not an actual user name, the
string "not-for-mail" is often used, but in fact a simple "x" would
be sufficient.
A summary of delimiters and the meaning they imply for the Often this field will be the only entry in the region (known as the
name on the right, or in addition, the name to the left. pre-injection region) after the '%', although there may be entries
corresponding to machines traversed between the posting agent and the
injecting agent proper. In particular, injecting agents that receive
articles from many sources SHOULD include the identity of the source
machine connecting to do the injection, and possibly other
information enabling them to establish the circumstances of the
injection (provided it does not conflict with any genuine site
identifier). The '!' delimiter may be used freely within the pre-
injection region, although '/' and '?' are also appropriate if used
correctly.
[If/when we invent some form of Injector-Info header, we may want to
revisit that paragraph.]
, Verified or generated identity. 5.6.4. Delimiter Summary
@ Name failed verification test. Name on left is identity A summary of the various delimiters. The name immediately to the left
generated by site further to the left. of the delimiter is always that of the machine which added the
delimiter.
% Optional pre-injection entries followed by tail entry. '/' The name immediately to the right is known to be the identity of
Commonly just the tail entry, either "x" or an encoding the machine from which the article was received (either because
of login identity. Name on left is FQDN of site that the entry was made by that machine and we have verified it, or
handles mail for Injecting Agent. The presence of two "%" because we have added it ourselves).
in a path indicates a double-injected error.
! Entry is unverified. Identity on left is an old-style '?' The name immediately to the right is the claimed identity of the
system not conformant with this specification. machine from which the article was received, but we were unable
to verify it (and have prepended our own view of where it came
News Article Format February 2000
Folding Whitespace MUST NOT be used as the sole delimiter. from, and then a '/').
Other Treat as "!" as per RFC1036 '%' Everything to the right is the pre-injection region followed by
the tail-entry. The name on the left is the FQDN of the
injecting agent. The presence of two '%'s in a path indicates a
double-injection (see 8.2.2).
"/" Reserved for future use, treat as "," '!' The name immediately to the right is unverified. The presence of
a '!' to the left of the '%' indicates that the identity to the
left is that of an old-style system not conformant with this
standard.
; Semicolon is reserved for the generation of extensible headers. ',' Reserved for future use, treat as '/'.
: The colon is a valid delimiter for legacy systems, however, Other
inside an IPv6 numeric address, surrounded in square brackets, Old software may possibly use other delimiters, which should be
it is a part of the path-identifier. treated as '!'. But note in particular that ':', '-' and '_' are
components of names, not delimiters, and FWS on its own MUST NOT
be used as the sole delimiter.
_ This should not be treated as punctuation (a delimiter), NOTE: Old Netnews relaying and injecting programs almost all
contrary to RFC1036. Treat as part of identifiers. delimit Path entries with the '!' delimiter, and these entries
are not verified. As such, the presence of '%' as a delimiter
will indicate that the article was injected by software
conforming to this standard, and the presence of '!' as a
delimiter to the left of a '%' will indicate that the message
passed through systems developed prior to this standard. It is
anticipated that relaying agents will reject articles in the old
style once this new standard has been widely adopted.
5.6.6 Other formatting Issues 5.6.5. Suggested Verification Methods
The Path header MUST NOT be truncated. The following approaches for common transports are suggested in order
to meet a site's verification obligations. They are not required, but
following them should avoid the necessity for wasteful double-entry
Path additions.
Whitespace MAY be present in the Path to make it easier to If the incoming article arrives through some TCP/IP protocol such as
represent. However, there is no requirement to do so. NNTP, the IP address of the source will be known, and will likely
Whitespace MUST not be used as a delimiter. already have been checked against a list of known FQDNs or IP
addresses that the receiving site has agreed to peer with (this will
have involved a DNS lookup of a known FQDN, following CNAME chains as
required, to find an A record containing that source IP).
5.6.6.1 Use of "!" 1. Where the path-identity is an FQDN (or even an arbitrary name
starting with a '.') it is now a simple matter to check that it is
the proper FQDN for the source, or some known registered alias
thereof. Alternatively, where the FQDN in the path-identity has an
associated A record, an immediate DNS lookup as above can be used
to verify it.
Old USENET relaying and injecting programs almost all delimit 2. Where the path-identity is an encoding of an IP address which does
Path: entries with the "!" delimiter, and these entries are not immediately match the known IP address of the source, a
not verified. As such, the presence of "%" as a delimiter reverse-DNS (in-addr.arpa PTR record) lookup may be done on the
will indicate the article was injected by software conforming News Article Format February 2000
to this standard, and the presence of "!" as a delimiter will
indicate the message passed through systems developed prior
to this standard. Prior to the draft-implementation-date,
messages with mixed sets of delimiters will be common. After
that date, all messages SHOULD NOT have "!" delimiters prior
to the "%" delimiter.
5.6.7 Suggested Verification Methods provided address, followed by a regular DNS "A" record lookup on
the returned name. There may be A records for several IP
addresses, of which one should match the path-identity and another
should match the source.
Sites attempting to verify an incoming entry SHOULD take the 3. If the path-identity fails to match any known alias for the source
following approaches for common transports. They are not (requiring the insertion of an extra path-identity for the true
required, but not following them may lead to wasteful source followed by a '?'), simply doing a reverse DNS (PTR) lookup
double-entry Path additions. on the source IP address is not sufficient to generate the true
FQDN. The returned name must be mapped back to A records to assure
it matches the source's IP address.
If the incoming article arrives through some protocol local to If the incoming article arrives through some other protocol, such as
the site, such as UUCP, that protocol MUST include a means of UUCP, that protocol MUST include a means of verifying the source
verifying the article source site, and this should match. In site. In UUCP implementations, commonly each incoming connection has
UUCP implementations, commonly each incoming connection has a a unique login name and password, and that login name (or some alias
unique login name and password; that login name could be used registered for it) would be expected as the path-identity.
to build a suitable verified identifier. [The above description may still contain more detail that we would wish.
My aim so far was to retain everything in Brad's original, but expressed
in a more palatable manner. We can now decide how much of it we want to
keep.]
Here is an example of a suitable verification method for an 5.6.6. Example
article arriving via a TCP/IP protocol such as via NNTP:
1. If it is an encoding of an IP address, it should be decoded Path: foo.isp.example/
into a canonical form. If that address does not match the .foo-server/bar.isp.example?10.123.12.2/old.site.example!
source's IP, a reverse-DNS (in-addr.arpa PTR record) lookup barbaz/baz.isp.example%dialup123.baz.isp.example!x
should be done on the provided address, followed by a regular
DNS "A" record lookup on the returned name. That A record may
contain several IP addresses. So long as one matches the IP
address from the path, and another matches the source IP
address, this is considered a match.
2. If it is a internet DNS style FQDN, then the name should be NOTE: That article was injected into the news stream by
looked up with DNS. The A records MUST contain an IP address baz.isp.example (complaints may be addressed to
that is the verified address of the source. usenet@baz.isp.example). The injector has taken care to record
that it got it from dialup123.baz.isp.example. "x" is the
default tail entry, though sometimes a real userid is put there.
3. (It should be noted that when generating a name after a The article was relayed, perhaps by UUCP, to the machine known
non-match, if an FQDN is desired, simply doing a reverse DNS in the UUCP maps database as "barbaz".
(PTR) lookup on the IP address is not sufficient to generate
the FQDN. The returned name must be mapped back to A records
to assure it matches the source's IP address.)
5.6.8 Issues Barbaz relayed it to old.site.example, which does not yet
conform to this standard (hence the '!' delimiter). So one
cannot be sure that it really came from barbaz.
There is no firm way to tell a path entry generated by new Old.site.example relayed it to a site claiming to have the IP
software, and one generated by old software assuming that any address [10.123.12.2], and claiming (by using the '/' delimiter)
delimiter is valid. However, use of "!" by old software has to have verified that it came from old.site.example.
become effectively universal.
Sites are not strictly required to use a standard form for [10.123.12.2] relayed it to ".foo-server" which, not being
their path entry, but if they don't, path lines out of that convinced that it truly came from [10.123.12.2], did a reverse
site get longer due to the adding of the identity. However, lookup on the actual source and concluded it was known as
groups of associated sites wanting a common identity may decide bar.isp.example (that is not to say that [10.123.12.2] was not a
to use that and let the receiver add the specific site. correct IP address for bar.isp.example, but simply that that
connection could not be substantiated by .foo-server). Observe
that .foo-server has now added two entries to the Path.
News Article Format February 2000
".foo-server" is a locally significant name (observe the
presence of the '.') within the complex site of many machines
run by foo.isp.example, so the latter should have no problem
recognizing .foo-server and using a '/' delimiter. Presumably
foo.isp.example then delivered the article to its direct
clients.
It appears that foo.isp.example and old.site.example decided to
fold the line, on the grounds that it seemed to be getting a
little too long.
6. Optional Headers 6. Optional Headers
The headers appearing in this section have established The headers appearing in this section have established meanings and
meanings. They MUST be interpreted according to the MUST be interpreted according to the definitions given here. None of
definitions made in this document. None of them are required to them is required to appear in every article but some of them are
appear in every article. All of the headers appearing in this required in certain types of article, such as followups. Any header
document MUST NOT appear more than once in an article. Headers defined in this (or any other) standard MUST NOT appear more than
not appearing in this document (i.e. X-headers, headers defined once in an article unless specifically stated otherwise.
by cooperating subnets) are exempt from this requirement. See Experimental headers (4.2.2.1) and headers defined by cooperating
"Responsibilities of Agents" for a clear picture. subnets are exempt from this requirement. See section 8 "Duties of
Various Agents" for the full picture.
6.1 Followup-To 6.1. Reply-To
The Followup-To header contents specify which newsgroup(s) The Reply-To header specifies a reply address(es) to be used for
followups should be posted to: personal replies for the author(s) of the article when this is
different from the author's address(es) given in the From header. The
content syntax makes use of syntax defined in [MESSFOR], but subject
to the revised definition of local-part given in section 5.2.
Followup-To-content = Newsgroups-content / "poster" Reply-To-content = From-content ; see 5.2
The syntax is the same as that of the Newsgroups content, with In the absence of Reply-To, the reply address(es) is the address(es)
the exception that the magic word "poster" means that in the From header. For this reason a Reply-To SHOULD NOT be included
followups should be mailed to the article's reply address if it just duplicates the From header.
rather than posted. In the absence of Followup-To, the default
newsgroup(s) for a followup are those in the Newsgroups header
and for this reason the Followup-To header should not be
included if it just duplicates the Newsgroups header.
6.2 Sender NOTE: Use of a Reply-To header is preferable to including a
similar request in the article body, because reply agents can
take account of Reply-To automatically.
The Sender header specifies the email address of the entity An address of "<>" in the Reply-To header MAY be used to indicate
which actually sent this article, if that entity is different that the poster does not wish to recieve email replies.
from the From header. This header SHOULD NOT appear in an
article unless the sender is different from the author. This
header is appropriate for use by automatic article posters.
See [DRUMS] for
Sender-content = mailbox-list 6.1.1. Examples
6.3 Expires Reply-To: John Smith <jsmith@site.example>
Reply-To: John Smith <jsmith@site.example>, dave@isp.example
Reply-To: John Smith <jsmith@site.example>,andrew@isp.example,
fred@site2.example
Reply-To: Please do not reply <>
News Article Format February 2000
The Expires header content specifies a date and time when 6.2. Sender
the article is deemed to be no longer useful and should be
removed ("expired"). The content syntax is the same as that of
the Date content which is defined in the Message Format
Standard [MESSFOR] .
expires-content = date-time The Sender header specifies the mailbox of the entity which actually
sent this article, if that entity is different from that given in the
From header or if more than one address appears in the From header.
This header SHOULD NOT appear in an article unless the sender is
different from the author. This header is appropriate for use by
automatic article posters. The content syntax makes use of syntax
defined in [MESSFOR].
A Expires header SHOULD only be used in an article if the Sender-content = mailbox
requested expiry time is earlier or later than the default
would normally be for that article. Local policy for each
serving agent will dictate when this header is obeyed and
authors SHOULD NOT depend on it being completely followed.
6.3. Reply-To 6.3. Organization
The Reply-To header content specifies a reply address(es) to The Organization header is a short phrase identifying the author's
be used for personal replies for the author(s) of the article organization.
when this is different from the author's address(es) given in
the From header. The format of the Reply-To header is defined
in the Message Format Standard [MESSFOR] .
In the absence of Reply-To, the reply address(es) is the Organization-content= 1*( [FWS] utext )
address(es) in the From header. For this reason a Reply-To
SHOULD NOT be included if it just duplicates the From header.
Use of a Reply-To header is preferable to including a similar NOTE: Posting and injecting agents are discouraged from
request in the article body, because reply agents can take providing a default value for this header unless it is
account of Reply-To automatically. acceptable to all posters using those agents. Unless this header
contains useful information (including some indication of the
authors physical location) posters are discouraged from
including it.
"Reply-To: <> " MAY be used to indicate that the poster does 6.4. Keywords
not wish to recieve email replies.
Reply-To-content = From-content The Keywords field contains a comma separated list of important words
and phrases intended to describe some aspect of the content of the
article. The content syntax makes use of syntax defined in [MESSFOR].
6.3.1 Examples: Keywords-content = phrase *( "," phrase )
Reply-To: John Smith <jsmith@site.example> NOTE: The list is comma separated NOT space separated.
Reply-To: John Smith <jsmith@site.example>, dave@isp.example
Reply-To: John Smith <jsmith@site.example>, andrew@isp.example,
fred@site2.example
Reply-To: Please not not reply <>
6.4. References 6.5. Summary
The References header content lists optionally CFWS-separated The Summary header is a short phrase summarizing the article's
message ids of precursors. The format of the References header content.
is defined in the Message Format Standard [MESSFOR].
A followup MUST have a References header, and an article that Summary-content = 1*( [FWS] utext )
is not a followup MUST NOT have a References header. In a
followup, if the precursor did not have a References header,
the followup's References content MUST be formed by the
message ID of the precursor. A followup to an article which
had a References header MUST have a References header
containing the precursor's References content, plus the
precursor's message ID appended to the end of the list
(separated from it by optional CFWS).
Followup Agents SHOULD NOT trim message ids out of the The summary SHOULD be terse. Authors SHOULD avoid trying to cram
References content unless the number of message ids exceeds 31 their entire article into the headers; even the simplest query
in which case message ids SHOULD be trimmed until there are usually benefits from a sentence or two of elaboration and context,
only 31. and not all reading agents display all headers. On the other hand the
summary should give more detail than the Subject.
Trimming SHOULD be done by removing the sixth (6th) message-id 6.6. Distribution
and any incomplete or otherwise broken message-ids. If
Followup Agents trim any message-ids out of the References
content, then they MUST leave the first five and the last nine
message ids and they SHOULD also leave any message ids
mentioned in the body of the article intact.
NOTE: Software writers should be aware that the number of The Distribution header is an inheritable header (see 4.2.2.2) which
messages ids in this header may exceed 31 and software must be specifies geographical or organizational limits to an article's
able to handle this without problem. propagation.
References-content = msg-id [msg-id...] News Article Format February 2000
6.4.1 Examples: Distribution-content= distribution *( dist-delim distribution )
dist-delim = ","
distribution = positive-distribution /
negative-distribution
positive-distribution
= *FWS distribution-name *FWS
negative-distribution
= *FWS "!" distribution-name *FWS
distribution-name = letter 1*distribution-rest
distribution-rest = letter / "+" / "-" / "_"
Articles MUST NOT be passed between relaying agents or to serving
agents unless the sending agent has been configured to supply and the
receiving agent to receive BOTH of
(a) at least one of the newsgroups in the article's Newsgroups
header, and
(b) at least one of the positive-distributions (if any) in the
article's Distribution header and none of the negative-
distributions.
Additionally, reading agents MAY be configured so that unwanted
distributions do not get displayed.
NOTE: Although it would seem redundant to filter out unwanted
distributions at both ends of a relaying link (and it is clearly
more efficient to do so at the sending end), many sending sites
have been reluctant, historically speaking, to apply such
filters (except to ensure that distributions local to their own
site or cooperating subnet did not escape); moreover they tended
to configure their filters on an "all but those listed" basis,
so that new and hitherto unheard of distributions would not be
caught. Indeed many "hub" sites actually wanted to receive all
possible distributions so that they could feed on to their
clients in all possible geographical (or organizational)
regions.
Therefore, it is desirable to provide facilities for rejecting
unwanted distributions at the receiving end. Indeed, it may be
simpler to do so locally than to inform each sending site of
what is required, especially in the case of specialized
distributions (for example for control messages, such as cancels
from certain issuers) which might need to be added at short
notice. Tha possibility for reading agents to filter
distributions has been provided for the same reason.
Exceptionally, ALL relaying agents are deemed willing to supply or
accept the distribution "world", and NO relaying agent should supply
or accept the distribution "local". However, "world" SHOULD NEVER be
mentioned explicitly since it is the default when the Distribution
header is absent entirely. "All" MUST NOT be used as a
distribution-name. Distribution-names SHOULD contain at least three
characters, except when they are two-letter country names as in [ISO
3166]. Distribution-names are case-insensitive (i.e. "US", "Us" and
"us" all specify the same distribution).
News Article Format February 2000
NOTE: "Distribution: !us" can be used to cause an article to go
to the whole of "world" except for "us".
Posting agents SHOULD NOT provide a default Distribution header
without giving the poster an opportunity to override it. Followup
agents SHOULD initially supply the same Distribution header as found
in the precursor.
6.7. Followup-To
The Followup-To header specifies which newsgroup(s) followups should
be posted to.
Followup-To-content = Newsgroups-content / "poster"
The syntax is the same as that of the Newsgroups-content, with the
exception that the magic word "poster" is allowed. In the absence of
a Followup-To header, the default newsgroup(s) for a followup are
those in the Newsgroups header, and for this reason the Followup-To
header SHOULD NOT be included if it just duplicates the Newsgroups
header.
A Followup-To header consisting of the magic word "poster" indicates
that the author requests no followups to be sent in response to this
article, only personal replies to the article's reply address.
6.8. References
The References header lists optionally CFWS-separated message
identifiers of precursors. The content syntax makes use of syntax
defined in [MESSFOR].
References-content = msg-id *( CFWS msg-id )
NOTE: This differs from the syntax of [MESSFOR] by requiring at
least one CFWS between the msg-ids (this was an [RFC 1036]
requirement).
A followup MUST have a References header, and an article that is not
a followup MUST NOT have a References header. In a followup, if the
precursor did not have a References header, the followup's
References-content MUST be formed by the message identifier of the
precursor. A followup to an article which had a References header
MUST have a References header containing the precursor's References-
content (subject to trimming as described below) plus the precursor's
message identifier appended to the end of the list (separated from it
by CFWS).
Followup agents SHOULD NOT trim message identifiers out of a
References header unless the number of message identifiers exceeds
21, at which time trimming SHOULD be done by removing sufficient
identifiers starting with the second so as to bring the total down to
21. However, it would be wrong to assume that References headers
containing more than 21 message identifiers will not occur.
News Article Format February 2000
6.8.1. Examples
References: <i4g587y@site1.example> References: <i4g587y@site1.example>
References: <i4g587y@site1.example> <kgb2231+ee@site2.example> References: <i4g587y@site1.example> <kgb2231+ee@site2.example>
References: <i4g587y@site1.example><kgb2231+ee@site2.example> References: <i4g587y@site1.example><kgb2231+ee@site2.example>
<222@site1.example><87tfbyv@site7.example><67jimf@site666.example> <222@site1.example> <87tfbyv@site7.example>
<67jimf@site666.example>
References: <i4g587y@site1.example> <kgb2231+ee@site2.example> References: <i4g587y@site1.example> <kgb2231+ee@site2.example>
<tisjits@smeghead.example> <tisjits@smeghead.example>
6.5. Control 6.9. Expires
The Control header content marks the article as a control The Expires header specifies a date and time when the article is
message, and specifies the desired actions (other than the deemed to be no longer relevant and could usefully be removed
usual ones of filing and passing on the article): ("expired"). The content syntax makes use of syntax defined in
[MESSFOR].
Control-content = verb *( FWS argument ) verb = 1*( ALPHA / Expires-content = date-time
DIGIT ) argument = 1* ftext
The verb indicates what action should be taken, and the An Expires header should only be used in an article if the requested
argument(s) (if any) supply details. In some cases, the body expiry time is earlier or later than the time typically to be
of the article may also contain details. The next section expected for such articles. Local policy for each serving agent will
describes the standard verbs. dictate whether and when this header is obeyed and authors SHOULD NOT
depend on it being completely followed.
6.6. Control Messages 6.10. Archive
The following sections document the group control messages. This optional header is a signal to automatic archival agents on
"Message" is used herein as a synonym for "article" unless whether this article is available for long-term storage.
context indicates otherwise. Group control messages are a
special class of control messages, that request the group
configuration on a server be updated.
All of the group control messages MUST have an Approved header Archive-content = [CFWS] ("no" | "yes" ) [CFWS]
(section 6.10). They SHOULD use one of the authentication Archive-header-parameter
mechanisms defined in section TBD. = Filename-token "=" value
; for USENET-header-parameters see 4.1
Filename-token = [CFWS] "filename" [CFWS]
The execution of the actions requested by control messages is Agents which see "Archive: no" MUST NOT keep the article past the
subject to local administrative restrictions, which MAY deny Expires date. "Archive: yes" merely confirms what is already the
requests or refer them to an administrator for approval. The default state. The optional Filename parameter MAY then be used to
descriptions below are generally phrased in terms suggesting suggest a filename under which the article should be archived.
mandatory actions, but any or all of these MAY be subject to Further extensions to this standard may provide additional parameters
local administrative approval (either as a class or for administration of the archiving process.
case-by-case). Analogously, where the description below
specifies that a message or portion thereof is to be ignored,
this action MAY include reporting it to an administrator.
Relaying Agents MUST propagate even control messages they do 6.11. Control
not understand.
In the following sections, each type of control message is The Control header marks the article as a control message, and
defined syntactically by defining its arguments and its body. specifies the desired actions (other than the usual ones of storing
For example, "cancel" is defined by defining cancel-arguments and/or relaying the article).
and cancel-body.
6.6.1 The "newgroup" Control Message Control-content = CONTROL-verb CONTROL-argument
CONTROL-verb = <the verb defined in this standard
(or an extension of it) for a specific
CONTROL message>
verb = token
News Article Format February 2000
newgroup-ctrl = "newgroup" FWS groupname [ FWS flags ] CONTROL-arguments = <the argument defined in this standard
flags = "moderated" (or an extension of it) for a specific
groupname ; defined in [NEWS] CONTROL message>
arguments = *( CFWS value ) ; see 4.1
[Observe that <value> reqires the use of a quoted-string if any
tspecials or NON-ASCII characters are involved. This is a restriction on
present usage, but follows Mime practice.]
The "newgroup" control message requests the specified group be The verb indicates what action should be taken, and the argument(s)
created or changed. The text "moderated" is appended to mark (if any) supply details. In some cases, the body of the article may
the group as moderated. The message contains a also contain details. Section 7 describes all of the standard verbs.
"multipart/news-groupinfo" (section 6.6.1 body) part containing
machine- and human-readable information about the group.
The newgroup command is also used to update the description An article with a Control header MUST NOT also have a Replaces or
line or moderation status of a group. Supersedes header.
NOTE: It is also possible to send newgroups for existing NOTE: The presence of a Subject header starting with the string
groups that don't change anything to ensure the group exist on "cmsg " and followed by a Control-content MUST NOT be construed,
all systems ("booster" newgroups). Implementations might want in the absence of a proper Control header, as a request to
to test for this condition before attempting to update their perform that control action (as may have occurred in some legacy
configuration. software). See also section 5.4.
6.6.1.1 multipart/news-groupinfo 6.12. Approved
The "multipart/news-groupinfo" body structure contains The Approved header indicates the mailing addresses (and possibly the
information about a (new) newsgroup. full names) of the persons or entities approving the article for
posting.
The MIME content type definition of "multipart/news-groupinfo" Approved-content = From-content ; see 5.2
is:
MIME type name: multipart Each mailbox contained in the Approved-content MUST be that of the
MIME subtype name: news-groupinfo person or entity in question, and one of those mailboxes MUST be that
Required parameters: boundary (see [MIME2]) of the actual injector of the article.
Optional parameters: none [This is the start of an attempt to strengthen this header. It should be
Encoding considerations: "7bit" or "8bit" is sufficient and a TOSSable offence to put a dummy or invalid address in here. Later,
MUST be used to maintain compatibility. when we have some form of authentication, I would hope to be able to say
Security considerations: to be added more.]
A "multipart/news-groupinfo" body part contains the following An Approved header is required in all postings to moderated
subparts: newsgroups. If this header is not present in such postings, then
relaying and serving agents MUST reject the article. Please see
section 8.2.2 for how injecting agents should treat postings to
moderated groups that do not contain this header.
1. An "application/news-groupinfo" part (section 6.6.1.2) An Approved header is also required in certain control messages, to
containing the name and description line of the group(s). This reduce the risk of accidental posting of same; see the relevant parts
part is mandatory. of section 7.
2. Other parts containing useful information about the 6.13. Replaces / Supersedes
backgrounds of newsgroup message.
3. Parts containing initial named articles for the These two headers contain one or more message identifiers that the
newsgroup. See section 6.6.1.3 for details. current article is expected to replace or supersede. All listed
articles MUST be treated as though a "cancel" control message had
arrived for the article (but observe that a site MAY choose not to
News Article Format February 2000
6.6.1.2 application/news-groupinfo honour a "cancel" message, especially if its authenticity is in
doubt).
The "application/news-groupinfo" body part contains a short 6.13.1. Syntax and Semantics
information on a newsgroup, i.e. the group's name, it's
description and the moderation flag.
NOTE: This part has a format that makes the whole The Replaces and Supersedes headers specify articles to be cancelled
"multipart/news-groupinfo" structure compatible with [1036BIS]. on arrival of this one. The content syntax makes use of syntax
defined in [MESSFOR].
The MIME content type definition of "application/news-groupinfo" Replaces-content = msg-id *( CFWS msg-id )
is: Replaces-header-parameter
= Usage-token "=" Usage-value
; for USENET-header-parameters see 4.1
Usage-token = [CFWS] "usage" [CFWS]
Usage-value = [CFWS] ("replace" / "revise" / "repost" )
[CFWS]
Supersedes-content = msg-id
MIME type name: application NOTE: There is no "c" in "Supersedes".
MIME subtype name: news-groupinfo [I could be persuaded of a better token than "usage". I did wonder about
Optional parameters: none "disposition". Observe that "usage" is also now used also in
Encoding considerations: "7bit" or "8bit" is sufficient and message/news-transmission.]
MUST be used to maintain compatibility.
Note that the descriptions may use [MIME3].
Security considerations: to be added
The content of the "application/news-groupinfo" body part is If an article contains a Replaces header, then the old articles
defined as: mentioned SHOULD simply be deleted by the serving agent, as in a
cancel message (7.5), and the new article inserted into the system as
any other new article would be.
groupinfo-body = descriptor-tag CRLF 1*( description-line CRLF ) A Replaces-header-parameter is only meaningful when it occurs within
descriptor-tag = %x46.6F.72 SP %x79.6F.75.72 SP a Replaces-content. If its Usage-value is "revise" or "repost" (or if
%x6E.65.77.73.67.72.6F.75.70.73 SP the Replaces-header-parameter is absent, then by default) reading
%x66.69.6C.6E.3A agents SHOULD NOT show the article as an "unread" article unless the
; case sensitive "For your newsgroups file:" replaced article(s) were themselves all unread, except when the
description-line = newsgroup-name [ 1*WSP description ] reader has configured his reading agent otherwise.
description = nonblank-text
moderation-flags = [ moderated-literal ]
moderated-literal = %x28.4D.6F.64.65.72.61.74.65.64.29
; case sensitive "(Moderated)"
The "application/news-groupinfo" is used in conjunction with the Moreover, if a Usage-value is "revise" or "repost", serving agents
"newgroup" (section 6.6.1) and "mvgroup" control messages (section that generate a local Xref header MUST then include additional
6.6.3) as part of a "multipart/news-groupinfo" (section 6.6.1) MIME "revise" or "repost" information as set out in section 6.14.
structure.
Moderated newsgroups MUST be marked by appending the case NOTE: A replacement with "usage=replace" is intended to be used
sensitive text " (Moderated)" at the end. It is NOT recommended in the case of an article that is sufficiently different from
that the moderator's email address be included in the description. its predecessors that it is advisable for readers to see it
again. A replacement with "usage=revise" is intended to be used
in the case of a minor change, unworthy of being brought to the
attention of a reader who has already read one of its
predecessors. A replacement with "usage=repost" is intended to
be used in the case of an article identical to the one replaced
(but possibly being reposted because the earlier one had likely
expired).
Although, in accordance with [NNTP], [MESSFOR] and 4.6 of this NOTE: A reader who elects to ignore all the articles available
document, a description line could have a maximum length of 998 in a newsgroup (perhaps on the occasion of accessing that
octets, as a matter of policy a far lower limit, expressed in newsgroup for the first time) will likely have them all marked
characters, SHOULD be set. By default, in the absence of as "already read", unless the reading agent provides a distinct
explicit policies, the description length SHOULD be limited in News Article Format February 2000
such a way that the newsgroup name, the tab (interpreted as an
8-character tab that takes one at least to column 24) and the
description (excluding flags) fit into the first 79 characters.
NOTE: Servers that use an "newsgroups" file will store the mark such as "never offered". This could lead to a later
group descritpions there as is, i.e. without any conversion of replacement with "revise" or "repost" for one of those articles
charsets or encoding. being missed.
NOTE: The descriptions will also be used with the [NNTP] LIST The Supersedes header is obsolescent, is provided only for
NEWSGROUPS command. The descriptions will be sent as is, i.e. compatibility with existing software, and may be removed entirely in
without any conversion of charsets or encoding. some future version of this standard. Its meaning is the same as that
of a corresponding Replaces header with its Replaces-header-parameter
set to "usage=replace", and whenever a Supersedes header is provided
a matching Replaces header SHOULD be provided as well. Observe that
the Supersedes header makes provision for only a single msg-id.
6.6.1.3 Initial Named Articles Until the Replaces header has become widely implemented, software
SHOULD generate Replaces headers with only one msg-id, and cancel
control messages SHOULD be issued if needed for further identifiers.
Moreover, until that time, any article containing a Replaces header
SHOULD contain also a Supersedes header (or alternatively be
accompanied by a Control cancel message) for that same msg-id, to
ensure that older systems still at least remove the predecessor.
Some parts of a multipart/news-groupinfo structure MAY contain When a message contains both a Replaces and a Supersedes header they
an initial set of named articles. These parts are identified by MUST be for the same msg-id. Furthermore, to resolve any doubt, the
the Article-Name header just like normal named article Replaces header shall be deemed to take priority.
postings. The named articles are filed separately as single
postings, where the headers of the enclosing control message
are copied to every part that contains a named article except
that:
Content-* and Article-* headers MUST be taken from the body part. Whatever security or authentication mechanisms are required for a
Control cancel message MUST also be required for an article with a
Replaces or Supersedes header. In the absence or failure of such
checks, the article SHOULD be discarded, or at most stored as an
ordinary article.
[We can write something more constructive in here as soon as the
situation with regard to cancel-locks and signed headers has been
clarified.]
The message id MUST be changed by inserting /partX before the @ 6.13.2. Message-ID version procedure
sign, where X is the number of the body part, starting with 0.
The Control header of the enclosing message header MUST be
stripped. It MAY be replaced by a "Control: named" header.
Signatures (Auth, X-Auth...) of the enclosing message SHOULD be
stripped. They MAY be replaced by a signature of the own site.
The resulting articles are for internal use of the server and its Whilst this procedure is not essential for the operation of Netnews,
users only, they MUST NOT, repeat MUST NOT be forwarded to other it SHOULD be supported by all serving agents. However, for the
sites. procedure to work, all the msg-ids in the Replaces-content MUST be
those of successive replacements of the same original article, and
all be generated as described below.
[Whilst the procedure about to be described will undoubtedly work, it
must be pointed out that life would be much simpler if there was only a
single msg-id allowed in a Replaces-content.]
Nested multipart/* structures are allowed, they are not 6.13.2.1. Message version numbers
recursively expanded to separate articles.
6.6.2 The "rmgroup" Control Message According to [MESSFOR], and omitting the obsolete forms, the syntax
of the left hand side of a msg-id (the part before the "@") is given
by:
rmgroup-ctrl = "rmgroup" FWS groupname id-left-side = dot-atom-text / no-fold-quote
News Article Format February 2000
The "rmgroup" control message requests the specified group be Consider this to be replaced by:
removed from the list of valid groups. The Content-Type of the
body is unspecified; it MAY contain anything, usually an
explaining text.
NOTE: It is also possible to send rmgroups for nonexisting, id-left-side = ( atom-text / no-fold-quote )
bogus groups to ensure the group is removed on all systems *( dollars-sequence )
("booster" rmgroups). Implementations might want to test for dollars-sequence = version-number / random-dollars-sequence
this before attempting to update their configuration. version-number = "$" %d118 "=" 1*DIGIT ; $v=digits
random-dollars-sequence
= "$" 1*atom-text
6.6.3 The "mvgroup" Control Message Whilst this is admittedly ambiguous ("$" is already a possible value
of atom-text) and does not in fact change what is allowable as an
id-left-side, it does serve to allow dollars-sequences such as
version-number (and any others that may be added by extensions to
this standard) to be distinguished within a message identifier and
utilized by agents which can understand them. Observe that no-fold-
quotes cannot occur within a dollars-sequence.
mvgroup-ctrl = "mvgroup" FWS ( mvgrp-groups / mvgrp-hrchy) Posters and/or posting agents when replacing (or superseding)
mvgrp-groups = groupname [ FWS groupname ] articles SHOULD arrange that the message identifier of the
mvgrp-hrchy = groupnamepart ".*" FWS groupnamepart replacement follows the following convention, generating what are
groupnamepart = groupname ; syntactically known as "version-number" message identifiers. This is to enable the
new version of the article to be retrieved by its original message
identifier, notably when it occurs in a URL of the form
<news:message-identifier> [RFC 1738].
6.6.3.1 Single group 1. If the id-left-side of the most recent predecessor's message
identifier contains a leftmost version-number "$v=<n>", where <n>
is an integer version number, possibly followed by one or more
random-dollars-sequences, the replacement message identifier
should be obtained by replacing the <n> with the integer <n+1> and
providing a different random-dollars-sequence(s). For example
<foo$v=3$XYZ@faq-site.example> becomes <foo$v=4$PQR@faq-
site.example>.
The "mvgroup" control message requests the first specified 2. If the id-left-side of the predecessor's message identifier does
group to be moved to the second group. The message contains a not contain a version-number, the replacement message identifier
"multipart/news-groupinfo" (section 6.6.1.2) body part containing should be obtained by appending the string "$v=1", preferably
machine- and human-readable information about the new group. followed by a random-dollars-sequence(s), to that id-left-side.
For example <foo@faq-site.example> becomes <foo$v=1$ABC@faq-
site.example>.
When this message is received, the new group SHOULD be created Any random-dollars-sequence so added MUST NOT start with "$<l>=" for
and all articles, including named articles, SHOULD be copied or any letter <l>.
moved to the new group, then the old, now empty group SHOULD be
deleted.
NOTE: For servers that use a file system directory structure to NOTE: The presence of a random-dollars-sequence following the
organize message storage, this operation is quite efficiently version-number is intended to prevent a malicious poster from
implemented as a single directory rename operation. preempting the posting of a replacement article by guessing its
likely message identifier.
If the old group does not exist, the message is ignored unless Attempts to fetch a replaced (or superseded) article by its message
the new group does not exist either, in which case the new identifier SHOULD retrieve instead its most recent successor which
group is created just as for a "newgroup" message. has used the version-number convention. Some indication that a newer
version than was asked for has been delivered SHOULD be provided.
This is intended to ensure that "news:" URLs [RFC-1738] will continue
News Article Format February 2000
An indication that the old group was replaced by the new group to work even when an article has been replaced, but agents SHOULD
MAY be left back in the server's configuration and be made then draw attention to the fact that the message identifier retrieved
available to clients. differed from that requested.
NOTE: For servers that use an "active" file this means an entry 6.13.2.2. Implementation and Use Note
in the form "oldgroup xxx yyy =newgroup" is created.
NOTE: If the old group did not exist, this is considered a [Here is the implementation technique that we discussed, based on the
local configuration error. Therefore it is the best to correct use of a conventional History file. This is a sanity check for our own
this error when a mvgroup is received. use, not intended to go in the final text.
If the old group does not exist, the message is ignored unless 1. Ensure that the implementation of DBZ is not upset if the same key is
the new group does not exist either, in which case the new attempted to be stored a second time, and that such a key always
group is created just as for a "newgroup" message. retrieves the latest record indexed by that key.
If both groups exist, the groups MAY be "merged". If this is 2. Additions to the History file are always made at the end. Removals or
done, it MUST be done correctly, i.e. implementations MUST take changes to existing entries are only made by the expire program. An
care that the messages in the group being deleted are entry for a Replaced (or otherwise cancelled) article will remain until,
renumbered accordingly to avoid overwriting articles in one first, the expire program removes the links to the articles that are no
group with those of the other and that crossposted articles longer stored, and later on removes the entire entry according to its
don't appear twice. Otherwise, the old group is just deleted. expiry date. For every entry containing a '$v=n' followed by random-
dollars-sequences there will be an immediately following entry identical
but for the omission of that '$v=n' and of the random-dollars-sequences.
Thus there may be several entries with identical message-ids but,
because of the change to DBZ just described, only the most recent will
ever be seen except by programs that access the History file directly,
rather than by its index.
In all cases, information transported in the 3. When an article is Replaced, at the same time as the successor
"multipart/news-groupinfo" body part is applied to the new group. article is entered into the History file, with '$v=7' say, a duplicate
entry (same article list) is entered under the same key, modified by
removing any leftmost '$v=n' and the following random-dollars-sequences
from it.
Named articles are taken from the mvgroup message, the new 4. Provide a call to a routine which, if asked to retrieve any message
group (if already existent) and the old group in this identifier with '$v=n' and finding it missing (or rather linked to no
precedence. stored groups), immediately tries again without the '$v=n' and its
random-dollars-sequences. NOTE. We don't want this behaviour when
checking whether we already have an article offered to us by IHAVE, only
in response to an ARTICLE command. So this needs to be an extra call in
DBZ, in addition to the 'fetch' or 'dbzfetch' calls, to be used in the
proposed extension to the NNTP ARTICLE command. Observe that if the
requested '$v=n' is present and linked to stored articles (for whatever
reason) then you will be given exactly that version, even if later ones
are stored as well.
As a special case, the second name, i.e. the one of the new 5. NOTE that I have dropped the idea of having '$v=0', because you can
group MAY be omitted. In this case, only the information of the never be sure that the very first issue of the FAQ used it, so you have
group is updated according to the contained to provide the versionless root as well. If someone asks for '$v=0' (or
"multipart/news-groupinfo". any '$v=n') the algorithm I gave will still find it via the root. So we
don't care what people put in URLs.
Until most relay agents conform to this document, whenever a mvgroup 6. You are supposed to cancel the replaced/superseded article. If you
control message for a single group is issued, a corresponding pair of REALLY want to keep the old ones around a little longer, then this
rmgroup and newgroup control messages SHOULD be issued a few days later. implementation will not work if you want the latest to be retrieved
News Article Format February 2000
6.6.3.2 Multiple Groups automatically - you will have to invent something much more complicated.
If the first name ends with the character sequence ".*", the 7. Having said all that, here follows a brief account of the same thing,
newgroup message requests a whole (sub)hierarchy to be moved. but short enough to be included in our document (the convention being
The same procedure as for single groups (section 6.6.3.1) applies that implementation issues are hinted at, rather than being described in
to every matched group; however, some systems might be able to full detail).]
optimize the process.
NOTE: For servers that use a file system directory structure to Typically, a news database will index a Replacement article both by
organize message storage, this process can be optimized by its "version-number" message identifier (containing a "$v=" tag
renaming the parent directory instead of every group's followed by a random-dollars-sequence) and by its "root" version
directory. (without the "$v=" tag or any following random-dollars-sequence).
Thus when a request for an article comes in that is not present under
the version-number requested, any article that is present and indexed
by the corresponding root version can be retrieved instead. The
indexing mechanism needs to be such that, although the root version
may have at times referred to many different articles, it is always
the latest that is retrieved.
To avoid recursion, the new groups' names MUST NEVER match the NOTE: The presence of a version-number in the message identifier
old groups name pattern; i.e. moving a whole (sub)hierarchy to of an article without a Replaces or Supersedes header causes no
a subhierarchy of the original hierarchy is explicitly extra action (it is just an ordinary article). Observe also that
disallowed. if an article with the exact message identifier (even though it
contains a version-number) is, for whatever reason, already
present on the serving agent, that article will always be
retrieved in preference to the one indexed by any root version.
Until a critical mass of relay agents are in compliance, whenever 6.13.2.3. The Message-Version NNTP extension
a mvgroup control message for multiple groups is issued, a
corresponding set of rmgroup and newgroup control messages for all
the affected groups SHOULD be issued a few days later.
6.6.4 The "checkgroups" Control Message The following Service Extension to the NNTP protocol is defined in
accordance with the framework set out in [NNTP], and is to be
registered with IANA.
The "checkgroups" control message contains a list of all valid Name of the extension: Message-Version
groups in a complete hierarchy. The "Control:" header has the Extension Label (for the LIST EXTENSIONS command): MESSAGE-VERSION
following format: Additional keywords, syntax and parameters: None
checkgroup-ctrl = "checkgroups" [ FWS chkscope ] [ FWS chksernr ] In a server supporting this extension, the behaviour of the ARTICLE,
chkscope = 1*( ["!"] newsgroup-name-part ) HEAD, BODY and STAT commands when the parameter is a <message-id> is
chksernr = "#" 1*DIGIT modified as follows.
The chkscope parameter(s) specifies the (sub)hierarchy(s) for If the specified article is available on the server then it (or its
which this "checkgroups" message applies. Head, Body or Status as appropriate) is returned in the normal
manner. Otherwise, if a leftmost id-left-side of the <message-id>
(the part before the '@') contains "$v=<n>", where <n> is an integer
version number, that "$v=<n>"and everything following it is stripped
from that id-left-side and the article (Head, Body or Status) with
the stripped <message-id> is returned instead. Otherwise (no article
is available under the original, or any stripped, <message-id>), a
430 response is given as usual.
6.6.4.1 Example: NOTE: If the client is concerned to know whether the article
Control: checkgroups de !de.alt #248 found was exactly the one requested or a replacement article
corresponding to a stripped <message-id>, then it has only to
compare the <message-id> requested with that returned in the 220
News Article Format February 2000
NOTE: "Old" software is known to ignore the "chkscope" (221, 222, or 223) response. The intent of this extension is to
parameter. Thus a "checkgroups" message SHOULD also contain enable the retrieval of the current version of an article (such
the groups of other subhierarchies the sender is not as a regularly posted FAQ) referenced by a "news:" URL [RFC-
responsible for. "New" software MUST ignore groups which don't 1738] which quotes the <message-id> of an earlier version.
fall into the scope of the "checkgroups" message.
If no scope for the checkgroups message is given, it applies to NOTE: This extension has no effect on the IHAVE command.
all hierarchies for which group statements appear in the
message.
"Checkgroups" messages MAY also contain a serial number, which 6.13.2.4. Examples
can be any positive integer (i.e. just numbered or the date in
YYYYMMDD). It SHOULD increase by an arbitrary value with every
change to the group list and MUST NOT ever decrease.
NOTE: This was added to circumvent security problems in Example 1. The first edition of a FAQ is posted with a message
situations where the Date header can not be signed. identifier of the form: <examplegroup-faq@faq-site.example>. The
next (but identical) version, a month later, has:
The body of the message is an "application/news-checkgroups" part Message-ID: <examplegroup-faq$v=1$A1b@faq-site.example>
containing the list of ALL valid groups (and MAYBE deletion Replaces: <examplegroup-faq@faq-site.example> ; usage=repost
confirmations) for the specified hierarchies. Supersedes: <examplegroup-faq@faq-site.example>
Observe the inclusion of a Supersedes header as well, it being
presumed that the Replaces header was not yet widely implemented at
that time.
6.6.5 application/news-checkgroups The next one, another month later (and with some significant changes
justifying the use of "replace" rather than "repost") has:
The "application/news-checkgroups" body part contains a complete Message-ID: <examplegroup-faq$v=2$B2b@faq-site.example>
list of all newsgroups in a top level hierarchy, their Replaces: <examplegroup-faq$v=1$A1b@faq-site.example>
description lines and moderation status. <examplegroup-faq@faq-site.example> ; usage=replace
Supersedes: <examplegroup-faq$v=1$A1b@faq-site.example>
The MIME content type definition of The next one, another month later, has:
"application/news-checkgroups:" is:
MIME type name: application Message-ID: <examplegroup-faq$v=3$C3c@faq-site.example>
MIME subtype name: news-checkgroups Replaces: <examplegroup-faq$v=2$B2b@faq-site.example>
Optional parameters: none <examplegroup-faq$v=1$A1b@faq-site.example> ; usage=repost
Encoding considerations: "7bit" or "8bit" is sufficient and Supersedes: <examplegroup-faq$v=2$B2b@faq-site.example>
MUST be used to maintain compatibility.
Note that the descriptions may use [MIME3].
Security considerations: to be added
The content of the "application/news-checkgroups" body part is Note that the only reason to include more than one message identifier
defined as: in the Replaces is in case a site had missed the previous
Replacement. It is hardly necessary with such a long interval between
the postings.
checkgroups-body = *( invalidation CRLF ) 1*( valid-group CRLF ) Under the above, on systems using the version-number system (which is
invalidation = "!" groupname *( "," *WSP groupname ) optional) requests for any message identifier in the chain will
valid-group = description-line always return the most recent. As such the URL "news:examplegroup-
description-line ; see section 6.6.1.2 faq@faq-site.example" will always work, making it suitable to appear
in HTML documents.
The "application/news-checkgroups" content type is used in Example 2. A user posts a message <myuniquepart@mysite.example> to
conjunction with the "checkgroups" control message (section the net. She notices a typo and, 2 minutes later, posts with:
6.6.1.3.1).
6.6.5.1 Examples Message-ID: <myuniquepart$v=1$xxx@mysite.example>
Replaces: <myuniquepart@mysite.example> ; usage=revise
News Article Format February 2000
A "newgroup" with bilingual charter and policy information: 3 minutes later she sees another typo, and posts:
From: admin@example.invalid (example.all Administrator) Message-ID: <myuniquepart$v=2$yyy@mysite.example>
Newsgroups: example.admin.groups,example.admin.announce Replaces: <myuniquepart$v=1$xxx@mysite.example>
Date: 27 Feb 1997 12:50:22 +14:00 (EST) <myuniquepart@mysite.example> ; usage=revise
Subject: Group example.admin.info created.
Approved: admin@example.invalid
Control: newgroup example.admin.info moderated
Message-ID: <newgroup-example.admin.info-19970227@example.invalid>
Content-Type: multipart/news-groupinfo; boundary="nxtprt"
Content-Transfer-Encoding: 8bit
This is a MIME control message. The two bad versions will be replaced with the 3rd, even if a site
Content-Type: application/news-groupinfo never sees the 2nd due to batching or feed problems (thus the use of
two message identifiers is quite useful in this case, in
contradistinction to the first example). Requests for the original
will return the 3rd.
For your newsgroups file: 6.14. Xref
example.admin.info Information on the example.* hierarchy <info@news.org>
(Moderated)
Content-Type: multipart/alternative ; The Xref header is a local header (4.2.2.3) which indicates where an
differences = content-language ; article was filed by the last server to process it, and whether it is
boundary = nxtlang a Replacement (6.13) for an earlier article.
Article-Name: example.admin.info: charter
Content-Type: text/plain; charset=us-ascii Xref-content = [CFWS] server-name 1*( CFWS location )
Content-Transfer-Encoding: 7bit server-name = path-identity ; see 5.6.1
Content-Language: en location = newsgroup-name ":" article-locator
[ CFWS ( "revise" / "repost" )
":" article-locator ]
article-locator = 1*( %x21-7E ) ; US-ASCII printable characters
The group example.admin.info contains regularly posted information on The server-name is included so that software can determine which
the example.* hierarchy. serving agent generated the header. The locations specify what
Content-Type: text/plain; charset=us-ascii newsgroups the article was filed under (which may differ from those
Content-Transfer-Encoding: 8bit in the Newsgroups header) and where it was filed under them. The
Content-Language: de exact form of an article-locator is implementation-specific.
Die Gruppe example.admin.info enthõlt regelmõ~Kig versandte NOTE: The traditional form of an article-locator is a decimal
Informationen ³ber die example.*-Hierarchie. number, with articles in each newsgroup numbered consecutively
starting from 1. NNTP demands that such a model be provided, and
much other software expects it, but it seems desirable to permit
flexibility for unorthodox implementations.
plain "rmgroup": Whenever an Xref header is created by an agent for an article which
includes a Replaces header with "usage=revise" or "usage=repost"
(6.13), it SHOULD include, within the location field of each
newsgroup in the Newsgroups header of whichever of the old articles
referenced in that Replaces header is still current, a corresponding
"revise:<old-article-locator>" or "repost:<old-article-locator>" for
the oldest article known to be being replaced, where <old-article-
locator> is the article-locator under which that oldest article was
filed. If the Replaces header has a "usage=replace" (explicit or
implicit) the Xref header MUST NOT include any such reference to an
<old-article-locator>.
From: admin@example.invalid (example.all Administrator) NOTE: This is to enable reading agents to avoid showing that
Newsgroups: example.admin.groups, example.admin.announce article to users who have already read any of those older
Date: 4 Jul 1997 22:04 +02:00 (PST) articles (see 6.13). Because several replacements for a given
Subject: Deletion of example.admin.obsolete article may arrive in the period between attempts by a reader to
Message-ID: <rmgroup-example.admin.obsolete-19970730@example.invalid> read a given newsgroup, it is useful to include the oldest one
Approved: admin@example.invalid News Article Format February 2000
Control: rmgroup example.admin.obsolete
The group example.admin.obsolete is obsolete. Please remove it from in the Xref header. The information necessary to determine this
your system. article can be obtained from the Xref header of the current
version of the article just before it is deleted. Observe that a
server that never received one of the replaced articles can
still generate suitable information from whichever earlier
version it actually has. This is why it is useful for a Replaces
header to mention more than one earlier article, especially when
replacements are being issued in quick succession.
plain "mvgroup": NOTE: "revise" and "repost" are case-insensitive.
From: admin@example.invalid (example.all Administrator) An agent inserting an Xref header into an article MUST delete any
Newsgroups: example.admin.groups, example.admin.announce previous Xref header(s). A relaying agent MAY delete it before
Date: 30 Jul 1997 22:04 +02:00 (CEST) relaying, but otherwise it SHOULD be ignored (and usually replaced)
Subject: Moving example.oldgroup to example.newgroup by any relying or serving agent receiving it.
Message-ID: <mvgroup-example.oldgroup-19970730@example.invalid>
Approved: admin@example.invalid
Control: mvgroup example.oldgroup example.newgroup
Content-Type: multipart/news-groupinfo; boundary=nxt
Content-Type: application/newgroupinfo An agent MUST use the same serving-name in Xref headers as the path-
identity it uses in Path headers.
For your newsgroups file: 6.15. Lines
example.newgroup The new replacement group.
The group example.oldgroup is replaced by example.newgroup. The Lines header indicates the number of lines in the body of the
Please update your configuration. article.
more complex "mvgroup" for a whole hierarchy: Lines-content = [CFWS] 1*digit
The charter of the group example.talk.jokes contained a reference to The line count includes all body lines, including the signature if
example.talk.jokes.d, which is also being moved. So the charter is any, including empty lines (if any) at the beginning or end of the
updated. body, and including the whole of all Mime message and multipart parts
contained in the body (the single empty separator line between the
headers and the body is not part of the body). The "body" here is the
body as found in the posted article as transmitted by the posting
agent.
From: admin@example.invalid (example.all Administrator) This header is to be regarded as obsolete, and it will likely be
Newsgroups: example.admin.groups, example.admin.announce removed entirely in a future version of this standard. In the
Date: 30 Jul 1997 22:04 +02:00 (PST) meantime, its use is deprecated.
Subject: Deletion of example.admin.obsolete
Message-ID: <mvgroup-example.talk-19970730@example.invalid>
Approved: admin@example.invalid
Control: mvgroup example.talk.* example.conversation
Content-Type: multipart/news-groupinfo; boundary=nxt; chartas=1
Content-Type: application/newgroupinfo 6.16. User-Agent
For your newsgroups file: The User-Agent header contains information about the user agent
example.conversation.boring Boring conversations. (typically a newsreader) generating the article, for statistical
example.conversation.interesting Interesting conversations. purposes and tracing of standards violations to specific software
example.conversation.jokes Jokes and funny stuff. needing correction. Although optional, posting agents SHOULD normally
example.conversation.jokes.d Discussion about example.conversation.jokes. include this header.
Article-Name: example.conversation.jokes: charter User-Agent-content = product-token *( CFWS product-token )
product-token = value ["/" product-version] ; see 4.1
product-version = value
This group is to publish jokes and other funny stuff. This header MAY contain multiple product-tokens identifying the agent
Discussions about the articles posted here should be redirected and any subproducts which form a significant part of the posting
to example.conversation.jokes.d; adding a Followup-to: header agent, listed in order of their significance for identifying the
is recommended. application. Product-tokens should be short and to the point - they
News Article Format February 2000
6.6.6 Cancel MUST NOT be used for information beyond the canonical name of the
product and its version. Injecting agents MAY include product
information for servers (such as INN/1.7.2), but serving and relaying
agents MUST NOT generate or modify this header to list themselves.
The cancel message requests that one or more target articles be NOTE: Variations from [RFC 2616] which describes a similar
"canceled" ie be withdrawn from circulation or access. This facility for the HTTP protocol:
message MAY be issued by entities which processed the target
article(s) while it was still a proto-article (ie posters,
posting agents, moderators and injecting agent. See also
Gateways[2.1] ). Other entities MUST NOT use this method to
remove articles.
NOTE: A separate method for other entities to cancel articles 1. use of arbitrary text or octets from character sets other
will be defined in a later draft. than US-ASCII in a product-token may require the use of a
quoted-string,
cancel-arguments = 1*( message-id CFWS ) 2. "{" and "}" are allowed in a value (product-token and
cancel-body = body product-version) in Netnews,
The argument(s) identify the article(s) to be cancelled, by 3. UTF-8 replaces ISO-8859-1 as charset assumption.
message-id. The body SHOULD contain an indication of why the
cancellation was requested. The cancel message SHOULD be posted
to the same newsgroup(s), with the same distribution(s), as the
article(s) it is attempting to cancel.
In order for a cancel message to remove an article either: NOTE: Comments should be restricted to information regarding the
product named to their left such as platform information and
should be concise. Use as an advertising medium (in the mundane
sense) is discouraged.
1. The mailing addresses from the From line of the cancel 6.16.1. Examples
message and the target article match and the target article is
otherwise unauthenticated.
2. At least one authentication method of the target article User-Agent: tin/1.2-PL2
MUST be matched by the cancel message plus the mailing addresses User-Agent: tin/1.3-950621beta-PL0 (Unix)
from the From line of the cancel message and the target article User-Agent: tin/unoff-1.3-BETA-970813 (UNIX) (Linux/2.0.30 (i486))
MAY match. User-Agent: tin/pre-1.4-971106 (UNIX) (Linux/2.0.30 (i486))
User-Agent: Mozilla/4.02b7 (X11; I; en; HP-UX B.10.20 9000/712)
User-Agent: Microsoft-Internet-News/4.70.1161
User-Agent: Gnus/5.4.64 XEmacs/20.3beta17 ("Bucharest")
User-Agent: Pluto/1.05h (RISC-OS/3.1) NewsHound/1.30
User-Agent: inn/1.7.2
User-Agent: telnet
NOTE: The Sender, From or Approved headers MUST NOT be used as NOTE: This header supersedes the role performed redundantly by
an "authentication method" within the meaning of the previous experimental headers such as X-Newsreader, X-Mailer, X-Posting-
paragraph. If the above conditions are satisfied then the Agent, X-Http-User-Agent, and other headers previously used on
relaying or serving agent SHOULD delete the target article Usenet for this purpose. Use of these experimental headers
completely and immediately (or at the minimum make the article SHOULD be discontinued in favor of the single, standard User-
unavailable for relaying or serving) and also SHOULD reject any Agent header which can be used freely both in Netnews and mail.
copies of this article that appear. See also section 7 on
duties of Serving and Relaying agents.
6.6.7 ihave, sendme 6.17. MIME headers
The ihave and sendme control messages implement a crude 6.17.1. Syntax
batched predecessor of the NNTP [rrr] protocol. They are
largely obsolete in the Internet, but still see use in the UUCP
environment, especially for backup feeds that normally are
active only when a primary feed path has failed.
NOTE: The ihave and sendme messages defined here have The following headers, as defined within [RFC 2045] and its
ABSOLUTELY NOTHING TO DO WITH NNTP, despite similarities of extensions, may be used within articles conforming to this standard.
terminology.
The two messages share the same syntax: MIME-Version:
Content-Type:
Content-Transfer-Encoding:
Content-ID:
Content-Description:
ihave-arguments = *( message-id space ) relayer-name News Article Format February 2000
sendme-arguments = ihave-arguments
ihave-body = *( message-id CRLF )
sendme-body = ihave-body
Message IDs MUST appear in either the arguments or the body, but Content-Disposition:
NOT both. Relayers SHOULD generate the form putting message Content-MD5:
IDs in the body, but the other form MUST be supported for
backward compatibility.
The ihave message states that the named relaying agent has Insofar as the syntax for these headers, as given in [RFC 2045], does
received articles with the specified message IDs, which may be not specify precisely where whitespace and comments may occur
of interest to the relaying agents receiving the ihave message. (whether in the form of WSP, FWS or CFWS), the usage defined in this
The sendme message requests that the agent receiving it send standard, and failing that in [MESSFOR], and failing that in [RFC
the articles having the specified message IDs to the named 822] MUST be followed. In particular, there MUST NOT be any WSP
relaying agent. between a header-name and the following colon and there MUST be a SP
following that colon.
These control messages are normally sent essentially as The meaning of the various MIME headers is as defined in [RFC 2045]
point-to-point messages, by using "to." newsgroups (see section and [RFC 2046], and in extensions registered in accordance with [RFC
5.5.1) that are sent only to the relaying agent the messages are 2048]. However, their usage is curtailed as described in the
intended for. The two relaying agents MUST be neighbors, following sections.
exchanging news directly with each other. Each relaying agent
advertises its new arrivals to the other using ihave messages,
and each uses sendme messages to request the articles it lacks.
To reduce overhead, ihave and sendme messages SHOULD be sent 6.17.2. Content-Transfer-Encoding
relatively infrequently and SHOULD contain reasonable numbers
of message IDs. If ihave and sendme are being used to implement
a backup feed, it may be desirable to insert a delay between
reception of an ihave and generation of a sendme, so that a
slightly slow primary feed will not cause large numbers of
articles to be requested unnecessarily via sendme.
6.6.8 Obsolete control messages. Posting agents SHOULD specify "Content-Transfer-Encoding: 8bit" for
all articles not written in pure US-ASCII and not requiring full
binary. They MAY use "8bit" encoding even when "7bit" encoding would
have sufficed. They SHOULD specify "base64" when the content type
implies binary (i.e. content intended for machine, rather than human,
consumption).
The following forms of control messages are declared obsolete NOTE: If a future extension to the MIME standards were to
by this document: provide a more compact encoding of binary suited to transport
over an 8bit channel, it could be considered as an alternative
to base64 once it had gained widespread acceptance.
sendsys Posting agents SHOULD NOT specify encoding "quoted-printable", but
version reading agents MUST interpret that encoding correctly. Encoding
whogets "binary" MUST NOT be used (except in cooperating subnets with
senduuname alternative transport arrangements) because this standard does not
mandate a transport mechanism that could support it.
6.7. Distribution Injecting and relaying agents MUST NOT change the encoding of
articles passed to them. Gateways SHOULD ONLY change the encoding if
absolutely necessary.
6.7.1 Historical Note 6.17.3. Content-Type
The original Distribution header provided a means to limit The Content-Type: "text/plain" is the default type for any news
the distribution of articles to a subset of the sites which article, but the recommendations and limits on line lengths set out
received the newsgroups it was posted to. It was designed in section 4.5 SHOULD be observed. The acceptability of other
to control a feed. Each site feeding other sites would, for subtypes of Content-Type: "text" (such as "text/html") is a matter of
each feed, configure the list of distributions appropriate policy (see 1.1), and posters SHOULD NOT use them unless established
to send to that site. If an article had a Distribution policy or custom in the particular hierarchies or groups involved so
header, a check would be made to see if any of the allows. Moreover, even in those cases, the material SHOULD, for the
distributions in the header matched the distribution list benefit of readers who see it only in its transmitted form, be
for the feed. "pretty-printed" so as to keep it within the line lengths recommended
in section 4.5, and to keep any sequences which control its layout or
style separate from the meaningful text.
Sadly, this list was often configured in the form "all News Article Format February 2000
distributions except the following" where the local
distributions would be listed.
This mean an unknown distribution, leaked from an external In the same way, Content-Types requiring special processing for their
site, would match the "all distributions" and get fed out. display, such as "application", "image", "audio", "video" and
This meant that once an article leaked out from a "multipart/related" are discouraged except in groups specifically
distribution's subnet, it flooded the entire net, or at intended (by policy or custom) to include them. Exceptionally, those
least the very large subset that used "all but these" style application types defined in [RFC 1847] and [RFC 2015] for use within
of configuring the feed. "multipart/signed" articles, and the type "application/pgp-keys" (or
other similar types containing digital certificates) may be used
freely but, contrary to [RFC 2015] and unless the article is intended
to be sent by mail also, the Content-Transfer-Encoding SHOULD be left
as "8bit" (or "7bit" as appropriate).
Indeed, many sites deliberately wanted this flood. Hub Reading agents SHOULD NOT, unless explicitly configured otherwise,
sites at national and multinational ISPs wanted to receive act automatically on Application types which could change the state
all the local distributions, for the use of their users in of that agent (e.g. by writing or modifying files), except in the
the individual geographic regions. This assured netwide case of those prescribed for use in control messages (7.1.2 and
propagation of all distributions, defeating the purpose of ).
the header. It became close to valueless.
6.7.1.1 New Semantics 6.17.3.1. Message/partial
While distributions SHOULD still control feeds as they do, The Content-Type "message/partial" MAY be used to split a long news
they SHOULD also be associated with the site. Each site article into several smaller ones, but this usage is discouraged on
SHOULD maintain a list of the distributions to which it is a the grounds that modern transport agents should have no difficulty in
"member." Newsreaders SHOULD also allow the user to handling articles of arbitrary length.
maintain a list of distributions to which the user is a
member.
Newsreaders MAY also keep track of distributions the user However, IF this feature is used, then the "id" parameter SHOULD be
wishes to belong to. In this event, they should examine the in the form of a unique message identifier (but different from that
Distribution headers of articles to be presented to the in the Message-ID header of any of the parts). Contrary to the
user, and SHOULD not display them if the user does not requirements specified in [RFC 2046], the Transfer-Encoding SHOULD be
belong to any of the distributions named. set to "8bit" at least in each part that requires it. The second and
subsequent parts SHOULD contain References headers referring to all
the previous parts, thus enabling reading agents with threading
capabilities to present them in the correct order. Reading agents MAY
then provide a facility to recombine the parts into a single article
(but this standard does not require them to do so).
6.7.1.2 Planned Uses 6.17.3.2. Message/rfc822
Distributions can now be used to define rigid subsets of the The Content-Type "message/rfc822" should be used for the
net that sites can "subscribe" to. For example, say a party encapsulation (whether as part of another news article or, more
wishes to issue 3rd party cancel messages that delete spam usually, as part of a mail message) of complete news articles which
or net abuse at sites which wish to listen to that have already been posted to Netnews and which are for the information
canceller. These messages would now be posted to a specific of the recipient, and do not constitute a request to repost them.
distribution. They might still reach the entire net, and
would make it to hubs, but they would only have effect at
sites which explicitly took membership in the distribution,
even without authentication.
However, as these might be very high volume messages -- In the case where the encapsulated article has Content-Transfer-
especially if there are many such 3rd party cancel services Encoding "8bit", it will be necessary to change that encoding if it
-- it remains possible for sites to ask their feeders to not is to be forwarded over some mail transport that only supports
even feed articles in this distribution, thus making the "7bit". However, this should not be necessary for any mail transport
system efficient. that supports the 8BITMIME feature [SMTP]. Moreover, where the
headers of the encapsulated article contain any UTF8-xtra-chars
(2.4), it may not be possible to transport them over mail transports
even where 8BITMIME is supported. In such cases, it will be necessary
to encode those headers as provided in [RFC 2047] (notwithstanding
that such usage is deprecated for news headers by this standard, and
actually forbidden in the case of the Newsgroups header).
6.7.2 Definition News Article Format February 2000
The Distribution header specifies geographical or In the event that the encapsulated article has to be encoded for
organizational limits to an article's propagation: either of these reasons, it may be necessary to reverse that encoding
if certain forms of digital signatures have been employed, or if the
article is to be reintroduced into some Netnews system (however, in
the latter case, the Content-Type "application/news-transmission"
should have been used instead).
Distribution-content = distribution *( dist-delim distribution) NOTE: It is likely, though not guaranteed, that headers
dist-delim = "," containing UTF8-xtra-chars will pass safely through mail
distribution = positive-distribution / negative-distribution transports supporting 8BITMIME if the "message/rfc822" object is
positive-distribution = *FWS distribution-name *FWS sent as an attachment (i.e. as a part of a multipart) rather
negative-distribution = *FWS "!" distribution-name *FWS than as the top-level body of the mail message. Moreover, it is
distribution-name = 1*letter anticipated that future extensions to the mail standards will
permit headers containing UTF8-xtra-chars to be carried without
further ado over conforming transports.
[In fact, of current transports supporting 8BITMIME, only sendmail will
have problems with UTF-8 in top-level headers.]
[That is more restrictive than Henry, omitting '+', '-' and 6.17.3.3. Message/external-body
'_', but more liberal in allowing uppercase letters, which in
fact are commonly used, and in not specifying any 14 character
limit.]
A distribution is case-insensitive (i.e. "US", "Us" and "us" The Content-Type "message/external-body" could be apropriate for
all specify the same distribution). In the absence of a texts which it would be uneconomic (in view of the likely readership)
Distribution header, the default Distribution-content is to distribute to the entire network.
"world". However, "world" SHOULD NOT be explicitly mentioned
unless a negative-distribution is also present, as in
Distribution: world, !us "All" MUST NOT be used as a
distribution-name.
Articles MUST NOT be passed between relaying agents unless the 6.17.3.4. Multipart types
sending agent has been configured to supply and the receiving
agent has requested to receive BOTH of (a) at least one of the
newsgroups in the article's Newsgroups header, and (b) at
least one of the positive-distributions in the article's
Distribution header and none of the negative-distributions.
Exceptionally, ALL relaying agents are deemed willing to
supply or accept the distribution "world", and NO relaying
agent should supply or accept the distribution "local".
Posting agents SHOULD NOT provide a default Distribution The Content-Types "multipart/mixed", "multipart/parallel" and
header without giving the poster an opportunity to override "multipart/signed" may be used freely in news articles. However,
it. Followup agents SHOULD initially supply the same except where policy or custom so allows, the Content-Type:
Distribution header as found in the precursor. "multipart/alternative" SHOULD NOT be used, on account of the extra
bandwidth consumed and the difficulty of quoting in followups, but
reading agents MUST accept it.
All the two-letter country names (e.g. "us") commonly used as The Content-Type: "multipart/digest" is commended for any article
top-level domain names may be used as distributions, but the composed of multiple messages more conveniently viewed as separate
common non-country top-level domain names (such as "edu" and entities. The "boundary" should be composed of 28 hyphens (US-ASCII
"com") are NOT distributions, moreover top-level 45) (which makes each boundary delimiter 30 hyphens, or 32 for the
newsgroup-names (such as "comp" and "soc") are NOT final one) so as to accord with current practice for digests [RFC
distributions. Apart from the above, distribution-names are a 1153].
matter for negotiation between the relaying agents or [Actually, this conflicts with some present digest usage (such as the
cooperating subnets involved. news.answers rules), but should still be the right way to go. I suggest
this is left in for now (just to stake a claim), while we discuss the
matter with the news.answers moderators and the faq-maintainers.]
6.8. Keywords 6.17.4. Character Sets
The Keywords field contains a comma separated list of In principle, any character set may be specified in the "charset="
important words and phrases intended to describe some aspect parameter of a content type. However, character sets other than "us-
of the content of the article. The format of the Keywords ascii", "iso-8859-1" (and the corresponding parts of UTF-8) ought
header is defined in the Message Format Standard [MESSFOR] . only to be used in hierarchies where the language customarily used so
requires (and whose readers could be expected to possess agents
capable of displaying them).
NOTE: The list is comma seperated NOT space seperated. News Article Format February 2000
6.9. Summary 6.17.5. Content Disposition
The Summary header content is a short phrase summarizing the Reading agents SHOULD honour any Content-Disposition header that is
article's content. provided (in particular, they SHOULD display any part of a multipart
for which the disposition is "inline", possibly distinguished from
adjacent parts by some suitable separator). In the absence of such a
header, the body of an article or any part of a multipart with
Content-Type "text" SHOULD be displayed inline. Followup agents which
quote parts of a precursor (see 4.3.2) SHOULD initially include all
parts of the precursor that were displayed inline, as if they were a
single part.
summary-content = non-blank-text CRLF 6.17.6. Definition of some new Content-Types
non-blank-text = 1*(FWS text)
The summary SHOULD be terse. Authors SHOULD avoid trying to This standard defines (or redefines) several new Content-Types, which
cram their entire article into the headers; even the require to be registered with IANA as provided for in [RFC 2048].
simplest query usually benefits from a sentence or two of For "application/news-groupinfo" see 7.1.2, for "application/news-
elaboration and context, and not all reading agents display checkgroups" see 7.4.1, and for "application/news-transmission" see
all headers. On the other hand the summary should give more the following section.
detail than the Subject.
6.10. Approved 6.17.6.1. Application/news-transmission
The Approved header content indicates the mailing addresses The Content-Type "application/news-transmission" is intended for the
(and possibly the full names) of the persons or entities encapsulation of complete news articles where the intention is that
approving the article for posting: the recipient should then inject them into Netnews. This Application
type SHOULD be used when mailing articles to moderators and to mail-
to-news gateways (see 8.2.2).
Approved-content = From-content NOTE: The benefit of such encapsulation is that it removes
possible conflict between news and email headers and it provides
a convenient way of "tunnelling" a news article through a
transport medium that does not support 8bit characters.
An Approved header is required in all postings to moderated The MIME content type definition of "application/news-transmission"
newsgroups. If this header is not present then relaying and is:
serving agents MUST reject the article.
An Approved header is also required in certain control MIME type name: application
messages, to reduce the probability of accidental posting of MIME subtype name: news-transmission
same; see the relevant parts of section 6.6. Required parameters: none
Optional parameters: usage=moderate
usage=inject
usage=relay
Encoding considerations: A transfer-encoding (such as Quoted-
Printable or Base64) different from that of
the article transmitted MAY be supplied
(perhaps en route) to ensure correct
transmission over some 7bit transport
medium.
Security considerations: A news article may be a "control message",
which could have effects on the recipient
host's system beyond just storage of the
article. However, such control messages
also occur in normal news flow, so most
hosts will already be suitably defended
News Article Format February 2000
Please see section 7.1 on how injecting agents should treat against undesired effects.
posts to moderated groups that do not contain this header. Published specification: [USEFOR]
Body part: A complete article or proto-article, ready
for injection into Netnews, or a batch of
such articles.
6.11 Lines NOTE: It is likely that the recipient of an "application/news-
transmission" will be a specialised gateway (e.g. a moderator's
submission address) able to accept articles with only one of the
three usage parameters "moderate", "inject" and "relay", hence
the reason why they are optional, being redundant in most
situations. Nevertheless, they MAY be used to signify the
originator's intention with regard to the transmission, so
removing any possible doubt.
The Lines header content indicates the number of lines in the When the parameter "relay" is used, or implied, the body part MAY be
body of the article: a batch of articles to be transmitted together, in which case the
following syntax MUST be used.
Lines-content = 1*digit batch = 1*( batch-header article )
batch-header = "#!" SP "rnews" SP article-size CRLF
article-size = 1*digit
The line count includes all body lines, including the where the "rnews" is case-sensitive. Thus a batch is a sequence of
signature if any, including empty lines (if any) at beginning articles, each prefixed by a header line that includes its size. The
or end of the body. (The single empty separator line between article-size is a decimal count of the octets in the article,
the headers and the body is not part of the body) . The "body" counting each CRLF as one octet regardless of how it is actually
here is the body as found in the posted article as transmitted represented.
by the posting agent.
Software SHOULD NOT use the value of Lines for any purpose other NOTE: Despite the similarity of this format to an executable
than to display an estimate to humans. This header will be UNIX script, it is EXTREMELY unwise to feed such a batch into a
deprecated in a future RFC. command interpreter in anticipation of it running a command
named "rnews"; the security implications of so doing would be
disastrous.
6.12 Xref 6.17.6.2. Message/news withdrawn
The Xref header content indicates where an article was filed The Content-Type "message/news", as previously registered with IANA,
by the last server to process it: is hereby obsoleted and should be withdrawn. It was never widely
implemented, and its default treatment as "application/octet-stream"
by agents that did not recognise it was counter productive. The
Content-Type "message/rfc822" SHOULD be used in its place, as already
described above.
Xref-content = server 1*( CFWS location ) 6.18. Obsolete Headers
server = server-name
location = newsgroup-name ":" article-locator
article-locator = 1*
The serving agent's name is included so that software can Persons writing new agents SHOULD ignore any former meanings of the
determine which serving agent generated the header. The following headers:
locations specify what newsgroups the article was filed under
(which may differ from those in the Newsgroups header) and
where it was filed under them. The exact form of an article
locator is implementation-specific.
NOTE: The traditional form of an article locator is a decimal Also-Control
number, with articles in each newsgroup numbered consecutively See-Also
starting from 1. NNTP demands that such a model be Article-Names
provided, and there may be other software which expects it, Article-Updates
but it seems desirable to permit flexibility for unorthodox News Article Format February 2000
implementations.
An agent inserting an Xref header into an article MUST delete 7. Control Messages
any previous Xref header(s). A relaying agent MUST only create
and/or relay an Xref header if it correct on all the receiving
agents the article is forwarded to. Serving agents SHOULD
insert this header unless the information in it (apart from
the serving name) is correct in which case it should be left
unchanged.
An agent MUST use the same name in Xref headers as it uses in The following sections document the control messages. "Message" is
Path headers. used herein as a synonym for "article" unless context indicates
otherwise. Group control messages are the sub-class of control
messages that request some update to the configuration of the groups
known to a serving agent, namely "newgroup". "rmgroup", "mvgroup"
and "checkgroups", plus any others created by extensions to this
standard.
6.13 Organization All of the group control messages MUST have an Approved header
(6.12). Moreover, in those hierarchies where appropriate
administrative agencies exist (see 1.1), group control messages
SHOULD NOT be issued except as authorized by those agencies.
[They SHOULD also use one of the authentication mechanisms which we
shall define when we get a Round Tuit.]
The Organization header content is a short phrase identifying The Newsgroups header of each control message MUST include the
the author's organization: newsgroup-name(s) for the group(s) affected (i.e. groups to be
created, modified or removed, or containing articles to be canceled).
This is to ensure that the message progagates to all sites which
receive (or would receive) that group(s). It MAY include other
newsgroup-names so as to improve propagation (but this practice
should be regarded as exceptional rather than normal).
organization-content = nonblank-text CRLF The descriptions below are generally phrased in terms suggesting
mandatory actions, but any or all of these MAY be subject to local
administrative restrictions, and MAY be denied or referred to an
administrator for approval (either as a class or on a case-by-case
basis). Analogously, where the description below specifies that a
message or portion thereof is to be ignored, this action MAY include
reporting it to an administrator.
NOTE: Posting and injection agents are discouraged from Relaying Agents MUST propagate even control messages that they do not
providing a default value for this header unless it is understand.
acceptable to all posters using these agents. Unless this
header contains useful information ( including some indication
of the authors physical location) posters are discouraged from
including it.
6.14 User-Agent In the following sections, each type of control message is defined
syntactically by defining its verb, its arguments, and possibly its
body.
The User-Agent header contains information about the user 7.1. The 'newgroup' Control Message
agent (typically a newsreader) generating the article. This is
for statistical purposes and tracing of standards violations
to specific software needing correction. Although OPTIONAL,
user agents SHOULD include this header with the articles they
generate.
The field MAY contain multiple product tokens and comments newgroup-verb = "newgroup"
identifying the agent and any subproducts which form a newgroup-arguments = CFWS newsgroup-name [ CFWS newgroup-flag ]
significant part of the user agent such as external agents newgroup-flag = "moderated"
used for message composition, separated injecting agents (such
as those used by offline newsreaders), and significant
libraries that are part of such agents. The products are
listed in order of their significance for identifying the
application, not necessarily in chronological order of
handling prior to injection. Injecting agents MAY include
product information for servers (such as INN/1.7.2), but
servers MUST NOT generate or modify this header to list
themselves.
User-Agent MUST NOT be modified after injection, but MAY be The "newgroup" control message requests that the specified group be
stripped or have its contents replaced prior to re-injection created or changed. The newgroup-flag "moderated" is appended to mark
by another user agent such as an anonymizing gateway. the group as moderated. The absence of this flag marks the group as
unmoderated. "Moderated" is the only such flag defined by this
standard; other flags MAY be defined for use in cooperating subnets,
but newgroup messages containing them MUST NOT be acted on outside of
those subnets.
User-Agent = "User-Agent:" SP User-Agent-content News Article Format February 2000
User-Agent-content = product *(CFWS product) [CFWS]
At least one product MUST be present. The first token MUST NOT NOTE: Specifically, some alternative flags such as "y" and "m",
be a comment. Comments relate to the previously named product, which are sent and recognised by some current software, are NOT
not the product following it. part of this standard. Moreover, some existing implementations
treat any flag other than "moderated" as indicating an
unmoderated newsgroup. Both of these usages are contrary to this
standard.
product = token ["/" product-version] product-version = token The message body comprises or includes a "application/news-groupinfo"
(7.1.2) part containing machine- and human-readable information about
the group.
Product tokens should be short and to the point -- they MUST The newsgroup-name MUST conform to all requirements set out in
NOT be used for information beyond the canonical name of the section 5.5, and it is the responsibility of the newgroup message
product and it's version. Although any token character MAY issuer to ensure this (since some of those requirements are hard to
appear in a product-version, this token SHOULD be used only enforce mechanically). Moreover, the newsgroup-name SHOULD conform to
for a version identifier (i.e., successive versions of the whatever policies have been established by the administrative agency,
same product SHOULD differ only in the product-version portion if any, for that hierarchy.
of the product value). Product tokens MUST identify products.
NOTE: Variations from RFC 1945: The newgroup command is also used to update the newsgroups-line or
the moderation status of a group.
1. product token is required and MUST be first, 7.1.1. The Body of the 'newgroup' Control Message
2. use of other text in the syntactic usage of the product The body of the newgroup message contains the following subparts,
token which is not a token is forbidden, preferably in the order shown:
3. comment allows quoted-pair, 1. An "application/news-groupinfo" part (7.1.2) containing the name
and newsgroups-line of the group(s). This part MUST be present.
4. "{" and "}" are allowed in token (product and 2. Other parts containing useful information about the background of
product-version) in news, the newsgroup message (typically of type "text/plain").
5. octets from character sets other than ASCII ar