draft-ietf-usefor-article-06.txt   draft-ietf-usefor-article-07.txt 
INTERNET-DRAFT Charles H. Lindsey INTERNET-DRAFT Charles H. Lindsey
Usenet Format Working Group University of Manchester Usenet Format Working Group University of Manchester
November 2001 May 2002
News Article Format News Article Format
<draft-ietf-usefor-article-06.txt> <draft-ietf-usefor-article-07.txt>
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC 2026. all provisions of Section 10 of RFC 2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 1, line 34 skipping to change at page 1, line 34
progress." progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
Abstract Abstract
This Draft defines the format of Netnews articles and specifies This Draft is intended as a standards track document, obsoleting
the requirements to be met by software which originates, RFC 1036, which itself dates from 1987.
distributes, stores and displays them. It is intended as a
standards track document, superseding RFC 1036, which itself dates This Standard defines the format of Netnews articles and specifies
from 1987. the requirements to be met by software which originates, distributes,
stores and displays them.
Since the 1980s, Usenet has grown explosively, and many Internet and Since the 1980s, Usenet has grown explosively, and many Internet and
non-Internet sites now participate. In addition, this technology is non-Internet sites now participate. In addition, the Netnews
now in widespread use for other purposes. technology is now in widespread use for other purposes.
Backward compatibility has been a major goal of this endeavour, but Backward compatibility has been a major goal of this endeavour, but
where this standard and earlier documents or practices conflict, this where this standard and earlier documents or practices conflict, this
standard should be followed. In most such cases, current practice is standard should be followed. In most such cases, current practice is
already compatible with these changes. already compatible with these changes.
[The use of the words "this standard" within this document when [The use of the words "this standard" within this document when
referring to itself does not imply that this draft yet has pretensions referring to itself does not imply that this draft yet has pretensions
to be a standard, but rather indicates what will become the case if and to be a standard, but rather indicates what will become the case if and
when it is accepted as an RFC with the status of a proposed or draft when it is accepted as an RFC with the status of a proposed or draft
standard.] standard.]
News Article Format November 2001 News Article Format May 2002
[Remarks enclosed in square brackets and aligned with the left margin, [Remarks enclosed in square brackets and aligned with the left margin,
such as this one, are not part of this draft, but are editorial notes to such as this one, are not part of this draft, but are editorial notes to
explain matters amongst ourselves, or to point out alternatives, or to explain matters amongst ourselves, or to point out alternatives, or to
indicate work yet to be done.] assist the RFC Editor.]
[Please note that this Draft describes "Work in Progress". Much remains [In this draft, references to [NNTP] are to be replaced by [RFC 977], or
to be done, though the material included so far is unlikely to change in else by references to the RFC arising from the series of drafts draft-
any major way.] ietf-nntpext-base-*.txt, in the event that such RFC has been accepted at
the time this document is published.]
[Please note that this Draft is now close to Last Call, and the material
included here is unlikely to change in any major way.]
Table of Contents Table of Contents
1. Introduction .................................................. 5 1. Introduction .................................................. 6
1.1. Basic Concepts ............................................ 5 1.1. Basic Concepts ............................................ 6
1.2. Objectives ................................................ 6 1.2. Objectives ................................................ 7
1.3. Historical Outline ........................................ 6 1.3. Historical Outline ........................................ 7
1.4. Transport ................................................. 6 1.4. Transport ................................................. 7
2. Definitions, Notations and Conventions ........................ 6 2. Definitions, Notations and Conventions ........................ 8
2.1. Definitions ............................................... 7 2.1. Definitions ............................................... 8
2.2. Textual Notations ......................................... 8 2.2. Textual Notations ......................................... 9
2.3. Relation To Mail and MIME ................................. 10 2.3. Relation To Email and MIME ................................ 10
2.4. Syntax Notation ........................................... 10 2.4. Syntax .................................................... 11
2.5. Language .................................................. 13 2.4.1. Syntax Notation ....................................... 11
3. Changes to the existing protocols ............................. 13 2.4.2. Syntax adapted from Email and MIME .................... 11
3.1. Principal Changes ......................................... 13 2.4.3. Syntax copied from other standards .................... 13
3.2. Transitional Arrangements ................................. 14 2.5. Language .................................................. 14
4. Basic Format .................................................. 15 3. Changes to the existing protocols ............................. 15
4.1. Syntax of News Articles ................................... 15 3.1. Principal Changes ......................................... 15
4.2. Headers ................................................... 16 3.2. Transitional Arrangements ................................. 15
4.2.1. Names and Contents .................................... 16 4. Basic Format .................................................. 17
4.2.2. Header Properties ..................................... 18 4.1. Syntax of News Articles ................................... 17
4.2.2.1. Experimental Headers .............................. 18 4.2. Headers ................................................... 18
4.2.2.2. Inheritable Headers ............................... 18 4.2.1. Naming of Headers ..................................... 18
4.2.2.3. Variant Headers ................................... 18 4.2.2. MIME-style Parameters ................................. 19
4.2.3. White Space and Continuations ......................... 19 4.2.3. White Space and Continuations ......................... 20
4.2.4. Comments .............................................. 20 4.2.4. Comments .............................................. 21
4.2.5. Undesirable Headers ................................... 20 4.2.5. Header Properties ..................................... 22
4.3. Body ...................................................... 21 4.2.5.1. Experimental Headers .............................. 22
4.3.1. Body Format Issues .................................... 21 4.2.5.2. Inheritable Headers ............................... 22
4.3.2. Body Conventions ...................................... 21 4.2.5.3. Variant Headers ................................... 23
4.4. Characters and Character Sets ............................. 23 4.2.6. Undesirable Headers ................................... 23
4.4.1. Character Sets within Article Headers ................. 23 4.3. Body ...................................................... 23
4.4.2. Character Sets within Article Bodies .................. 25 4.3.1. Body Format Issues .................................... 23
4.5. Size Limits ............................................... 25 4.3.2. Body Conventions ...................................... 24
4.6. Example ................................................... 26 4.4. Characters and Character Sets ............................. 26
5. Mandatory Headers ............................................. 27 4.4.1. Character Sets within Article Headers ................. 26
5.1. Date ...................................................... 27 4.4.2. Character Sets within Article Bodies .................. 27
5.1.1. Examples .............................................. 28 4.5. Size Limits ............................................... 28
5.2. From ...................................................... 28 4.6. Example ................................................... 29
5.2.1. Examples: ............................................ 28 5. Mandatory Headers ............................................. 29
5.3. Message-ID ................................................ 29 News Article Format May 2002
5.4. Subject ................................................... 29
5.4.1. Examples .............................................. 30
5.5. Newsgroups ................................................ 31
News Article Format November 2001
5.5.1. Forbidden newsgroup names ............................. 35 5.1. Date ...................................................... 30
5.6. Path ...................................................... 36 5.1.1. Examples .............................................. 30
5.6.1. Format ................................................ 36 5.2. From ...................................................... 30
5.6.2. Adding a path-identity to the Path header ............. 37 5.2.1. Examples: ............................................ 31
5.6.3. The tail-entry ........................................ 38 5.3. Message-ID ................................................ 32
5.6.4. Delimiter Summary ..................................... 38 5.4. Subject ................................................... 33
5.6.5. Suggested Verification Methods ........................ 39 5.4.1. Examples .............................................. 34
5.6.6. Example ............................................... 40 5.5. Newsgroups ................................................ 34
6. Optional Headers .............................................. 41 5.5.1. Forbidden newsgroup names ............................. 39
6.1. Reply-To .................................................. 41 5.6. Path ...................................................... 39
6.1.1. Examples .............................................. 42 5.6.1. Format ................................................ 39
6.2. Sender .................................................... 42 5.6.2. Adding a path-identity to the Path-header ............. 40
6.3. Organization .............................................. 42 5.6.3. The tail-entry ........................................ 41
6.4. Keywords .................................................. 42 5.6.4. Path-Delimiter Summary ................................ 42
6.5. Summary ................................................... 42 5.6.5. Suggested Verification Methods ........................ 43
6.6. Distribution .............................................. 43 5.6.6. Example ............................................... 43
6.7. Followup-To ............................................... 44 6. Optional Headers .............................................. 44
6.8. Mail-Copies-To ............................................ 44 6.1. Reply-To .................................................. 44
6.9. Posted-And-Mailed ......................................... 46 6.1.1. Examples .............................................. 44
6.10. References ............................................... 46 6.2. Sender .................................................... 45
6.10.1. Examples ............................................. 47 6.3. Organization .............................................. 45
6.11. Expires .................................................. 47 6.4. Keywords .................................................. 45
6.12. Archive .................................................. 47 6.5. Summary ................................................... 45
6.13. Control .................................................. 48 6.6. Distribution .............................................. 46
6.14. Approved ................................................. 48 6.7. Followup-To ............................................... 47
6.15. Supersedes ............................................... 49 6.8. Mail-Copies-To ............................................ 47
6.16. Xref ..................................................... 49 6.9. Posted-And-Mailed ......................................... 49
6.17. Lines .................................................... 50 6.10. References ............................................... 49
6.18. User-Agent ............................................... 50 6.10.1. Examples ............................................. 50
6.18.1. Examples ............................................. 51 6.11. Expires .................................................. 50
6.19. Injector-Info ............................................ 51 6.12. Archive .................................................. 50
6.19.1. Usage of Injector-Info-header-parameters ............. 53 6.13. Control .................................................. 51
6.19.1.1. The posting-host-parameter ....................... 54 6.14. Approved ................................................. 52
6.19.1.2. The posting-account-parameter .................... 54 6.15. Supersedes ............................................... 52
6.19.1.3. The posting-sender-parameter ..................... 54 6.16. Xref ..................................................... 53
6.19.1.4. The posting-logging-parameter .................... 54 6.17. Lines .................................................... 54
6.19.1.5. The posting-date-parameter ....................... 54 6.18. User-Agent ............................................... 54
6.19.2. Example .............................................. 55 6.18.1. Examples ............................................. 55
6.20. Complaints-To ............................................ 55 6.19. Injector-Info ............................................ 55
6.21. MIME headers ............................................. 55 6.19.1. Usage of Injector-Info-parameters .................... 57
6.21.1. Syntax ............................................... 55 6.19.1.1. The posting-host-parameter ....................... 58
6.21.2. Content-Type ......................................... 56 6.19.1.2. The posting-account-parameter .................... 58
6.21.2.1. Message/partial .................................. 56 6.19.1.3. The posting-sender-parameter ..................... 58
6.21.2.2. Message/rfc822 ................................... 57 6.19.1.4. The posting-logging-parameter .................... 58
6.21.2.3. Message/external-body ............................ 58 6.19.1.5. The posting-date-parameter ....................... 58
6.21.2.4. Multipart types .................................. 58 6.19.2. Example .............................................. 59
6.21.3. Content-Transfer-Encoding ............................ 58 6.20. Complaints-To ............................................ 59
6.21.4. Character Sets ....................................... 60 6.21. MIME headers ............................................. 59
6.21.5. Content Disposition .................................. 60 6.21.1. Syntax ............................................... 59
6.21.6. Definition of some new Content-Types ................. 60 6.21.2. Content-Type ......................................... 60
6.21.6.1. Application/news-transmission .................... 60 6.21.2.1. Message/partial .................................. 60
6.21.6.2. Message/news withdrawn ........................... 62 6.21.2.2. Message/rfc822 ................................... 61
6.22. Obsolete Headers ......................................... 62 6.21.2.3. Message/external-body ............................ 62
7. Control Messages .............................................. 62 6.21.2.4. Multipart types .................................. 62
News Article Format November 2001 News Article Format May 2002
7.1. Digital Signature of Headers .............................. 63 6.21.3. Content-Transfer-Encoding ............................ 62
7.2. Group Control Messages .................................... 63 6.21.4. Character Sets ....................................... 64
7.2.1. The 'newgroup' Control Message ........................ 63 6.21.5. Content Disposition .................................. 64
7.2.1.1. The Body of the 'newgroup' Control Message ........ 64 6.21.6. Definition of some new Content-Types ................. 64
7.2.1.2. Application/news-groupinfo ........................ 64 6.21.6.1. Application/news-transmission .................... 64
7.2.1.3. Initial Articles .................................. 66 6.21.6.2. Message/news obsoleted ........................... 66
7.2.1.4. Example ........................................... 67 6.22. Obsolete Headers ......................................... 66
7.2.2. The 'rmgroup' Control Message ......................... 68 7. Control Messages .............................................. 66
7.2.2.1. Example ........................................... 68 7.1. Digital Signature of Headers .............................. 67
7.2.3. The 'mvgroup' Control Message ......................... 68 7.2. Group Control Messages .................................... 67
7.2.3.1. Example ........................................... 70 7.2.1. The 'newgroup' Control Message ........................ 67
7.2.4. The 'checkgroups' Control Message ..................... 70 7.2.1.1. The Body of the 'newgroup' Control Message ........ 68
7.2.4.1. Application/news-checkgroups ...................... 71 7.2.1.2. Application/news-groupinfo ........................ 68
7.3. Cancel .................................................... 72 7.2.1.3. Initial Articles .................................. 70
7.4. Ihave, sendme ............................................. 73 7.2.1.4. Example ........................................... 71
7.5. Obsolete control messages. ............................... 74 7.2.2. The 'rmgroup' Control Message ......................... 71
8. Duties of Various Agents ...................................... 74 7.2.2.1. Example ........................................... 72
8.1. General principles to be followed ......................... 74 7.2.3. The 'mvgroup' Control Message ......................... 72
8.2. Duties of an Injecting Agent .............................. 75 7.2.3.1. Example ........................................... 73
8.2.1. Proto-articles ........................................ 75 7.2.4. The 'checkgroups' Control Message ..................... 74
8.2.2. Procedure to be followed by Injecting Agents .......... 75 7.2.4.1. Application/news-checkgroups ...................... 75
8.3. Duties of a Relaying Agent ................................ 77 7.3. Cancel .................................................... 76
8.4. Duties of a Serving Agent ................................. 78 7.4. Ihave, sendme ............................................. 77
8.5. Duties of a Posting Agent ................................. 79 7.5. Obsolete control messages. ............................... 78
8.6. Duties of a Followup Agent ................................ 79 8. Duties of Various Agents ...................................... 78
8.7. Duties of a Moderator ..................................... 80 8.1. General principles to be followed ......................... 78
8.8. Duties of a Gateway ....................................... 81 8.2. Duties of an Injecting Agent .............................. 79
8.8.1. Duties of an Outgoing Gateway ......................... 82 8.2.1. Proto-articles ........................................ 79
8.8.2. Duties of an Incoming Gateway ......................... 83 8.2.2. Procedure to be followed by Injecting Agents .......... 79
8.8.3. Example ............................................... 85 8.3. Duties of a Relaying Agent ................................ 81
9. Security and Related Considerations ........................... 86 8.4. Duties of a Serving Agent ................................. 82
9.1. Leakage ................................................... 86 8.5. Duties of a Posting Agent ................................. 83
9.2. Attacks ................................................... 86 8.6. Duties of a Followup Agent ................................ 83
9.2.1. Denial of Service ..................................... 86 8.7. Duties of a Moderator ..................................... 84
9.2.2. Compromise of System Integrity ........................ 87 8.8. Duties of a Gateway ....................................... 85
9.3. Liability ................................................. 88 8.8.1. Duties of an Outgoing Gateway ......................... 86
10. IANA Considerations .......................................... 89 8.8.2. Duties of an Incoming Gateway ......................... 88
11. References ................................................... 89 8.8.3. Example ............................................... 90
12. Acknowledgements ............................................. 91 9. Security and Related Considerations ........................... 90
13. Contact Addresses ............................................ 91 9.1. Leakage ................................................... 91
14. Intellectual Property Rights ................................. 92 9.2. Attacks ................................................... 91
Appendix A.1 - A-News Article Format .............................. 93 9.2.1. Denial of Service ..................................... 91
Appendix A.2 - Early B-News Article Format ........................ 93 9.2.2. Compromise of System Integrity ........................ 92
Appendix A.3 - Obsolete Headers ................................... 94 9.3. Liability ................................................. 93
Appendix A.4 - Obsolete Control Messages .......................... 94 10. IANA Considerations .......................................... 94
Appendix B - Collected Syntax ..................................... 95 11. References ................................................... 94
News Article Format November 2001 12. Acknowledgements ............................................. 97
13. Contact Address .............................................. 97
Appendix A.1 - A-News Article Format .............................. 98
Appendix A.2 - Early B-News Article Format ........................ 99
Appendix A.3 - Obsolete Headers ................................... 99
Appendix A.4 - Obsolete Control Messages .......................... 100
Appendix B - Collected Syntax ..................................... 101
Appendix B.1 - Characters, Atoms and Folding ...................... 101
News Article Format May 2002
Appendix B.2 - Basic Forms ........................................ 103
Appendix B.3 - Headers ............................................ 104
Appendix B.3.1 - Header outlines .................................. 104
Appendix B.3.2 - Control-message outlines ......................... 106
Appendix B.3.3 - Other header rules ............................... 107
Appendix C - Notices .............................................. 109
News Article Format May 2002
1. Introduction 1. Introduction
1.1. Basic Concepts 1.1. Basic Concepts
"Netnews" is a set of protocols for generating, storing and "Netnews" is a set of protocols for generating, storing and
retrieving news "articles" (which resemble mail messages) and for retrieving news "articles" (which resemble email messages) and for
exchanging them amongst a readership which is potentially widely exchanging them amongst a readership which is potentially widely
distributed. It is organized around "newsgroups", with the distributed. It is organized around "newsgroups", with the
expectation that each reader will be able to see all articles posted expectation that each reader will be able to see all articles posted
to each newsgroup in which he participates. These protocols most to each newsgroup in which he participates. These protocols most
commonly use a flooding algorithm which propagates copies throughout commonly use a flooding algorithm which propagates copies throughout
a network of participating servers. Typically, only one copy is a network of participating servers. Typically, only one copy is
stored per server, and each server makes it available on demand to stored per server, and each server makes it available on demand to
readers able to access that server. readers able to access that server.
An important characteristic of Netnews is the lack of any requirement An important characteristic of Netnews is the lack of any requirement
for a central administration or for the establishment of any for a central administration or for the establishment of any
controlling host to manage the network. A network which limits controlling host to manage the network. A network which limits
participation to some restricted set of hosts (within some company, participation to some restricted set of hosts (within some company,
for example) is a "closed" network; otherwise it is an "open" for example) is a "closed" network; otherwise it is an "open"
network. A set of hosts within a network which, by mutual network. A set of hosts within a network which, by mutual
arrangement, operates some variant (whether more or less restrictive) arrangement, operates some variant (whether more or less restrictive)
of the Netnews protocols is a "cooperating subnet". of the Netnews protocols is a "cooperating subnet".
"Usenet" is a particular worldwide open network based upon the "Usenet" is a particular worldwide open network based upon the
Netnews protocols, with the newsgroups being organised into Netnews protocols, with the newsgroups being organized into
recognized "hierarchies". Anybody can join (it is simply necessary recognized "hierarchies". Anybody can join (it is simply necessary
to negotiate an exchange of articles with one or more other to negotiate an exchange of articles with one or more other
participating hosts). Usenet "belongs" to those who administer the participating hosts). Usenet "belongs" to those who administer the
hosts of which it is comprised. There is no Cabal with overall hosts of which it is comprised. There is no Cabal with overall
authority to direct what is to be be allowed. Nevertheless, there do authority to direct what is to be be allowed. Nevertheless, there do
exist agencies within Usenet that have authority to establish exist agencies within Usenet that have authority to establish
policies and to perform administrative functions, but such authority policies and to perform administrative functions, but such authority
derives solely from the consent of those sites which choose to derives solely from the consent of those sites which choose to
recognise it (and who can decline to exchange articles with sites recognize it (and who can decline to exchange articles with sites
which choose not to recognise it). Usually, the authority of such an which choose not to recognize it). Usually, the authority of such an
agency is restricted to a particular hierarchy, or group of agency is restricted to a particular hierarchy, or group of
hierarchies. hierarchies.
A "policy" is a rule intended to facilitate the smooth operation of a A "policy" is a rule intended to facilitate the smooth operation of a
network by establishing parameters which restrict behaviour that, network by establishing parameters which restrict behaviour that,
whilst technically unexceptionable, would nevertheless contravene whilst technically unexceptionable, would nevertheless contravene
some accepted standard of "Good Netkeeping". Since the ultimate some accepted standard of "Good Netkeeping". Since the ultimate
beneficiaries of a network are its human readers, who will be less beneficiaries of a network are its human readers, who will be less
tolerant of poorly designed interfaces than mere computers, articles tolerant of poorly designed interfaces than mere computers, articles
in breach of established policy can cause considerable annoyance to in breach of established policy can cause considerable annoyance to
their recipients. their recipients.
Policies may well vary from network to network, from hierarchy to Policies may well vary from network to network, from hierarchy to
hierarchy within one network, and even between individual newsgroups hierarchy within one network, and even between individual newsgroups
within one hierarchy. It is assumed, for the purposes of this within one hierarchy. It is assumed, for the purposes of this
standard, that agencies with varying degrees of authority to standard, that agencies with varying degrees of authority to
establish such policies will exist, and that where they do not, establish such policies will exist, and that where they do not,
policy will be established by mutual agreement. For the benefit of policy will be established by mutual agreement. For the benefit of
News Article Format November 2001 News Article Format May 2002
networks and hierarchies without such established agencies, and to networks and hierarchies without such established agencies, and to
provide a basis upon which all agencies can build, this present provide a basis upon which all agencies can build, this present
standard often provides default policy parameters, usually standard often provides default policy parameters, usually
introducing them by a phrase such as "As a matter of policy ...". introducing them by a phrase such as "As a matter of policy ...".
1.2. Objectives 1.2. Objectives
The purpose of this present standard is to define the protocols to be The purpose of this present standard is to define the format of
used for Netnews in general, and for Usenet in particular, and to set articles and the protocols to be used for Netnews in general, and for
standards to be followed by software that implements those protocols. Usenet in particular, and to set standards to be followed by software
that implements those protocols.
It is NOT the purpose of this standard to define how the authority of It is NOT the purpose of this standard to define how the authority of
various agencies to exercise control or oversight of the various various agencies to exercise control or oversight of the various
parts of Usenet is established (that is itself a matter of policy). parts of Usenet is established (that is itself a matter of policy).
Nevertheless, it is assumed that such authorities will exist, and Nevertheless, it is assumed that such authorities will exist, and
tools are provided within the protocols for their use. tools are provided within the protocols for their use.
1.3. Historical Outline 1.3. Historical Outline
Network news originated as the medium of communication for Usenet, Network news originated as the medium of communication for Usenet,
circa 1980. Since then, Usenet has grown explosively, and many circa 1980. Since then, Usenet has grown explosively, and many
Internet and non-Internet sites participate in it. In addition, the Internet and non-Internet sites participate in it. In addition, the
news technology is now in widespread use for other purposes, on the news technology is now in widespread use for other purposes, on the
Internet and elsewhere. Internet and elsewhere.
The earliest news interchange used the so-called "A News" article The earliest news interchange used the so-called "A News" article
format. Shortly thereafter, an article format vaguely resembling format. Shortly thereafter, an article format vaguely resembling
Internet mail was devised and used briefly. Both of those formats Internet Mail was devised and used briefly. Both of those formats
are completely obsolete; they are documented in Appendix A.1 and are completely obsolete; they are documented in Appendix A.1 and
Appendix A.2 for historical reasons only. With publication of [RFC Appendix A.2 for historical reasons only. With publication of [RFC
850] in 1983, news articles came to closely resemble Internet mail 850] in 1983, news articles came to closely resemble Internet Mail
messages, with some restrictions and some additional headers. [RFC messages, with some restrictions and some additional headers. [RFC
1036] in 1987 updated [RFC 850] without making major changes. 1036] in 1987 updated [RFC 850] without making major changes.
A Draft popularly referred to as "Son of 1036" [Son-of-1036] was A Draft popularly referred to as "Son of 1036" [Son-of-1036] was
written in 1994 by Henry Spencer. That document formed the original written in 1994 by Henry Spencer. That document formed the original
basis for this standard. Much is taken directly from Son of 1036, and basis for this standard. Much is taken directly from Son of 1036, and
it is hoped that we have followed its spirit and intentions. it is hoped that we have followed its spirit and intentions.
1.4. Transport 1.4. Transport
As in this standard's predecessors, the exact means used to transmit As in this standard's predecessors, the exact means used to transmit
articles from one host to another is not specified. NNTP [NNTP] is articles from one host to another is not specified. NNTP [NNTP] is
the most common transmission method on the Internet, but much the most common transmission method on the Internet, but much
transmission takes place entirely independent of the Internet. Other transmission takes place entirely independent of the Internet. Other
methods in use include the UUCP protocol [RFC 976] extensively used methods in use include the UUCP protocol [RFC 976] extensively used
in the early days of Usenet, FTP, downloading via satellite, tape in the early days of Usenet, FTP, downloading via satellite, tape
archives, and physically delivered magnetic and optical media. archives, and physically delivered magnetic and optical media.
News Article Format May 2002
2. Definitions, Notations and Conventions 2. Definitions, Notations and Conventions
News Article Format November 2001
2.1. Definitions 2.1. Definitions
An "article" is the unit of news, analogous to an [RFC 2822] An "article" is the unit of news, analogous to an [RFC 2822]
"message". A "proto-article" is one that has not yet been injected "message". A "proto-article" is one that has not yet been injected
into the news system. into the news system.
A "message identifier" (5.3) is a unique identifier for an article, A "message identifier" (5.3) is a unique identifier for an article,
usually supplied by the "posting agent" which posted it or, failing usually supplied by the "posting agent" which posted it or, failing
that, by the "injecting agent". It distinguishes the article from that, by the "injecting agent". It distinguishes the article from
skipping to change at page 7, line 56 skipping to change at page 9, line 5
A "reader" is the person or software reading news articles. A "reader" is the person or software reading news articles.
A "reading agent" is software which presents articles to a reader. A "reading agent" is software which presents articles to a reader.
A "followup" is an article containing a response to the contents of A "followup" is an article containing a response to the contents of
an earlier article (the followup's "precursor"). an earlier article (the followup's "precursor").
A "followup agent" is a combination of reading agent and posting A "followup agent" is a combination of reading agent and posting
agent that aids in the preparation and posting of a followup. agent that aids in the preparation and posting of a followup.
An article's "reply address" is the address to which mailed replies News Article Format May 2002
should be sent. This is the address specified in the article's From
header (5.2), unless it also has a Reply-To header (6.1).
News Article Format November 2001
A "reply agent" is a combination of reading agent and mailer that An article's "reply address" is the address to which mailed replies
aids in the preparation and posting of an email response to an should be sent. This is the address specified in the article's From-
article. header (5.2), unless it also has a Reply-To-header (6.1).
A "sender" is the person or software (usually, but not always, the A "sender" is the person or software (usually, but not always, the
same as the poster) responsible for the operation of the posting same as the poster) responsible for the operation of the posting
agent or, which amounts to the same thing, for passing the article to agent or, which amounts to the same thing, for passing the article to
the injecting agent. The sender is analogous to [RFC 2822]'s sender. the injecting agent. The sender is analogous to [RFC 2822]'s sender.
An "injecting agent" takes the finished article from the posting An "injecting agent" takes the finished article from the posting
agent (often via the NNTP "post" command) performs some final checks agent (often via the NNTP "post" command) performs some final checks
and passes it on to a relaying agent for general distribution. and passes it on to a relaying agent for general distribution.
skipping to change at page 9, line 4 skipping to change at page 10, line 4
the specification. The purpose of the notes is to explain why choices the specification. The purpose of the notes is to explain why choices
were made, to place them in context, or to suggest possible were made, to place them in context, or to suggest possible
implementation techniques. implementation techniques.
NOTE: While such explanatory notes may seem superfluous in NOTE: While such explanatory notes may seem superfluous in
principle, they often help the less-than-omniscient reader grasp principle, they often help the less-than-omniscient reader grasp
the purpose of the specification and the constraints involved. the purpose of the specification and the constraints involved.
Given the limitations of natural language for descriptive Given the limitations of natural language for descriptive
purposes, this improves the probability that implementors and purposes, this improves the probability that implementors and
users will understand the true intent of the specification in users will understand the true intent of the specification in
News Article Format November 2001 News Article Format May 2002
cases where the wording is not entirely clear. cases where the wording is not entirely clear.
"US-ASCII" is short for "the ANSI X3.4 character set" [ANSI X3.4]. "US-ASCII" is short for "the ANSI X3.4 character set" [ANSI X3.4].
While "ASCII" is often misused to refer to various character sets While "ASCII" is often misused to refer to various character sets
somewhat similar to X3.4, in this standard "US-ASCII" is used to mean somewhat similar to X3.4, in this standard "US-ASCII" is used to mean
X3.4 and only X3.4. US-ASCII is a 7 bit character set. Please note X3.4 and only X3.4. US-ASCII is a 7 bit character set. Please note
that this standard requires that all agents be 8 bit clean; that is, that this standard requires that all agents be 8 bit clean; that is,
they must accept and transmit data without changing or omitting the they must accept and transmit data without changing or omitting the
8th bit. 8th bit.
skipping to change at page 9, line 40 skipping to change at page 10, line 40
do no worse than cause extreme irritation to other readers, do no worse than cause extreme irritation to other readers,
particularly in the case of the publicly distributed Usenet, particularly in the case of the publicly distributed Usenet,
that is no reason not to take it seriously. The essential that is no reason not to take it seriously. The essential
distinction is that enforcement of a "MUST" or "SHOULD" is a distinction is that enforcement of a "MUST" or "SHOULD" is a
matter of ensuring correct implementation, whereas enforcement matter of ensuring correct implementation, whereas enforcement
of an "Ought" is more a matter of sensible design or of social of an "Ought" is more a matter of sensible design or of social
pressure (whose effectiveness should not be underestimated, even pressure (whose effectiveness should not be underestimated, even
though it cannot be prescribed by this standard). though it cannot be prescribed by this standard).
NOTE: A requirement imposed on a relaying or serving agent NOTE: A requirement imposed on a relaying or serving agent
should be understood as applying only to articles actually regarding some particular article should be understood as
accepted for processing by that agent (since any agent may applying only if that article is actually accepted for
always reject any article entirely, for reasons of site policy). processing (since any agent may always reject any article
entirely, for reasons of site policy).
All numeric values are given in decimal unless otherwise indicated.
Octets are assumed to be unsigned values for this purpose.
Throughout this standard we will give examples of various Throughout this standard we will give examples of various
definitions, headers and other specifications. It needs to be definitions, headers and other specifications. It needs to be
remembered that these samples are for the aid of the reader only and remembered that these samples are for the aid of the reader only and
do NOT define any specification themselves. In order to prevent do NOT define any specification themselves. In order to prevent
possible conflict with "Real World" entities and people the top level possible conflict with "Real World" entities and people the top level
domain of ".example" is used in all sample domains and addresses. The domain ".example" is used in all sample domains and addresses. The
hierarchy of example.* is also used as a sample hierarchy. hierarchy "example.*" is also used as a sample hierarchy.
Information on the ".example" top level domain is in [RFC 2606]. Information on the ".example" top level domain is in [RFC 2606].
News Article Format November 2001 2.3. Relation To Email and MIME
2.3. Relation To Mail and MIME
The primary intent of this standard is to describe the news article The primary intent of this standard is to describe the news article
format. Insofar as news articles are a subset of the Mail message format. Insofar as news articles are a subset of the email message
format augmented by some new headers, this standard incorporates many format augmented by some new headers, this standard incorporates many
(though not all) of the provisions of [RFC 2822], with the aim of (though not all) of the provisions of [RFC 2822], with the aim of
enabling news articles to pass through mail systems and vice versa, News Article Format May 2002
enabling news articles to pass through email systems and vice versa,
provided only that they contain the minimum headers required for the provided only that they contain the minimum headers required for the
mode of transport being used. Unfortunately, the match is not mode of transport being used. Unfortunately, the match is not
perfect, but it is the intention of this standard that gateways perfect, but it is the intention of this standard that gateways
between Mail and News should be able to operate with the minimum of between Email and Netnews should be able to operate with the minimum
tinkering. of tinkering.
Likewise, this standard incorporates many (though not all) of the Likewise, this standard incorporates many (though not all) of the
provisions of the MIME standards [RFC 2045] et seq which, though provisions of the MIME standards [RFC 2045] et seq which, though
designed with Mail in mind, are mostly applicable to News. designed with Email in mind, are mostly applicable to Netnews.
2.4. Syntax Notation 2.4. Syntax
The complete syntax defined in this standard is repeated, for
convenience, in Appendix B.
2.4.1. Syntax Notation
This standard uses the Augmented Backus Naur Form described in [RFC This standard uses the Augmented Backus Naur Form described in [RFC
2234]. A discussion of this is outside the bounds of this standard, 2234].
but it is expected that implementors will be able quickly to
understand it with reference to that defining document.
Much of the syntax of News Articles is based on the corresponding In particular, it makes significant use of the "incremental
alternative" feature of that notation. For example, the two rules
header = other-header
header =/ Date-header
are equivalent to the single rule
header = other-header / Date-header
2.4.2. Syntax adapted from Email and MIME
Much of the syntax of Netnews Articles is based on the corresponding
syntax defined in [RFC 2822] or in the MIME specifications [RFC 2045] syntax defined in [RFC 2822] or in the MIME specifications [RFC 2045]
et seq, which is deemed to have been incorporated into this standard et seq, which are deemed to have been incorporated into this standard
as required. However, there are some important differences arising as required. However, there are some important differences arising
from the fact that [RFC 2822] does not recognise anything other than from the fact that [RFC 2822] does not recognize anything other than
US-ASCII characters, that it does not recognise the MIME headers [RFC US-ASCII characters, that it does not recognize the MIME headers [RFC
2045], and that it includes much syntax described as "obsolete". 2045], and that it includes much syntax described as "obsolete"
(which is excluded from this standard, as detailed below).
NOTE: News parsers historically have been much less permissive NOTE: Netnews parsers historically have been much less
than Mail parsers, and this is reflected in the modifications permissive than Email parsers, and this is reflected in the
referred to, and in some further specific rules. modifications referred to, and in some further specific rules.
The following syntactic forms therefore supersede the corresponding The following syntactic rules therefore supersede the corresponding
rules given in [RFC 2822] and [RFC 2045], thus allowing UTF-8 rules given in [RFC 2822] and [RFC 2045], thus allowing UTF-8
characters [RFC 2279] to appear in certain contexts (the five rules characters [RFC 2279] to appear in certain contexts (the five rules
begining with "strict-" reflect the corresponding original rules from begining with "strict-" reflect the corresponding original rules from
[RFC 2822]). [RFC 2822]).
News Article Format May 2002
UTF8-xtra-2-head= %xC2-DF UTF8-xtra-2-head= %xC2-DF
UTF8-xtra-3-head= %xE0 %xA0-BF / %xE1-EC %x80-BF / UTF8-xtra-3-head= %xE0 %xA0-BF / %xE1-EC %x80-BF /
%xED %x80-9F / %xEE-EF %x80-BF %xED %x80-9F / %xEE-EF %x80-BF
UTF8-xtra-4-head= %xF0 %x90-BF / %xF1-F7 %x80-BF UTF8-xtra-4-head= %xF0 %x90-BF / %xF1-F7 %x80-BF
UTF8-xtra-5-head= %xF8 %x88-BF / %xF9-FB %x80-BF UTF8-xtra-5-head= %xF8 %x88-BF / %xF9-FB %x80-BF
UTF8-xtra-6-head= %xFC %x84-BF / %xFD %x80-BF UTF8-xtra-6-head= %xFC %x84-BF / %xFD %x80-BF
UTF8-xtra-tail = %x80-BF UTF8-xtra-tail = %x80-BF
UTF8-xtra-char = UTF8-xtra-2-head 1( UTF8-xtra-tail ) / UTF8-xtra-char = UTF8-xtra-2-head 1( UTF8-xtra-tail ) /
UTF8-xtra-3-head 1( UTF8-xtra-tail ) / UTF8-xtra-3-head 1( UTF8-xtra-tail ) /
UTF8-xtra-4-head 2( UTF8-xtra-tail ) / UTF8-xtra-4-head 2( UTF8-xtra-tail ) /
UTF8-xtra-5-head 3( UTF8-xtra-tail ) / UTF8-xtra-5-head 3( UTF8-xtra-tail ) /
UTF8-xtra-6-head 4( UTF8-xtra-tail ) UTF8-xtra-6-head 4( UTF8-xtra-tail )
News Article Format November 2001
text = %d1-9 / ; all UTF-8 characters except text = %d1-9 / ; all UTF-8 characters except
%d11-12 / ; US-ASCII NUL, CR and LF %d11-12 / ; US-ASCII NUL, CR and LF
%d14-127 / %d14-127 /
UTF8-xtra-char UTF8-xtra-char
ctext = NO-WS-CTL / ; all of <text> except ctext = NO-WS-CTL / ; all of <text> except
%d33-39 / ; SP, HTAB, "(", ")" %d33-39 / ; SP, HTAB, "(", ")"
%d42-91 / ; and "\" %d42-91 / ; "\" and DEL
%d93-126 / %d93-126 /
UTF8-xtra-char UTF8-xtra-char
qtext = NO-WS-CTL / ; all of <text> except qtext = NO-WS-CTL / ; all of <text> except
%d33 / ; SP, HTAB, "\" and DQUOTE %d33 / ; SP, HTAB, "\" DQUOTE
%d35-91 / %d35-91 / ; and DEL
%d93-126 / %d93-126 /
UTF8-xtra-char UTF8-xtra-char
utext = NO-WS-CTL / ; Non white space controls utext = NO-WS-CTL / ; Non white space controls
%d33-126 / ; The rest of US-ASCII %d33-126 / ; The rest of UTF-8
UTF8-xtra-char UTF8-xtra-char
strict-text = %d1-9 / ; text restricted to strict-text = %d1-9 / ; text restricted to
%d11-12 / ; US-ASCII %d11-12 / ; US-ASCII
%d14-127 %d14-127
strict-qtext = NO-WS-CTL / ; qtext restricted to strict-qtext = NO-WS-CTL / ; qtext restricted to
%d33 / ; US-ASCII %d33 / ; US-ASCII
%d35-91 / %d35-91 /
%d93-127 %d93-126
strict-quoted-pair strict-quoted-pair
= "\" strict-text = "\" strict-text
strict-qcontent = strict-qtext / strict-quoted-pair strict-qcontent = strict-qtext / strict-quoted-pair
strict-quoted-string strict-quoted-string
= [CFWS] = [CFWS] DQUOTE
DQUOTE *([FWS] strict-qcontent) [FWS] DQUOTE *( [FWS] strict-qcontent ) [FWS]
[CFWS] DQUOTE [CFWS]
unstructured = 1*( [FWS] utext ) [FWS]
The syntax for UTF8-xtra-char excludes those redundant sequences of The syntax for UTF8-xtra-char excludes those redundant sequences of
octets which cannot occur in UTF-8, as defined by [RFC 2279], either octets which cannot occur in UTF-8, as defined by [RFC 2279], either
because they would not be the shortest possible encodings of some UCS because they would not be the shortest possible encodings of some UCS
character, or they would represent one of the characters D800 through character [ISO/IEC 10646], or they would represent one of the
DFFF, disallowed in UCS because of their surrogate use in the UTF-16 characters D800 through DFFF, disallowed in UCS because of their
encoding. These sequences MUST NOT be generated by posting agents. surrogate use in the UTF-16 encoding. These sequences MUST NOT be
Where they occur inadavertently, they MAY be passed on untouched by generated by posting agents. Where they occur inadadvertently, they
other agents, but they MUST NOT ever be interpreted as valid MAY be passed on untouched by other agents, but they MUST NOT ever be
characters. interpreted as valid characters.
News Article Format May 2002
Observe, in contradistinction to [RFC 2822], that an unstructured
MUST contain at least one non-whitespace character (see also remarks
about empty headers in 4.2.6).
Wherever in this standard the syntax is stated to be taken from [RFC Wherever in this standard the syntax is stated to be taken from [RFC
2822], it is to be understood as the syntax defined by [RFC 2822] 2822], it is to be understood as the syntax defined by [RFC 2822]
after making the above changes, but NOT including any syntax defined after making the above changes, but NOT including any syntax defined
in section 4 ("Obsolete syntax") of [RFC 2822]. Software compliant in section 4 ("Obsolete syntax") of [RFC 2822]. Software compliant
with this standard MUST NOT generate any of the syntactic forms with this standard MUST NOT generate any of the syntactic forms
defined in that Obsolete Syntax, although it MAY accept such defined in that Obsolete Syntax, although it MAY accept such
syntactic forms. Certain syntax from the MIME specifications [RFC syntactic forms. Certain syntax from the MIME specifications [RFC
2045] et seq is also considered a part of this standard (see 6.21). 2045] et seq is also considered a part of this standard (see 6.21).
2.4.3. Syntax copied from other standards
The following syntactic forms, taken from [RFC 2234] or from [RFC The following syntactic forms, taken from [RFC 2234] or from [RFC
2822], are repeated here for convenience only: 2822], are repeated here for convenience only:
News Article Format November 2001
ALPHA = %x41-5A / ; A-Z ALPHA = %x41-5A / ; A-Z
%x61-7A ; a-z %x61-7A ; a-z
CR = %x0D ; carriage return CR = %x0D ; carriage return
CRLF = CR LF CRLF = CR LF
DIGIT = %x30-39 ; 0-9 DIGIT = %x30-39 ; 0-9
HTAB = %x09 ; horizontal tab HTAB = %x09 ; horizontal tab
LF = %x0A ; line feed LF = %x0A ; line feed
SP = %x20 ; space SP = %x20 ; space
NO-WS-CTL = %d1-8 / ; US-ASCII control characters NO-WS-CTL = %d1-8 / ; US-ASCII control characters
%d11 / ; which do not include the %d11 / ; which do not include the
%d12 / ; carriage return, line feed, %d12 / ; carriage return, line feed,
%d14-31 / ; and whitespace characters %d14-31 / ; and whitespace characters
%d127 %d127
specials = "(" / ")" / ; Special characters used in specials = "(" / ")" / ; Special characters used in
"<" / ">" / ; other parts of the syntax "<" / ">" / ; other parts of the syntax
"[" / "]" / "[" / "]" /
":" / ";" / ":" / ";" /
"@" / " "@" / "\" /
"," / "." / "," / "." /
DQUOTE DQUOTE
WSP = SP / HTAB ; Whitespace characters WSP = SP / HTAB ; Whitespace characters
FWS = ([*WSP CRLF] 1*WSP); Folding whitespace FWS = ([*WSP CRLF] 1*WSP); Folding whitespace
ccontent = ctext / quoted-pair / comment ccontent = ctext / quoted-pair / comment
comment = "(" *([FWS] ccontent) [FWS] ")" comment = "(" *([FWS] ccontent) [FWS] ")"
CFWS = *([FWS] comment) (([FWS] comment) / FWS ) CFWS = *([FWS] comment) (([FWS] comment) / FWS )
DQUOTE = %d34 ; quote mark DQUOTE = %d34 ; quote mark
quoted-pair = "\" text quoted-pair = "\" text
News Article Format May 2002
atext = ALPHA / DIGIT / atext = ALPHA / DIGIT /
"!" / "#" / ; Any character except "!" / "#" / ; Any US-ASCII character except
"$" / "%" / ; controls, SP, and specials. "$" / "%" / ; controls, SP, and specials.
"&" / "'" / ; Used for atoms "&" / "'" / ; Used for atoms
"*" / "+" / "*" / "+" /
"-" / "/" / "-" / "/" /
"=" / "?" / "=" / "?" /
"^" / "_" / "^" / "_" /
"`" / "}" / "`" / "{" /
"|" / "}" / "|" / "}" /
"~" "~"
atom = [CFWS] 1*atext [CFWS] atom = [CFWS] 1*atext [CFWS]
dot-atom = [CFWS] dot-atom-text [CFWS] dot-atom = [CFWS] dot-atom-text [CFWS]
dot-atom-text = 1*atext *( "." 1*atext ) dot-atom-text = 1*atext *( "." 1*atext )
qcontent = qtext / quoted-pair qcontent = qtext / quoted-pair
quoted-string = [CFWS] quoted-string = [CFWS] DQUOTE
DQUOTE *([FWS] qcontent) [FWS] DQUOTE *( [FWS] qcontent ) [FWS]
[CFWS] DQUOTE [CFWS]
word = atom / quoted-string word = atom / quoted-string
phrase = 1*word phrase = 1*word
unstructured = *( [FWS] utext ) [FWS]
NOTE: CFWS occurs at many places in the syntax in order to allow
comments and extra whitespace to be inserted almost anywhere.
The syntax is in fact ambiguous insofar as it may be impossible
to tell in which of several possible ways a given comment or WS
was produced. However, this does not lead to semantic ambiguity
News Article Format November 2001
because, unless specifically stated otherwise, the presence of
absence of a comment or additional WS has no semantic meaning
and, in particular, it is a matter of indifference whether it
forms a part of the syntactic construct preceding it or the one
following it.
NOTE: Following [RFC 2234], literal text included in the syntax NOTE: Following [RFC 2234], literal text included in the syntax
is to be regarded as case-insensitive. However, in is to be regarded as case-insensitive. However, in
contradistinction to [RFC 2822], the Netnews protocols are contradistinction to [RFC 2822], the Netnews protocols are
sensitive to case in some instances (as in newsgroup names, some sensitive to case in some instances (as in newsgroup names, some
header parameters, etc.). Care has been taken to indicate this header parameters, etc.). Care has been taken to indicate this
explicitly where required. explicitly where required.
The complete syntax defined in this standard is repeated, for As in [RFC 2822], where any quoted-pair appears it is to be
convenience, in Appendix B. interpreted as its text character alone. That is to say, the "\"
character that appears as part of a quoted-pair is semantically
"invisible".
Again, as in [RFC 2822], strings of characters that include
characters not syntactically allowed in some particular context may
be incorporated into a quoted-string by "encapsulating" them between
quote (DQUOTE, US-ASCII 34) characters, prefixing every quote and
backslash character (and possibly other characters too) with a "\" so
as to form a quoted-pair, and possibly introducing folding by
prefixing some WSP with CRLF.
The semantic value of a quoted-string (i.e. the result of reversing
the encapsulation) is a string of characters which includes neither
the optional CFWS outside of the quote characters, nor the quote
characters themselves, nor any CRLF contained within any FWS between
the two quote characters, nor the "\" which introduces any quoted-
pair.
2.5. Language 2.5. Language
Various constant strings in this standard, such as header names and Various constant strings in this standard, such as header names and
month names, are derived from English words. Despite their month names, are derived from English words. Despite their
derivation, these words do NOT change when the poster or reader derivation, these words do NOT change when the poster or reader
employing them is interacting in a language other than English. employing them is interacting in a language other than English.
News Article Format May 2002
Posting and reading agents MAY translate as appropriate in their Posting and reading agents MAY translate as appropriate in their
interaction with the poster or reader, but the forms that actually interaction with the poster or reader, but the forms that actually
appear in articles MUST be the English-derived ones defined in this appear in articles as transmitted MUST be the English-derived ones
standard. defined in this standard.
3. Changes to the existing protocols 3. Changes to the existing protocols
This standard prescribes many changes, clarifications and new This standard prescribes many changes, clarifications and new
features since the protocols described in [RFC 1036] and [Son-of- features since the protocols described in [RFC 1036] and [Son-of-
1036]. It is the intention that they can be assimilated into Usenet 1036]. It is the intention that they can be assimilated into Usenet
as it presently operates without major interruption to the service, as it presently operates without major interruption to the service,
though some of the new features may not begin to show benefit until though some of the new features may not begin to show benefit until
they become widely implemented. This section summarizes the main they become widely implemented. This section summarizes the main
changes, and comments on some features of the transition. changes, and comments on some features of the transition.
3.1. Principal Changes 3.1. Principal Changes
o The [RFC 2822] conventions for parenthesis-enclosed comments in o The [RFC 2822] conventions for parenthesis-enclosed comments in
headers are supported. headers are supported.
o Whitespace is permitted in Newsgroups headers, permitting folding o Whitespace is permitted in Newsgroups-headers, permitting folding
of such headers. Indeed, all news headers can now be folded. of such headers. Indeed, all headers can now be folded.
o An enhanced syntax for the Path header enables the injection o An enhanced syntax for the Path-header enables the injection
point of and the route taken by an article to be determined with point of and the route taken by an article to be determined with
certainty. certainty.
o Netnews is firmly established as an 8bit medium. o Netnews is firmly established as an 8bit medium and all headers
are deemed to be in the UTF-8 character set (thus permitting, in
particular, the use of non-ASCII newsgroup-names).
o Large parts of MIME are recognised as an integral part of o Large parts of MIME are recognised as an integral part of
Netnews. Netnews.
o The charset for headers is always UTF-8. This will, inter alia, o There is a new Control message 'mvgroup' to facilitate moving a
permit newsgroup-names with non-ASCII characters.
o There is a new Control command 'mvgroup' to facilitate moving a
group to a different place (name) in a hierarchy. group to a different place (name) in a hierarchy.
o There are several new headers defined, such as Injector-Info and o There are several new headers defined, notably Archive,
News Article Format November 2001 Complaints-To, Injector-Info, Mail-Copies-To, Posted-And-Mailed
and User-Agent, leading to increased functionality.
Author-Ids, leading to increased functionality. o Provision has been made for almost all headers to have MIME-style
parameters (to be ignored if not recognized), thus facilitating
extension of those headers in future standards.
o Certain headers and Control messages (AppendixA.3 and Appendix
A.4) have been made obsolete.
o Distributions are expected to be checked at the receiving end, as
well as the sending end, of a relaying link.
o There are numerous other small changes, clarifications and o There are numerous other small changes, clarifications and
enhancements. enhancements.
[Doubtless many other changes should be listed, but there is little
point in doing so until our text is nearing completion. The above gives
the flavour of what should be said. There should also be references to
Appendix A.3 and Appendix A.4 ]
3.2. Transitional Arrangements 3.2. Transitional Arrangements
An important distinction must be made between serving and relaying An important distinction must be made between serving and relaying
agents which are responsible for the distribution and storage of news agents, which are responsible for the distribution and storage of
articles, and user agents which are responsible for interactions with news articles, and user agents, which are responsible for
users. It is important that the former should be upgraded to conform interactions with users. It is important that the former should be
to this standard as soon as possible to provide the benefit of the upgraded to conform to this standard as soon as possible to provide
enhanced facilities. Fortunately, the number of distinct the benefit of the enhanced facilities. Fortunately, the number of
implementations of such agents is rather small, at least so far as distinct implementations of such agents is rather small, at least so
the main "backbone" of Usenet is concerned, and many of the new far as the main "backbone" of Usenet is concerned, and many of the
features are already supported. Contrariwise, there are a great News Article Format May 2002
new features are already supported. Contrariwise, there are a great
number of implementations of user agents, installed on a vastly number of implementations of user agents, installed on a vastly
greater number of small sites. Therefore, the new functionality has greater number of small sites. Therefore, the new functionality has
been designed so that existing agents may continue to be used, been designed so that existing agents may continue to be used,
although the full benefits may not be realised until a substantial although the full benefits may not be realised until a substantial
proportion of them have been upgraded. proportion of them have been upgraded.
In the list which follows, care has been taken to distinguish the In the list which follows, care has been taken to distinguish the
implications for both kinds of agent. implications for both kinds of agent.
o [RFC 2822] style comments in headers do not affect serving and o [RFC 2822] style comments in headers do not affect serving and
relaying agents (note that the Newsgroups and Path headers do not relaying agents (note that the Newsgroups-, Distribution- and
contain them). They are unlikely to hinder their proper display Path-headers do not contain them). They are unlikely to hinder
in existing user agents except in the case of the References their proper display in existing reading agents except in the
header in agents which thread articles. Therefore, it is provided case of the References-header in agents which thread articles.
that they SHOULD NOT be generated except where permitted by the Therefore, it is provided that they SHOULD NOT be generated
previous standards. except where permitted by the previous standards.
o Because of its importance to all serving agents, the extension o Because of its importance to all serving agents, the extension
permitting whitespace and folding in Newsgroup headers SHOULD NOT permitting whitespace and folding in Newsgroups-headers SHOULD
be used until it has been widely deployed amongst relaying NOT be used until it has been widely deployed amongst relaying
agents. User agents are unaffected. agents. User agents are unaffected.
o The new style of Path header is already consistent with the o The new style of Path-header is already consistent with the
previous standards. However, the intention is that relaying previous standards. However, the intention is that relaying
agents should eventually reject articles in the old style, and so agents should eventually reject articles in the old style, and so
this should be offered as a configurable option for relaying this possibility should be offered as a configurable option in
agents. User agents are unaffected. relaying agents. User agents are unaffected.
o The vast majority of serving, relaying and transport agents are o The vast majority of serving, relaying and transport agents are
believed to be already 8bit clean (in the slightly restricted believed to be already 8bit clean (in the slightly restricted
sense in which that term is used in the MIME standards). User sense in which that term is used in the MIME standards). User
agents that do not implement MIME may be disadvantaged, but no agents that do not implement MIME may be disadvantaged, but no
more so than at present when faced with 8bit characters (which more so than at present when faced with 8bit characters (which
currently abound in spite of the previous standards). currently abound in spite of the previous standards).
o The introduction of MIME reflects a practice that is already o The introduction of MIME reflects a practice that is already
widespread. Articles in strict compliance with the previous widespread. Articles in strict compliance with the previous
standards (using strict US-ASCII) will be unaffected. Many user standards (using strict US-ASCII) will be unaffected. Many user
agents already support it, at least to the extent of widely used agents already support it, at least to the extent of widely used
News Article Format November 2001
charsets such as ISO-8859-1. Users expecting to read articles charsets such as ISO-8859-1. Users expecting to read articles
using other charsets will need to acquire suitable reading using other charsets will need to acquire suitable reading
agents. It is not intended, in general, that any single user agents. It is not intended, in general, that any single user
agent will be able to display every charset known to IANA, but agent will be able to display every charset known to IANA, but
all such agents MUST support US-ASCII. Serving and relaying all such agents MUST support US-ASCII. Serving and relaying
agents are not affected. agents are not affected.
o The use of the UTF-8 charset for headers will not affect any o The use of the UTF-8 charset for headers will not affect any
existing usage, since US-ASCII is a strict subset of UTF-8. existing usage that complies with the previous standards, since
Insofar as newsgroup names containing non-ASCII characters can US-ASCII is a strict subset of UTF-8. Insofar as newsgroup names
now be expected to arise, support from serving and relaying containing non-ASCII characters can now be expected to arise,
agents will be necessary. It is believed that the customary some support from serving and relaying agents will be desirable,
storage structure used by serving agents can already cope although it has been established that most current serving agents
(perhaps not ideally) with such names. Note that it is not can already cope with such names without modification (although
necessary for serving and relaying agents to understand all the perhaps not in an ideal manner). Note that it is not necessary
characters available in UTF-8, though it is desirable for them to for serving and relaying agents to understand all the characters
be displayable for diagnostic purposes via some escape mechanism available in UTF-8, though it is desirable for them to be
displayable for diagnostic purposes via some escape mechanism
using, for example, the visible subset of US-ASCII. For users using, for example, the visible subset of US-ASCII. For users
News Article Format May 2002
expecting to use the more exotic possibilities available under expecting to use the more exotic possibilities available under
UTF-8, the remarks already made in connection with MIME will UTF-8, the remarks already made in connection with MIME will
apply. apply.
o The new Control: mvgroup command will need to be implemented in o The new Control: mvgroup command will need to be implemented in
serving agents. It SHOULD be used in conjunction with pairs of serving agents. For the benefit of older serving agents it is
matching rmgroup and newgroup commands (injected shortly after therefore RECOMMENDED that it be followed shortly by a
the mvgroup) until such time as mvgroup is widely implemented. corresponding newgroup command and it MUST always be followed by
User agents are unaffected. a rmgroup command for the old group after a reasonable overlap
o The headers newly introduced by this standard can safely be period. An implementation of the mvgroup command as an alias for
the newgroup command would thus be minimally conforming. User
agents are unaffected.
o All the headers newly introduced by this standard can safely be
ignored by existing software, albeit with loss of the new ignored by existing software, albeit with loss of the new
functionality. functionality.
4. Basic Format 4. Basic Format
4.1. Syntax of News Articles 4.1. Syntax of News Articles
The overall syntax of a news article is: The overall syntax of a news article is:
article = 1*header separator body article = 1*( header CRLF ) separator body
header = header-name ":" 1*SP header-content CRLF header = other-header
other-header = header-name ":" 1*SP other-content
header-name = 1*name-character *( "-" 1*name-character ) header-name = 1*name-character *( "-" 1*name-character )
name-character = ALPHA / DIGIT name-character = ALPHA / DIGIT
header-content = USENET-header-content other-content = <the content of a header defined by some
*( [CFWS] ";" header-parameter ) / other standard>
other-header-content separator = CRLF
USENET-header-content body = *( *998text CRLF )
= <the header-content defined in this standard
(or an extension of it) for a specific
USENET header>
other-header-content
= <a header-content defined (explicitly or
implicitly) by some other standard>
header-parameter = USENET-header-parameter /
other-header-parameter
News Article Format November 2001
USENET-header-parameter However, the rule given above for header is incomplete. Further
= <an other-header-parameter defined in alternatives will be added incrementally as the various Netnews
this standard for use in conjunction with headers are introduced in this standard (or in future extensions),
a specific USENET-header-content> using the "=/" notation defined in [RFC 2234]. For example, a
other-header-parameter typical USENET-header would be defined as follows:
= attribute "=" value
attribute = USENET-token / iana-token / x-token header =/ USENET-header
value = token / quoted-string USENET-header = "USENET" ":" SP USENET-content
USENET-token = <A token defined in this standard for *( ";" ( USENET-parameter /
use in conjunction with a specific other-parameter ) )
USENET-header-parameter> USENET-content = <syntax specific to that USENET-header>
iana-token = <A token defined in an experimental USENET-parameter = <a parameter specific to that USENET-header>
or standards-track RFC and registered with
IANA> where the USENET-parameter, which MUST always be of the same
x-token = [CFWS] "x-" token-core [CFWS] syntactic form as an other-parameter (see below), is not provided in
token = [CFWS] token-core [CFWS] all headers, and even the other-parameter is omitted in some cases
token-core = 1*<any (US-ASCII) CHAR except SP, CTLs, cases (see 4.2.2). Observe that "USENET" is (and MUST be) of the
syntactic form of a header-name.
other-parameter = <a parameter not defined by this standard>
parameter = attribute "=" value
attribute = [CFWS] token [CFWS]
x-token = "x-" token
News Article Format May 2002
token = 1*<any (US-ASCII) CHAR except SP, CTLs,
or tspecials> or tspecials>
tspecials = "(" / ")" / "<" / ">" / "@" / tspecials = "(" / ")" / "<" / ">" / "@" /
"," / ";" / ":" / "\" / DQUOTE / "," / ";" / ":" / "\" / DQUOTE /
"/" / "[" / "]" / "?" / "=" "/" / "[" / "]" / "?" / "="
separator = CRLF value = [CFWS] token [CFWS] / quoted-string
body = *( *998text CRLF )
An article consists of some headers followed by a body. An empty line An article consists of some headers followed by a body. An empty line
separates the two. The headers contain structured information about separates the two. The headers contain structured information about
the article and its transmission. A header begins with a header-name the article and its transmission. A header begins with a header-name
identifying it, and can be continued onto subsequent lines as identifying it, and can be continued onto subsequent lines as
described in section 4.2.3. The body is largely unstructured text described in section 4.2.3. The body is largely unstructured text
significant only to the poster and the readers. significant only to the poster and the readers.
NOTE: Terminology here follows the current custom in the news NOTE: Terminology here follows the current custom in the news
community, rather than the [RFC 2822] convention of referring to community, rather than the [RFC 2822] convention of referring to
what is here called a "header" as a "header-field" or "field". what is here called a "header" as a "header-field" or "field".
Note that the separator line must be truly empty, not just a line Note that the separator line MUST be truly empty, not just a line
containing white space. Further empty lines following it are part of containing white space. Further empty lines following it are part of
the body, as are empty lines at the end of the article. the body, as are empty lines at the end of the article.
NOTE: The syntax above defines the canonical form of a news NOTE: The syntax above defines the canonical form of a news
article as a sequence of lines each terminated by CRLF. This article as a sequence of lines each terminated by CRLF. This
does not prevent serving agents or transport agents from storing does not prevent serving agents or transport agents from storing
or handling the article in other formats (e.g. using a single LF or handling the article in other formats (e.g. using a single LF
in place of CRLF) so long as the overall effects achieved are as in place of CRLF) so long as the overall effects achieved are as
defined by this standard when operating on the canonical form. defined by this standard when operating on the canonical form.
4.2. Headers 4.2. Headers
4.2.1. Names and Contents
Despite the restrictions on header-name syntax imposed by the
grammar, relayers and reading agents SHOULD tolerate header names
containing any US-ASCII printable character other than colon (":",
US-ASCII 58).
News Article Format November 2001
Header-names SHOULD be either those for which a USENET-header-content
is established by this standard, or by [RFC 2822], or by any
extension to either of these standards including, in particular, the
MIME standards [RFC 2045] et seq., or else experimental headers
beginning with "X-" (as defined in 4.2.2.1). Software SHOULD NOT
attempt to interpret headers not described in this standard or in its
extensions, but relaying agents MUST pass them on unaltered and
reading agents MUST enable them to be displayed, at least optionally.
The possibility of allowing header-parameters to appear in all
headers is provided mainly for the purpose of allowing future
extensions to existing headers, since only a very few USENET-header-
parameters are actually defined in this standard. Observe that such
header-parameters do not, in general, occur in headers defined in
other standards, except for the MIME standards [RFC 2045] et seq. and
their extensions. Nevertheless, compliant software MUST accept all
such header-parameters in headers defined by this standard and its
extensions (ignoring them if their meaning is unknown) and SHOULD
accept (and ignore) them in all headers.
[but what about
address = mailbox / group
group = phrase ":" [mailbox-list] ";"
Does the following NOTE cover the situation?]
NOTE: The presence of a ";" in a header-content does not
indicate the presence of a header-parameter in the few
situations where it can be parsed as part of some USENET-
header-content or other-header-content.
On the other hand, posting agents SHOULD NOT generate header-
parameters (even those using x-tokens) except in those headers for
which a USENET-header-parameter has been defined, or where that usage
is permitted by some other standard (notably one of the MIME
standards). This restriction is likely to removed in a future version
of this standard.
NOTE: The given syntax is ambiguous insofar as a USENET-header-
content that is defined to be <unstructured> could contain,
within that <unstructured>, text of the form <*(";" header-
parameter)>. The intention is therefore that any such apparent
header-parameters are to be regarded as part of the
<unstructured>. This standard therefore does not (and extensions
to it SHOULD NOT) define any USENET-header-parameter to be
associated with such an unstructured USENET-header-content.
The order of headers in an article is not significant. However, The order of headers in an article is not significant. However,
posting agents are encouraged to put mandatory headers (section 5) posting agents are encouraged to put mandatory headers (section 5)
first, followed by optional headers (section 6), followed by first, followed by optional headers (section 6), followed by
experimental headers and headers not defined in this standard or its experimental headers and headers not defined in this standard or its
extensions. Relaying agents MUST NOT change the order of the headers extensions. Relaying agents MUST NOT change the order of the headers
in an article. in an article.
News Article Format November 2001 4.2.1. Naming of Headers
Despite the restrictions on header-name syntax imposed by the
grammar, relayers and reading agents SHOULD tolerate header-names
containing any US-ASCII printable character other than colon (":",
US-ASCII 58).
Whilst relaying agents MUST accept, and pass on unaltered, any non-
variant header whose header-name is syntactically correct, and
reading agents MUST enable them to be displayed, at least optionally,
posting and injecting agents SHOULD NOT generate headers other than
o headers established by this standard or any extension to it;
o those recognized by other IETF-established standards, notably the
Email standard [RFC 2822] and its extensions, excluding any
explicitly deprecated for Netnews (e.g. see section 9.2.1 for the
deprecated Disposition-Notification-To-header); or,
News Article Format May 2002
alternatively, those listed in some future IANA registry of
recognized headers;
o experimental headers beginning with "X-" (as defined in 4.2.5.1);
o on a provisional basis only, headers related to new protocols
under development which are the subject of (or intended to be the
subject of) some IETF-approved RFC (whether Informational,
Experimental or Standards-Track).
However, software SHOULD NOT attempt to interpret headers not
specifically intended to be meaningful in the Netnews environment.
Header-names are case-insensitive. There is a preferred case Header-names are case-insensitive. There is a preferred case
convention, which posters and posting agents Ought to use: each convention, which posters and posting agents Ought to use: each
hyphen-separated "word" has its initial letter (if any) in uppercase hyphen-separated "word" has its initial letter (if any) in uppercase
and the rest in lowercase, except that some abbreviations have all and the rest in lowercase, except that some abbreviations have all
letters uppercase (e.g. "Message-ID" and "MIME-Version"). The forms letters uppercase (e.g. "Message-ID" and "MIME-Version"). The forms
used in this standard are the preferred forms for the headers given in the various rules defining headers in this standard are the
described herein. Relaying and reading agents MUST, however, tolerate preferred forms for them. Relaying and reading agents MUST, however,
articles not obeying this convention. tolerate articles not obeying this convention.
4.2.2. Header Properties
There are three special properties that may apply to particular
headers, namely: "experimental", "inheritable", and "variant". When a
header is defined, in this (or any future) standard, as having one
(or possibly more) of these properties, it is subject to special
treatment, as indicated below.
4.2.2.1. Experimental Headers
Experimental headers are those whose header-names begin with "X-". 4.2.2. MIME-style Parameters
They are to be used for experimental Netnews features, or for
enabling additional material to be propagated with an article. There
are no established headers (see 4.2.1) that are considered
experimental headers; an established header cannot be experimental.
NOTE: Some such headers may eventually be adopted as standard by The possibility of allowing Mime-style parameters (whether header-
some extension to this standard, at which point they will lose specific ones or generic other-parameters) to appear in virtually all
their "X-" prefix. headers is provided mainly for the purpose of allowing future
extensions to existing headers, since only a very few specific
parameters are defined in this standard. Observe that such parameters
do not, in general, occur in headers defined in other standards,
except for the MIME standards [RFC 2045] et seq. and their
extensions.
4.2.2.2. Inheritable Headers Other-parameters (whether those defined elsewhere or experimental
parameters whose attribute is an x-token) MAY be used, where the
syntax so allows, in any of the headers defined in this standard or
its extensions except that, at present, they SHOULD NOT be used in
headers in widespread use prior to the introduction of this standard
(this restriction is likely to be removed in a future version of this
standard). Nevertheless, compliant software MUST accept such
parameters where required by this standard (ignoring them if their
meaning is unknown) and SHOULD accept (and ignore) them in all
structured headers wherever defined.
Subject only to the overriding ability of the poster to determine the NOTE: The syntax does not permit other-parameters in
contents of the headers in a proto-article, headers with the unstructured headers (where they are unnecessary) or in certain
inheritable property MUST be copied by followup agents (perhaps with headers (notably the From-, Reply-To-, Mail-Copies-To- and
some modification) into the followup article, and headers without Complaints-To-headers) containing address-lists or mailbox-lists
that property MUST NOT be so copied. Examples include: (so that agents can simply replace the header-name by "To" or
o Newsgroups (5.5) - copied from the precursor, subject to any "Cc" to obtain a header immediately suitable for sending Email,
Followup-To header. and also so as to avoid some minor parsing problems with
o Subject (5.4) - modified by prefixing with "Re: ", but otherwise addresses).
copied from the precursor.
o References (6.10) - copied from the precursor, with the addition
of the precursor's Message-ID.
o Distribution (6.6) - copied from the precursor.
NOTE: The Keywords header is not inheritable, though some older Each header-specific parameter introduced in this standard is
newsreaders treated it as such. described by specifying
(a) the token to be used in its attribute, and
(b) the syntax rule(s) defining the object(s) permitted in its
News Article Format May 2002
4.2.2.3. Variant Headers value.
If a value object is not of the syntactic form of a token, it MUST
(and otherwise MAY) be encapsulated in a quoted-string (see 2.4.3).
Observe that the syntax of a parameter also allows additional WSP,
folding and comments.
Headers with the variant property may differ between (or even be The semantics of a parameter is always to associate the token in its
completely absent from) copies of the same article as stored or attribute with the object represented by the token, or the semantic
relayed throughout a Netnews system. The manner of the difference (or value (2.4.3) of the quoted-string, contained in its value.
absence) MUST be as specified in this (or any future) standard.
Typically, these headers are modified as articles are propagated, or
News Article Format November 2001
they reflect the status of the article on a particular serving agent, For example, the posting-sender-parameter (6.19) is defined to be
or cooperating group of such agents. The variant header MAY be placed <a parameter with attribute "sender" and value some sender-value>
anywhere within the headers (though placing it first is recommended). where
The principle examples are: sender-value = mailbox / "verified"
o Path (5.6) - augmented at each relaying agent that an article A valid posting-sender-parameter would be
passes through. sender = "\"Joe D. Bloggs\" <jdbloggs@example.com>" (authinfo)
o Xref (6.16) - used to keep track of the article locators of The comment (syntactically part of the quoted-string) is irrelevant.
crossposted articles so that newsreaders serviced by a particular The actual mailbox (to be used, for example, if email is to be sent
serving agent can mark such articles as read. to the sender) is
"Joe D. Bloggs" <jdbloggs@example.com>
4.2.3. White Space and Continuations 4.2.3. White Space and Continuations
Each header is logically a single line of characters comprising the Each header is logically a single line of characters comprising the
header-name, the colon with its following SP, and the header-content. header-name, the colon with its following SP, the content, and any
For convenience, however, the header-content can be split into a parameters. For convenience, however, the content and parameters can
multiple line representation; this is called "folding". The general be "folded" into a multiple line representation by inserting a CRLF
rule is that wherever this standard allows for FWS or CFWS (but not before any WSP contained within any FWS or CFWS (but not any other SP
simply SP or HTAB) a CRLF may be inserted before any WSP. For or HTAB) allowed by this standard. For example, the header:
example, the header: Approved: modname@modsite.example (Moderator of example.foo.bar)
Approved: modname@modsite.example (Moderator of comp.foo.bar)
can be represented as: can be represented as:
Approved: modname@modsite.example Approved: modname@modsite.example
(Moderator of comp.foo.bar) (Moderator of example.foo.bar)
NOTE: Though header-contents are defined in such a way that FWS occurs at many places in the syntax (usually within a CFWS) in
folding can take place between many of the lexical tokens (and order to allow the inclusion of comments, whitespace and folding. The
even within some of them), folding SHOULD be limited to placing syntax is in fact ambiguous insofar as it sometimes allows two
the CRLF at higher-level syntactic breaks, and SHOULD also avoid consecutive instantiations of FWS (as least one of which is always
leaving trailing WSP on the preceding line. For instance, if a optional), or of FWS followed by an explicit CRLF. However, all such
header-content is defined as comma-separated values, it is cases MUST be treated as if the optional instantiation (or one of
recommended that folding occur after the comma separating the them) had not been present. It is thus precluded that any line of a
structured items, even if it is allowed elsewhere. header should be made up of whitespace characters and nothing else
(for such a line might otherwise have been interpreted by a non-
compliant agent as the separator between the headers and the body of
the article).
Folding MUST NOT be carried out in such a way that any line of a NOTE: This does not lead to semantic ambiguity because, unless
header is made up entirely of WSP characters and nothing else. specifically stated otherwise, the presence or absence of
folding, a comment or additional WSP has no semantic meaning
and, in particular, it is a matter of indifference whether it
forms a part of the syntactic construct preceding it or the one
following it.
The colon following the header name on the first line MUST be News Article Format May 2002
followed by a WSP, even if the header is empty. If the header is not
empty, at least some of the content MUST appear on the first line
(this is to avoid the possibility of harm by any non-compliant agent
that might eliminate a trailing SP). Posting agents MUST enforce
these restrictions, but relaying agents SHOULD accept even articles
that violate them.
NOTE: This standard differs from [RFC 2822] in requiring that NOTE: It may be observed that the content part of every header
WSP followng the colon (it was also an [RFC 1036] requirement). begins and ends with an optional CFWS (or FWS in the case of
certain headers). Moreover, every parameter also begins and ends
with an optional CFWS.
Posters and posting agents SHOULD use SP, not HTAB, where white space NOTE: Though contents are defined in such a way that folding can
is desired in headers (some existing software expects this), and MUST take place between many of the lexical tokens (and even within
use SP immediately following the colon after a header-name. Relaying some of them), folding should be limited to placing the CRLF at
agents SHOULD accept HTAB in all such cases, however. higher-level syntactic breaks, and should also avoid leaving
trailing WSP on the preceding line. For instance, if a header-
content is defined as comma-separated values, it is recommended
that folding occur after the comma separating the structured
items, even if it is allowed elsewhere.
News Article Format November 2001 In accordance with the syntax, the header-name on the first line MUST
be followed by a SP (even if the rest of the header is empty, but see
4.2.6). Even though the syntax allows otherwise, at least some of
the content MUST appear on that first line (to avoid the possibility
of harm by any non-compliant agent that might eliminate a trailing
WSP). Although posting agents are REQUIRED to enforce these
restrictions, relaying and serving agents SHOULD accept articles that
violate them.
Since the white space beginning a continuation line remains a part of NOTE: This standard differs from [RFC 2822] in requiring that SP
the logical line, headers can be "broken" into multiple lines only at following the colon (it was also an [RFC 1036] requirement).
FWS or CFWS. Posting agents Ought Not to break headers unnecessarily
(but see 4.5). Posters and posting agents SHOULD use SP, not HTAB, where white space
is desired in headers (some existing software expects this). Relaying
and serving agents SHOULD accept HTAB in all such cases, however.
4.2.4. Comments 4.2.4. Comments
Strings of characters which are treated as comments may be included Strings of characters which are treated as comments may be included
in header-contents wherever the syntactic element CFWS occurs. They in headers wherever the syntactic element CFWS occurs. They consist
consist of characters enclosed in parentheses. Such strings are of characters enclosed in parentheses. Comments may be nested.
considered comments so long as they do not appear within a quoted-
string. Comments may be nested. NOTE: Although CFWS occurs wherever whitespace is allowed in
almost all headers, there are exceptions where only FWS is
permitted (hence folding but no comments). Notably, this happens
in the case of the Newsgroups-, Distribution-, Path- and
Followup-To-headers, and within the Date-header except right at
the end.
A comment is normally used to provide some human readable A comment is normally used to provide some human readable
informational text, except at the end of an address which contains no informational text, except at the end of an address which contains no
phrase, as in phrase, as in
fred@foo.bar.example (Fred Bloggs) fred@foo.bar.example (Fred Bloggs)
as opposed to as opposed to
"Fred Bloggs" <fred@foo.bar.example> . "Fred Bloggs" <fred@foo.bar.example> .
The former is a deprecated, but commonly encountered, usage and The former is a deprecated, but commonly encountered, usage and
reading agents SHOULD take special note of such comments as reading agents SHOULD take special note of such comments as
indicating the name of the person whose address it is. In all other indicating the name of the person whose address it is. In all other
situations a comment is semantically interpreted as a single SP. situations a comment is semantically interpreted as a single SP.
News Article Format May 2002
Since a comment is allowed to contain FWS, folding is permitted Since a comment is allowed to contain FWS, folding is permitted
within it as well as immediately preceding and immediately following within it as well as immediately preceding and immediately following
it. Also note that, since quoted-pair is allowed in a comment, the it. Also note that, since quoted-pair is allowed in a comment, the
parenthesis and backslash characters may appear in a comment so long parenthesis and backslash characters may appear in a comment so long
as they appear as a quoted-pair. Semantically, the enclosing as they appear as a quoted-pair. Semantically, the enclosing
parentheses are not part of the comment content; the content is what parentheses are not part of the content of the comment; the content
is contained between the two parentheses. is what is contained between the two parentheses.
Since comments have not hitherto been permitted in news articles, Since comments have not hitherto been permitted in news articles,
except in a few specified places, posters and posting-agents SHOULD except in a few specified places, posters and posting-agents SHOULD
NOT insert them except in those places, namely following addresses in NOT insert them except in those places, namely following addresses in
From and similar headers, and to indicate the name of the timezone in From and similar headers, and to indicate the name of the timezone in
Date headers. However, compliant software MUST accept them in all Date-headers. However, compliant software MUST accept them in all
places where they are syntactically allowed. places where they are syntactically allowed.
4.2.5. Undesirable Headers 4.2.5. Header Properties
A header whose content is empty is said to be an empty header. There are three special properties that may apply to particular
Relaying and reading agents SHOULD NOT consider presence or absence headers, namely: "experimental", "inheritable", and "variant". When a
of an empty header to alter the semantics of an article (although header is defined, in this (or any future) standard, as having one
syntactic rules, such as requirements that certain header names (or possibly more) of these properties, it is subject to special
appear at most once in an article, MUST still be satisfied). Posting treatment, as indicated below.
and injecting agents SHOULD delete empty headers from articles before
posting them; relaying agents MUST pass them untouched.
Headers that merely state defaults explicitly (e.g., a Followup-To 4.2.5.1. Experimental Headers
header with the same content as the Newsgroups header, or a MIME
Content-Type header with contents "text/plain; charset=us-ascii") or Experimental headers are those whose header-names begin with "X-".
They are to be used for experimental Netnews features, or for
enabling additional material to be propagated with an article. They
are not (and will not be) defined by this, or any, standard.
NOTE: Experimental headers are suitable for situations where
they need only to be human readable. They are not intended to be
recognized by widely deployed Netnews software and, should such
a requirement be envisaged, it is preferable to use a normal
header on the provisional basis set out in section 4.2.1.
4.2.5.2. Inheritable Headers
Subject only to the overriding ability of the poster to determine the
contents of the headers in a proto-article, headers with the
inheritable property MUST be copied by followup agents (perhaps with
some modification) into the followup article, and headers without
that property MUST NOT be so copied. Examples include:
o Newsgroups (5.5) - copied from the precursor, subject to any
Followup-To-header.
o Subject (5.4) - modified by prefixing with "Re: ", but otherwise
copied from the precursor.
o References (6.10) - copied from the precursor, with the addition
of the precursor's Message-ID.
o Distribution (6.6) - copied from the precursor.
NOTE: The Keywords-header is not inheritable, though some older
newsreaders treated it as such.
News Article Format May 2002
4.2.5.3. Variant Headers
Headers with the variant property may differ between (or even be
completely absent from) copies of the same article as stored or
relayed throughout a Netnews system. The manner of the difference (or
absence) MUST be as specified in this (or any future) standard.
Typically, these headers are modified as articles are propagated, or
they reflect the status of the article on a particular serving agent,
or cooperating group of such agents. The variant header MAY be placed
anywhere within the headers (though placing it first is recommended).
The principle examples are:
o Path (5.6) - augmented at each relaying agent that an article
passes through.
o Xref (6.16) - used to keep track of the article locators of
crossposted articles so that newsreaders serviced by a particular
serving agent can mark such articles as read.
4.2.6. Undesirable Headers
A header whose content is empty is said to be an empty header (in
fact, no such headers are defined by this standard). Relaying and
reading agents SHOULD NOT consider presence or absence of an empty
header to alter the semantics of an article (although syntactic
rules, such as requirements that certain header-names appear at most
once, MUST still be satisfied). Posting and injecting agents SHOULD
delete empty headers from articles before posting them; relaying
agents MUST pass them untouched.
Headers that merely state defaults explicitly (e.g., a Followup-To-
header with the same content as the Newsgroups-header, or a MIME
Content-Type-header with contents "text/plain; charset=us-ascii") or
state information that reading agents can typically determine easily state information that reading agents can typically determine easily
themselves (e.g. the length of the body in octets) are redundant and themselves (e.g. the length of the body in octets) are redundant and
News Article Format November 2001
posters and posting agents Ought Not to include them. posters and posting agents Ought Not to include them.
4.3. Body 4.3. Body
4.3.1. Body Format Issues 4.3.1. Body Format Issues
The body of an article SHOULD NOT be empty. A posting or injecting The body of an article SHOULD NOT be empty. A posting or injecting
agent which does not reject such an article entirely SHOULD at least agent which does not reject such an article entirely SHOULD at least
issue a warning message to the poster and supply a non-empty body. issue a warning message to the poster and supply a non-empty body.
Note that the separator line MUST be present even if the body is Note that the separator line MUST be present even if the body is
empty. empty.
NOTE: Some existing news software is known to react badly to NOTE: Some existing news software is known to react badly to
body-less articles, hence the request for posting and injecting body-less articles, hence the request for posting and injecting
agents to insert a body in such cases. The sentence "This agents to insert a body in such cases. The sentence "This
article was probably generated by a buggy news reader" has article was probably generated by a buggy news reader" has
traditionally been used is this situation. traditionally been used is this situation.
Note that an article body is a sequence of lines terminated by CRLFs, Note that an article body is a sequence of lines terminated by CRLFs,
not arbitrary binary data, and in particular it MUST end with a CRLF. not arbitrary binary data, and in particular it MUST end with a CRLF.
However, relaying agents SHOULD treat the body of an article as an However, relaying and serving agents SHOULD treat the body of an
uninterpreted sequence of octets (except as mandated by changes of News Article Format May 2002
CRLF representation and by control-message processing) and SHOULD
avoid imposing constraints on it. See also section 4.5. article as an uninterpreted sequence of octets (except as mandated by
changes of CRLF representation and by control message processing, as
in 7.2.4) and SHOULD avoid imposing constraints on it. See also
section 4.5.
Posters SHOULD avoid using control characters and escape sequences Posters SHOULD avoid using control characters and escape sequences
except for tab (US-ASCII 9), formfeed (US-ASCII 12) and, possibly, except for tab (US-ASCII 9), formfeed (US-ASCII 12) and, possibly,
backspace (US-ASCII 8). Tab signifies sufficient horizontal white backspace (US-ASCII 8). Tab signifies sufficient horizontal white
space to reach the next of a set of fixed positions; posters are space to reach the next of a set of fixed positions; posters are
warned that there is no standard set of positions, so tabs should be warned that there is no standard set of positions, so tabs should be
avoided if precise spacing is essential. Formfeed (which is sometimes avoided if precise spacing is essential. Formfeed (which is sometimes
referred to as the "spoiler character") signifies a point at which a referred to as the "spoiler character") signifies a point at which a
reading agent Ought to pause and await reader interaction before reading agent Ought to pause and await reader interaction before
displaying further text. displaying further text.
skipping to change at page 21, line 56 skipping to change at page 24, line 36
NOTE: Backspace was historically used for underlining, done by NOTE: Backspace was historically used for underlining, done by
an underscore (US-ASCII 95), a backspace, and a character, an underscore (US-ASCII 95), a backspace, and a character,
repeated for each character that should be underlined. Posters repeated for each character that should be underlined. Posters
are warned that underlining is not available on all output are warned that underlining is not available on all output
devices or supported by all reading agents and is best not devices or supported by all reading agents and is best not
relied on for essential meaning. relied on for essential meaning.
4.3.2. Body Conventions 4.3.2. Body Conventions
A body is by default an uninterpreted sequence of octets for most of A body is by default an uninterpreted sequence of octets for most of
the purposes of this standard. However, a MIME Content-Type header the purposes of this standard. However, a MIME Content-Type-header
may impose some structure or intended interpretation upon it, and may may impose some structure or intended interpretation upon it, and may
also specify the character set in accordance with which the octets also specify the character set in accordance with which the octets
are to be interpreted. are to be interpreted.
News Article Format November 2001 The following conventions for quotations, attributions and
signatures, although not mandated by this standard, describe widely
used practices. They are documented here in order to establish their
correct usage, and the use of the words "MUST", "SHOULD", etc. is to
be understood accordingly.
It is a common practice for followup agents to enable the It is conventional for followup agents to enable the incorporation of
incorporation of the followed-up article (the "precursor") as a the followed-up article (the "precursor") as a quotation. This SHOULD
quotation. This SHOULD be done by prefacing each line of the quoted be done by prefacing each line of the quoted text (even if it is
text (even if it is empty) with the character ">" (or perhaps with empty) with the character ">" (or perhaps with "> " in the case of a
"> " in the case of a previously unquoted line). This will result in previously unquoted line). This will result in multiple levels of ">"
multiple levels of ">" when quoted content itself contains quoted when quoted content itself contains quoted content, and it will also
content, and it will also facilitate the automatic analysis of facilitate the automatic analysis of articles.
articles.
NOTE: Posters should edit quoted context to trim it down to the NOTE: Posters should edit quoted context to trim it down to the
minimum necessary. However, followup agents Ought Not to attempt minimum necessary. However, followup agents Ought Not to attempt
to enforce this beyond issuing a warning (past attempts to do so to enforce this beyond issuing a warning (past attempts to do so
have been found to be notably counter-productive). have been found to be notably counter-productive).
News Article Format May 2002
The followup agent SHOULD also precede the quoted content by an The followup agent SHOULD also precede the quoted content by an
"attribution line" (however, readers are warned not to assume that "attribution line" (however, readers are warned not to assume that
they are accurate, especially within multiply nested quotations). The they are accurate, especially within multiply nested quotations). The
following convention for such lines, whilst not mandated by this following convention for such lines is intended to facilitate their
standard, is intended to facilitate their automatic recognition and automatic recognition and processing by sophisticated reading agents.
processing by sophisticated reading agents. The attribution SHOULD The attribution SHOULD contain the name or the email address of the
contain the name or the email address of the precursor's poster, as precursor's poster, as in
in
Joe D. Bloggs <jdbloggs@foo.example> wrote: Joe D. Bloggs <jdbloggs@foo.example> wrote:
or or
Helmut Schmidt <helmut@bar.example> schrieb: Helmut Schmidt <helmut@bar.example> schrieb:
The attribution MAY contain also a single Newsgroup name (the one The attribution MAY contain also a single newsgroup-name (the one
from which the followup is being made), the precursor's Message-ID from which the followup is being made), the precursor's Message-ID
and/or the precursor's Date and Time. Any of these that are present, and/or the precursor's Date and Time. Any of these that are present,
SHOULD precede the name and/or email address. However, the inclusion SHOULD precede the name and/or email address. However, the inclusion
or not of such fields Ought always to be under the control of the or not of such fields Ought always to be under the control of the
poster. poster.
To enable this line, and the Message-ID and the Email address within To enable this line, and the Message-ID and the email address within
it, to be recognised (for example to enable suitable reading agents it, to be recognized (for example to enable suitable reading agents
to retrieve the precursor or email its poster by clicking on them), to retrieve the precursor or email its poster by clicking on them),
the following conventions SHOULD be observed: the following conventions SHOULD be observed:
o The precursor's Message-ID SHOULD be enclosed within <...> or o The precursor's Message-ID SHOULD be enclosed within <...> or
<news:...> <news:...>
o The precursor's poster's Email address SHOULD be enclosed within o The precursor's poster's email address SHOULD be enclosed within
<...> <...>
o The various fields may be separated by arbitrary text and they o The various fields may be separated by arbitrary text and they
may be folded in the same way as headers, but attributions SHOULD may be folded in the same way as headers, but attributions SHOULD
always be terminated by a ":" followed by CRLF. always be terminated by a ":" followed by CRLF.
Further examples: Further examples:
On comp.foo in <1234@bar.example> on 24 Dec 1997 16:40:20 +0000, On comp.foo in <1234@bar.example> on 24 Dec 2001 16:40:20 +0000,
Joe D. Bloggs <jdbloggs@bar.example> wrote: Joe D. Bloggs <jdbloggs@bar.example> wrote:
Am 24. Dez 1997 schrieb Helmut Schmidt <helmut@bar.example>: Am 24. Dez 2001 schrieb Helmut Schmidt <helmut@bar.example>:
News Article Format November 2001
A "personal signature" is a short closing text automatically added to A "personal signature" is a short closing text automatically added to
the end of articles by posting agents, identifying the poster and the end of articles by posting agents, identifying the poster and
giving his network addresses, etc. If a poster or posting agent does giving his network addresses, etc. Whenever a poster or posting agent
append such a signature to an article, it MUST be preceded with a appends such a signature to an article, it MUST be preceded with a
delimiter line containing (only) two hyphens (US-ASCII 45) followed delimiter line containing (only) two hyphens (US-ASCII 45) followed
by one SP (US-ASCII 32). The signature is considered to extend from by one SP (US-ASCII 32). The signature is considered to extend from
the last occurrence of that delimiter up to the end of the article the last occurrence of that delimiter up to the end of the article
(or up to the end of the part in the case of a multipart MIME body). (or up to the end of the part in the case of a multipart MIME body).
Followup agents, when incorporating quoted text from a precursor, Followup agents, when incorporating quoted text from a precursor,
Ought Not to include the signature in the quotation. Posting agents Ought Not to include the signature in the quotation. Posting agents
Ought to discourage (at least with a warning) signatures of excessive Ought to discourage (at least with a warning) signatures of excessive
length (4 lines is a commonly accepted limit). length (4 lines is a commonly accepted limit).
News Article Format May 2002
4.4. Characters and Character Sets 4.4. Characters and Character Sets
Transmission paths for news articles MUST treat news articles as Transmission paths for news articles MUST treat news articles as
uninterpreted sequences of octets, excluding the values 0 (US-ASCII uninterpreted sequences of octets, excluding the values 0 (US-ASCII
NUL) and 13 and 10 (US-ASCII CR and LF, which MUST ONLY appear in the NUL) and 13 and 10 (US-ASCII CR and LF, which MUST ONLY appear in the
combination CRLF which denotes a line separator). combination CRLF which denotes a line separator).
NOTE: this correspponds to the range of octets permitted for NOTE: this corresponds to the range of octets permitted for MIME
MIME "8bit data" [RFC 2045]. Thus raw binary data cannot be "8bit data" [RFC 2045]. Thus raw binary data cannot be
transmitted in an article body except by the use of a Content- transmitted in an article body except by the use of a Content-
Transfer-Encoding such as base64. Transfer-Encoding such as base64.
Character data is represented by octets in accordance with some Character data is represented by octets in accordance with some
encoding scheme (UTF-8 for headers, and determined by the Content- encoding scheme (UTF-8 for headers, and determined by the Content-
Type and Content-Transfer-Encoding headers for bodies). Type- and Content-Transfer-Encoding-headers for bodies).
If it comes to a relaying agent's attention that it is being asked to If it comes to a relaying agent's attention that it is being asked to
pass an article using the Content-Transfer-Encoding "8bit" to a pass an article using the Content-Transfer-Encoding "8bit" to a
relaying agent that does not support it, it SHOULD report this error relaying agent that does not support it, it SHOULD report this error
to its administrator. It MUST refuse to pass the article and MUST NOT to its administrator. It MUST refuse to pass the article and MUST NOT
re-encode it with different MIME encodings. re-encode it with different MIME encodings.
NOTE: This strategy will do little harm. The target relaying NOTE: This strategy will do little harm. The target relaying
agent is unlikely to be able to make use of the article on its agent is unlikely to be able to make use of the article on its
own servers, and the usual flooding algorithm will likely find own servers, and the usual flooding algorithm will likely find
skipping to change at page 24, line 5 skipping to change at page 26, line 46
Within article headers, characters are represented as octets Within article headers, characters are represented as octets
according to the UTF-8 encoding scheme [RFC 2279] or [ISO/IEC 10646], according to the UTF-8 encoding scheme [RFC 2279] or [ISO/IEC 10646],
and hence all the characters in Unicode [UNICODE 3.1] or in the and hence all the characters in Unicode [UNICODE 3.1] or in the
Universal Multiple-Octet Coded Character Set (UCS) [ISO/IEC 10646] Universal Multiple-Octet Coded Character Set (UCS) [ISO/IEC 10646]
(which is essentially a superset of Unicode and expected to remain (which is essentially a superset of Unicode and expected to remain
so) are potentially available. However, processing all octets in the so) are potentially available. However, processing all octets in the
same manner as US-ASCII characters should ensure correct behaviour in same manner as US-ASCII characters should ensure correct behaviour in
most situations. most situations.
News Article Format November 2001
NOTE: UTF-8 is an encoding for 16bit (and even 32bit) character NOTE: UTF-8 is an encoding for 16bit (and even 32bit) character
sets with the property that any octet less than 128 immediately sets with the property that any octet less than 128 immediately
represents the corresponding US-ASCII character, thus ensuring represents the corresponding US-ASCII character, thus ensuring
upwards compatibility with previous practice. Non-ASCII upwards compatibility with previous practice. Non-ASCII
characters from Unicode are represented by sequences of octets characters from Unicode are represented by sequences of octets
satisfying the syntax of a UTF8-xtra-char (2.4), which excludes satisfying the syntax of a UTF8-xtra-char (2.4.2), which
certain octet sequences not explicitly permitted by [RFC 2279]. excludes certain octet sequences not explicitly permitted by
Unicode includes all characters from the ISO-8859 series of [RFC 2279]. Unicode includes all characters from the ISO-8859
characters sets [ISO 8859] (which includes all Cyrillic, Greek series of characters sets [ISO 8859] (which includes all
and Arabic characters) together with the more elaborate Cyrillic, Greek and Arabic characters) together with the more
characters used in Asian countries. See the following section elaborate characters used in Asian countries. See the following
for the appropriate treatment of Unicode characters by reading section for the appropriate treatment of Unicode characters by
agents. reading agents.
News Article Format May 2002
Notwithstanding the great flexibility permitted by UTF-8, there is Notwithstanding the great flexibility permitted by UTF-8, there is
need for restraint in its use in order that the essential components need for restraint in its use in order that the essential components
of headers may be discerned using reading agents that cannot present of headers may be discerned using reading agents that cannot present
the full Unicode range. In particular, header-names and tokens MUST the full Unicode range. In particular, header-names and tokens MUST
be in US-ASCII, and certain other components of headers, as defined be in US-ASCII, and certain other components of headers, as defined
elsewhere in this standard - notably msg-ids, date-times, dot-atoms, elsewhere in this standard - notably msg-ids, date-times, dot-atoms,
domains and path-identities - MUST be in US-ASCII. Comments, phrases domains and path-identities - MUST be in US-ASCII. Comments, phrases
(as in addresses) and unstructureds (as in Subject headers) MAY use (as in addresses) and unstructured headers (such as the Subject-,
the full range of UTF-8 characters, but SHOULD nevertheless be Organization- and Summary-headers) MAY use the full range of UTF-8
invariant under Unicode normalization NFC [UNICODE 3.1]. characters, but SHOULD nevertheless be invariant under Unicode
normalization NFC [UNICODE 3.1].
NOTE: Unicode allows for composite characters made up of a NOTE: Unicode allows for composite characters made up of a
starter character - which can be a letter, number, punctuation starter character - which can be a letter, number, punctuation
mark, or symbol - plus zero or more combining marks (such as mark, or symbol - plus zero or more combining marks (such as
accents, diacritics, and similar). The requirement that a accents, diacritics, and similar). The requirement that a
composite be invariant under normalization NFC means that, where composite be invariant under normalization NFC means that, where
it could be written in more than one way, only one particular it could be written in more than one way, only one particular
one is allowed (for example, the single character E-acute is one is allowed (for example, the single character E-acute is
preferred over E followed by a non-spacing acute accent, and A- preferred over E followed by a non-spacing acute accent, and A-
ring is preferred over the Angstrom symbol). At least for the ring is preferred over the Angstrom symbol). At least for the
skipping to change at page 24, line 52 skipping to change at page 27, line 39
already available as single characters, it is unlikely that already available as single characters, it is unlikely that
posting agents will need to take any special steps to ensure posting agents will need to take any special steps to ensure
normalization. normalization.
In the particular case of newsgroup-names (see 5.5) there are more In the particular case of newsgroup-names (see 5.5) there are more
stringent requirements regarding the use of UTF-8 and Unicode. stringent requirements regarding the use of UTF-8 and Unicode.
Where the use of non-ASCII characters, encoded in UTF-8, is permitted Where the use of non-ASCII characters, encoded in UTF-8, is permitted
as above, they MAY also be encoded using the MIME mechanism defined as above, they MAY also be encoded using the MIME mechanism defined
in [RFC 2047], but this usage is deprecated within news articles in [RFC 2047], but this usage is deprecated within news articles
(even though it is required in mail messages) since it is less (even though it is required in email messages) since it is less
legible in older reading agents which support neither it nor UTF-8. legible in older reading agents which support neither it nor UTF-8.
Nevertheless, reading agents SHOULD support this usage, but only in Nevertheless, reading agents SHOULD support this usage, but only in
those contexts explicitly mentioned in [RFC 2047]. those contexts explicitly mentioned in [RFC 2047].
News Article Format November 2001 Similar considerations apply to non-ASCII characters within the
values of parameters (which, according to the syntax, MUST be in the
form of quoted-strings in order for UTF8-xtra-chars to be
accomodated). Such values MAY be encoded using the MIME mechanism
defined in [RFC 2231], but this usage is deprecated within news
articles (even though it is required in email messages) since it is
less legible in older reading agents which support neither it nor
UTF-8. Nevertheless, reading agents SHOULD support this usage.
4.4.2. Character Sets within Article Bodies 4.4.2. Character Sets within Article Bodies
Within article bodies, characters are represented as octets according Within article bodies, characters are represented as octets according
to the encoding scheme implied by any Content-Transfer-Encoding and to the encoding scheme implied by any Content-Transfer-Encoding- and
Content-Type headers [RFC 2045]. In the absence of such headers, Content-Type-headers [RFC 2045]. In the absence of such headers,
reading agents cannot be relied upon to display correctly more than reading agents cannot be relied upon to display correctly more than
the US-ASCII characters. the US-ASCII characters, though they MUST display at least those.
News Article Format May 2002
NOTE: Observe that reading agents are not forbidden to "guess", NOTE: Observe that reading agents are not forbidden to "guess",
or to interpret as UTF-8 regardless, which would be the simplest or to interpret as UTF-8 regardless, which would be the simplest
course for them to take. course for them to take.
NOTE: It is not expected that reading agents will necessarily be NOTE: It is not expected that reading agents will necessarily be
able to present characters in all possible character sets, able to present characters in all possible character sets. For
although they MUST be able to present all US-ASCII characters. example, a reading agent might be able to present only the ISO-
For example, a reading agent might be able to present only the 8859-1 (Latin 1) characters [ISO 8859], in which case it Ought
ISO-8859-1 (Latin 1) characters [ISO 8859], in which case it to present undisplayable characters using some distinctive
Ought to present undisplayable characters using some distinctive glyph, or by exhibiting a suitable warning.
glyph, or by exhibiting a suitable warning. Older reading agents
that do not understand MIME headers or UTF-8 should be able to
display bodies in US-ASCII (with some loss of human
comprehensibility) except possibly when the Content-Transfer-
Encoding is "8bit".
Followup agents MUST be careful to apply appropriate encodings to the Followup agents MUST be careful to apply appropriate encodings to the
outbound followup. A followup to an article containing non-ASCII outbound followup. A followup to an article containing non-ASCII
material is very likely to contain non-ASCII material itself. material is very likely to contain non-ASCII material itself.
4.5. Size Limits 4.5. Size Limits
Posting agents SHOULD endeavour to keep all header lines, so far as Posting agents SHOULD endeavour to keep all header lines, so far as
is possible, within 79 characters by folding them at suitable places is possible, within 79 characters by folding them at suitable places
(see 4.2.3). However, posting agents MUST permit the poster to (see 4.2.3). However, posting agents MUST permit the poster to
skipping to change at page 26, line 5 skipping to change at page 28, line 46
folding, be made to exceed the 998 octets restriction pertaining folding, be made to exceed the 998 octets restriction pertaining
to a single header line). to a single header line).
The syntax provides for the lines of a body to be up to 998 octets in The syntax provides for the lines of a body to be up to 998 octets in
length, not including the CRLF. All software compliant with this length, not including the CRLF. All software compliant with this
standard MUST support lines of at least that length, both in headers standard MUST support lines of at least that length, both in headers
and in bodies, and all such software SHOULD support lines of and in bodies, and all such software SHOULD support lines of
arbitrary length. In particular, relaying agents MUST transmit lines arbitrary length. In particular, relaying agents MUST transmit lines
of arbitrary length without truncation or any other modification. of arbitrary length without truncation or any other modification.
News Article Format November 2001
NOTE: The limit of 998 octets is consistent with the NOTE: The limit of 998 octets is consistent with the
corresponding limit in [RFC 2822]. corresponding limit in [RFC 2822].
In plain-text messages (those with no MIME headers, or those with a In plain-text messages (those with no MIME headers, or those with a
MIME Content-Type of text/plain) posting agents Ought to endeavour to MIME Content-Type of text/plain) posting agents Ought to endeavour to
keep the length of body lines within some reasonable limit. The size keep the length of body lines within some reasonable limit. The size
of this limit is a matter of policy, the default being to keep within of this limit is a matter of policy, the default being to keep within
79 characters at most, and preferably within 72 characters (to allow 79 characters at most, and preferably within 72 characters (to allow
room for quoting in followups). Exceptionally, posting agents Ought room for quoting in followups). Exceptionally, posting agents Ought
Not to adjust the length of quoted lines in followups unless they are Not to adjust the length of quoted lines in followups unless they are
able to reformat them in a consistent manner. Moreover, posting able to reformat them in a consistent manner. Moreover, posting
agents MUST permit the poster to include longer lines if he so agents MUST permit the poster to include longer lines if he so
insists. insists.
News Article Format May 2002
NOTE: Plain-text messages are intended to be displayed "as-is" NOTE: Plain-text messages are intended to be displayed "as-is"
without any special action (such as automatic line splitting) on without any special action (such as automatic line splitting) on
the part of the recipient. The policy limit (e.g. 72 or 79) the part of the recipient. The policy limit (e.g. 72 or 79)
should be expressed as a number of characters (as they will be should be expressed as a number of characters (as they will be
displayed by a reading agent) rather than as the number of displayed by a reading agent) rather than as the number of
octets used to encode them. octets used to encode them.
NOTE: This standard provides no upper bound on the overall size NOTE: This standard provides no upper bound on the overall size
of a single article, but neither does it forbid relaying agents of a single article, but neither does it forbid relaying agents
from dropping articles of excessive length. It is, however, from dropping articles of excessive length. It is, however,
skipping to change at page 26, line 42 skipping to change at page 29, line 28
agents would be more appropriately expressed in megabytes than agents would be more appropriately expressed in megabytes than
in kilobytes. in kilobytes.
4.6. Example 4.6. Example
Here is a sample article: Here is a sample article:
Path: server.example/unknown.site2.example@site2.example/ Path: server.example/unknown.site2.example@site2.example/
relay.site.example/site.example/injector.site.example%jsmith relay.site.example/site.example/injector.site.example%jsmith
Newsgroups: example.announce,example.chat Newsgroups: example.announce,example.chat
Message-ID: <9urrt98y53@site.example> Message-ID: <9urrt98y53@site1.example>
From: Ann Example <a.example@site1.example> From: Ann Example <a.example@site1.example>
Subject: Announcing a new sample article. Subject: Announcing a new sample article.
Date: Fri, 27 Mar 1998 12:12:50 +1300 Date: Wed, 27 Mar 2002 12:12:50 +0300
Approved: example.announce moderator <jsmith@site.example> Approved: example.announce moderator <jsmith@site.example>
Followup-To: example.chat Followup-To: example.chat
Reply-To: Ann Example <a.example+replies@site1.example> Reply-To: Ann Example <a.example+replies@site1.example>
Expires: Wed, 22 Apr 1998 12:12:50 -0700 Expires: Mon, 22 Apr 2002 12:12:50 +0300
Organization: Site1, The Number one site for examples. Organization: Site1, The Number one site for examples.
User-Agent: ExampleNews/3.14 (Unix) User-Agent: ExampleNews/3.14 (Unix)
Keywords: example, announcement, standards, RFC 1036, Usefor Keywords: example, announcement, standards, RFC 1036, Usefor
Summary: The URL for the next standard. Summary: The URL for the next standard.
Injector-Info: injector.site.example; posting-host=du003.site.example
Complaints-To: abuse@site.example
Just a quick announcemnt that a new standard example article has Just a quick announcement that a new standard example article has
been released; it is in the new USEFOR draft obtainable from been released; it is in the new USEFOR standard obtainable from
ftp.ietf.org. ftp.ietf.org.
News Article Format November 2001
Ann. Ann.
-- --
Ann Example <a.example@site1.example> Sample Poster to the Stars Ann Example <a.example@site1.example> Sample Poster to the Stars
"The opinions in this article are bloody good ones" - J. Clarke. "The opinions in this article are bloody good ones" - J. Clarke.
[The RFC Editor is invited to change the above Date and Expires headers
to match the actual publication dates and to insert its correct URL.]
5. Mandatory Headers 5. Mandatory Headers
An article MUST have one, and only one, of each of the following An article MUST have one, and only one, of each of the following
headers: Date, From, Message-ID, Subject, Newsgroups, Path. headers: Date, From, Message-ID, Subject, Newsgroups, Path.
News Article Format May 2002
Note also that there are situations, discussed in the relevant parts Note also that there are situations, discussed in the relevant parts
of section 6, where References, Sender, or Approved headers are of section 6, where References-, Sender-, or Approved-headers are
mandatory. In control messages, specific values are required for mandatory. In control messages, specific values are required for
certain headers. certain headers.
For the overall syntax of headers, see section 4.1. In the
discussions of the individual headers, the content of each is
specified using the syntax notation. The convention used is that the
content of, for example, the Subject header is defined as <Subject-
content>.
A proto-article (see 8.2.1) may lack some of these mandatory headers, A proto-article (see 8.2.1) may lack some of these mandatory headers,
but they MUST then be supplied by the injecting agent. but they MUST then be supplied by the injecting agent.
5.1. Date 5.1. Date
The Date header contains the date and time that the article was The Date-header contains the date and time that the article was
prepared by the poster ready for transmission and SHOULD express the prepared by the poster ready for transmission and SHOULD express the
poster's local time. The content syntax makes use of syntax defined poster's local time. The content syntax makes use of syntax defined
in [RFC 2822], subject to the following revised definition of zone. in [RFC 2822], subject to the following revised definition of zone.
header =/ Date-header
Date-header = "Date" ":" SP Date-content
*( ";" other-parameter )
Date-content = date-time Date-content = date-time
zone = (( "+" / "-" ) 4DIGIT) / "UT" / "GMT" zone = (( "+" / "-" ) 4DIGIT) / "UT" / "GMT"
The forms "UT" and "GMT" (indicating universal time) are to be The forms "UT" and "GMT" (indicating universal time) are to be
regarded as obsolete synonyms for "+0000". They MUST be be accepted, regarded as obsolete synonyms for "+0000". They MUST be be accepted,
and passed on unchanged, by all agents, but they MUST NOT be and passed on unchanged, by all agents, but they MUST NOT be
generated as part of new articles by posting and injecting agents. generated as part of new articles by posting and injecting agents.
The date-time MUST be semantically valid as required by [RFC 2822].
Although folding white space is permitted throughout the date-time Although folding white space is permitted throughout the date-time
syntax, it is RECOMMENDED that a single space be used in each place syntax, it is RECOMMENDED that a single space be used in each place
that FWS appears (whether it is required or optional). that FWS appears (whether it is required or optional).
NOTE: A convention that is sometimes followed is to add a NOTE: A convention that is sometimes followed is to add a
comment, after the date-time, containing the time zone in comment, after the date-time, containing the time zone in
human-readable form, but many of the abbreviations commonly used human-readable form, but many of the abbreviations commonly used
for this purpose are ambiguous. The value given by the <zone> is for this purpose are ambiguous. The value given by the <zone> is
the only definitive form. the only definitive form.
In order to prevent the reinjection of expired articles into the news In order to prevent the reinjection of expired articles into the news
stream, relaying and serving agents MUST refuse articles whose Date stream, relaying and serving agents MUST refuse "stale" articles
header predates the earliest articles of which they normally keep whose Date-header predates the earliest articles of which they
record, or which is more than 24 hours into the future (though they normally keep record, or which is more than 24 hours into the future
MAY use a margin less than that 24 hours). Relaying agents MUST NOT (though they MAY use a margin less than that 24 hours). Relaying
News Article Format November 2001 agents MUST NOT modify the Date-header in transit.
modify the Date header in transit.
5.1.1. Examples 5.1.1. Examples
Date: Fri, 2 Apr 1999 20:20:51 -0500 (EST) Date: Sat, 26 May 2001 11:13:00 -0500 (EST)
Date: 26 May 1999 16:13 +0000 Date: 26 May 2001 16:13 +0000
Date: 26 May 1999 16:13 GMT (Obsolete) Date: 26 May 2001 16:13 GMT (Obsolete)
5.2. From 5.2. From
The From header contains the electronic address(es), and possibly the The From-header contains the electronic address(es), and possibly the
full name, of the article's poster(s). The content syntax makes use full name, of the article's poster(s). The content syntax makes use
of syntax defined in [RFC 2822], subject to the following revised of syntax defined in [RFC 2822], subject to the following revised
News Article Format May 2002
definition of local-part. definition of local-part.
header =/ From-header
From-header = "From" ":" SP From-content
From-content = mailbox-list From-content = mailbox-list
addr-spec = local-part "@" domain addr-spec = local-part "@" domain
local-part = dot-atom / strict-quoted-string local-part = dot-atom / strict-quoted-string
NOTE: This syntax ensures that the local-part of an addr-spec is NOTE: This syntax ensures that the local-part of an addr-spec is
restricted to pure US-ASCII (and is thus in strict compliance restricted to pure US-ASCII (and is thus in strict compliance
with [RFC 2822]), whilst allowing any UTF-8 character to be used with [RFC 2822]), whilst allowing any UTF-8 character to be used
in a preceding quoted-string containing the poster's full name. in a preceding quoted-string containing the poster's full name.
If some future extension to the Mail protocols should relax this If some future extension to the Email protocols should relax
restriction, one would expect the Netnews protocols to follow. this restriction, one would expect the Netnews protocols to
follow.
The mailbox in the From-content SHOULD be a valid address, belonging Each mailbox in the From-content SHOULD be a valid address, belonging
to the poster(s) of the article, or person or agent on whose behalf to the poster(s) of the article, or person or agent on whose behalf
the post is being sent (see the Sender header, 6.2). When, for the post is being sent (see the Sender-header, 6.2). When, for
whatever reason, the poster does not wish to include such an whatever reason, the poster does not wish to include such an address,
adddress, the From-content SHOULD then be an address which ends in the From-content SHOULD then be an address which ends in the top
the top level domain of ".invalid" [RFC 2606]. level domain of ".invalid" [RFC 2606].
NOTE: Since such addresses ending in ".invalid" are NOTE: Since such addresses ending in ".invalid" are
undeliverable, user agents Ought to warn any user attempting to undeliverable, user agents Ought to warn any user attempting to
reply to them and Ought Not, in any case, to attempt to deliver reply to them and Ought Not, in any case, to attempt to deliver
to them (since that would be pointless anyway). Whether or not to them (since that would be pointless anyway). Whether or not
a valid address can subsequently be extracted from such an a valid address can subsequently be extracted from such an
address falls outside the scope of this standard (though it address falls outside the scope of this standard (though it
would be pointless to use a disguise so easily penetrable). would be pointless to use a disguise so easily penetrable).
Be warned also that some injecting agents that have Be warned, however, that some injecting agents which are unable
authentication information may choose to replace the From- to detect that the address belongs to the poster may choose to
content based upon the authenticated identity. insert a Sender-header (6.2) or some entry in an Injector-Info-
header (6.19) which discloses some valid address for the poster.
5.2.1. Examples: 5.2.1. Examples:
From: John Smith <jsmith@site.example> From: John Smith <jsmith@site.example>
From: "John Smith" <jsmith@site.example>, dave@isp.example From: "John Smith" <jsmith@site.example>, dave@isp.example
From: "John D. Smith" <jsmith@site.example>, andrew@isp.example, From: "John D. Smith" <jsmith@site.example>, andrew@isp.example,
fred@site2.example fred@site2.example
From: Jan Jones <jan@please_setup_your_system_correctly.invalid> From: Jan Jones <jan@please_setup_your_system_correctly.invalid>
From: Jan Jones <joe@guess-where.invalid> From: Jan Jones <joe@guess-where.invalid>
From: dave@isp.example (Dave Smith) From: dave@isp.example (Dave Smith)
News Article Format November 2001
NOTE: the last example shows a now deprecated convention of NOTE: the last example shows a now deprecated convention of
putting a poster's full name in a comment following the mailbox, putting a poster's full name in a comment following the mailbox,
rather than in a phrase at the start of that mailbox. Observe rather than in a phrase at the start of it. Observe also the use
that the quotes around the "John D. Smith" example were of the quoted-string "John D. Smith" which is required on
required, on account of the '.' character, and they would also account of presence of the '.' character, and which would also
have been required had any UTF8-xtra-char been present. have been required had any UTF8-xtra-char been present.
News Article Format May 2002
5.3. Message-ID 5.3. Message-ID
The Message-ID header contains the article's message identifier, a The Message-ID-header contains the article's message identifier, a
unique identifier distinguishing the article from every other unique identifier distinguishing the article from every other
article. The content syntax makes use of syntax defined in [RFC article. The content syntax makes use of syntax defined in [RFC
2822], subject to the following revised definition of no-fold-quote 2822], subject to the following revised definition of no-fold-quote
and no-fold-literal. and no-fold-literal.
header =/ Message-ID-header
Message-ID-header = "Message-ID" ":" SP Message-ID-content
*( ";" other-parameter )
Message-ID-content = msg-id Message-ID-content = msg-id
id-left = dot-atom-text / no-fold-quote id-left = dot-atom-text / no-fold-quote
id-right = dot-atom-text / no-fold-literal id-right = dot-atom-text / no-fold-literal
no-fold-quote = DQUOTE *( strict-qtext / strict-quoted-pair ) no-fold-quote = DQUOTE
*( strict-qtext / "\\" / "\" DQUOTE )
qspecial
*( strict-qtext / "\\" / "\" DQUOTE )
DQUOTE DQUOTE
no-fold-literal = DQUOTE *( dtext / strict-quoted-pair ) DQUOTE qspecial = "(" / ")" / ; same as specials except
"<" / ">" / ; "\" and DQUOTE quoted
"[" / "]" /
":" / ";" /
"@" / "\\" /
"," / "." /
"\" DQUOTE
no-fold-literal = "[" *( dtext / "\[" / "\]" / "\\" ) "]"
A msg-id MUST NOT contain any SP within any strict-quoted-pair. The The msg-id MUST NOT be more than 250 octets in length.
msg-id MUST NOT be more than 250 octets in length.
NOTE: The syntax ensures that a msg-id is restricted to pure NOTE: The restriction to strict-qtext ensures that no UTF8-
US-ASCII, and is thus a strict subset of that defined by [RFC xtra-char can appear. Msg-ids as defined here are a "normalized"
2822]. The exclusion of SP is to ensure compatibility with subset of those defined by [RFC 2822], ensuring that no string
existing software. The length restriction ensures that systems of characters is quoted unless strictly necessary (it must
which accept message identifiers as a parameter when retrieving contain at least one qspecial) and no single character is
an article (e.g. [NNTP]) can rely on a bounded length. Observe prefixed by a "\" in the form of a quoted-pair unless strictly
that msg-id includes the '<' and '>'. necessary, and moreover there is no possibility for WSP to
occur, whether quoted or not. The length restriction ensures
that systems which accept message identifiers as a parameter
when retrieving an article (e.g. [NNTP]) can rely on a bounded
length. Observe that msg-id includes the '<' and '>'.
Following the provisions of [RFC 2822], an agent generating an An agent generating an article's message identifier MUST ensure that
article's message identifier MUST ensure that it is unique and that it is unique (as also required in [RFC 2822]) and that it is NEVER
it is NEVER reused (either in Netnews or email). Moreover, even reused (either in Netnews or Email). Moreover, even though commonly
though commonly derived from the domain name of the originating site derived from the domain name of the originating site (and domain
(and domain names are case-insensitive), a message identifier MUST names are case-insensitive), a message identifier MUST NOT be altered
NOT be altered in any way during transport, or when copied (as into a in any way during transport, or when copied (as into a References-
References header), and thus a simple (case-sensitive) comparison of header), and thus a simple (case-sensitive) comparison of octets will
octets will always suffice to recognise that same message identifier always suffice to recognize that same message identifier wherever it
wherever it subsequently reappears. subsequently reappears.
NOTE: some old software may treat message identifiers that News Article Format May 2002
NOTE: These requirements are to be contrasted with those of the
un-normalized msg-ids defined by [RFC 2822], which may perfectly
legitimately become normalized (or vice versa) during transport
or copying in email systems.
NOTE: Some old software may treat message identifiers that
differ only in case within their id-right part as equivalent, differ only in case within their id-right part as equivalent,
and implementors of agents that generate message identifiers and implementors of agents that generate message identifiers
should be aware of this. should be aware of this.
5.4. Subject 5.4. Subject
The Subject header contains a short string identifying the topic of The Subject-header contains a short string identifying the topic of
the message. This is an inheritable header (4.2.2.2) to be copied the message. This is an inheritable header (4.2.5.2) to be copied
into the Subject header of any followup, in which case the new into the Subject-header of any followup, in which case the new
header-content SHOULD then default to the string "Re: " (a "back Subject-content SHOULD then default to the string "Re: " (a "back
News Article Format November 2001
reference") followed by the contents of the pure-subject of the reference") followed by the contents of the pure-subject of the
precursor. Any leading "Re: " in the pure-subject MUST be stripped. precursor. Any leading "Re: " in the pure-subject MUST be stripped.
Subject-content = [ back-reference ] pure-subject header =/ Subject-header
pure-subject = 1*( [FWS] utext ) Subject-header = "Subject" ":" SP Subject-content
Subject-content = [ [FWS] back-reference ] pure-subject
pure-subject = unstructured
back-reference = %x52.65.3A.20 back-reference = %x52.65.3A.20
; which is a case-sensitive "Re: " ; which is a case-sensitive "Re: "
The pure-subject MUST NOT begin with "Re: ". The pure-subject MUST NOT begin with "Re: ".
NOTE: The given syntax differs from that prescribed in [RFC NOTE: The given syntax differs from that prescribed in [RFC
2822] insofar as it does not permit a header content to be 2822] insofar as it does not permit a header content to be
completely empty, or to consist of WSP only (see remarks in completely empty, or to consist of WSP only (see remarks in
4.2.5 concerning undesirable headers). 4.2.6 concerning undesirable headers).
Followup agents MAY remove strings that are known to be used Followup agents MAY remove strings that are known to be used
erroneously as back-reference (such as "Re(2): ", "Re:", "RE: ", or erroneously as back-reference (such as "Re(2): ", "Re:", "RE: ", or
"Sv: ") from the Subject-content when composing the subject of a "Sv: ") from the Subject-content when composing the subject of a
followup and add a correct back-reference in front of the result. followup and add a correct back-reference in front of the result.
NOTE: that would be "SHOULD remove instances" except that we NOTE: that would be "SHOULD remove instances" except that we
cannot find a sufficiently robust and simple algorithm to do the cannot find a sufficiently robust and simple algorithm to do the
necessary natural language processing. necessary natural language processing.
skipping to change at page 30, line 41 skipping to change at page 33, line 60
reference. Specifically, a translation of "Re: " into a local reference. Specifically, a translation of "Re: " into a local
language or usage MUST NOT be used. language or usage MUST NOT be used.
NOTE: "Re" is an abbreviation for the Latin "In re", meaning "in NOTE: "Re" is an abbreviation for the Latin "In re", meaning "in
the matter of", and not an abbreviation of "Reference" as is the matter of", and not an abbreviation of "Reference" as is
sometimes erroneously supposed. sometimes erroneously supposed.
Agents SHOULD NOT depend on nor enforce the use of back references by Agents SHOULD NOT depend on nor enforce the use of back references by
followup agents. For compatibility with legacy news software the followup agents. For compatibility with legacy news software the
Subject-content of a control message (i.e. an article that also Subject-content of a control message (i.e. an article that also
contains a Control header) MAY start with the string "cmsg ", and contains a Control-header) MAY start with the string "cmsg ", and
News Article Format May 2002
non-control messages MUST NOT start with the string "cmsg ". See also non-control messages MUST NOT start with the string "cmsg ". See also
section 6.13. section 6.13.
5.4.1. Examples 5.4.1. Examples
In the following examples, please note that only "Re: " is mandated In the following examples, please note that only "Re: " is mandated
by this standard. "was: " is a convention used by many English- by this standard. "was: " is a convention used by many English-
speaking posters to signal a change in subject matter. Software speaking posters to signal a change in subject matter. Software
should be able to deduce this information from References. should be able to deduce this information from References-header.
Subject: Film at 11 Subject: Film at 11
Subject: Re: Film at 11 Subject: Re: Film at 11
Subject: Godwin's law considered harmful (was: Film at 11) Subject: Godwin's law considered harmful (was: Film at 11)
Subject: Godwin's law (was: Film at 11) Subject: Godwin's law (was: Film at 11)
Subject: Re: Godwin's law (was: Film at 11) Subject: Re: Godwin's law (was: Film at 11)
News Article Format November 2001
5.5. Newsgroups 5.5. Newsgroups
The Newsgroups header's content specifies the newsgroup(s) in which The Newsgroups-header's content specifies the newsgroup(s) in which
the article is intended to appear. It is an inheritable header the article is intended to appear. It is an inheritable header
(4.2.2.2) which then becomes the default Newsgroups header of any (4.2.5.2) which then becomes the default Newsgroups-header of any
followup, unless a Followup-To header is present to prescribe followup, unless a Followup-To-header is present to prescribe
otherwise. otherwise. Articles MUST NOT be passed between relaying agents or to
serving agents unless the sending agent has been configured to supply
and the receiving agent to receive at least one of the newsgroup-
names in the Newsgroups-header.
References to "Unicode" or "the latest version of the Unicode References to "Unicode" or "the latest version of the Unicode
Standard" mean [UNICODE 3.1] or any standard that supersedes it. That Standard" mean [UNICODE 3.1] or any standard that supersedes it. That
document contains guarantees of strict future upwards compatibility document contains guarantees of strict future upwards compatibility
(e.g. no character will be removed or change classification). (e.g. no character will be removed or change classification).
Implementors should be aware that currently unassigned code points Implementors should be aware that currently unassigned code points
(Unicode category Cn) may become valid characters in future versions (Unicode category Cn) may become valid characters in future versions
of Unicode. Since the poster of an article might have access to a of Unicode. Since the poster of an article might have access to a
newer version of that standard, relaying and serving agents MUST newer version of that standard, relaying and serving agents MUST
accept such characters, but posting agents (and indeed all agents) accept such characters, but posting agents (and indeed all agents)
MUST NOT generate them (though they might well follow up to MUST NOT generate them (though they might well follow up to
newsgroup-names containing them). newsgroup-names containing them).
Newsgroups-content = newsgroup-name header =/ Newsgroups-header
*( *FWS ng-delim *FWS newsgroup-name ) Newsgroups-header = "Newsgroups" ":" SP Newsgroups-content
*FWS *( ";" other-parameter )
Newsgroups-content = [FWS] newsgroup-name
*( [FWS] ng-delim [FWS] newsgroup-name )
[FWS]
newsgroup-name = component *( "." component ) newsgroup-name = component *( "." component )
component = 1*component-glyph component = 1*component-glyph
ng-delim = "," ng-delim = ","
component-glyph = combiner-base *combiner-mark component-glyph = combiner-base *combiner-mark
combiner-base = combiner-ASCII / combiner-extended combiner-base = combiner-ASCII / combiner-extended
combiner-ASCII = "0"-"9" / %x41-5A / %x61-7A / "+" / "-" / "_" combiner-ASCII = DIGIT / ALPHA / "+" / "-" / "_"
News Article Format May 2002
combiner-extended = <any character with a Unicode code value of combiner-extended = <any character with a Unicode code value of
0080 or greater and a combining class of 0, 0080 or greater and a combining class of 0,
but excluding any character in Unicode but excluding any character in Unicode
categories Cc, Cf, Cs, Zs, Zl, and Zp> categories Cc, Cf, Cs, Zs, Zl, and Zp>
combiner-mark = <any character with a Unicode code value of combiner-mark = <any character with a Unicode code value of
0080 or greater and a combining class other 0080 or greater and a combining class other
than 0> than 0>
NOTE: the excluded characters are control characters (Cc), NOTE: the excluded characters are control characters (Cc),
format control characters (Cf), surrogates (Cs), and separators format control characters (Cf), surrogates (Cs), and separators
skipping to change at page 32, line 5 skipping to change at page 35, line 29
his screen, though it might be transmitted as several actual his screen, though it might be transmitted as several actual
characters (e.g. q-circumflex is two characters). Note also characters (e.g. q-circumflex is two characters). Note also
that, in some writing schemes, several component-glyphs will that, in some writing schemes, several component-glyphs will
merge into one visible object of variable size. merge into one visible object of variable size.
Each component MUST be invariant under Unicode normalization NFKC Each component MUST be invariant under Unicode normalization NFKC
(cf. the weaker normalization requirement for other headers in (cf. the weaker normalization requirement for other headers in
section 4.4.1 which specified no more than normalization NFC, and see section 4.4.1 which specified no more than normalization NFC, and see
also the explanatory NOTE in that section). also the explanatory NOTE in that section).
News Article Format November 2001
NOTE: As a result of of this restriction, a name has only one NOTE: As a result of of this restriction, a name has only one
valid form. Implementations can assume that a straight valid form. Implementations can assume that a straight
comparison of characters or octets is sufficient to compare two comparison of characters or octets is sufficient to compare two
newsgroup-names. newsgroup-names.
The requirement that names be invariant under NFKC, rather than The requirement that names be invariant under NFKC, rather than
NFC, means that all characters with a "compatibility NFC, means that all characters with a "compatibility
decomposition" are forbidden (Unicode provides the property decomposition" are forbidden (Unicode provides the property
"NFKC_NO" to make this test easier). The effect is to exclude "NFKC_NO" to make this test easier). The effect is to exclude
variant forms of characters, such as superscripts and variant forms of characters, such as superscripts and
subscripts, wide and narrow forms, font variants, encircled subscripts, wide and narrow forms, font variants, encircled
forms, ligatures, and so on, as their use could cause confusion. forms, ligatures, and so on, as their use could cause confusion.
There is insufficient experience in this area to determine There is insufficient experience in this area to determine
whether this is the right long-term solution. Implementers whether this is the right long-term solution. Implementors
should therefore be aware that a future version of this standard should therefore be aware that a future version of this standard
might reduce the requirement in the direction of NFC as opposed might reduce the requirement in the direction of NFC as opposed
to NFKC. to NFKC.
NOTE: An implementation is not required to apply NFKC, or any NOTE: An implementation is not required to apply NFKC, or any
other normalization, to newsgroup names. Only agencies that other normalization, to newsgroup names. Only agencies that
create new groups need to be careful to obey this restriction create new groups need to be careful to obey this restriction
(7.2.1). However, if a posting agent neglects to normalize a (7.2.1). However, if a posting agent neglects to normalize a
newsgroup-name entered manually, this may lead to the user newsgroup-name entered manually, this may lead to the user
posting to a non-existent group without understanding why. posting to a non-existent group without understanding why.
Newsgroup-names containing non-ASCII characters MUST be encoded in Newsgroup-names containing non-ASCII characters MUST be encoded in
UTF-8 and not according to [RFC 2047]. UTF-8 and not according to [RFC 2047].
News Article Format May 2002
Components beginning with underline ("_") are reserved for use by Components beginning with underline ("_") are reserved for use by
future versions of this standard and MUST NOT occur in newsgroup future versions of this standard and MUST NOT occur in newsgroup
names (whether in Newsgroup headers or in newgroup control messages names (whether in Newsgroups-headers or in newgroup control messages
(7.2.1)). However, such names MUST be accepted. (7.2.1)). However, such names MUST be accepted.
Components beginning with "+" or "-" are reserved for use by Components beginning with "+" or "-" are reserved for use by
implementations and MUST NOT occur in newsgroup names (whether in implementations and MUST NOT occur in newsgroup names (whether in
Newsgroup headers or in newgroup control messages). Implementors may Newsgroups-headers or in newgroup control messages). Implementors may
assume that this rule will not change in any future version of this assume that this rule will not change in any future version of this
standard. standard.
NOTE: For example, implementors may safely use leading "+" and NOTE: For example, implementors may safely use leading "+" and
"-" to "escape" other entities within something that looks like "-" to "escape" other entities within something that looks like
a newsgroup-name. a newsgroup-name.
Agencies responsible for the administration of particular hierarchies Agencies responsible for the administration of particular hierarchies
Ought to place additional restrictions on the characters they allow Ought to place additional restrictions on the characters they allow
in newsgroup-names within those hierarchies (such as to accord with in newsgroup-names within those hierarchies (such as to accord with
the languages commonly used within those hierarchies, or to avoid the languages commonly used within those hierarchies, or to avoid
perceived ambiguities pertinent to those languages). Where there is perceived ambiguities pertinent to those languages). Where there is
no such specific policy, the following restrictions SHOULD be applied no such specific policy, the following restrictions SHOULD be applied
to newsgroup names. to newsgroup names.
News Article Format November 2001
NOTE: These restrictions are intended to reflect existing NOTE: These restrictions are intended to reflect existing
practice, with some additions to accomodate foreseeable practice, with some additions to accommodate foreseeable
enhancements, and are intended both to avoid certain technical enhancements, and are intended both to avoid certain technical
difficulties and to avoid unnecessary confusion. It may well be difficulties and to avoid unnecessary confusion. It may well be
that experience will allow future extensions to this standard to that experience will allow future extensions to this standard to
relax some or all of these restrictions. relax some or all of these restrictions.
The specific restrictions (to be applied in the absence of The specific restrictions (to be applied in the absence of
established policies to the contrary) are: established policies to the contrary) are:
1. The following characters are forbidden, subject to the comments 1. The following characters are forbidden, subject to the comments
and notes at the end of the list: and notes at the end of the list:
skipping to change at page 33, line 35 skipping to change at page 37, line 4
characters in category Pd (Punctuation, Dash) [4][5] characters in category Pd (Punctuation, Dash) [4][5]
characters in category Pe (Punctuation, Close) [4] characters in category Pe (Punctuation, Close) [4]
characters in category Pf (Punctuation, Final quote) [4] characters in category Pf (Punctuation, Final quote) [4]
characters in category Pi (Punctuation, Initial quote) [4] characters in category Pi (Punctuation, Initial quote) [4]
characters in category Po (Punctuation, Other) [4] characters in category Po (Punctuation, Other) [4]
characters in category Ps (Punctuation, Open) [4] characters in category Ps (Punctuation, Open) [4]
characters in category Sc (Symbol, Currency) [4] characters in category Sc (Symbol, Currency) [4]
characters in category Sk (Symbol, Modifier) [4] characters in category Sk (Symbol, Modifier) [4]
characters in category Sm (Symbol, Math) [4][5] characters in category Sm (Symbol, Math) [4][5]
characters in category So (Symbol, Other) [4] characters in category So (Symbol, Other) [4]
News Article Format May 2002
[1] As new characters are added to Unicode, the code point moves [1] As new characters are added to Unicode, the code point moves
from category Cn to some other category. As stated above, from category Cn to some other category. As stated above,
implementors should be prepared for this. implementors should be prepared for this.
[2] Specific private use characters can be used within a hierarchy [2] Specific private use characters can be used within a hierarchy
or co-operating subnet that has agreed meanings for them. or co-operating subnet that has agreed meanings for them.
[3] Traditionally, newsgroup-names have been written in lowercase. [3] Traditionally, newsgroup-names have been written in lowercase.
Posting agents Ought Not to convert uppercase or titlecase Posting agents Ought Not to convert uppercase or titlecase
skipping to change at page 34, line 4 skipping to change at page 37, line 31
[5] Although the characters "+" and "-" are within categories Pd [5] Although the characters "+" and "-" are within categories Pd
and Sm, they are not forbidden. and Sm, they are not forbidden.
2. A component name is forbidden to consist entirely of digits. 2. A component name is forbidden to consist entirely of digits.
NOTE: This requirement was in [RFC 1036] but nevertheless NOTE: This requirement was in [RFC 1036] but nevertheless
several such groups have appeared in practice and implementors several such groups have appeared in practice and implementors
should be prepared for them. A common implementation technique should be prepared for them. A common implementation technique
uses each component as the name of a directory and uses numeric uses each component as the name of a directory and uses numeric
News Article Format November 2001
filenames for each article within a group. Such an filenames for each article within a group. Such an
implementation needs to be careful when this could cause a clash implementation needs to be careful when this could cause a clash
(e.g. between article 123 of group xxx.yyy and the directory for (e.g. between article 123 of group xxx.yyy and the directory for
group xxx.yyy.123). group xxx.yyy.123).
[Open issue a number of people think this should not be a default
requirement but simply be a NOTE; wording for such is further down.]
3. A component is limited to 30 component-glyphs and a newsgroup-name 3. A component is limited to 30 component-glyphs and a newsgroup-name
to 71 component-glyphs. Whilst there is no longer any technical to 71 component-glyphs. Whilst there is no longer any technical
reason to limit the length of a component (formerly, it was reason to limit the length of a component (formerly, it was
limited to 14 octets) nor of a newsgroup-name, it should be noted limited to 14 octets) nor of a newsgroup-name, it should be noted
that these names are also used in the newsgroups line (7.2.1.2) that these names are also used in the newsgroups line (7.2.1.2)
where an overall policy limit applies and, moreover, excessively where an overall policy limit applies and, moreover, excessively
long names can be exceedingly inconvenient in practical use. long names can be exceedingly inconvenient in practical use.
Serving and relaying agents MUST accept any newsgroup-name that meets Serving and relaying agents MUST accept any newsgroup-name that meets
skipping to change at page 34, line 35 skipping to change at page 38, line 5
posting agents MAY attempt to correct them (but only with the posting agents MAY attempt to correct them (but only with the
explicit agreement of the poster for anything more than NFC or NFKC explicit agreement of the poster for anything more than NFC or NFKC
normalization). However, because of the large and changing tables normalization). However, because of the large and changing tables
required to do these checks and corrections throughout the whole of required to do these checks and corrections throughout the whole of
Unicode, this standard does not require them to do so. Rather, the Unicode, this standard does not require them to do so. Rather, the
onus is placed on those who create new newsgroups (7.2.1) to check onus is placed on those who create new newsgroups (7.2.1) to check
the mandatory requirements, to consider the effects of relaxing the the mandatory requirements, to consider the effects of relaxing the
other restrictions, and to consider how all this may affect other restrictions, and to consider how all this may affect
propagation of the group. propagation of the group.
News Article Format May 2002
Since future extensions to this standard and the Unicode standard, Since future extensions to this standard and the Unicode standard,
including a possible relaxation of the NFKC normalization, plus any including a possible relaxation of the NFKC normalization, plus any
relaxations of the default restrictions introduced by specific relaxations of the default restrictions introduced by specific
hierarchies might invalidate some such checks, warnings, and hierarchies might invalidate some such checks, warnings, and
adjustments, implementations MUST incorporate means to disable them. adjustments, implementations MUST incorporate means to disable them.
[Alternative text for Open issue] NOTE: The newsgroup-name as encoded in UTF-8 should be regarded as
the canonical form. Reading agents may convert it to whatever
NOTE: Components composed entirely of digits were forbidden by character set they are able to display and serving agents may
[RFC 1036] but have nevertheless been used in practice, and are possibly need to convert it to some form more suitable as a
therefore permitted by this specification. A common filename. Simple algorithms for both kinds of conversion are
implementation technique uses each component as the name of a readily available. Observe that the syntax does not allow
directory and uses numeric filenames for each article within a comments within the Newsgroups-header; this is to simplify
group. Such an implementation needs to be careful when this processing by relaying and serving agents which have a requirement
could cause a clash (e.g. between article 123 of group xxx.yyy to process this header extremely rapidly.
and the directory for group xxx.yyy.123).
[Open issue: delete the above text if we retain the default requirement
above.]
NOTE: The newsgroup-name as encoded in UTF-8 should be regarded
as the canonical form. Reading agents may convert it to whatever
character set they are able to display (see 4.4.1) and serving
agents may possibly need to convert it to some form more
suitable as a filename. Simple algorithms for both kinds of
conversion are readily available. Observe that the syntax does
News Article Format November 2001
not allow comments within the Newsgroups header; this is to
simplify processing by relaying and serving agents which have a
requirement to process this header extremely rapidly.
The inclusion of folding white space within a Newsgroups-content is a The inclusion of folding white space within a Newsgroups-content is a
newly introduced feature in this standard. It MUST be accepted by all newly introduced feature in this standard. It MUST be accepted by all
conforming implementations (relaying agents, serving agents and conforming implementations (relaying agents, serving agents and
reading agents). Posting agents should be aware that such postings reading agents). Posting agents should be aware that such postings
may be rejected by overly-critical old-style relaying agents. When a may be rejected by overly-critical old-style relaying agents. When a
sufficient number of relaying agents are in conformance, posting sufficient number of relaying agents are in conformance, posting
agents SHOULD generate such whitespace in the form of <CRLF WS> so as agents SHOULD generate such whitespace in the form of <CRLF WSP> so
to keep the length of lines in the relevant headers (notably as to keep the length of lines in the relevant headers (notably
Newsgroups and Followup-To) to no more than than 79 characters (or Newsgroups and Followup-To) to no more than than 79 characters (or
other agreed policy limit - see 4.5). Before such critical mass other agreed policy limit - see 4.5). Before such critical mass
occurs, injecting agents MAY reformat such headers by removing occurs, injecting agents MAY reformat such headers by removing
whitespace inserted by the posting agent, but relaying agents MUST whitespace inserted by the posting agent, but relaying agents MUST
NOT do so. NOT do so.
Posters SHOULD use only the names of existing newsgroups in the Posters SHOULD use only the names of existing newsgroups in the
Newsgroups header. However, it is legitimate to cross-post to Newsgroups-header. However, it is legitimate to cross-post to a
newsgroup(s) which do not exist on the posting agent's host, provided newsgroup(s) which do not exist on the posting agent's host, provided
that at least one of the newsgroups DOES exist there, and followup that at least one of the newsgroups DOES exist there, and followup
agents SHOULD accept this (posting agents MAY accept it, but Ought at agents SHOULD accept this (posting agents MAY accept it, but Ought at
least to alert the poster to the situation and request confirmation). least to alert the poster to the situation and request confirmation).
Relaying agents MUST NOT rewrite Newsgroups headers in any way, even Relaying agents MUST NOT rewrite Newsgroups-headers in any way, even
if some or all of the newsgroups do not exist on the relaying agent's if some or all of the newsgroups do not exist on the relaying agent's
host. Serving agents MUST NOT create new newsgroups simply because an host. Serving agents MUST NOT create new newsgroups simply because an
unrecognised newsgroup-name occurs in a Newsgroups header (see 7.2.1 unrecognized newsgroup-name occurs in a Newsgroups-header (see 7.2.1
for the correct method of newsgroup creation). for the correct method of newsgroup creation).
The Newsgroups header is intended for use in Netnews articles rather The Newsgroups-header is intended for use in Netnews articles rather
than in mail messages. It MAY be used in a mail message to indicate than in email messages. It MAY be used in an email message to
that it is a copy also posted to the listed newsgroups, but it SHOULD indicate that it is a copy also posted to the listed newsgroups, in
NOT be used in a mail-only reply to a Netnews article (thus the which case the inclusion of a Posted-And-Mailed header (6.9) would
"inheritable" property of this header applies only to followups to a also be appropriate. However, it SHOULD NOT be used in an email-only
newsgroup, and not to followups to the poster). Moreover, if a reply to a Netnews article (thus the "inheritable" property of this
newsgroup-name contains any non-ASCII character, it MAY be encoded header applies only to followups to a newsgroup, and not to followups
using the mechanism defined in [RFC 2047] when sent by mail but, if to the poster). Moreover, if a newsgroup-name contains any non-ASCII
it is subsequently returned to the Netnews environment, it MUST then character, it MAY be encoded using the mechanism defined in [RFC
be re-encoded into UTF-8. 2047] when sent by email (for which purpose the newsgroup-name SHOULD
be treated as an encoded-word) but, if it is subsequently returned to
the Netnews environment, it MUST then be re-encoded into UTF-8. See
News Article Format May 2002
also the further discussion in section 8.8.1.
5.5.1. Forbidden newsgroup names 5.5.1. Forbidden newsgroup names
The following forms of newsgroup-name MUST NOT be used except for the The following forms of newsgroup-name MUST NOT be used except for the
specific purposes indicated: specific purposes indicated:
o Newsgroup-names having only one component. These are reserved for o Newsgroup-names having only one component. These are reserved for
newsgroups whose propagation is restricted to a single host or newsgroups whose propagation is restricted to a single host or
local network, and for pseudo-newsgroups such as "poster" (which local network, and for pseudo-newsgroups such as "poster" (which
has special meaning in the Followup-To header - see section 6.7), has special meaning in the Followup-To-header - see section 6.7),
"junk" (often used by serving agents), "control" (likewise) "junk" (often used by serving agents), and "control" (likewise);
News Article Format November 2001
o Any newsgroup-name beginning with "control." (used as pseudo- o Any newsgroup-name beginning with "control." (used as pseudo-
newsgroups by many serving agents) newsgroups by many serving agents);
o Any newsgroup-name containing the component "ctl" (likewise) o Any newsgroup-name containing the component "ctl" (likewise);
o "to" or any newsgroup-name beginning with "to." (reserved for the o "to" or any newsgroup-name beginning with "to." (reserved for the
ihave/sendme protocol described in section 7.4, and for test ihave/sendme protocol described in section 7.4, and for test
messages sent on an essentially point-to-point basis) messages sent on an essentially point-to-point basis);
o Any newsgroup-name beginning with "example." (reserved for o Any newsgroup-name beginning with "example." (reserved for
examples in this and other standards) examples in this and other standards);
o Any newsgroup-name containing the component "all" (because this o Any newsgroup-name containing the component "all" (because this
is used as a wildcard in some implementations) is used as a wildcard in some implementations).
A newsgroup-name SHOULD NOT appear more than once in the Newsgroups A newsgroup-name SHOULD NOT appear more than once in the Newsgroups-
header. The order of newsgroup names in the Newsgroups header is not header. The order of newsgroup names in the Newsgroups-header is not
significant, except for determining which moderator to send the significant, except for determining which moderator to send the
article to if more than one of the groups is moderated (see 8.2). article to if more than one of the groups is moderated (see 8.2).
5.6. Path 5.6. Path
The Path header shows the route taken by a message since its entry The Path-header shows the route taken by a message since its entry
into the Netnews system. It is a variant header (4.2.2.3), each agent into the Netnews system. It is a variant header (4.2.5.3), each agent
that processes an article being required to add one (or more) entries that processes an article being required to add one (or more) entries
to it. This is primarily to enable relaying agents to avoid sending to it. This is primarily to enable relaying agents to avoid sending
articles to sites already known to have them, in particular the site articles to sites already known to have them, in particular the site
they came from, and additionally to permit tracing the route articles they came from, and additionally to permit tracing the route articles
take in moving over the network, and for gathering Usenet statistics. take in moving over the network, and for gathering Usenet statistics.
Finally the presence of a '%' delimiter in the Path header can be Finally the presence of a '%' path-delimiter in the Path-header can
used to identify an article injected in conformance with this be used to identify an article injected in conformance with this
standard. standard.
5.6.1. Format 5.6.1. Format
Path-content = *( path-identity [FWS] delimiter [FWS] ) header =/ Path-header
tail-entry *FWS Path-header = "Path" ":" SP Path-content
path-identity = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" ) *( ";" other-parameter )
delimiter = "/" / "?" / "%" / "," / "!" Path-content = [FWS]
*( path-identity [FWS] path-delimiter [FWS] )
tail-entry [FWS]
path-identity = ( ALPHA / DIGIT )
*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
path-delimiter = "/" / "?" / "%" / "," / "!"
tail-entry = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" ) tail-entry = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
News Article Format May 2002
NOTE: A Path-content will inevitably contain at least one path- NOTE: A Path-content will inevitably contain at least one path-
identity, except possibly in the case of a proto-article that identity, except possibly in the case of a proto-article that
has not yet been injected onto the network. has not yet been injected onto the network.
NOTE: Observe that the syntax does not allow comments within the NOTE: Observe that the syntax does not allow comments within the
Path header; this is to simplify processing by relaying and Path-header; this is to simplify processing by relaying and
injecting agents which have a requirement to process this header injecting agents which have a requirement to process this header
extremely rapidly. extremely rapidly.
A relaying agent SHOULD NOT pass an article to another relaying agent A relaying agent SHOULD NOT pass an article to another relaying agent
whose path-identity (or some known alias thereof) already appears in whose path-identity (or some known alias thereof) already appears in
the Path-content. Since the comparison may be either case sensitive the Path-content. Since the comparison may be either case sensitive
or case insensitive, relaying agents SHOULD NOT generate a name which or case insensitive, relaying agents SHOULD NOT generate a name which
differs from that of another site only in terms of case. differs from that of another site only in terms of case.
News Article Format November 2001
A relaying agent MAY decline to accept an article if its own path- A relaying agent MAY decline to accept an article if its own path-
identity is already present in the Path-content or if the Path- identity is already present in the Path-content or if the Path-
content contains some path-identity whose articles the relaying agent content contains some path-identity whose articles the relaying agent
does not want, as a matter of local policy. does not want, as a matter of local policy.
NOTE: This last facility is sometimes used to detect and decline NOTE: This last facility is sometimes used to detect and decline
control messages (notably cancel messages) which have been control messages (notably cancel messages) which have been
deliberately seeded with a path-identity to be "aliased out" by deliberately seeded with a path-identity to be "aliased out" by
sites not wishing to act upon them. sites not wishing to act upon them.
5.6.2. Adding a path-identity to the Path header 5.6.2. Adding a path-identity to the Path-header
When an injecting, relaying or serving agent receives an article, it When an injecting, relaying or serving agent receives an article, it
MUST prepend its own path-identity followed by a delimiter to the MUST prepend its own path-identity followed by a path-delimiter to
beginning of the Path-content. In addition, it SHOULD then add CRLF the beginning of the Path-content. In addition, it SHOULD then add
and WSP if it would otherwise result in a line longer than 79 CRLF and WSP if it would otherwise result in a line longer than 79
characters. characters.
The path-identity added MUST be unique to that agent. To this end it The path-identity added MUST be unique to that agent. To this end it
SHOULD be one of: SHOULD be one of:
1. A fully qualified domain name (FQDN) associated (by the Internet 1. A fully qualified domain name (FQDN) associated (by the Internet
DNS service [RFC 1034]) with an A record, which SHOULD identify DNS service [RFC 1034]) with an A record, which SHOULD identify
the actual machine prepending this path-identity. Ideally, this the actual machine prepending this path-identity. Ideally, this
FQDN should also be "mailable" in the sense that it enables the FQDN should also be "mailable" (see below).
construction of a valid E-mail address of the form "usenet@<FQDN>"
or "news@<FQDN>" [RFC 2142] whereby the administrators of that
agent may be reached.
2. A fully qualified domain name (FQDN) associated (by the Internet 2. A fully qualified domain name (FQDN) associated (by the Internet
DNS service) with an MX record which MUST then enable the DNS service) with an MX record, which MUST be "mailable".
construction of a valid E-mail address of the form "usenet@<FQDN>"
or "news@<FQDN>" whereby the administrators of that agent may be
reached.
3. An arbitrary name believed to be unique and registered at least 3. An arbitrary name believed to be unique and registered at least
with all sites immediately downstream from the given site. with all sites immediately downstream from the given site.
4. An encoding of an IP address - <dotted-quad> [RFC 820] or <ipv6- 4. An encoding of an IP address - <IPv4address> or <IPv6address> [RFC
numeric> [RFC 2373] (the requirement to be able to use an <ipv6- 2373] (the requirement to be able to use an <IPv6address> is the
numeric> is the reason for including ':' as an allowed character reason for including ':' as an allowed character within a path-
within a path-identity). identity).
News Article Format May 2002
The FQDN of an agent is "mailable" if the administrators of that
agent can be reached by email using both of the forms "usenet@<FQDN>"
and "news@<FQDN>", in conformity with [RFC 2142].
Of the above options, nos. 1 to 3 are much to be preferred, unless Of the above options, nos. 1 to 3 are much to be preferred, unless
there are strong technical reasons dictating otherwise. In there are strong technical reasons dictating otherwise. In
particular, the injecting agent's path-identity MUST, as a special particular, the injecting agent's path-identity MUST, as a special
case, be an FQDN mailable address in the sense defined under option case, be an FQDN as in option 1 or option 2, and MUST be mailable.
1, or with an associated MX record as in option 2. Additionally, in the case of an injecting agent offering its services
to the general public, its administrators MUST also be reachable
using the form "abuse@<FQDN>" UNLESS a more specific complaints
address has been specified in a Complaints-To-header (6.20).
The injecting agent's path-identity MUST be followed by the special The injecting agent's path-identity MUST be followed by the special
delimiter '%' which serves to separate the pre-injection and post- path-delimiter '%' which serves to separate the pre-injection and
injection regions of the Path-content (see 5.6.3). post-injection regions of the Path-content (see 5.6.3).
News Article Format November 2001
In the case of a relaying or serving agent, the delimiter is chosen In the case of a relaying or serving agent, the path-delimiter is
as follows. When such an agent receives an article, it MUST chosen as follows. When such an agent receives an article, it MUST
establish the identity of the source and compare it with the leftmost establish the identity of the source and compare it with the leftmost
path-identity of the Path-content. If it matches, a '/' should be path-identity of the Path-content. If it matches, a '/' should be
used as the delimiter when prepending the agent's own path-identity. used as the path-delimiter when prepending the agent's own path-
If it does not match then the agent should prepend two entries to the identity. If it does not match then the agent should prepend two
Path-content; firstly the true established path-identity of the entries to the Path-content; firstly the true established path-
source followed by a '?' delimiter, and then, to the left of that, identity of the source followed by a '?' path-delimiter, and then,
the agent's own path-identity followed by a '/' delimiter as usual. to the left of that, the agent's own path-identity followed by a '/'
This prepending of two entries SHOULD NOT be done if the provided and path-delimiter as usual. This prepending of two entries SHOULD NOT
established identities match. be done if the provided and established identities match.
Any method of establishing the identity of the source may be used Any method of establishing the identity of the source may be used
(but see 5.6.5 below), with the consideration that, in the event of (but see 5.6.5 below), with the consideration that, in the event of
problems, the agent concerned may be called upon to justify it. problems, the agent concerned may be called upon to justify it.
NOTE: The use of the '%' delimiter marks the position of the NOTE: The use of the '%' path-delimiter marks the position of
injecting agent in the chain. In normal circumstances there the injecting agent in the chain. In normal circumstances there
should therefore be only one `%` delimiter present, and should therefore be only one `%` path-delimiter present, and
injecting agents MAY choose to reject proto-articles with a '%' injecting agents MAY choose to reject proto-articles with a '%'
already in them. If, for whatever reason, more than one '%' is already in them. If, for whatever reason, more than one '%' is
found, then the path-identity in front of the leftmost '%' is to found, then the path-identity in front of the leftmost '%' is to
be regarded as the true injecting agent. be regarded as the true injecting agent.
5.6.3. The tail-entry 5.6.3. The tail-entry
For historical reasons, the tail-entry (i.e. the rightmost entry in For historical reasons, the tail-entry (i.e. the rightmost entry in
the Path-content) is regarded as a "user name", and therefore MUST the Path-content) is regarded as a "user name", and therefore MUST
NOT be interpreted as a site through which the article has already NOT be interpreted as a site through which the article has already
passed. Moreover, the Path-content is not an E-mail address and MUST passed. Moreover, the Path-content as a whole is not an email address
NOT be used to contact the poster. Posting and/or injecting agents and MUST NOT be used to contact the poster. Posting and/or injecting
MAY place any string here. When it is not an actual user name, the agents MAY place any string here. When it is not an actual user
string "not-for-mail" is often used, but in fact a simple "x" would name, the string "not-for-mail" is often used, but in fact a simple
be sufficient. "x" would be sufficient.
News Article Format May 2002
Often this field will be the only entry in the region (known as the Often this field will be the only entry in the region (known as the
pre-injection region) after the '%', although there may be entries pre-injection region) after the '%', although there may be entries
corresponding to machines traversed between the posting agent and the corresponding to machines traversed between the posting agent and the
injecting agent proper. In particular, injecting agents that receive injecting agent proper. In particular, injecting agents that receive
articles from many sources MAY include information to establish the articles from many sources MAY include information to establish the
circumstances of the injection such as the identity of the source circumstances of the injection such as the identity of the source
machine (especially if the Injector-Info header (6.19) is absent). machine (especially if the Injector-Info-header (6.19) is absent).
Any such inclusion SHOULD NOT conflict with any genuine site Any such inclusion SHOULD NOT conflict with any genuine site
identifier. The '!' delimiter may be used freely within the pre- identifier. The '!' path-delimiter may be used freely within the
injection region, although '/' and '?' are also appropriate if used pre-injection region, although '/' and '?' are also appropriate if
correctly. used correctly.
5.6.4. Delimiter Summary
A summary of the various delimiters. The name immediately to the left 5.6.4. Path-Delimiter Summary
of the delimiter is always that of the machine which added the
delimiter.
News Article Format November 2001 A summary of the various path-delimiters. The name immediately to the
left of the path-delimiter is always that of the machine which added
the path-delimiter.
'/' The name immediately to the right is known to be the identity of '/' The name immediately to the right is known to be the identity of
the machine from which the article was received (either because the machine from which the article was received (either because
the entry was made by that machine and we have verified it, or the entry was made by that machine and we have verified it, or
because we have added it ourselves). because we have added it ourselves).
'?' The name immediately to the right is the claimed identity of the '?' The name immediately to the right is the claimed identity of the
machine from which the article was received, but we were unable machine from which the article was received, but we were unable
to verify it (and have prepended our own view of where it came to verify it (and have prepended our own view of where it came
from, and then a '/'). from, and then a '/').
skipping to change at page 39, line 30 skipping to change at page 42, line 48
double-injection (see 8.2.2). double-injection (see 8.2.2).
'!' The name immediately to the right is unverified. The presence of '!' The name immediately to the right is unverified. The presence of
a '!' to the left of the '%' indicates that the identity to the a '!' to the left of the '%' indicates that the identity to the
left is that of an old-style system not conformant with this left is that of an old-style system not conformant with this
standard. standard.
',' Reserved for future use, treat as '/'. ',' Reserved for future use, treat as '/'.
Other Other
Old software may possibly use other delimiters, which should be Old software may possibly use other path-delimiters, which should
treated as '!'. But note in particular that ':', '-' and '_' are be treated as '!'. But note in particular that ':', '-' and '_'
components of names, not delimiters, and FWS on its own MUST NOT are components of names, not path-delimiters, and FWS on its own
be used as the sole delimiter. MUST NOT be used as the sole path-delimiter.
NOTE: Old Netnews relaying and injecting programs almost all NOTE: Old Netnews relaying and injecting agents almost all
delimit Path entries with the '!' delimiter, and these entries delimit Path entries with a '!', and these entries are not
are not verified. As such, the presence of '%' as a delimiter verified. The presence of '%' indicates that the article was
will indicate that the article was injected by software injected by software conforming to this standard, and the
conforming to this standard, and the presence of '!' as a presence of '!' to the left of a '%' indicates that the message
delimiter to the left of a '%' will indicate that the message
passed through systems developed prior to this standard. It is passed through systems developed prior to this standard. It is
anticipated that relaying agents will reject articles in the old anticipated that relaying agents will reject articles in the old
style once this new standard has been widely adopted. style once this new standard has been widely adopted.
News Article Format May 2002
5.6.5. Suggested Verification Methods 5.6.5. Suggested Verification Methods
The following approaches for common transports are suggested in order It is preferable to verify the claimed path-identity against the
to meet a site's verification obligations. They are not required, but source than to make routine use of the '?' path-delimiter, with
following them should avoid the necessity for wasteful double-entry consequential wasteful double-entry Path additions.
Path additions.
If the incoming article arrives through some TCP/IP protocol such as If the incoming article arrives through some TCP/IP protocol such as
NNTP, the IP address of the source will be known, and will likely NNTP, the IP address of the source will be known, and will likely
already have been checked against a list of known FQDNs or IP already have been checked against a list of known FQDNs, IP
addresses that the receiving site has agreed to peer with (this will addresses, or other registered aliases that the receiving site has
have involved a DNS lookup of a known FQDN, following CNAME chains as agreed to peer with.
required, to find an A record containing that source IP).
News Article Format November 2001
1. Where the path-identity is an FQDN (or even an arbitrary name
starting with a '.') it is now a simple matter to check that it is
the proper FQDN for the source, or some known registered alias
thereof. Alternatively, where the FQDN in the path-identity has an
associated A record, an immediate DNS lookup as above can be used
to verify it.
2. Where the path-identity is an encoding of an IP address which does
not immediately match the known IP address of the source, a
reverse-DNS (in-addr.arpa PTR record) lookup may be done on the
provided address, followed by a regular DNS "A" record lookup on
the returned name. There may be A records for several IP
addresses, of which one should match the path-identity and another
should match the source.
3. If the path-identity fails to match any known alias for the source Since the source host may have several IP addresses, checking the
(requiring the insertion of an extra path-identity for the true claimed FQDN or IP address against the source IP, or finding a
source followed by a '?'), simply doing a reverse DNS (PTR) lookup suitable FQDN to report with a '?' path-delimiter, may involve
on the source IP address is not sufficient to generate the true several DNS lookups, following CNAME chains as required. Note that
FQDN. The returned name must be mapped back to A records to assure any reverse DNS lookup that is involved needs to be confirmed by a
it matches the source's IP address. forward one.
If the incoming article arrives through some other protocol, such as If the incoming article arrives through some other protocol, such as
UUCP, that protocol MUST include a means of verifying the source UUCP, that protocol MUST include a means of verifying the source
site. In UUCP implementations, commonly each incoming connection has site. In UUCP implementations, commonly each incoming connection has
a unique login name and password, and that login name (or some alias a unique login name and password, and that login name (or some alias
registered for it) would be expected as the path-identity. registered for it) would be expected as the path-identity.
[The above description may still contain more detail that we would wish.
My aim so far was to retain everything in Brad's original, but expressed
in a more palatable manner. We can now decide how much of it we want to
keep.]
5.6.6. Example 5.6.6. Example
Path: foo.isp.example/ Path: foo.isp.example/
foo-server/bar.isp.example?10.123.12.2/old.site.example! foo-server/bar.isp.example?10.123.12.2/old.site.example!
barbaz/baz.isp.example%dialup123.baz.isp.example!x barbaz/baz.isp.example%dialup123.baz.isp.example!x
NOTE: That article was injected into the news stream by NOTE: That article was injected into the news stream by
baz.isp.example (complaints may be addressed to baz.isp.example (complaints may be addressed to
usenet@baz.isp.example). The injector has taken care to record abuse@baz.isp.example). The injector has taken care to record
that it got it from dialup123.baz.isp.example. "x" is the that it got it from dialup123.baz.isp.example. "x" is a dummy
default tail entry, though sometimes a real userid is put there. tail-entry, though sometimes a real userid is put there.
The article was relayed, perhaps by UUCP, to the machine known, The article was relayed, perhaps by UUCP, to the machine known,
at least to its downstream, as "barbaz". at least to its downstream, as "barbaz".
Barbaz relayed it to old.site.example, which does not yet Barbaz relayed it to old.site.example, which does not yet
conform to this standard (hence the '!' delimiter). So one conform to this standard (hence the '!' path-delimiter). So one
cannot be sure that it really came from barbaz. cannot be sure that it really came from barbaz.
News Article Format November 2001
Old.site.example relayed it to a site claiming to have the IP Old.site.example relayed it to a site claiming to have the IP
address [10.123.12.2], and claiming (by using the '/' delimiter) address [10.123.12.2], and claiming (by using the '/' path-
to have verified that it came from old.site.example. delimiter) to have verified that it came from old.site.example.
[10.123.12.2] relayed it to "foo-server" which, not being [10.123.12.2] relayed it to "foo-server" which, not being
convinced that it truly came from [10.123.12.2], did a reverse convinced that it truly came from [10.123.12.2], did a reverse
lookup on the actual source and concluded it was known as lookup on the actual source and concluded it was known as
bar.isp.example (that is not to say that [10.123.12.2] was not a bar.isp.example (that is not to say that [10.123.12.2] was not a
correct IP address for bar.isp.example, but simply that that correct IP address for bar.isp.example, but simply that that
connection could not be substantiated by foo-server). Observe connection could not be substantiated by foo-server). Observe
News Article Format May 2002
that foo-server has now added two entries to the Path. that foo-server has now added two entries to the Path.
"foo-server" is a locally significant name within the complex "foo-server" is a locally significant name within the complex
site of many machines run by foo.isp.example, so the latter site of many machines run by foo.isp.example, so the latter
should have no problem recognizing foo-server and using a '/' should have no problem recognizing foo-server and using a '/'
delimiter. Presumably foo.isp.example then delivered the path-delimiter. Presumably foo.isp.example then delivered the
article to its direct clients. article to its direct clients.
It appears that foo.isp.example and old.site.example decided to It appears that foo.isp.example and old.site.example decided to
fold the line, on the grounds that it seemed to be getting a fold the line, on the grounds that it seemed to be getting a
little too long. little too long.
6. Optional Headers 6. Optional Headers
The headers appearing in this section have established meanings and None of the headers appearing in this section is required to appear
MUST be interpreted according to the definitions given here. None of in every article but some of them are required in certain types of
them is required to appear in every article but some of them are article, such as followups. Any header defined in this (or any other)
required in certain types of article, such as followups. Any header standard MUST NOT appear more than once in an article unless
defined in this (or any other) standard MUST NOT appear more than specifically stated otherwise. Experimental headers (4.2.5.1) and
once in an article unless specifically stated otherwise. headers defined by cooperating subnets are exempt from this
Experimental headers (4.2.2.1) and headers defined by cooperating requirement. See section 8 "Duties of Various Agents" for the full
subnets are exempt from this requirement. See section 8 "Duties of picture.
Various Agents" for the full picture.
6.1. Reply-To 6.1. Reply-To
The Reply-To header specifies a reply address(es) to be used for The Reply-To-header specifies a reply address(es) to be used for
personal replies for the poster(s) of the article when this is personal replies for the poster(s) of the article when this is
different from the poster's address(es) given in the From header. The different from the poster's address(es) given in the From-header. The
content syntax makes use of syntax defined in [RFC 2822], but subject content syntax makes use of syntax defined in [RFC 2822], but subject
to the revised definition of local-part given in section 5.2. to the revised definition of local-part given in section 5.2.
Reply-To-content = From-content ; see 5.2 header =/ Reply-To-header
Reply-To-header = "Reply-To" ":" SP Reply-To-content
Reply-To-content = address-list
In the absence of Reply-To, the reply address(es) is the address(es) In the absence of Reply-To, the reply address(es) is the address(es)
in the From header. For this reason a Reply-To SHOULD NOT be included in the From-header. For this reason a Reply-To SHOULD NOT be included
if it just duplicates the From header. if it just duplicates the From-header.
NOTE: Use of a Reply-To header is preferable to including a NOTE: Use of a Reply-To-header is preferable to including a
similar request in the article body, because reply agents can similar request in the article body, because replying agents can
take account of Reply-To automatically. take account of Reply-To automatically.
News Article Format November 2001
An address of "<>" in the Reply-To header MAY be used to indicate
that the poster does not wish to recieve email replies.
6.1.1. Examples 6.1.1. Examples
Reply-To: John Smith <jsmith@site.example> Reply-To: John Smith <jsmith@site.example>
Reply-To: John Smith <jsmith@site.example>, dave@isp.example Reply-To: John Smith <jsmith@site.example>, dave@isp.example
Reply-To: John Smith <jsmith@site.example>,andrew@isp.example, Reply-To: John Smith <jsmith@site.example>,andrew@isp.example,
fred@site2.example fred@site2.example
Reply-To: Please do not reply <> News Article Format May 2002
6.2. Sender 6.2. Sender
The Sender header specifies the mailbox of the entity which actually The Sender-header specifies the mailbox of the entity which caused
sent this article, if that entity is different from that given in the this article to be posted (and hence injected), if that entity is
From header or if more than one address appears in the From header. different from that given in the From-header or if more than one
This header SHOULD NOT appear in an article unless the sender is address appears in the From-header. This header SHOULD NOT appear in
different from the poster. This header is appropriate for use by an article unless the sender is different from the poster. This
automatic article posters. The content syntax makes use of syntax header is appropriate for use by automatic article posters. The
defined in [RFC 2822]. content syntax makes use of syntax defined in [RFC 2822].
header =/ Sender-header
Sender-header = "Sender" ":" SP Sender-content
*( ";" other-parameter )
Sender-content = mailbox Sender-content = mailbox
6.3. Organization 6.3. Organization
The Organization header is a short phrase identifying the poster's The Organization-header is a short phrase identifying the poster's
organization. organization.
Organization-content= 1*( [FWS] utext ) header =/ Organization-header
Organization-header = "Organization" ":" SP Organization-content
Organization-content= unstructured
NOTE: Posting and injecting agents are discouraged from NOTE: Posting and injecting agents are discouraged from
providing a default value for this header unless it is providing a default value for this header unless it is
acceptable to all posters using those agents. Unless this header acceptable to all posters using those agents. Unless this header
contains useful information (including some indication of the contains useful information (including some indication of the
posters physical location) posters are discouraged from posters physical location) posters are discouraged from
including it. including it.
6.4. Keywords 6.4. Keywords
The Keywords field contains a comma separated list of important words The Keywords field contains a comma separated list of important words
and phrases intended to describe some aspect of the content of the and phrases intended to describe some aspect of the content of the
article. The content syntax makes use of syntax defined in [RFC article. The content syntax makes use of syntax defined in [RFC
2822]. 2822].
header =/ Keywords-header
Keywords-header = "Keywords" ":" SP Keywords-content
*( ";" other-parameter )
Keywords-content = phrase *( "," phrase ) Keywords-content = phrase *( "," phrase )
NOTE: The list is comma separated NOT space separated. NOTE: The list is comma separated, NOT space separated.
NOTE: Contrary to the usage defined in [RFC 2822], this standard
does not permit multiple occurrences of this header.
6.5. Summary 6.5. Summary
The Summary header is a short phrase summarizing the article's The Summary-header is a short phrase summarizing the article's
content. content.
News Article Format November 2001 News Article Format May 2002
Summary-content = 1*( [FWS] utext ) header =/ Summary-header
Summary-header = "Summary" ":" SP Summary-content
Summary-content = unstructured
The summary should be terse. Authors Ought to avoid trying to cram The summary should be terse. Authors Ought to avoid trying to cram
their entire article into the headers; even the simplest query their entire article into the headers; even the simplest query
usually benefits from a sentence or two of elaboration and context, usually benefits from a sentence or two of elaboration and context,
and not all reading agents display all headers. On the other hand the and not all reading agents display all headers. On the other hand the
summary should give more detail than the Subject. summary should give more detail than the Subject.
6.6. Distribution 6.6. Distribution
The Distribution header is an inheritable header (see 4.2.2.2) which The Distribution-header is an inheritable header (see 4.2.5.2) which
specifies geographical or organizational limits to an article's specifies geographical or organizational limits to an article's
propagation. propagation.
header =/ Distribution-header
Distribution-header = "Distribution" ":" SP Distribution-content
*( ";" other-parameter )
Distribution-content= distribution *( dist-delim distribution ) Distribution-content= distribution *( dist-delim distribution )
dist-delim = "," dist-delim = ","
distribution = positive-distribution / distribution = [FWS] distribution-name [FWS]
negative-distribution
positive-distribution
= *FWS distribution-name *FWS
negative-distribution
= *FWS "!" distribution-name *FWS
distribution-name = ALPHA 1*distribution-rest distribution-name = ALPHA 1*distribution-rest
distribution-rest = ALPHA / "+" / "-" / "_" distribution-rest = ALPHA / "+" / "-" / "_"
NOTE: The use of ALPHA in the syntax ensures that distribution NOTE: The use of ALPHA in the syntax ensures that distribution
names are always in US-ASCII. names are always in US-ASCII.
Articles MUST NOT be passed between relaying agents or to serving Articles MUST NOT be passed between relaying agents or to serving
agents unless the sending agent has been configured to supply and the agents unless the sending agent has been configured to supply and the
receiving agent to receive BOTH of receiving agent to receive at least one of the distributions in the
(a) at least one of the newsgroups in the article's Newsgroups Distribution-header. Additionally, reading agents MAY also be
header, and configured so that unwanted distributions do not get displayed.
(b) at least one of the positive-distributions (if any) in the
article's Distribution header and none of the negative-
distributions.
Additionally, reading agents MAY be configured so that unwanted
distributions do not get displayed.
NOTE: Although it would seem redundant to filter out unwanted NOTE: Although it would seem redundant to filter out unwanted
distributions at both ends of a relaying link (and it is clearly distributions at both ends of a relaying link (and it is clearly
more efficient to do so at the sending end), many sending sites more efficient to do so at the sending end), many sending sites
have been reluctant, historically speaking, to apply such have been reluctant, historically speaking, to apply such
filters (except to ensure that distributions local to their own filters (except to ensure that distributions local to their own
site or cooperating subnet did not escape); moreover they tended site or cooperating subnet did not escape); moreover they tended
to configure their filters on an "all but those listed" basis, to configure their filters on an "all but those listed" basis,
so that new and hitherto unheard of distributions would not be so that new and hitherto unheard of distributions would not be
caught. Indeed many "hub" sites actually wanted to receive all caught. Indeed many "hub" sites actually wanted to receive all
possible distributions so that they could feed on to their possible distributions so that they could feed on to their
clients in all possible geographical (or organizational) clients in all possible geographical (or organizational)
regions. regions.
News Article Format November 2001
Therefore, it is desirable to provide facilities for rejecting Therefore, it is desirable to provide facilities for rejecting
unwanted distributions at the receiving end. Indeed, it may be unwanted distributions at the receiving end. Indeed, it may be
simpler to do so locally than to inform each sending site of simpler to do so locally than to inform each sending site of
what is required, especially in the case of specialized what is required, especially in the case of specialized
distributions (for example for control messages, such as cancels distributions (for example for control messages, such as cancels
from certain issuers) which might need to be added at short from certain issuers) which might need to be added at short
notice. Tha possibility for reading agents to filter notice. The possibility for reading agents to filter
News Article Format May 2002
distributions has been provided for the same reason. distributions has been provided for the same reason.
Exceptionally, ALL relaying agents are deemed willing to supply or Exceptionally, ALL relaying agents are deemed willing to supply or
accept the distribution "world", and NO relaying agent should supply accept the distribution "world", and NO relaying agent should supply
or accept the distribution "local". However, "world" SHOULD NEVER be or accept the distribution "local". However, "world" SHOULD NEVER be
mentioned explicitly since it is the default when the Distribution mentioned explicitly since it is the default when the Distribution-
header is absent entirely. "All" MUST NOT be used as a header is absent entirely. "All" MUST NOT be used as a
distribution-name. Distribution-names SHOULD contain at least three distribution-name. Distribution-names SHOULD contain at least three
characters, except when they are two-letter country names as in [ISO characters, except when they are two-letter country names as in [ISO
3166]. Distribution-names are case-insensitive (i.e. "US", "Us" and 3166]. Distribution-names are case-insensitive (i.e. "US", "Us" and
"us" all specify the same distribution). "us" all specify the same distribution).
NOTE: "Distribution: !us" can be used to cause an article to go Posting agents Ought Not to provide a default Distribution-header
to the whole of "world" except for "us".
Posting agents Ought Not to provide a default Distribution header
without giving the poster an opportunity to override it. Followup without giving the poster an opportunity to override it. Followup
agents SHOULD initially supply the same Distribution header as found agents SHOULD initially supply the same Distribution-header as found
in the precursor. in the precursor.
6.7. Followup-To 6.7. Followup-To
The Followup-To header specifies which newsgroup(s) followups should The Followup-To-header specifies which newsgroup(s) followups should
be posted to. be posted to.
Followup-To-content = Newsgroups-content / "poster" header =/ Followup-To-header
Followup-To-header = "Followup-To" ":" SP Followup-To-content
*( ";" other-parameter )
Followup-To-content = Newsgroups-content / [FWS] "poster" [FWS]
The syntax is the same as that of the Newsgroups-content, with the The syntax is the same as that of the Newsgroups-content, with the
exception that the magic word "poster" is allowed. In the absence of addition that the keyword "poster" is allowed. In the absence of a
a Followup-To header, the default newsgroup(s) for a followup are Followup-To-header, the default newsgroup(s) for a followup are those
those in the Newsgroups header, and for this reason the Followup-To in the Newsgroups header, and for this reason the Followup-To-header
header SHOULD NOT be included if it just duplicates the Newsgroups SHOULD NOT be included if it just duplicates the Newsgroups-header.
header.
A Followup-To header consisting of the magic word "poster" indicates A Followup-To-header consisting of the keyword "poster" indicates
that the poster requests no followups to be sent in response to this that the poster requests no followups to be sent in response to this
article, only personal replies to the article's reply address. article, only personal replies to the article's reply address.
NOTE: A poster who wishes both a personal reply and a followup NOTE: A poster who wishes both a personal reply and a followup
post should include a Mail-Copies-To header (6.8). post should include a Mail-Copies-To-header (6.8).
6.8. Mail-Copies-To 6.8. Mail-Copies-To
The Mail-Copies-To header indicates whether or not the poster wishes The Mail-Copies-To-header indicates whether or not the poster wishes
to have followups to an article emailed in addition to being posted to have followups to an article emailed in addition to being posted
to Netnews and, if so, establishes the address to which they should to Netnews and, if so, establishes the address to which they should
be sent. be sent.
News Article Format November 2001
The content syntax makes use of syntax defined in [RFC 2822], but The content syntax makes use of syntax defined in [RFC 2822], but
subject to the revised definition of local-part given in section 5.2. subject to the revised definition of local-part given in section 5.2.
Mail-Copies-To-content = copy-addr / "nobody" / "poster" header =/ Mail-Copies-To-header
copy-addr = mailbox Mail-Copies-To-header
= "Mail-Copies-To" ":" SP Mail-Copies-To-content
News Article Format May 2002
Mail-Copies-To-content
= copy-addr / [CFWS] ( "nobody" / "poster" ) [CFWS]
copy-addr = address-list
The keyword "nobody" indicates that the poster does not wish copies The keyword "nobody" indicates that the poster does not wish copies
of any followup postings to be emailed. This indication is widely of any followup postings to be emailed. This indication is widely
seen as a very strong wish, and is to be taken as the default when seen as a very strong wish, and is to be taken as the default when
this header is absent. this header is absent.
The keyword "poster" indicates that the poster wishes a copy of any The keyword "poster" indicates that the poster wishes a copy of any
followup postings to be emailed to him. followup postings to be emailed to him.
Otherwise, this header contains a copy-addr to which the poster Otherwise, this header contains a copy-addr to which the poster
skipping to change at page 45, line 38 skipping to change at page 48, line 35
The automatic actions of a followup agent in the various cases The automatic actions of a followup agent in the various cases
(subject to manual override by the user) are as follows: (subject to manual override by the user) are as follows:
nobody (or when the header is absent) nobody (or when the header is absent)
The followup agent SHOULD NOT, by default, email such a copy and The followup agent SHOULD NOT, by default, email such a copy and
Ought, especially when there is an explicit "nobody", to issue a Ought, especially when there is an explicit "nobody", to issue a
warning and ask for confirmation if the user attempts to do so. warning and ask for confirmation if the user attempts to do so.
poster poster
The followup agent Ought, by default, to email a copy, which MUST The followup agent Ought, by default, to email a copy, which MUST
then be sent to the address in the Reply-To header, and in the then be sent to the address(es) in the Reply-To-header, and in the
absence of that to the address(es) in the From header. absence of that to the address(es) in the From-header.
copy-addr copy-addr
The followup agent Ought, by default, to email a copy, which MUST The followup agent Ought, by default, to email a copy, which MUST
then be sent to the copy-addr. then be sent to the copy-addr.
NOTE: This header is only relevant when posting followups to NOTE: This header is only relevant when posting followups to
Netnews articles, and is to be ignored when sending pure email Netnews articles, and is to be ignored when sending pure email
replies to the poster, which are handled as prescribed under the replies to the poster, which are handled as prescribed under the
Reply-To header (6.1). Whether or not this header will also Reply-To-header (6.1). Whether or not this header will also
find similar usage for replies to messages sent to mailing lists find similar usage for replies to messages sent to mailing lists
falls outside the scope of this standard. falls outside the scope of this standard.
When emailing a copy, the followup agent SHOULD also include a When emailing a copy, the followup agent SHOULD also include a
"Posted-And-Mailed: yes" header (6.9). "Posted-And-Mailed: yes" header (6.9).
NOTE: In addition to the Posted-And-Mailed header, some followup NOTE: In addition to the Posted-And-Mailed-header, some followup
agents also include within the body a mention that the article agents also include within the body a mention that the article
is both posted and mailed, for the benefit of reading agents is both posted and mailed, for the benefit of reading agents
that do not normally show that header. that do not normally show that header.
News Article Format November 2001 News Article Format May 2002
6.9. Posted-And-Mailed 6.9. Posted-And-Mailed
Posted-And-Mailed-content = "yes" / "no" header =/ Posted-And-Mailed-header
Posted-And-Mailed-header
= "Posted-And-Mailed" ":" SP Posted-And-Mailed-content
*( ";" other-parameter )
Posted-And-Mailed-content
= [CFWS] ( "yes" / "no" ) [CFWS]
This header, when used with the "yes" keyword, indicates that the This header, when used with the "yes" keyword, indicates that the
article has been both posted to the specified newsgroups and emailed. article has been both posted to the specified newsgroups and emailed.
It SHOULD be used when replying to the poster of an article to which It SHOULD be used when replying to the poster of an article to which
this one is a followup (see the Mail-Copies-To header in section 6.8) this one is a followup (see the Mail-Copies-To-header in section 6.8)
and it MAY be used when any article is also mailed to a recipient(s) and it MAY be used when any article is also mailed to a recipient(s)
identified in a To and/or Cc header that is also present. The "no" identified in a To- and/or Cc-header that is also present. The "no"
keyword is included for the sake of completeness; it MAY be used to keyword is included for the sake of completeness; it MAY be used to
indicate the opposite state, but is redundant insofar as it only indicate the opposite state, but is redundant insofar as it only
describes the default state when this header is absent. describes the default state when this header is absent.
This header, if present, MUST be included in both the posted and This header, if present, MUST be included in both the posted and
emailed versions of the article. The Newsgroups header of the posted emailed versions of the article. The Newsgroups-header of the posted
article SHOULD be included in the email version as recommended in article SHOULD be included in the email version as recommended in
section 5.5. All other headers defined in this standard (excluding section 5.5. All other headers defined in this standard (excluding
variant headers, but including specifically the Message-ID header) variant headers, but including specifically the Message-ID-header)
MUST be identical in both the posted and mailed versions of the MUST be identical in both the posted and mailed versions of the
article, and so MUST the body. article, and so MUST the body.
NOTE: This leaves open the question of whether a To or a Cc NOTE: This leaves open the question of whether a To- or a Cc-
header should appear in the posted version. Naturally, a Bcc header should appear in the posted version. Naturally, a Bcc-
header should not appear, except in a form which indicates that header should not appear, except in a form which indicates that
there are additional unspecified recipients. there are additional unspecified recipients.
6.10. References 6.10. References
The References header lists optionally CFWS-separated message The References-header lists CFWS-separated message identifiers of
identifiers of precursors. The content syntax makes use of syntax precursors. The content syntax makes use of syntax defined in [RFC
defined in [RFC 2822]. 2822].
header =/ References-header
References-header = "References" ":" SP References-content
*( ";" other-parameter )
References-content = msg-id *( CFWS msg-id ) References-content = msg-id *( CFWS msg-id )
NOTE: This differs from the syntax of [RFC 2822] by requiring at NOTE: This differs from the syntax of [RFC 2822] by requiring at
least one CFWS between the msg-ids (this was an [RFC 1036] least one CFWS between the msg-ids (a SP at this point was an
requirement). [RFC 1036] requirement).
A followup MUST have a References header, and an article that is not A followup MUST have a References-header, and an article that is not
a followup MUST NOT have a References header. In a followup, if the a followup MUST NOT have a References-header. In a followup, if the
precursor did not have a References header, the followup's precursor did not have a References-header, the followup's
References-content MUST be formed by the message identifier of the References-content MUST be formed by the message identifier of the
precursor. A followup to an article which had a References header precursor. A followup to an article which had a References-header
MUST have a References header containing the precursor's References- MUST have a References-header containing the precursor's References-
content (subject to trimming as described below) plus the precursor's content (subject to trimming as described below) plus the precursor's
News Article Format May 2002
message identifier appended to the end of the list (separated from it message identifier appended to the end of the list (separated from it
by CFWS). by CFWS).
Followup agents SHOULD NOT trim message identifiers out of a Followup agents SHOULD NOT trim message identifiers out of a
References header unless the number of message identifiers exceeds References-header unless the number of message identifiers exceeds
21, at which time trimming SHOULD be done by removing sufficient 21, at which time trimming SHOULD be done by removing sufficient
identifiers starting with the second so as to bring the total down to identifiers starting with the second so as to bring the total down to
21 (but the first message identifier MUST NOT be trimmed). However, 21 (but the first message identifier MUST NOT be trimmed). However,
News Article Format November 2001 it would be wrong to assume that References-headers containing more
it would be wrong to assume that References headers containing more
than 21 message identifiers will not occur. than 21 message identifiers will not occur.
6.10.1. Examples 6.10.1. Examples
References: <i4g587y@site1.example> References: <i4g587y@site1.example>
References: <i4g587y@site1.example> <kgb2231+ee@site2.example> References: <i4g587y@site1.example> <kgb2231+ee@site2.example>
References: <i4g587y@site1.example> <kgb2231+ee@site2.example> References: <i4g587y@site1.example> <kgb2231+ee@site2.example>
<222@site1.example> <87tfbyv@site7.example> <222@site1.example> <87tfbyv@site7.example>
<67jimf@site666.example> <67jimf@site666.example>
References: <i4g587y@site1.example> <kgb2231+ee@site2.example> References: <i4g587y@site1.example> <kgb2231+ee@site2.example>
<tisjits@smeghead.example> <tisjits@smeghead.example>
6.11. Expires 6.11. Expires
The Expires header specifies a date and time when the article is The Expires-header specifies a date and time when the article is
deemed to be no longer relevant and could usefully be removed deemed to be no longer relevant and could usefully be removed
("expired"). The content syntax makes use of syntax defined in [RFC ("expired"). The content syntax makes use of syntax defined in [RFC
2822]. 2822].
header =/ Expires-header
Expires-header = "Expires" ":" SP Expires-content
*( ";" other-parameter )
Expires-content = date-time Expires-content = date-time
An Expires header should only be used in an article if the requested An Expires-header should only be used in an article if the requested
expiry time is earlier or later than the time typically to be expiry time is earlier or later than the time typically to be
expected for such articles. Local policy for each serving agent will expected for such articles. Local policy for each serving agent will
dictate whether and when this header is obeyed and posters SHOULD NOT dictate whether and when this header is obeyed and posters SHOULD NOT
depend on it being completely followed. depend on it being completely followed.
6.12. Archive 6.12. Archive
This optional header provides an indication of the poster's intent This optional header provides an indication of the poster's intent
regarding preservation of the article in publicly accessible long- regarding preservation of the article in publicly accessible long-
term or permanent storage. term or permanent storage.
Archive-content = [CFWS] ("no" | "yes" ) [CFWS] header =/ Archive-header
Archive-header-parameter Archive-header = "Archive" ":" SP Archive-content
= Filename-token "=" value *( ";" ( Archive-parameter /
; for USENET-header-parameters see 4.1 other-parameter ) )
Filename-token = [CFWS] "filename" [CFWS] Archive-content = [CFWS] ("no" / "yes" ) [CFWS]
Archive-parameter = <a parameter with attribute "filename"
and any value>
News Article Format May 2002
The presence of an "Archive: no" header in an article indicates that The presence of an "Archive: no" header in an article indicates that
the poster does not permit redistribution from publicly accessible the poster does not permit redistribution from publicly accessible
long-term or permanent archives. The absence of this header, or an long-term or permanent archives. The absence of this header, or an
explicit "Archive: yes", indicates that the poster is willing for explicit "Archive: yes", indicates that the poster is willing for
such redistribution to take place. The optional Filename parameter such redistribution to take place. The optional "filename" parameter
can then be used to suggest a filename under which the article should can then be used to suggest a filename under which the article should
be stored. Further extensions to this standard may provide additional be stored. Further extensions to this standard may provide additional
parameters for administration of the archiving process. parameters for administration of the archiving process.
NOTE: This standard does not attempt to define the length of NOTE: This standard does not attempt to define the length of
"long-term", since it is dependent on many factors, including "long-term", since it is dependent on many factors, including
the retention policies of individual sites, and the customs or the retention policies of individual sites, and the customs or
policies established for particular newsgroup or hierarchies. policies established for particular newsgroups or hierarchies.
News Article Format November 2001
NOTE: Posters are cautioned that some sites may not implement NOTE: Posters are cautioned that some sites may not implement
the "no" option of the Archive header correctly. In some the "no" option of the Archive-header correctly. In some
jurisdictions non-compliance with this header may constitute a jurisdictions non-compliance with this header may constitute a
breach of copyright or of other legal provisions. Moreover, breach of copyright or of other legal provisions. Moreover,
even if this header prevents the poster's words from being even if this header prevents the poster's words from being
archived publicly, it does nothing to prevent the archiving of a archived publicly, it does nothing to prevent the archiving of a
followup in which those words are quoted. followup in which those words are quoted.
6.13. Control 6.13. Control
The Control header marks the article as a control message, and The Control-header marks the article as a control message, and
specifies the desired actions (other than the usual ones of storing specifies the desired actions (additional to the usual ones of
and/or relaying the article). storing and/or relaying the article).
Control-content = CONTROL-verb CONTROL-argument header =/ Control-header
CONTROL-verb = <the verb defined in this standard Control-header = "Control" ":" SP Control-content
(or an extension of it) for a specific *( ";" other-parameter )
CONTROL message> Control-content = [CFWS] control-message [CFWS]
verb = token control-message = <empty>
CONTROL-arguments = <the argument defined in this standard
(or an extension of it) for a specific However, the rule given above for control-message is incomplete.
CONTROL message> Further alternatives will be added incrementally as the various
arguments = *( CFWS value ) ; see 4.1 control-messages are introduced in section 7, or in extensions to
[Observe that <value> reqires the use of a quoted-string if any this standard, using the "=/" notation defined in [RFC 2234]. For
tspecials or non-ASCII characters are involved. This is a restriction on example, a typical CONTROL-message would be defined as follows:
present usage, but follows MIME practice.]
control-message =/ CONTROL-message
CONTROL-message = "CONTROL" CONTROL-arguments
CONTROL-arguments = <the argument(s) specific to that
CONTROL-message>
where "CONTROL" is a "verb" which is (and MUST be) of the syntactic
form of a token and CONTROL-arguments MUST be of the syntactic form
of a CFWS-separated list of values (which may require the use of
quoted-strings if any tspecials or non-ASCII characters are
involved).
The verb indicates what action should be taken, and the argument(s) The verb indicates what action should be taken, and the argument(s)
(if any) supply details. In some cases, the body of the article may (if any) supply details. In some cases, the body of the article may
also contain details. Section 7 describes all of the standard verbs. also contain details.
An article with a Control header MUST NOT also have a Supersedes News Article Format May 2002
An article with a Control-header MUST NOT also have a Supersedes-
header. header.
NOTE: The presence of a Subject header starting with the string NOTE: The presence of a Subject-header starting with the string
"cmsg " and followed by a Control-content MUST NOT be construed, "cmsg " and followed by a Control-message MUST NOT be construed,
in the absence of a proper Control header, as a request to in the absence of a proper Control-header, as a request to
perform that control action (as may have occurred in some legacy perform that control action (as may have occurred in some legacy
software). See also section 5.4. software). See also section 5.4.
6.14. Approved 6.14. Approved
The Approved header indicates the mailing addresses (and possibly the The Approved-header indicates the mailing addresses (and possibly the
full names) of the persons or entities approving the article for full names) of the persons or entities approving the article for
posting. posting.
header =/ Approved-header
Approved-header = "Approved" ":" SP Approved-content
*( ";" other-parameter )
Approved-content = From-content ; see 5.2 Approved-content = From-content ; see 5.2
Each mailbox contained in the Approved-content MUST be that of the Each mailbox contained in the Approved-content MUST be that of one of
person or entity in question, and one of those mailboxes MUST be that the person(s) or entity(ies) in question, and one of those mailboxes
of the actual injector of the article. MUST be that of the actual injector of the article.
News Article Format November 2001
An Approved header is required in all postings to moderated An Approved-header is required in all postings to moderated
newsgroups. If this header is not present in such postings, then newsgroups. If this header is not present in such postings, then
relaying and serving agents MUST reject the article. Please see relaying and serving agents MUST reject the article. Please see
section 8.2.2 for how injecting agents should treat postings to section 8.2.2 for how injecting agents should treat postings to
moderated groups that do not contain this header. moderated groups that do not contain this header.
An Approved header is also required in certain control messages, to An Approved-header is also required in certain control messages, to
reduce the risk of accidental posting of same. reduce the risks of accidental or unauthorized posting of same.
NOTE: The presence of an Approved header indicates that the NOTE: The presence of an Approved-header indicates that the
person or entity identified claims to have the necessary person or entity identified claims to have the necessary
authority to post the article in question, thus enabling sites authority to post the article in question, thus enabling sites
that dispute that authority to refuse to accept or to act upon that dispute that authority to refuse to accept or to act upon
it. However, the mere presence of the header is insufficient to it. However, the mere presence of the header is insufficient to
provide assurance that it indeed originated from that person or provide assurance that it indeed originated from that person or
entity, and it is therefore desirable that it be included within entity, and it is therefore desirable that it be included within
some digital signature scheme (see 7.1), especially in the case some digital signature scheme (see 7.1), especially in the case
of control messages (section 7). of control messages (section 7).
6.15. Supersedes 6.15. Supersedes
The Supersedes header contains a message identifier specifying an The Supersedes-header contains a message identifier specifying an
article to be superseded upon the arrival of this one. The specified article to be superseded upon the arrival of this one. The specified
article MUST be treated as though a "cancel" control message had article MUST be treated as though a "cancel" control message had
arrived for the article (but observe that a site MAY choose not to arrived for the article (but observe that a site MAY choose not to
honour a "cancel" message, especially if its authenticity is in honour a "cancel" message, especially if its authenticity is in
doubt). The content syntax makes use of syntax defined in [RFC 2822]. doubt). The content syntax makes use of syntax defined in [RFC 2822].
header =/ Supersedes-header
News Article Format May 2002
Supersedes-header = "Supersedes" ":" SP Supersedes-content
*( ";" other-parameter )
Supersedes-content = msg-id Supersedes-content = msg-id
NOTE: There is no "c" in "Supersedes". NOTE: There is no "c" in "Supersedes".
If an article contains a Supersedes header, then the old article NOTE: The Supersedes-header defined here has no connection with
the Supersedes-header that sometimes appears in Email messages
converted from X.400 according to [RFC 2156]; in particular, the
syntax here permits only one msg-id in contrast to the multiple
msg-ids in that Email version.
If an article contains a Supersedes-header, then the old article
mentioned SHOULD be withdrawn from circulation or access, as in a mentioned SHOULD be withdrawn from circulation or access, as in a
cancel message (7.3), and the new article inserted into the system as cancel message (7.3), and the new article inserted into the system as
any other new article would have been. any other new article would have been.
Whatever security or authentication checks are normally applied to a Whatever security or authentication checks are normally applied to a
Control cancel message (or may be prescribed for such messages by Control cancel message (or may be prescribed for such messages by
some extension to this standard - see the remarks in 7.1 and 7.3) some extension to this standard - see the remarks in 7.1 and 7.3)
MUST also be applied to an article with a Supersedes header. In the MUST also be applied to an article with a Supersedes-header. In the
event of the failure of such checks, the article SHOULD be discarded, event of the failure of such checks, the article SHOULD be discarded,
or at most stored as an ordinary article. or at most stored as an ordinary article.
6.16. Xref 6.16. Xref
The Xref header is a variant header (4.2.2.3) which indicates where The Xref-header is a variant header (4.2.5.3) which indicates where
an article was filed by the last server to process it. an article was filed by the last server to process it.
Xref-content = [CFWS] server-name 1*( CFWS location ) header =/ Xref-header
Xref-header = "Xref" ":" SP Xref-content
*( ";" other-parameter )
Xref-content = [CFWS] server-name 1*( CFWS location ) [CFWS]
server-name = path-identity ; see 5.6.1 server-name = path-identity ; see 5.6.1
location = newsgroup-name ":" article-locator location = newsgroup-name ":" article-locator
article-locator = 1*( %x21-7E ) ; US-ASCII printable characters article-locator = 1*( %x21-27 / %x29-3A / %x3C-7E )
News Article Format November 2001 ; US-ASCII printable characters
; except '(' and ';'
The server-name is included so that software can determine which The server-name is included so that software can determine which
serving agent generated the header. The locations specify what serving agent generated the header. The locations specify what
newsgroups the article was filed under (which may differ from those newsgroups the article was filed under (which may differ from those
in the Newsgroups header) and where it was filed under them. The in the Newsgroups-header) and where it was filed under them. The
exact form of an article-locator is implementation-specific. exact form of an article-locator is implementation-specific.
NOTE: The traditional form of an article-locator is a decimal NOTE: The traditional form of an article-locator is a decimal
number, with articles in each newsgroup numbered consecutively number, with articles in each newsgroup numbered consecutively
starting from 1. NNTP demands that such a model be provided, and starting from 1. NNTP demands that such a model be provided, and
much other software expects it, but it seems desirable to permit much other software expects it, but it seems desirable to permit
flexibility for unorthodox implementations. flexibility for unorthodox implementations.
An agent inserting an Xref header into an article MUST delete any An agent inserting an Xref-header into an article MUST delete any
previous Xref header(s). A relaying agent MAY delete it before previous Xref-header(s). A relaying agent MAY delete it before
relaying, but otherwise it SHOULD be ignored (and usually replaced) relaying, but otherwise it SHOULD be ignored by any relaying or
by any relying or serving agent receiving it. News Article Format May 2002
An agent MUST use the same serving-name in Xref headers as the path- serving agent receiving it.
identity it uses in Path headers.
An agent MUST use the same serving-name in Xref-headers as the path-
identity it uses in Path-headers.
6.17. Lines 6.17. Lines
The Lines header indicates the number of lines in the body of the The Lines-header indicates the number of lines in the body of the
article. article.
Lines-content = [CFWS] 1*digit header =/ Lines-header
Lines-header = "Lines" ":" SP Lines-content
*( ";" other-parameter )
Lines-content = [CFWS] 1*DIGIT [CFWS]
The line count includes all body lines, including the signature if The line count includes all body lines, including the signature if
any, including empty lines (if any) at the beginning or end of the any, including empty lines (if any) at the beginning or end of the
body, and including the whole of all MIME message and multipart parts body, and including the whole of all MIME message and multipart parts
contained in the body (the single empty separator line between the contained in the body (the single empty separator line between the
headers and the body is not part of the body). The "body" here is the headers and the body is not part of the body). The "body" here is the
body as found in the posted article as transmitted by the posting body as found in the posted article as transmitted by the posting
agent. agent.
This header is to be regarded as obsolete, and it will likely be This header is to be regarded as obsolete, and it will likely be
removed entirely in a future version of this standard. In the removed entirely in a future version of this standard. In the
meantime, its use is deprecated. meantime, its use is deprecated.
6.18. User-Agent 6.18. User-Agent
The User-Agent header contains information about the user agent The User-Agent-header contains information about the user agent
(typically a newsreader) generating the article, for statistical (typically a newsreader) generating the article, for statistical
purposes and tracing of standards violations to specific software purposes and tracing of standards violations to specific software
needing correction. Although not one of the mandatory headers, needing correction. Although not one of the mandatory headers,
posting agents SHOULD normally include it. posting agents SHOULD normally include it.
header =/ User-Agent-header
User-Agent-header = "User-Agent" ":" SP User-Agent-content
*( ";" other-parameter )
User-Agent-content = product-token *( CFWS product-token ) User-Agent-content = product-token *( CFWS product-token )
product-token = value ["/" product-version] ; see 4.1 product-token = value ["/" product-version] ; see 4.1
product-version = value product-version = value
This header MAY contain multiple product-tokens identifying the agent This header MAY contain multiple product-tokens identifying the agent
and any subproducts which form a significant part of the posting and any subproducts which form a significant part of the posting
agent, listed in order of their significance for identifying the agent, listed in order of their significance for identifying the
News Article Format November 2001
application. Product-tokens should be short and to the point - they application. Product-tokens should be short and to the point - they
MUST NOT be used for information beyond the canonical name of the MUST NOT be used for information beyond the canonical name of the
product and its version. Injecting agents MAY include product product and its version. Injecting agents MAY include product
information for servers (such as "INN/1.7.2"), but serving and information for themselves (such as "INN/1.7.2"), but relaying and
relaying agents MUST NOT generate or modify this header to list serving agents MUST NOT generate or modify this header to list
themselves. themselves.
News Article Format May 2002
NOTE: Variations from [RFC 2616] which describes a similar NOTE: Variations from [RFC 2616] which describes a similar
facility for the HTTP protocol: facility for the HTTP protocol:
1. use of arbitrary text or octets from character sets other 1. use of arbitrary text or octets from character sets other
than US-ASCII in a product-token may require the use of a than US-ASCII in a product-token may require the use of a
quoted-string, quoted-string,
2. "{" and "}" are allowed in a value (product-token and 2. "{" and "}" are allowed in a value (product-token and
product-version) in Netnews, product-version) in Netnews,
skipping to change at page 51, line 48 skipping to change at page 55, line 42
User-Agent: Gnus/5.4.64 XEmacs/20.3beta17 ("Bucharest") User-Agent: Gnus/5.4.64 XEmacs/20.3beta17 ("Bucharest")
User-Agent: Pluto/1.05h (RISC-OS/3.1) NewsHound/1.30 User-Agent: Pluto/1.05h (RISC-OS/3.1) NewsHound/1.30
User-Agent: inn/1.7.2 User-Agent: inn/1.7.2
User-Agent: telnet User-Agent: telnet
NOTE: This header supersedes the role performed redundantly by NOTE: This header supersedes the role performed redundantly by
experimental headers such as X-Newsreader, X-Mailer, X-Posting- experimental headers such as X-Newsreader, X-Mailer, X-Posting-
Agent, X-Http-User-Agent, and other headers previously used on Agent, X-Http-User-Agent, and other headers previously used on
Usenet for this purpose. Use of these experimental headers Usenet for this purpose. Use of these experimental headers
SHOULD be discontinued in favor of the single, standard User- SHOULD be discontinued in favor of the single, standard User-
Agent header which can be used freely both in Netnews and mail. Agent-header which can be used freely both in Netnews and Email
(except that non-ASCII characters would be inappropriate in
email).
6.19. Injector-Info 6.19. Injector-Info
The Injector-Info header SHOULD be added to each article by the The Injector-Info-header SHOULD be added to each article by the
injecting agent in order to provide information as to how that injecting agent in order to provide information as to how that
article entered the Netnews system and to assist in tracing its true article entered the Netnews system and to assist in tracing its true
origin. origin.
header =/ Injector-Info-header
Injector-Info-header
= "Injector-Info" ":" SP Injector-Info-content
*( ";" ( Injector-Info-parameter /
other-parameter ) )
Injector-Info-content Injector-Info-content
= path-identity = [CFWS] path-identity [CFWS]
News Article Format November 2001 News Article Format May 2002
Injector-Info-header-parameter Injector-Info-parameter
= posting-host-parameter / = posting-host-parameter /
posting-account-parameter / posting-account-parameter /
posting-sender-parameter / posting-sender-parameter /
posting-logging-parameter / posting-logging-parameter /
posting-date-parameter posting-date-parameter
; for USENET-header-parameters see 4.1 ; for {USENET}-parameters see 4.1
posting-host-parameter posting-host-parameter
= [CFWS] "posting-host" [CFWS] "=" [CFWS] = <a parameter with attribute "posting-host"
( host-value / and value some host-value>
DQUOTE host-value DQUOTE ) [CFWS]
host-value = dot-atom / host-value = dot-atom /
[ dot-atom ":" ] [ dot-atom ":" ]
( dotted-quad / ; see [RFC 820] ( IPv4address / IPv6address ); see [RFC 2373]
ipv6-numeric ) ; see [RFC 2373]
posting-account-parameter posting-account-parameter
= [CFWS] "posting-account" [CFWS] "=" value = <a parameter with attribute "posting-account"
and any value>
posting-sender-parameter posting-sender-parameter
= [CFWS] "sender" [CFWS] "=" [CFWS] = <a parameter with attribute "sender"
( sender-value / and value some sender-value>
DQUOTE sender-value DQUOTE ) [CFWS] sender-value = mailbox / "verified"
sender-value = ( mailbox / "verified" )
posting-logging-parameter posting-logging-parameter
= [CFWS] "logging-data" [CFWS] "=" value = <a parameter with attribute "logging-data"
and any value>
posting-date-parameter posting-date-parameter
= [CFWS] "posting-date" [CFWS] "=" [CFWS] = <a parameter with attribute "posting-date"
( date-value / and value some date-time>
DQUOTE date-value DQUOTE ) [CFWS]
date-value = 1*DIGIT [ ":" date-time ]
An Injector-Info header MUST NOT be added to an article by any agent An Injector-Info-header MUST NOT be added to an article by any agent
other than an injecting agent. Any Injector-Info header present when other than an injecting agent. Any Injector-Info-header present when
an article arrives at an injecting agent MUST be removed. In an article arrives at an injecting agent MUST be removed. In
particular if, for some exceptional reason (8.2.2), an article gets particular if, for some exceptional reason (8.2.2), an article gets
injected twice, the Injector-Info header will always relate to the injected twice, the Injector-Info-header will always relate to the
second injection. second injection.
The path-identity MUST be the same as the path-identity prepended to The path-identity MUST be the same as the path-identity prepended to
the Path header by that same injecting agent which, following section the Path-header by that same injecting agent which, following section
5.6.2, MUST therefore be a fully qualified domain name (FQDN) 5.6.2, MUST therefore be a fully qualified domain name (FQDN)
mailable address. mailable address.
Although comments and folding of white space are permitted throughout Although comments and folding of white space are permitted throughout
the Injector-Info-content specification, it is RECOMMENDED that the Injector-Info-content specification, it is RECOMMENDED that
folding is not used within any header-parameter (but only before or folding is not used within any parameter (but only before or after
after the ";" separating parameters), and that comments are only used the ";" separating those parameters), and that comments are only used
following the last parameter. It is also RECOMMENDED that such following the last parameter. It is also RECOMMENDED that such
parameters as are present are included in the order in which they parameters as are present are included in the order in which they
have been defined in the syntax above. An injecting agent SHOULD use have been defined in the syntax above. An injecting agent SHOULD use
a consistent form of this header for all articles emanating from the a consistent form of this header for all articles emanating from the
same or similar origins. same or similar origins.
News Article Format November 2001
NOTE: The effect of those recommendations is to facilitate the NOTE: The effect of those recommendations is to facilitate the
recognition of articles arising from certain designated origins recognition of articles arising from certain designated origins
(as in the so-called "killfiles" which are available in some (as in the so-called "killfiles" which are available in some
reading agents). Observe that the order within the syntax has reading agents). Observe that the order within the syntax has
been chosen to place last those parameters which are most likely been chosen to place last those parameters which are most likely
News Article Format May 2002
to change between successive articles posted from the same to change between successive articles posted from the same
origin. origin.
NOTE: To comply with the overall "attribute = value" syntax of NOTE: To comply with the overall "attribute = value" syntax of
USENET-header-parameters, any value containing an ipv6-numeric, parameters, any value containing an IPv6address, a date-time, a
a date-time, a mailbox or any CFWS MUST be quoted using mailbox or any CFWS MUST be quoted using <DQUOTE>s (the quoting
<DQUOTE>s (the quoting is optional in other cases). is optional in other cases).
NOTE: This header is intended to replace various currently-used NOTE: This header is intended to replace various currently-used
but nowhere-documented headers such as "NNTP-Posting-Host", but nowhere-documented headers such as "NNTP-Posting-Host",
"NNTP-Posting-Date" amd "X-Trace". Any of these headers present "NNTP-Posting-Date" and "X-Trace". These headers are now
when an article arrives at an injecting agent SHOULD also be deprecated, and any of them present when an article arrives at
removed as above. an injecting agent SHOULD also be removed as above.
6.19.1. Usage of Injector-Info-header-parameters 6.19.1. Usage of Injector-Info-parameters
The purpose of these parameters is to enable the injecting agent to The purpose of these parameters is to enable the injecting agent to
make assertions about the origin of the article, in fulfilment of its make assertions about the origin of the article, in fulfilment of its
responsibilities towards the rest of the network as set out in responsibilities towards the rest of the network as set out in
section 8.2. These assertions can then be utilized as follows: section 8.2. These assertions can then be utilized as follows:
1. To enable the administrator of the injecting agent to respond to 1. To enable the administrator of the injecting agent to respond to
complaints and queries concerning the article. For this purpose, complaints and queries concerning the article. For this purpose,
the parameters included SHOULD be sufficient to enable the the parameters included SHOULD be sufficient to enable the
administrator to identify its true origin (which parameters are administrator to identify its true origin (which parameters are
skipping to change at page 53, line 49 skipping to change at page 57, line 43
there is no benefit in including parameters which contribute there is no benefit in including parameters which contribute
nothing to this aim). An administrator MAY, with those parameters nothing to this aim). An administrator MAY, with those parameters
where the syntax so allows, use cryptic notations interpretable where the syntax so allows, use cryptic notations interpretable
only by himself if he considers it appropriate to protect the only by himself if he considers it appropriate to protect the
privacy of that origin. privacy of that origin.
2. To enable relaying, serving and reading agents to recognize 2. To enable relaying, serving and reading agents to recognize
articles from origins which they might wish to reject, divert, or articles from origins which they might wish to reject, divert, or
otherwise handle specially, for reasons of site policy. otherwise handle specially, for reasons of site policy.
3. To enable the timely identification of spews af articles arising 3. To enable the timely identification of spews of articles arising
from a common origin. from a common origin.
An injecting agent MUST NOT include any Injector-Info-header- An injecting agent MUST NOT include any Injector-Info-parameter
parameter unless it has positive evidence of its correctness. An unless it has positive evidence of its correctness. An injecting
injecting agent MAY include other-header-parameters with x-token agent MAY include other-parameters with x-token attributes which will
attributes which will assist in identifying the origin of the assist in identifying the origin of the article.
article.
NOTE: It will be observed that the range of parameters provided NOTE: Administrators of injecting agents can choose which
allows much choice as to the precise manner in which an injecting selection of the following parameters best enables them to fulfil
agent fulfils its responsibilities. Whilst this standard does not their responsibilities. Some of these parameters identify the
News Article Format November 2001 source of the article explicitly whereas others do so indirectly,
thus affording more privacy to posters who value their anonymity,
but also making harder the tracking of malicious disruption of the
network, especially so if the administrators choose not to
cooperate. There is thus a balance to be struck between the needs
of privacy on the one hand and the good order of Usenet on the
News Article Format May 2002
seek to establish any preferences in this matter, administrators other, and administrators need to be aware of this when
of injecting agents need to be aware of the privacy implications formulating their policies.
of the choices that they make.
6.19.1.1. The posting-host-parameter 6.19.1.1. The posting-host-parameter
If a dot-atom is present, it MUST be a FQDN identifying the specific If a dot-atom is present, it MUST be a FQDN identifying the specific
host from which the injecting agent received the article. host from which the injecting agent received the article.
Alternatively, an IP address (dotted-quad or ipv6-numeric) identifies Alternatively, an IP address (IPv4address or IPv6address) identifies
that host. If both forms are present, then they MUST identify the that host. If both forms are present, then they MUST identify the
same host, or at least have done so at the time the article was same host, or at least have done so at the time the article was
injected. injected.
NOTE: It is commonly the case that this header identifies a NOTE: It is commonly the case that this parameter identifies a
dial-up point-of-presence, in which case a posting-account or dial-up point-of-presence, in which case a posting-account or
logging-data may need to be consulted to find the true origin of logging-data may need to be consulted to find the true origin of
the article. the article.
6.19.1.2. The posting-account-parameter 6.19.1.2. The posting-account-parameter
This parameter identifies the source from which the injecting agent This parameter identifies the source from which the injecting agent
received the article. It MAY be in a cryptic notation understandable received the article. It SHOULD be in a cryptic notation
only by the administrator of the injecting agent, but it MUST be such understandable only by the administrator of the injecting agent, but
that a given source always gives rise to the same posting-account (if it MUST be such that a given source gives rise to the same posting-
the injecting agent is unable to meet that obligation, then it should account, at least in the short term. If the injecting agent is unable
use a posting-logging-parameter instead). to meet that obligation, then it should use a posting-logging-
parameter instead.
6.19.1.3. The posting-sender-parameter 6.19.1.3. The posting-sender-parameter
This parameter identifies the mailbox of the verified sender of the This parameter identifies the mailbox of the verified sender of the
article (alternatively, it uses the token "verified" to indicate that article (alternatively, it uses the token "verified" to indicate that
at least any addr-spec in the Sender header of the article, or in the at least any addr-spec in the Sender-header of the article, or in the
From header if the Sender header is absent, is correct). From-header if the Sender-header is absent, is correct).
NOTE: An injecting agent is unlikely to be able to make use of NOTE: An injecting agent is unlikely to be able to make use of
this parameter except in cases where it is running on a machine this parameter except in cases where it is running on a machine
which is aware of the user-space in which the posting agent is which is aware of the user-space in which the posting agent is
operating. This parameter should be used in preference to a operating. This parameter should be used in preference to a
posting-account-parameter in such situations. posting-account-parameter in such situations.
6.19.1.4. The posting-logging-parameter 6.19.1.4. The posting-logging-parameter
This parameter contains information (typically a serial number or a This parameter contains information (typically a session number or
session number) which will enable the true origin of the article to other non-persistent means of identifying a posting account) which
be determined by reference to logging information kept by the will enable the true origin of the article to be determined by
injecting agent. reference to logging information kept by the injecting agent.
6.19.1.5. The posting-date-parameter 6.19.1.5. The posting-date-parameter
This parameter identifies the time at which the article was injected This parameter identifies the time at which the article was injected
(as distinct from the Date header, which indicates when it was (as distinct from the Date-header, which indicates when it was
written). It is in the form of the number of seconds elapsed since written).
January 1st 1970, optionally followed by a date-time which MUST
indicate the same time.
News Article Format November 2001 News Article Format May 2002
6.19.2. Example 6.19.2. Example
Injector-Info: news2.isp.net; posting-host=modem-15.pop.isp.net; Injector-Info: news2.isp.net; posting-host=modem-15.pop.isp.net;
posting-account=client0002623; logging-data=2427; posting-account=client0002623; logging-data=2427;
posting-date="965243133: Wed 2 Aug 2000 20:05:33 -0100 (BST)" posting-date="Wed, 2 Aug 2000 20:05:33 -0100 (BST)"
6.20. Complaints-To 6.20. Complaints-To
The Complaints-To header is added to an article by an injecting agent The Complaints-To-header is added to an article by an injecting agent
in order to indicate the mailbox to which complaints concerning the in order to indicate the mailbox to which complaints concerning the
poster of the article may be sent. poster of the article may be sent.
header =/ Complaints-To-header
Complaints-To-header
= "Complaints-To" ":" SP Complaints-To-content
Complaints-To-content Complaints-To-content
= mailbox = address-list
A Complaints-To header MUST NOT be added to an article by any agent A Complaints-To-header MUST NOT be added to an article by any agent
other than an injecting agent. Any Complaints-To header present when other than an injecting agent. Any Complaints-To-header present when
an article arrives at an injecting agent MUST be removed. In an article arrives at an injecting agent MUST be removed. In
particular if, for some exceptional reason (8.2.2), an article gets particular if, for some exceptional reason (8.2.2), an article gets
injected twice, the Complaints-To header will always relate to the injected twice, the Complaints-To-header will always relate to the
second injection. second injection.
The specified mailbox is for sending complaints concerning the The specified mailbox is for sending complaints concerning the
behaviour of the poster of the article; it SHOULD NOT be used for behaviour of the poster of the article; it SHOULD NOT be used for
matters concerning propagation, protocol problems, etc. In the matters concerning propagation, protocol problems, etc. which should
absence of this header, such complaints should be sent to "usenet@" be addressed to "usenet@" or "news@" the path-identity which was
or "news@" the path-identity which was prepended to the Path header prepended to the Path-header by the injecting agent, in accordance
by the injecting agent following section 5.6.2. with section 5.6.2. In the absence of this header, complaints
concerning a poster's behaviour MAY be addressed to "abuse@" that
path-identity (although section 5.6.2 provides no obligation for that
address to be mailable at an injecting agent that is not provided for
the use of the general public).
6.21. MIME headers 6.21. MIME headers
6.21.1. Syntax 6.21.1. Syntax
The following headers, as defined within [RFC 2045] and its The following headers may be used within articles conforming to this
extensions, may be used within articles conforming to this standard. standard.
MIME-Version: MIME-Version: [RFC 2045]
Content-Type: Content-Type: [RFC 2045],[RFC 2046]
Content-Transfer-Encoding: Content-Transfer-Encoding: [RFC 2045]
Content-ID: Content-ID: [RFC 2045]
Content-Description: Content-Description: [RFC 2045]
Content-Disposition: Content-Disposition: [RFC 2183]
Content-MD5: Content-Location: [RFC 2557]
Content-MD5: [RFC 1864]
News Article Format May 2002
Insofar as the syntax for these headers, as given in [RFC 2045], does The RFCs listed are deemed to be incorporated into this standard to
the extent necessary to facilitate their usage within Netnews,
subject to the revised syntax of parameter given in this standard
(which permits UTF-xtra-chars to appear within quoted-strings used as
values), and subject to curtailment of that usage as described in the
following sections. Moreover, extensions to those standards
registered in accordance with [RFC 2048] are also available for use
within Netnews, as indeed is any other header in the Content-* series
which has a sensible interpretation within Netnews.
Insofar as the syntax for these headers, as given in those RFCs does
not specify precisely where whitespace and comments may occur not specify precisely where whitespace and comments may occur
(whether in the form of WSP, FWS or CFWS), the usage defined in this (whether in the form of WSP, FWS or CFWS), the usage defined in this
standard, and failing that in [RFC 2822], and failing that in [RFC standard, and failing that in [RFC 2822], and failing that in [RFC
822] MUST be followed. In particular, there MUST NOT be any WSP 822] MUST be followed. In particular, there MUST NOT be any WSP
between a header-name and the following colon and there MUST be a SP between a header-name and the following colon and there MUST be a SP
following that colon. following that colon.
News Article Format November 2001
The meaning of the various MIME headers is as defined in [RFC 2045]
and [RFC 2046], and in extensions registered in accordance with [RFC
2048]. However, their usage is curtailed as described in the
following sections.
6.21.2. Content-Type 6.21.2. Content-Type
The Content-Type: "text/plain" is the default type for any news The Content-Type: "text/plain" is the default type for any news
article, but the recommendations and limits on line lengths set out article, but the recommendations and limits on line lengths set out
in section 4.5 Ought to be observed in section 4.5 Ought to be observed
The acceptability of other subtypes of Content-Type: "text" (such as The acceptability of other subtypes of Content-Type: "text" (such as
"text/html") is a matter of policy (see 1.1), and posters Ought Not "text/html") is a matter of policy (see 1.1), and posters Ought Not
to use them unless established policy or custom in the particular to use them unless established policy or custom in the particular
hierarchies or groups involved so allows. Moreover, even in those hierarchies or groups involved so allows. Moreover, even in those
cases, for the benefit of readers who see it only in its transmitted cases, for the benefit of readers who see it only in its transmitted
form, the material SHOULD be "pretty-printed" (for example by form, the material SHOULD be "pretty-printed" (for example by
restricting its line length as above and by keeping sequences which restricting its line length as above and by keeping sequences which
control its layout or style separate from the meaningful text). control its layout or style separate from the meaningful text).
In the same way, Content-Types requiring special processing for their In the same way, Content-Types requiring special processing for their
display, such as "application", "image", "audio", "video" and display, such as "application", "image", "audio", "video" and
"multipart/related" are discouraged except in groups specifically "multipart/related" are discouraged except in groups specifically
intended (by policy or custom) to include them. Exceptionally, those intended (by policy or custom) to include them. Exceptionally, those
application types defined in [RFC 1847] and [RFC 2015] and/or [RFC application types defined in [RFC 1847] and [RFC 3156] for use within
2015bis] for use within "multipart/signed" articles, and the type "multipart/signed" articles, and the type "application/pgp-keys" (or
"application/pgp-keys" (or other similar types containing digital other similar types containing digital certificates) may be used
certificates) may be used freely. freely.
Reading agents SHOULD NOT, unless explicitly configured otherwise, Reading agents SHOULD NOT, unless explicitly configured otherwise,
act automatically on Application types which could change the state act automatically on Application types which could change the state
of that agent (e.g. by writing or modifying files), except in the of that agent (e.g. by writing or modifying files), except in the
case of those prescribed for use in control messages (7.2.1.2 and case of those prescribed for use in control messages (7.2.1.2 and
7.2.4.1). 7.2.4.1).
6.21.2.1. Message/partial 6.21.2.1. Message/partial
The Content-Type "message/partial" MAY be used to split a long news The Content-Type "message/partial" MAY be used to split a long news
article into several smaller ones. article into several smaller ones.
News Article Format May 2002
NOTE: This Content-Type is not recommended for textual articles NOTE: This Content-Type is not recommended for textual articles
because the Content-Type, and in particular the charset, of the because the Content-Type, and in particular the charset, of the
complete article cannot be determined by examination of the complete article cannot be determined by examination of the
second and subsequent parts, and hence it is not possible to second and subsequent parts, and hence it is not possible to
read them as separate articles (except when they are written in read them as separate articles (except when they are written in
pure US-ASCII). Moreover, for full compliance with [RFC 2046] it pure US-ASCII). Moreover, for full compliance with [RFC 2046] it
would be necessary to use the "quoted-printable" encoding to would be necessary to use the "quoted-printable" encoding to
ensure the material was 7bit-safe. In any case, breaking such ensure the material was 7bit-safe. In any case, breaking such
long texts into several parts is usually unnecessary, since long texts into several parts is usually unnecessary, since
modern transport agents should have no difficulty in handling modern transport agents should have no difficulty in handling
articles of arbitrary length. articles of arbitrary length.
News Article Format November 2001
On the other hand, "message/partial" may be useful for binaries On the other hand, "message/partial" may be useful for binaries
of excessive length, since reading of the individual parts on of excessive length, since reading of the individual parts on
their own is not required and they would in any case be encoded their own is not required and they would in any case be encoded
in a manner that was 7bit-safe. in a manner that was 7bit-safe.
IF this Content-Type is used, then the "id" parameter SHOULD be in IF this Content-Type is used, then the "id" parameter SHOULD be in
the form of a unique message identifier (but different from that in the form of a unique message identifier (but different from that in
the Message-ID header of any of the parts). The second and subsequent the Message-ID-header of any of the parts). The second and subsequent
parts SHOULD contain References headers referring to all the previous parts SHOULD contain References-headers referring to all the previous
parts, thus enabling reading agents with threading capabilities to parts, thus enabling reading agents with threading capabilities to
present them in the correct order. Reading agents MAY then provide a present them in the correct order. Reading agents MAY then provide a
facility to recombine the parts into a single article (but this facility to recombine the parts into a single article (but this
standard does not require them to do so). standard does not require them to do so).
6.21.2.2. Message/rfc822 6.21.2.2. Message/rfc822
The Content-Type "message/rfc822" should be used for the The Content-Type "message/rfc822" should be used for the
encapsulation (whether as part of another news article or, more encapsulation (whether as part of another news article or, more
usually, as part of a mail message) of complete news articles which usually, as part of an email message) of complete news articles which
have already been posted to Netnews and which are for the information have already been posted to Netnews and which are for the information
of the recipient, and do not constitute a request to repost them. of the recipient, and do not constitute a request to repost them.
In the case where the encapsulated article has Content-Transfer- In the case where the encapsulated article has Content-Transfer-
Encoding "8bit", it will be necessary to change that encoding if it Encoding "8bit", it will be necessary to change that encoding if it
is to be forwarded over some mail transport that only supports is to be forwarded over some email transport that only supports
"7bit". However, this should not be necessary for any mail transport "7bit". However, this should not be necessary for any email transport
that supports the 8BITMIME feature [RFC 2821]. Moreover, where the that supports the 8BITMIME feature [RFC 2821]. Moreover, where the
headers of the encapsulated article contain any UTF8-xtra-chars headers of the encapsulated article contain any UTF8-xtra-chars
(2.4), it may not be possible to transport them over mail transports (2.4.2), it may not be possible to transport them over email
even where 8BITMIME is supported. In such cases, it will be necessary transports even where 8BITMIME is supported. In such cases, it will
to encode those headers as provided in [RFC 2047] (notwithstanding be necessary to encode those headers as provided in [RFC 2047]
that such usage is deprecated for news headers by this standard, and (notwithstanding that such usage is deprecated for news headers by
actually forbidden in the case of the Newsgroups header). this standard, and actually forbidden in the case of the Newsgroups-
header).
In the event that the encapsulated article has to be encoded for In the event that the encapsulated article has to be encoded for
either of these reasons, it may be necessary to reverse that encoding either of these reasons, it may be necessary to reverse that encoding
if certain forms of digital signatures have been employed, or if the if certain forms of digital signatures have been employed, or if the
article is to be reintroduced into some Netnews system (however, in article is to be reintroduced into some Netnews system (however, in
the latter case, the Content-Type "application/news-transmission" the latter case, the Content-Type "application/news-transmission"
should have been used instead). should have been used instead).
News Article Format May 2002
NOTE: It is likely, though not guaranteed, that headers NOTE: It is likely, though not guaranteed, that headers
containing UTF8-xtra-chars will pass safely through mail containing UTF8-xtra-chars will pass safely through email
transports supporting 8BITMIME if the "message/rfc822" object is transports supporting 8BITMIME if the "message/rfc822" object is
sent as an attachment (i.e. as a part of a multipart) rather sent as an attachment (i.e. as a part of a multipart) rather
than as the top-level body of the mail message. Moreover, it is than as the top-level body of the email message. Moreover, it is
anticipated that future extensions to the mail standards will anticipated that future extensions to the Email standards will
permit headers containing UTF8-xtra-chars to be carried without permit headers containing UTF8-xtra-chars to be carried without
further ado over conforming transports. further ado over conforming transports.
[In fact, of current transports supporting 8BITMIME, only sendmail will
have problems with UTF-8 in top-level headers.]
News Article Format November 2001
6.21.2.3. Message/external-body 6.21.2.3. Message/external-body
The Content-Type "message/external-body" could be apropriate for The Content-Type "message/external-body" could be appropriate for
texts which it would be uneconomic (in view of the likely readership) texts which it would be uneconomic (in view of the likely readership)
to distribute to the entire network. to distribute to the entire network.
6.21.2.4. Multipart types 6.21.2.4. Multipart types
The Content-Types "multipart/mixed", "multipart/parallel" and The Content-Types "multipart/mixed", "multipart/parallel" and
"multipart/signed" may be used freely in news articles. However, "multipart/signed" may be used freely in news articles. However,
except where policy or custom so allows, the Content-Type: except where policy or custom so allows, the Content-Type:
"multipart/alternative" SHOULD NOT be used, on account of the extra "multipart/alternative" SHOULD NOT be used, on account of the extra
bandwidth consumed and the difficulty of quoting in followups, but bandwidth consumed and the difficulty of quoting in followups, but
reading agents MUST accept it. reading agents MUST accept it.
The Content-Type: "multipart/digest" is commended for any article The Content-Type: "multipart/digest" is commended for any article
composed of multiple messages more conveniently viewed as separate composed of multiple messages more conveniently viewed as separate
entities, thus enabling reading agents to move rapidly between them. entities, thus enabling reading agents to move rapidly between them.
The "boundary" should be composed of 28 hyphens (US-ASCII 45) (which The "boundary" should be composed of 28 hyphens (US-ASCII 45) (which
makes each boundary delimiter 30 hyphens, or 32 for the final one) so makes each boundary delimiter 30 hyphens, or 32 for the final one) so
as to enable reading agents which currently support the digest usage as to enable reading agents which currently support the digest usage
described in [RFC 1153] to continue to operate correctly. described in [RFC 1153] to continue to operate correctly.
[Actually, this conflicts with some present digest usage (such as the
news.answers rules), but should still be the right way to go. There
remains the possibility that future MIME-compliant readers could enable
one to proceed directly to some particular message by clicking on it in
a table of contents, but that feature is not yet supported by the
current MIME standards.]
NOTE: The various recomendations given above regarding the usage NOTE: The various recommendations given above regarding the
of particular Content-Types apply also to the individual parts usage of particular Content-Types apply also to the individual
of these multiparts. parts of these multiparts.
6.21.3. Content-Transfer-Encoding 6.21.3. Content-Transfer-Encoding
"Content-Transfer-Encoding: 7bit" is sufficient for article bodies "Content-Transfer-Encoding: 7bit" is sufficient for article bodies
(or parts of multiparts) written in pure US-ASCII (or most other (or parts of multiparts) written in pure US-ASCII (or most other
material representable in 7 bits). Posting agents SHOULD specify material representable in 7 bits). Posting agents SHOULD specify
"Content-Transfer-Encoding: 8bit" for all other cases unless there "Content-Transfer-Encoding: 8bit" for all other cases unless there
are pressing reasons to do otherwise. They MAY use "8bit" encoding are pressing reasons to do otherwise. They MAY use "8bit" encoding
even when "7bit" encoding would have sufficed. Examples of such even when "7bit" encoding would have sufficed. Examples of such
pressing reasons are the following: pressing reasons are the following:
1. The content type implies that the content is (or may be) "8bit- 1. The content type implies that the content is (or may be) "8bit-
unsafe"; i.e. it may contain octets equivalent to the US-ASCII unsafe"; i.e. it may contain octets equivalent to the US-ASCII
characters CR or LF (other than in the combination CRLF) or NUL. characters CR or LF (other than in the combination CRLF) or NUL.
In that case one of the Content-Transfer-Encodings "base64" or In that case one of the Content-Transfer-Encodings "base64" or
"quoted-printable" MUST be used, and reading agents MUST be able "quoted-printable" MUST be used, and reading agents MUST be able
to handle both of them. Encoding "binary" MUST NOT be used (except to handle both of them. Encoding "binary" MUST NOT be used (except
in cooperating subnets with alternative transport arrangements) in cooperating subnets with alternative transport arrangements)
because this standard does not mandate a transport mechanism that because this standard does not mandate a transport mechanism that
could support it. News Article Format May 2002
News Article Format November 2001 could support it.
NOTE: If a future extension to the MIME standards were to NOTE: If a future extension to the MIME standards were to
provide a more compact encoding of binary suited to transport provide a more compact encoding of binary suited to transport
over an 8bit channel, it could be considered as an alternative over an 8bit channel, it could be considered as an alternative
to base64 once it had gained widespread acceptance. to base64 once it had gained widespread acceptance.
2. It is often the case that "application" Content-Types are textual 2. It is often the case that "application" Content-Types are textual
in nature, and intelligible to humans as well as to machines, and in nature, and intelligible to humans as well as to machines, and
where this state can be recognized by the posting agent (either where this state can be recognized by the posting agent (either
through knowledge of the particular application type or by through knowledge of the particular application type or by
skipping to change at page 59, line 39 skipping to change at page 63, line 40
"application/news-transmission", "application/news-groupinfo" "application/news-transmission", "application/news-groupinfo"
and "application/news-checkgroups" are textual, and indeed and "application/news-checkgroups" are textual, and indeed
designed for human reading. designed for human reading.
3. Although the "text" Content-Types should normally be encoded as 3. Although the "text" Content-Types should normally be encoded as
8bit (or 7bit), if the character set specified by the "charset=" 8bit (or 7bit), if the character set specified by the "charset="
parameter can include the 3 disallowed octets, then the material parameter can include the 3 disallowed octets, then the material
MUST be encoded as for 8bit-unsafe. This is most likely to arise MUST be encoded as for 8bit-unsafe. This is most likely to arise
in the case of 16-bit character sets such as UTF-16 ([UNICODE3.1] in the case of 16-bit character sets such as UTF-16 ([UNICODE3.1]
or [ISO/IEC 10646]). In addition, where it is known that the or [ISO/IEC 10646]). In addition, where it is known that the
material is subseqently to be gatewayed from news to mail (8.8), material is subsequently to be gatewayed from Netnews to Email
the encoding "quoted-printable" MAY be used (otherwise the gateway (8.8), the encoding "quoted-printable" MAY be used (otherwise the
might have to re-encode it itself). gateway might have to re-encode it itself).
4. Some protocols REQUIRE the use of a particular Content-Transfer- 4. Some protocols REQUIRE the use of a particular Content-Transfer-
Encoding. In particular, the authentication protocol based on Encoding. In particular, the authentication protocol based on
[Open]PGP defined in [RFC 2015] and/or [RFC 2015bis] mandates the [Open]PGP defined in [RFC 3156] mandates the use of one of the
use of one of the encodings "quoted-printable" or "base64". encodings "quoted-printable" or "base64". Whilst posters might be
Whilst posters might be tempted to risk the use of "8bit" or tempted to risk the use of "8bit" or "7bit" encodings (and indeed
"7bit" encodings (and indeed the referenced standard recommends the referenced standard recommends that signed messages using
that signed messages using those encodings be accepted and those encodings be accepted and interpreted), they should be
interpreted), they should be warned that differences in the warned that differences in the treatment of trailing whitespace
treatment of trailing whitespace between OpenPGP [RFC 2440] and between OpenPGP [RFC 2440] and earlier versions of PGP may render
earlier versions of PGP may render signatures written with the one signatures written with the one unverifiable by the other; and,
unverifiable by the other; and, moreover, Usenet articles are very moreover, Usenet articles are very likely to include trailing
likely to include trailing whitespace in the form of a personal whitespace in the form of a personal signature (4.3.2).
signature (4.3.2).
[It is to be hoped that [RFC 2015bis] will have progressed to a full RFC
by the time this draft is finalized.]
News Article Format November 2001
5. The Content-Type message/partial [RFC 2046] is required to use 5. The Content-Type message/partial [RFC 2046] is required to use
encoding "7bit" (the encapsulated complete message may itself use encoding "7bit" (the encapsulated complete message may itself use
encoding "quoted-printable" or "base64", but that information is encoding "quoted-printable" or "base64", but that information is
News Article Format May 2002
only conveyed along with the first of the partial parts). only conveyed along with the first of the partial parts).
NOTE: Although there would actually be no problem using encoding NOTE: Although there would actually be no problem using encoding
"8bit" in a pure Netnews (as opposed to mail) environment, this "8bit" in a pure Netnews (as opposed to Email) environment, this
standard discourages (see 6.21.2.1) the use of "message/partial" standard discourages (see 6.21.2.1) the use of "message/partial"
except for binary material, which will be encoded to pass except for binary material, which will be encoded to pass
through "7bit" in any case. through "7bit" in any case.
Injecting and relaying agents MUST NOT change the encoding of Injecting and relaying agents MUST NOT change the encoding of
articles passed to them. Gateways SHOULD NOT change the encoding articles passed to them. Gateways SHOULD NOT change the encoding
unless absolutely necessary. unless absolutely necessary.
6.21.4. Character Sets 6.21.4. Character Sets
In principle, any character set may be specified in the "charset=" In principle, any character set may be specified in the "charset="
parameter of a content type. However, only those character sets (and parameter of a content type. However, only those character sets (and
the corresponding parts of UTF-8) should be used which are the corresponding parts of UTF-8) should be used which are
appropriate for the customary language(s) of the hierarchy or appropriate for the customary language(s) of the hierarchy or
newsgroup concerned (whose readers could be expected to possess newsgroup concerned (whose readers could be expected to possess
agents capable of displaying them). agents capable of displaying them).
6.21.5. Content Disposition 6.21.5. Content Disposition
Reading agents Ought to honour any Content-Disposition header that is Reading agents Ought to honour any Content-Disposition-header that is
provided (in particular, they Ought to display any part of a provided (in particular, they Ought to display any part of a
multipart for which the disposition is "inline", possibly multipart for which the disposition is "inline", possibly
distinguished from adjacent parts by some suitable separator). In the distinguished from adjacent parts by some suitable separator). In the
absence of such a header, the body of an article or any part of a absence of such a header, the body of an article or any part of a
multipart with Content-Type "text" Ought to be displayed inline. multipart with Content-Type "text" Ought to be displayed inline.
Followup agents which quote parts of a precursor (see 4.3.2) Ought Followup agents which quote parts of a precursor (see 4.3.2) Ought
initially to include all parts of the precursor that were displayed initially to include all parts of the precursor that were displayed
inline, as if they were a single part. inline, as if they were a single part.
6.21.6. Definition of some new Content-Types 6.21.6. Definition of some new Content-Types
skipping to change at page 60, line 55 skipping to change at page 64, line 52
require to be registered with IANA as provided for in [RFC 2048]. require to be registered with IANA as provided for in [RFC 2048].
For "application/news-groupinfo" see 7.2.1.2, for "application/news- For "application/news-groupinfo" see 7.2.1.2, for "application/news-
checkgroups" see 7.2.4.1, and for "application/news-transmission" see checkgroups" see 7.2.4.1, and for "application/news-transmission" see
the following section. the following section.
6.21.6.1. Application/news-transmission 6.21.6.1. Application/news-transmission
The Content-Type "application/news-transmission" is intended for the The Content-Type "application/news-transmission" is intended for the
encapsulation of complete news articles where the intention is that encapsulation of complete news articles where the intention is that
the recipient should then inject them into Netnews. This Application the recipient should then inject them into Netnews. This Application
type SHOULD be used when mailing articles to moderators and to mail- type SHOULD be used when mailing articles to moderators and to
to-news gateways (see 8.2.2). email-to-news gateways (see 8.2.2).
News Article Format November 2001
NOTE: The benefit of such encapsulation is that it removes NOTE: The benefit of such encapsulation is that it removes
possible conflict between news and email headers and it provides possible conflict between news and email headers and it provides
a convenient way of "tunnelling" a news article through a a convenient way of "tunnelling" a news article through a
transport medium that does not support 8bit characters. transport medium that does not support 8bit characters.
News Article Format May 2002
The MIME content type definition of "application/news-transmission" The MIME content type definition of "application/news-transmission"
is: is:
MIME type name: application MIME type name: application
MIME subtype name: news-transmission MIME subtype name: news-transmission
Required parameters: none Required parameters: none
Optional parameters: usage=moderate Optional parameters: usage=moderate
usage=inject usage=inject
usage=relay usage=relay
Encoding considerations: A transfer-encoding (such as Quoted- Encoding considerations: A transfer-encoding (such as Quoted-
skipping to change at page 61, line 40 skipping to change at page 65, line 35
article. However, such control messages article. However, such control messages
also occur in normal news flow, so most also occur in normal news flow, so most
hosts will already be suitably defended hosts will already be suitably defended
against undesired effects. against undesired effects.
Published specification: [USEFOR] Published specification: [USEFOR]
Body part: A complete article or proto-article, ready Body part: A complete article or proto-article, ready
for injection into Netnews, or a batch of for injection into Netnews, or a batch of
such articles. such articles.
NOTE: It is likely that the recipient of an "application/news- NOTE: It is likely that the recipient of an "application/news-
transmission" will be a specialised gateway (e.g. a moderator's transmission" will be a specialized gateway (e.g. a moderator's
submission address) able to accept articles with only one of the submission address) able to accept articles with only one of the
three usage parameters "moderate", "inject" and "relay", hence three usage parameters "moderate", "inject" and "relay", hence
the reason why they are optional, being redundant in most the reason why they are optional, being redundant in most
situations. Nevertheless, they MAY be used to signify the situations. Nevertheless, they MAY be used to signify the
originator's intention with regard to the transmission, so originator's intention with regard to the transmission, so
removing any possible doubt. removing any possible doubt.
When the parameter "relay" is used, or implied, the body part MAY be When the parameter "relay" is used, or implied, the body part MAY be
a batch of articles to be transmitted together, in which case the a batch of articles to be transmitted together, in which case the
following syntax MUST be used. following syntax MUST be used.
batch = 1*( batch-header article ) batch = 1*( batch-header article )
batch-header = "#!" SP "rnews" SP article-size CRLF batch-header = "#!" SP rnews SP article-size CRLF
article-size = 1*digit rnews = %x72.6E.65.77.73 ; case sensitive "rnews"
article-size = 1*DIGIT
where the "rnews" is case-sensitive. Thus a batch is a sequence of
articles, each prefixed by a header line that includes its size. The
article-size is a decimal count of the octets in the article,
counting each CRLF as one octet regardless of how it is actually
represented.
News Article Format November 2001 Thus a batch is a sequence of articles, each prefixed by a header
line that includes its size. The article-size is a decimal count of
the octets in the article, counting each CRLF as one octet regardless
of how it is actually represented.
NOTE: Despite the similarity of this format to an executable NOTE: Despite the similarity of this format to an executable
UNIX script, it is EXTREMELY unwise to feed such a batch into a UNIX script, it is EXTREMELY unwise to feed such a batch into a
command interpreter in anticipation of it running a command command interpreter in anticipation of it running a command
named "rnews"; the security implications of so doing would be named "rnews"; the security implications of so doing would be
News Article Format May 2002
disastrous. disastrous.
6.21.6.2. Message/news withdrawn 6.21.6.2. Message/news obsoleted
The Content-Type "message/news", as previously registered with IANA, The Content-Type "message/news", as previously registered with IANA,
is hereby obsoleted and should be withdrawn. It was never widely is hereby declared obsolete. It was never widely implemented, and its
implemented, and its default treatment as "application/octet-stream" default treatment as "application/octet-stream" by agents that did
by agents that did not recognise it was counter productive. The not recognize it was counter productive. The Content-Type
Content-Type "message/rfc822" SHOULD be used in its place, as already "message/rfc822" SHOULD be used in its place, as already described
described above. above.
6.22. Obsolete Headers 6.22. Obsolete Headers
Persons writing new agents SHOULD ignore any former meanings of the Persons writing new agents SHOULD ignore any former meanings of the
following headers: following headers:
Also-Control Also-Control
See-Also See-Also
Article-Names Article-Names
Article-Updates Article-Updates
7. Control Messages 7. Control Messages
The following sections document the control messages. "Message" is The following sections document the control messages. "Message" is
used herein as a synonym for "article" unless context indicates used herein as a synonym for "article" unless context indicates
otherwise. otherwise.
The Newsgroups header of each control message SHOULD include the The Newsgroups-header of each control message SHOULD include the
newsgroup-name(s) for the group(s) affected (i.e. groups to be newsgroup-name(s) for the group(s) affected (i.e. groups to be
created, modified or removed, or containing articles to be canceled). created, modified or removed, or containing articles to be canceled).
This is to ensure that the message progagates to all sites which This is to ensure that the message propagates to all sites which
receive (or would receive) that group(s). It MAY include other receive (or would receive) that group(s). It MAY include other
newsgroup-names so as to improve propagation (but this practice may newsgroup-names so as to improve propagation (but this practice may
cause the control message to propagate also to places where it is cause the control message to propagate also to places where it is
unwanted, or even cause it not to progatate where it should, so it unwanted, or even cause it not to propagate where it should, so it
should not be used without good reason). should not be used without good reason).
The descriptions below set out REQUIREMENTS to be followed by sites The descriptions below set out REQUIREMENTS to be followed by sites
that receive control messages and choose to honour them. However, that receive control messages and choose to honour them. However,
nothing in these descriptions should be taken as overriding the right nothing in these descriptions should be taken as overriding the right
of any such site, in accordance with its local policy, to deny any of any such site, in accordance with its local policy, to deny any
particular control message, or to refer it to an administrator for particular control message, or to refer it to an administrator for
approval (either as a class or on a case-by-case basis). In approval (either as a class or on a case-by-case basis). In
particular, sites Ought to deny messages not issued by the particular, sites Ought to deny messages not issued by the
appropriate administrative agencies, and therefore SHOULD take such appropriate administrative agencies, and therefore SHOULD take such
steps as are reasonably practicable to validate their authenticity steps as are reasonably practicable to validate their authenticity
(see, for example, section 7.1 below). (see, for example, section 7.1 below).
News Article Format November 2001
Relaying Agents MUST propagate even control messages that they do not Relaying Agents MUST propagate even control messages that they do not
recognise. recognize.
In the following sections, each type of control message is defined In the following sections, each type of control message is defined
syntactically by defining its verb, its arguments, and possibly its syntactically by defining its verb, its arguments, and possibly its
body. body.
News Article Format May 2002
7.1. Digital Signature of Headers 7.1. Digital Signature of Headers
It is most desirable that group control messages (7.2) in particular It is most desirable that group control messages (7.2) in particular
be authenticated by incorporating them within some digital signature be authenticated by incorporating them within some digital signature
scheme that encompasses other headers closely associated with them scheme that encompasses other headers closely associated with them
(including at least the Approved, Message-ID and Date headers). At (including at least the Approved-, Message-ID- and Date-headers). At
the time of writing, this is usually done by means of a protocol the time of writing, this is usually done by means of a protocol
known as "PGPverify" ([PGPVERIFY]), and continued usage of this is known as "PGPverify" ([PGPVERIFY]), and continued usage of this is
encouraged at least as an interim measure. encouraged at least as an interim measure.
However, PGPverify is not considered suitable for standardization in However, PGPverify is not considered suitable for standardization in
its present form, for various technical reasons. It is therefore its present form, for various technical reasons. It is therefore
expected that an early extension to this standard will provide a expected that an early extension to this standard will provide a
robust and general purpose digital authentication mechanism with robust and general purpose digital authentication mechanism with
applicability to all situations requiring protection against applicability to all situations requiring protection against
malicious use of, or interference with, headers. That extension malicious use of, or interference with, headers. That extension
would also address other Netnews security issues. would also address other Netnews security issues.
7.2. Group Control Messages 7.2. Group Control Messages
"Group control messages" are the sub-class of control messages that "Group control messages" are the sub-class of control messages that
request some update to the configuration of the groups known to a request some update to the configuration of the groups known to a
serving agent, namely "newgroup". "rmgroup", "mvgroup" and serving agent, namely "newgroup". "rmgroup", "mvgroup" and
"checkgroups", plus any others created by extensions to this "checkgroups", plus any others created by extensions to this
standard. standard.
All of the group control messages MUST have an Approved header All of the group control messages MUST have an Approved-header
(6.14). Moreover, in those hierarchies where appropriate (6.14). Moreover, in those hierarchies where appropriate
administrative agencies exist (see 1.1), group control messages Ought administrative agencies exist (see 1.1), group control messages Ought
Not to be issued except as authorized by those agencies. Not to be issued except as authorized by those agencies.
7.2.1. The 'newgroup' Control Message 7.2.1. The 'newgroup' Control Message
newgroup-verb = "newgroup" control-message =/ Newgroup-message
newgroup-arguments = CFWS newsgroup-name [ CFWS newgroup-flag ] Newgroup-message = "newgroup" Newgroup-arguments
Newgroup-arguments = CFWS newsgroup-name [ CFWS newgroup-flag ]
newgroup-flag = "moderated" newgroup-flag = "moderated"
The "newgroup" control message requests that the specified group be The "newgroup" control message requests that the specified group be
created or changed. If the request is honoured, or if the group created or changed. If the request is honoured, or if the group
already exists on the serving agent, and if the newgroup-flag already exists on the serving agent, and if the newgroup-flag
"moderated" is present, then the group MUST be marked as moderated, "moderated" is present, then the group MUST be marked as moderated,
and vice versa. "Moderated" is the only such flag defined by this and vice versa. "Moderated" is the only such flag defined by this
standard; other flags MAY be defined for use in cooperating subnets, standard; other flags MAY be defined for use in cooperating subnets,
but newgroup messages containing them MUST NOT be acted on outside of but newgroup messages containing them MUST NOT be acted on outside of
those subnets. those subnets.
News Article Format November 2001
NOTE: Specifically, some alternative flags such as "y" and "m", NOTE: Specifically, some alternative flags such as "y" and "m",
which are sent and recognised by some current software, are NOT which are sent and recognized by some current software, are NOT
part of this standard. Moreover, some existing implementations part of this standard. Moreover, some existing implementations
treat any flag other than "moderated" as indicating an treat any flag other than "moderated" as indicating an
unmoderated newsgroup. Both of these usages are contrary to this unmoderated newsgroup. Both of these usages are contrary to this
standard. standard and control messages with such non-standard flags
should be ignored.
News Article Format May 2002
The message body comprises or includes an "application/news- The message body comprises or includes an "application/news-
groupinfo" (7.2.1.2) part containing machine- and human-readable groupinfo" (7.2.1.2) part containing machine- and human-readable
information about the group. information about the group.
It is REQUIRED that the newsgroup-name conforms to all requirements It is REQUIRED that the newsgroup-name conforms to all requirements
set out in section 5.5. This includes the restrictions as to the set out in section 5.5. This includes the restrictions as to the
permitted characters, and the requirement that they be invariant permitted characters, and the requirement that they be invariant
under NFKC normalization. It is essential that those who issue under NFKC normalization. It is essential that those who issue
newgroup messages are aware of their responsibility to enforce this newgroup messages are aware of their responsibility to enforce this
skipping to change at page 64, line 52 skipping to change at page 68, line 45
2. Other parts containing useful information about the background of 2. Other parts containing useful information about the background of
the newsgroup message (typically of type "text/plain"). the newsgroup message (typically of type "text/plain").
3. Parts containing initial articles for the newsgroup. See section 3. Parts containing initial articles for the newsgroup. See section
7.2.1.3 for details. 7.2.1.3 for details.
In the event that there is only the single (i.e. application/news- In the event that there is only the single (i.e. application/news-
groupinfo) subpart present, it will suffice to include a "Content- groupinfo) subpart present, it will suffice to include a "Content-
Type: application/news-groupinfo" amongst the headers of the control Type: application/news-groupinfo" amongst the headers of the control
message. Otherwise, a "Content-Type: multipart/mixed header" will be message. Otherwise, a "Content-Type: multipart/mixed" header will be
needed, and each separate part will then need its own Content-Type needed, and each separate part will then need its own Content-Type-
header. header.
7.2.1.2. Application/news-groupinfo 7.2.1.2. Application/news-groupinfo
The "application/news-groupinfo" body part contains brief information The "application/news-groupinfo" body part contains brief information
about a newsgroup, i.e. the group's name, it's newsgroup-description about a newsgroup, i.e. the group's name, it's newsgroup-description
and the moderation-flag. and the moderation-flag.
News Article Format November 2001
NOTE: The presence of the newsgroups-tag "For your newsgroups NOTE: The presence of the newsgroups-tag "For your newsgroups
file:" is intended to make the whole newgroup message compatible file:" is intended to make the whole newgroup message compatible
with current practice as described in [Son-of-1036]. with current practice as described in [Son-of-1036].
News Article Format May 2002
The MIME content type definition of "application/news-groupinfo" is: The MIME content type definition of "application/news-groupinfo" is:
MIME type name: application MIME type name: application
MIME subtype name: news-groupinfo MIME subtype name: news-groupinfo
Required parameters: none Required parameters: none
Disposition: by default, inline Disposition: by default, inline
Encoding considerations: "7bit" or "8bit" is sufficient and MUST be Encoding considerations: "7bit" or "8bit" is sufficient and MUST be
used to maintain compatibility. used to maintain compatibility.
Security considerations: this type MUST NOT be used except as part Security considerations: this type MUST NOT be used except as part
of a control message for the creation or of a control message for the creation or
skipping to change at page 65, line 38 skipping to change at page 69, line 34
newsgroups-line CRLF newsgroups-line CRLF
newsgroups-tag = %x46.6F.72 SP %x79.6F.75.72 SP newsgroups-tag = %x46.6F.72 SP %x79.6F.75.72 SP
%x6E.65.77.73.67.72.6F.75.70.73 SP %x6E.65.77.73.67.72.6F.75.70.73 SP
%x66.69.6C.65.3A %x66.69.6C.65.3A
; case sensitive ; case sensitive
; "For your newsgroups file:" ; "For your newsgroups file:"
newsgroups-line = newsgroup-name newsgroups-line = newsgroup-name
[ 1*HTAB newsgroup-description ] [ 1*HTAB newsgroup-description ]
[ 1*WSP moderation-flag ] [ 1*WSP moderation-flag ]
newsgroup-description newsgroup-description
= 1*( [WSP] utext) = utext *( *WSP utext )
moderation-flag = %x28.4D.6F.64.65.72.61.74.65.64.29 moderation-flag = %x28.4D.6F.64.65.72.61.74.65.64.29
; case sensitive "(Moderated)" ; case sensitive "(Moderated)"
The whole groupinfo-body is intended to be interpreted as a text
written in the UTF-8 character set. The newsgroup-description MUST NOT contain any occurrence of the
string "(Moderated)" within it. The whole groupinfo-body is intended
to be interpreted as a text written in the UTF-8 character set.
The "application/news-groupinfo" is used in conjunction with the The "application/news-groupinfo" is used in conjunction with the
"newgroup" (7.2.1) and "mvgroup" (7.2.3) control messages. The "newgroup" (7.2.1) and "mvgroup" (7.2.3) control messages. The
newsgroup-name(s) in the newsgroups-line MUST agree with the newsgroup-name in the newsgroups-line MUST agree with the newsgroup-
newsgroup-name(s) in the "newgroup" or "mvgroup" control message. name in the "newgroup" or "mvgroup" control message. The Content-
The Content-Type "application/news-groupinfo" MUST NOT be used except Type "application/news-groupinfo" MUST NOT be used except as a part
as a part of such control messages. Although optional, the of such control messages. Although optional, the newsgroups-tag
newsgroups-tag SHOULD be included until such time as this standard SHOULD be included until such time as this standard has been widely
has been widely adopted, to ensure compatibility with present adopted, to ensure compatibility with present practice.
practice.
Moderated newsgroups MUST be marked by appending the case sensitive Moderated newsgroups MUST be marked by appending the case sensitive
text " (Moderated)" at the end. It is NOT recommended that the text " (Moderated)" at the end. It is NOT recommended that the
moderator's email address be included in the newsgroup-description as moderator's email address be included in the newsgroup-description as
has sometimes been done. has sometimes been done.
News Article Format November 2001
Although, in accordance with [RFC 2822] and section 4.5 of this Although, in accordance with [RFC 2822] and section 4.5 of this
standard, a newsgroups-line could have a maximum length of 998 standard, a newsgroups-line could have a maximum length of 998
octets, as a matter of policy a far lower limit, expressed in octets, as a matter of policy a far lower limit, expressed in
characters, Ought to be set. The current convention is to limit its characters, Ought to be set. The current convention is to limit its
length so that the newsgroup-name, the HTAB(s) (interpreted as 8- length so that the newsgroup-name, the HTAB(s) (interpreted as 8-
News Article Format May 2002
character tabs that takes one at least to column 24) and the character tabs that takes one at least to column 24) and the
newsgroup-description (excluding any moderation-flag) fit into 79 newsgroup-description (excluding any moderation-flag) fit into 79
characters. However, this standard does not seek to enforce any such characters. However, this standard does not seek to enforce any such
rule, and reading agents SHOULD therefore enable a newsgroups-line of rule, and reading agents SHOULD therefore enable a newsgroups-line of
any length to be displayed, e.g. by wrapping it as required. any length to be displayed, e.g. by wrapping it as required.
NOTE: The newsgroups-line is intended to provide a brief NOTE: The newsgroups-line is intended to provide a brief
description of the newsgroup, written in the UTF-8 character description of the newsgroup, written in the UTF-8 character
set. Since newsgroup-names are required to be expressed in set. Since newsgroup-names are required to be expressed in
UTF-8 when they appear in headers, and since [NNTP] requires the UTF-8 when they appear in headers, and since [NNTP] requires the
use of UTF-8 when such a description is transmitted by the LIST use of UTF-8 when such a description is transmitted by the LIST
NEWSGROUPS command, it would also be convenient for servers that NEWSGROUPS command, it would also be convenient for servers that
keep a "newsgroups" file to store them in that form, so as to keep a "newsgroups" file to store them in that form, so as to
avoid unnecessary conversions. avoid unnecessary conversions.
[If, at the time of publication of this standard, [NNTP] is still [RFC
977], that NOTE will need to be changed to indicate that "it is expected
that a future extension of [RFC 977] will require ...".]
7.2.1.3. Initial Articles 7.2.1.3. Initial Articles
Some subparts of a "newgroup" or "mvgroup" control message MAY Some subparts of a "newgroup" or "mvgroup" control message MAY
contain an initial set of articles to be posted to the affected contain an initial set of articles to be posted to the affected
newsgroup(s) as soon as it has been created or modified. These parts newsgroup(s) as soon as it has been created or modified. These parts
are identified by having the Content-Type "application/news- are identified by having the Content-Type "application/news-
transmission", possibly with the parameter "usage=inject". The body transmission", possibly with the parameter "usage=inject". The body
of each such part should be a complete proto-article, ready for of each such part should be a complete proto-article, ready for
posting. This feature is intended for the posting of charters, posting. This feature is intended for the posting of charters,
initial FAQs and the like to the newly formed group(s). initial FAQs and the like to the newly formed group(s).
The Newsgroups header of the proto-article MUST include the The Newsgroups-header of the proto-article MUST include the
newsgroup-name of the newly created or modified group. It MAY include newsgroup-name of the newly created or modified group. It MAY include
other newsgroup-names. If the proto-article includes a Message-ID other newsgroup-names. If the proto-article includes a Message-ID-
header, the message indentifier in it MUST be different from that of header, the message indentifier in it MUST be different from that of
any existing article and from that of the control message as a whole. any existing article and from that of the control message as a whole.
Alternatively such a message identifier MAY be derived by the Alternatively such a message identifier MAY be derived by the
injecting agent when the proto-article is posted. The proto-article injecting agent when the proto-article is posted. The proto-article
SHOULD include the header "Distribution: local". SHOULD include the header "Distribution: local".
The proto-article SHOULD be injected at the serving agent that The proto-article SHOULD be injected at the serving agent that
processes the control message AFTER the newsgroup(s) in question has processes the control message AFTER the newsgroup in question has
been created or modified. It MUST NOT be injected if the newsgroup is been created or modified. It MUST NOT be injected if the newsgroup is
not, in fact, created (for whatever reason). It MUST NOT be submitted not, in fact, created (for whatever reason). It MUST NOT be submitted
to any relaying agent for transmission beyond the server(s) upon to any relaying agent for transmission beyond the server(s) upon
which the newsgroup creation has just been effected (in other words, which the newsgroup creation has just been effected (in other words,
it is to be treated as having a "Distribution: local" header, it is to be treated as having a "Distribution: local" header,
whether such a header is actually present or not). whether such a header is actually present or not).
NOTE: The "$p=<n>" convention, if applied uniformly, should
ensure that initial articles relayed beyond the local server in
contravention of the above prohibition will not propagate in
competition with similar copies injected at other local servers.
News Article Format November 2001
NOTE: It is not precluded that the proto-article is itself a NOTE: It is not precluded that the proto-article is itself a
control message or other type of special article, to be control message or other type of special article, to be
activated only upon creation of the new newsgroup. However, activated only upon creation of the new newsgroup. However,
except as might arise from that possibility, any except as might arise from that possibility, any
"application/news-transmission" within some nested "multipart/*" "application/news-transmission" within some nested "multipart/*"
structure within the proto-article is not to be activated. structure within the proto-article is not to be activated.
News Article Format May 2002
7.2.1.4. Example 7.2.1.4. Example
A "newgroup" with bilingual charter and policy information: A "newgroup" with its charter:
From: "example.all Administrator" <admin@example.invalid> From: "example.all Administrator" <admin@noc.example>
Newsgroups: example.admin.groups,example.admin.announce Newsgroups: example.admin.info,example.admin.announce
Date: 27 Feb 1997 12:50:22 +0200 Date: 27 Feb 2002 12:50:22 +0200
Subject: cmsg newgroup example.admin.info moderated Subject: cmsg newgroup example.admin.info moderated
Approved: admin@example.invalid Approved: admin@noc.example
Control: newgroup example.admin.info moderated Control: newgroup example.admin.info moderated
Message-ID: <ng-example.admin.info-19970227@example.invalid> Message-ID: <ng-example.admin.info-20020227@noc.example>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="nxtprt" Content-Type: multipart/mixed; boundary="nxtprt"
Content-Transfer-Encoding: 8bit Content-Transfer-Encoding: 8bit
This is a MIME control message. This is a MIME control message.
--nxtprt --nxtprt
Content-Type: application/news-groupinfo Content-Type: application/news-groupinfo
For your newsgroups file: For your newsgroups file:
example.admin.info About the example.* groups (Moderated) example.admin.info About the example.* groups (Moderated)
--nxtprt --nxtprt
Content-Type: application/news-transmission Content-Type: application/news-transmission
Newsgroups: example.admin.info Newsgroups: example.admin.info
From: "example.all Administrator" <admin@example.invalid> From: "example.all Administrator" <admin@noc.example>
Subject: Charter for example.admin.info Subject: Charter for example.admin.info
Message-ID: <ng-example.admin.info-19970227$p=1@example.invalid> Message-ID: <charter-example.admin.info-20020227@noc.example>
Distribution: local Distribution: local
Content-Type: multipart/alternative ;
differences = content-language ;
boundary = nxtlang
--nxtlang
Content-Type: text/plain; charset=us-ascii Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit Content-Transfer-Encoding: 7bit
Content-Language: en
The group example.admin.info contains regularly posted The group example.admin.info contains regularly posted
information on the example.* hierarchy. information on the example.* hierarchy.
--nxtlang
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Content-Language: de
Die Gruppe example.admin.info enthaelt regelmaessig versandte
News Article Format November 2001
Informationen ueber die example.*-Hierarchie.
--nxtlang--
--nxtprt-- --nxtprt--
7.2.2. The 'rmgroup' Control Message 7.2.2. The 'rmgroup' Control Message
rmgroup-verb = "rmgroup" control-message =/ Rmgroup-message
rmgroup-arguments = CFWS newsgroup-name Rmgroup-message = "rmgroup" Rmgroup-arguments
Rmgroup-arguments = CFWS newsgroup-name
The "rmgroup" control message requests that the specified group be The "rmgroup" control message requests that the specified group be
removed from the list of valid groups. The Content-Type of the body removed from the list of valid groups. The Content-Type of the body
is unspecified; it MAY contain anything, usually an explanatory text. is unspecified; it MAY contain anything, usually an explanatory text.
NOTE: It is entirely proper for a serving agent to retain the NOTE: It is entirely proper for a serving agent to retain the
group until all the articles in it have expired, provided that group until all the articles in it have expired, provided that
it ceases to accept new articles. it ceases to accept new articles.
News Article Format May 2002
7.2.2.1. Example 7.2.2.1. Example
Plain "rmgroup": Plain "rmgroup":
From: "example.all Administrator" <admin@example.invalid> From: "example.all Admin