[Docs] [txt|pdf|xml|html] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]

Versions: (draft-hoffman-rfcv3-preptool) 00 01 02 RFC 7998

Network Working Group                                         P. Hoffman
Internet-Draft                                                     ICANN
Intended status: Informational                             J. Hildebrand
Expires: January 1, 2017                                           Cisco
                                                           June 30, 2016


                      RFC v3 Prep Tool Description
                      draft-iab-rfcv3-preptool-02

Abstract

   This document describes some aspects of the "prep tool" that is
   expected to be created when the new RFC v3 specification is deployed.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 1, 2017.

Copyright Notice

   Copyright (c) 2016 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.





Hoffman & Hildebrand     Expires January 1, 2017                [Page 1]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  v3 Prep Tool Usage Scenarios  . . . . . . . . . . . . . . . .   4
   3.  Internet-Draft Submission . . . . . . . . . . . . . . . . . .   4
   4.  Canonical RFC Preparation . . . . . . . . . . . . . . . . . .   5
   5.  What the v3 Prep Tool Does  . . . . . . . . . . . . . . . . .   5
     5.1.  XML Sanitization  . . . . . . . . . . . . . . . . . . . .   5
       5.1.1.  XInclude Processing . . . . . . . . . . . . . . . . .   6
       5.1.2.  DTD Removal . . . . . . . . . . . . . . . . . . . . .   6
       5.1.3.  Processing Instruction Removal  . . . . . . . . . . .   6
       5.1.4.  Validity Check  . . . . . . . . . . . . . . . . . . .   6
       5.1.5.  Check "anchor"  . . . . . . . . . . . . . . . . . . .   6
     5.2.  Defaults  . . . . . . . . . . . . . . . . . . . . . . . .   6
       5.2.1.  "version" Insertion . . . . . . . . . . . . . . . . .   6
       5.2.2.  "seriesInfo" Insertion  . . . . . . . . . . . . . . .   6
       5.2.3.  <date> Insertion  . . . . . . . . . . . . . . . . . .   7
       5.2.4.  "prepTime" Insertion  . . . . . . . . . . . . . . . .   7
       5.2.5.  <ol> Group "start" Insertion  . . . . . . . . . . . .   7
       5.2.6.  Attribute Default Value Insertion . . . . . . . . . .   7
       5.2.7.  Section "toc" attribute . . . . . . . . . . . . . . .   7
       5.2.8.  "removeInRFC" Warning Paragraph . . . . . . . . . . .   8
     5.3.  Normalization . . . . . . . . . . . . . . . . . . . . . .   8
       5.3.1.  "month" Attribute . . . . . . . . . . . . . . . . . .   8
       5.3.2.  ASCII Attribute Processing  . . . . . . . . . . . . .   8
       5.3.3.  "title" Conversion  . . . . . . . . . . . . . . . . .   8
     5.4.  Generation  . . . . . . . . . . . . . . . . . . . . . . .   9
       5.4.1.  "expiresDate" Insertion . . . . . . . . . . . . . . .   9
       5.4.2.  <boilerplate> Insertion . . . . . . . . . . . . . . .   9
         5.4.2.1.  Compare <rfc> "submissionType" and <seriesInfo>
                   "stream"  . . . . . . . . . . . . . . . . . . . .   9
         5.4.2.2.  'Status of this Memo' Insertion . . . . . . . . .   9
         5.4.2.3.  Copyright Insertion . . . . . . . . . . . . . . .   9
       5.4.3.  <reference> "target" Insertion  . . . . . . . . . . .   9
       5.4.4.  <name> Slugification  . . . . . . . . . . . . . . . .  10
       5.4.5.  <reference> Sorting . . . . . . . . . . . . . . . . .  10
       5.4.6.  "pn" Numbering  . . . . . . . . . . . . . . . . . . .  10
       5.4.7.  <iref> Numbering  . . . . . . . . . . . . . . . . . .  10
       5.4.8.  <xref> processing . . . . . . . . . . . . . . . . . .  11
         5.4.8.1.  "derivedContent" Insertion (With Content) . . . .  11
         5.4.8.2.  "derivedContent" Insertion (Without Content)  . .  11
       5.4.9.  <relref> Processing . . . . . . . . . . . . . . . . .  11
     5.5.  Inclusion . . . . . . . . . . . . . . . . . . . . . . . .  12
       5.5.1.  <artwork> Processing  . . . . . . . . . . . . . . . .  12
       5.5.2.  <sourcecode> Processing . . . . . . . . . . . . . . .  13
     5.6.  RFC Production Mode Cleanup . . . . . . . . . . . . . . .  14
       5.6.1.  <note> Removal  . . . . . . . . . . . . . . . . . . .  14
       5.6.2.  <cref> Removal  . . . . . . . . . . . . . . . . . . .  14



Hoffman & Hildebrand     Expires January 1, 2017                [Page 2]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


       5.6.3.  <link> Processing . . . . . . . . . . . . . . . . . .  14
       5.6.4.  XML Comment Removal . . . . . . . . . . . . . . . . .  15
       5.6.5.  "xml:base" and "originalSrc" Removal  . . . . . . . .  15
       5.6.6.  Compliance Check  . . . . . . . . . . . . . . . . . .  15
     5.7.  Finalization  . . . . . . . . . . . . . . . . . . . . . .  15
       5.7.1.  "scripts" Insertion . . . . . . . . . . . . . . . . .  15
       5.7.2.  Pretty-Format . . . . . . . . . . . . . . . . . . . .  15
   6.  Additional Uses for the Prep Tool . . . . . . . . . . . . . .  15
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  16
   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  16
   10. Informative References  . . . . . . . . . . . . . . . . . . .  16
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

1.  Introduction

   For the future of the RFC format, the RFC Editor has decided that XML
   (using the XML2RFCv3 vocabulary [I-D.iab-xml2rfc]) is the canonical
   format, in the sense that it is the data that is the authorized,
   recognized, accepted, and archived version of the document.  See
   [RFC6949] for more detail on this.

   Most people will read other formats, such as HTML, PDF, ASCII text,
   or other formats of the future, however.  In order to ensure each of
   these formats is as similar as possible to one another as well as the
   canonical XML, there is a desire that the translation from XML into
   the other formats will be straightforward syntactic translation.  To
   make that happen, a good amount of data will need to be in the XML
   format that is not there today.  That data will be added by a program
   called the "prep tool", which will often run as a part of the xml2rfc
   process.

   This draft specifies the steps that the prep tool will have to take.
   As changes to [I-D.iab-xml2rfc] are made, this document will be
   updated.

   The details (particularly any vocabularies) described in this
   document are expected to change based on experience gained in
   implementing the RFC Production Center's (RPC) toolset.  Revised
   documents will be published capturing those changes as the toolset is
   completed.  Other implementers must not expect those changes to
   remain backwards-compatible with the details described in this
   document.








Hoffman & Hildebrand     Expires January 1, 2017                [Page 3]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


2.  v3 Prep Tool Usage Scenarios

   The prep tool will have several settings:

   o  Internet-Draft preparation

   o  Canonical RFC preparation

   There are only a few differences between the two settings.  For
   example, the boilerplate output will be different, as will the date
   output on the front page.

   Note that this only describes what the IETF-sponsored prep tool does.
   Others might create their own work-alike prep tools for their own
   formatting needs.  However, an output format developer does not need
   to change the prep tool in order to create their own formatter: they
   only need to be able to consume prepared text.  The IETF-sponsored
   prep tool runs in two different modes: "I-D" mode when the tool is
   run during Internet-Draft submission and processing, and "RFC
   production mode" when the tool is run by the RFC Production Center
   while producing an RFC.

   This tool is described as if it is a separate tool so that we can
   reason about its architectural properties.  In actual implementation,
   it might be a part of a larger suite of functionality.

3.  Internet-Draft Submission

   When the IETF draft submission tool accepts v3 XML as an input
   format, the submission tool runs the submitted file through the prep
   tool.  This is called "I-D mode" in this document.  If the tool finds
   no errors, it keeps two XML files: the submitted file and the prepped
   file.

   The prepped file provides a record of what a submitter was attesting
   to at the time of submission.  It represents a self-contained record
   of what any external references resolved to at the time of
   submission.

   The prepped file is used by the IETF formatters to create outputs
   such as HTML, PDF, and text (or the tools act in a way
   indistinguishable from this).  The message sent out by the draft
   submission tool includes a link to the original XML as well as the
   other outputs, including the prepped XML.

   The prepped XML can be used by tools not yet developed to output new
   formats that have as similar output as possible to the current IETF
   formatters.  For example, if the IETF creates a .mobi output renderer



Hoffman & Hildebrand     Expires January 1, 2017                [Page 4]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


   later, it can run that renderer on all of the prepped XML that has
   been saved, ensuring that the content of included external references
   and all of the part numbers and boilerplate will be the same as what
   was produced by the previous IETF formatters at the time the document
   was first uploaded.

4.  Canonical RFC Preparation

   During AUTH48, the RPC will run the prep tool in canonical RFC
   production mode and make the results available to the authors so they
   can see what the final output might look like.  When the document has
   passed AUTH48 review, the RPC runs the prep tool in canonical RFC
   production mode one last time, locks down the canonicalized XML, runs
   the formatters for the publication formats, and publishes all of
   those.

   This document assumes that the prep tool will be used in the
   following manner by the RPC; they may use something different, or
   with different configuration.

   Similarly to I-D's, the prepped XML can be used later to re-render
   the output formats, or to generate new formats.

5.  What the v3 Prep Tool Does

   The steps listed here are in order of processing.  In all cases where
   the prep tool would "add" an attribute or element, if that attribute
   or element already exists, the prep tool will check that the
   attribute or element has valid values.  If the value is incorrect,
   the prep tool will warn with the old and new values, then replace the
   incorrect value with the new value.

   Currently, the IETF uses a tool called "idnits" to check text input
   to the Internet-Drafts publishing process. idnits indicates if it
   encountered any errors, and also provide text with all of the
   warnings and errors in a human-readable form.  The prep tool should
   probably check for as many of these errors and warnings as possible
   when it is processing the XML input.  For the moment, tooling might
   run idnts on the text output from the prepared XML.  The list below
   contains some of those errors and warnings, but the deployed version
   of the prep tool may contain additional steps to include more or the
   checks from idnits.

5.1.  XML Sanitization

   These steps will ensure that the input document is properly
   formatted, and that all XML processing has been performed.




Hoffman & Hildebrand     Expires January 1, 2017                [Page 5]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


5.1.1.  XInclude Processing

   Process all <x:include> elements.  Note: <x:include>d XML may include
   more <x:include>s (with relative references resolved against the base
   URI potentially modified by a previously inserted xml:base
   attribute).  The tool may be configurable with a limit on the depth
   of recursion.

5.1.2.  DTD Removal

   Fully process any Document Type Definitions (DTDs) in the input
   document, then remove the DTD.  At a minimum, this entails processing
   the entity references and includes for external files.

5.1.3.  Processing Instruction Removal

   Remove processing instructions.

5.1.4.  Validity Check

   Check the input against the RNG in [I-D.iab-xml2rfc].  If the input
   is not valid, give an error.

5.1.5.  Check "anchor"

   Check all elements for "anchor" attributes.  If any "anchor"
   attribute begins with "s-", "f-", "t-", or "i-", give an error.

5.2.  Defaults

   These steps will ensure that all default values have been filled in
   to the XML, in case the defaults change at a later date.  Steps in
   this section will not overwrite existing values in the input file.

5.2.1.  "version" Insertion

   If the <rfc> element has a "version" attribute with a value other
   than "3", give an error.  If the <rfc> element has no "version"
   attribute, add one with the value "3".

5.2.2.  "seriesInfo" Insertion

   If the <front> element of the <rfc> element does not already have a
   <seriesInfo> element, add a <seriesInfo> element with the name
   attribute based on the mode in which the preptool is running
   ("Internet-Draft" for Draft mode and "RFC" for RFC production mode)
   and a value that is the input filename minus any extension for




Hoffman & Hildebrand     Expires January 1, 2017                [Page 6]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


   Internet-Drafts, and is a number specified by the RFC Editor for
   RFCs.

5.2.3.  <date> Insertion

   If the <front> element in the <rfc> element does not contain a <date>
   element, add it and fill in the "day", "month", and "year" attibutes
   from the current date.  If the <front> element in the <rfc> element
   has a <date> element with "day", "month", and "year" attibutes, but
   the date indicated is more than three days in the past or is in the
   future, give a warning.  If the <front> element in the <rfc> element
   has a <date> element with some but not all of the "day", "month", and
   "year" attibutes, give an error.

5.2.4.  "prepTime" Insertion

   If the input document includes a "prepTime" attribute of <rfc>, exit
   with an error.

   Fill in the "prepTime" attribute of <rfc> with the current datetime.

5.2.5.  <ol> Group "start" Insertion

   Add a "start" attribute to every <ol> element containing a group that
   does not already have a start.

5.2.6.  Attribute Default Value Insertion

   Fill in any default values for attributes on elements, except
   "keepWithNext" and "keepWithPrevious" of <t>, and "toc" of <section>.
   Some default values can be found in the Relax NG schema, while others
   can be found in the prose describing the elements in
   [I-D.iab-xml2rfc]).

5.2.7.  Section "toc" attribute

   For each <section>, modify the "toc" attribute to be either "include"
   or "exclude":

   o  for sections that have an ancestor of <boilerplate>, use "exclude"

   o  else for sections that have a descendant that has toc="include",
      use "include".  If the section has toc="exclude" in the input,
      this is an error.

   o  else for sections that are children of a section with
      toc="exclude", use "exclude".




Hoffman & Hildebrand     Expires January 1, 2017                [Page 7]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


   o  else for sections that are deeper than rfc/@tocDepth, use
      "exclude"

   o  else use "include"

5.2.8.  "removeInRFC" Warning Paragraph

   If in I-D mode, if there is a <note> or <section> element with a
   "removeInRFC" attribute that has the value "true", add a paragraph to
   the top of the element with the text "This note is to be removed
   before publishing as an RFC." or "This section...", unless a
   paragraph consisting of that exact text already exists.

5.3.  Normalization

   These steps will ensure that ideas that can be expressed in multiple
   different ways in the input document are only found in one way in the
   prepared document.

5.3.1.  "month" Attribute

   Normalize the values of "month" attributes in all <date> elements in
   <front> elements in <rfc> elements to numeric values.

5.3.2.  ASCII Attribute Processing

   In every <email>, <organization>, <street>, <city>, <region>,
   <country>, and <code> element, if there is an "ascii" attribute and
   the value of that attribute is the same as the content of the
   element, remove the "ascii" element and issue a warning about the
   removal.

   In every <author> element, if there is an "asciiFullname",
   "asciiInitials", or "asciiSurname" attribute, check the content of
   that element against its matching "fullname", "initials", or
   "surname" element (respectively).  If the two are the same, remove
   the "ascii*" elelement and issue a warning about the removal.

5.3.3.  "title" Conversion

   For every <section>, <note>, <figure>, <references>, and <texttable>
   element that has a (deprecated) "title" attribute, remove the "title"
   attribute and insert a <name> element with the title from the
   attribute.







Hoffman & Hildebrand     Expires January 1, 2017                [Page 8]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


5.4.  Generation

   These steps will generate new content, overriding existing similar
   content in the input document.  Some of these steps are important
   enough that they specify a warning to be generated when the content
   being overwritten does not match the new content.

5.4.1.  "expiresDate" Insertion

   If in I-D mode, fill in "expiresDate" attribute of <rfc> based on the
   <date> element of the document's <front> element.

5.4.2.  <boilerplate> Insertion

   Create a <boilerplate> element if it does not exist.  If there are
   any children of the <boilerplate> element, produce a warning that
   says "Existing boilerplate being removed.  Other tools, specifically
   the draft submission tool, will treat this condition as an error" and
   remove the existing children.

5.4.2.1.  Compare <rfc> "submissionType" and <seriesInfo> "stream"

   Verify that <rfc> "submissionType" and <seriesInfo> "stream" are the
   same if they are both present.  If either is missing, add it.  Note
   that both have a default value of "IETF".

5.4.2.2.  'Status of this Memo' Insertion

   Add the "Status of this Memo" section to the <boilerplate> element
   with current values.  The application will use the "submissionType",
   and "consensus" attributes of the <rfc> element, the <workgroup>
   element, and the "status" and "stream" attributes of the <seriesInfo>
   element, to determine which [I-D.iab-rfc5741bis] boilerplate to
   include, as described in Appendix A of [I-D.iab-xml2rfc].

5.4.2.3.  Copyright Insertion

   Add the "Copyright Notice" section to the <boilerplate> element.  The
   application will use the "ipr" and "submissionType" attributes of the
   <rfc> element and the <date> element to determine which portions and
   which version of the TLP to use, as described in A.1 of
   [I-D.iab-xml2rfc].

5.4.3.  <reference> "target" Insertion

   For any <reference> element that does not already have a "target"
   attribute, fill the target attribute in if the element has one or
   more <seriesinfo> child element(s) and the "name" attribute of the



Hoffman & Hildebrand     Expires January 1, 2017                [Page 9]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


   <seriesinfo> element is "RFC", "Internet-Draft", or "DOI" or other
   value for which it is clear what the "target" should be.  The
   particular URLs for RFCs, Internet-Drafts, and DOIs for this step
   will be specified later by the RFC Editor and the IESG.  These URLs
   might also be different before and after the v3 format is adopted.

5.4.4.  <name> Slugification

   Add a "slugifiedName" attribute to each <name> element that does not
   contain one; replace the attribute if it contains a value that begins
   with "n-".

5.4.5.  <reference> Sorting

   If the "sortRefs" attribute of the <rfc> element is true, sort the
   <reference>s and <referencegroup>s lexically by the value of the
   "anchor" attribute, as modified by the "to" attribute of any
   <displayreference> element.  The RFC Editor needs to determine what
   the rules for lexical sorting are.  The authors of this document
   acknowledge that getting consensus on this will be a difficult task.

5.4.6.  "pn" Numbering

   Add "pn" attributes for all parts.  Parts are:

   o  <section> in <middle>: pn='s-1.4.2'

   o  <references>: pn='s-12' or pn='s-12.1'

   o  <abstract>: pn='s-abstract'

   o  <note>: pn='s-note-2'

   o  <section> in <boilerplate>: pn='s-boilerplate-1'

   o  <table>: pn='t-3'

   o  <figure>: pn='f-4'

   o  <artwork>, <aside>, <blockquote>, <dt>, <li>, <sourcecode>, <t>:
      pn='p-[section]-[counter]'

5.4.7.  <iref> Numbering

   In every <iref> element, create a document-unique "pn" attribute.
   The value of the "pn" attribute will start with 'i-', and use the
   item attribute, the subitem attribute (if it exists), and a counter




Hoffman & Hildebrand     Expires January 1, 2017               [Page 10]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


   to ensure uniqueness.  For example, the first instance of "<iref
   item='foo' subitem='bar'>" will get the irefid 'i-foo-bar-1'.

5.4.8.  <xref> processing

5.4.8.1.  "derivedContent" Insertion (With Content)

   For each <xref> element that has content, fill the "derivedContent"
   with the element content, having first trimmed the whitespace from
   ends of content text.  Issue a warning if the "derivedContent"
   attribute already exists and has a different value from what was
   being filled in.

5.4.8.2.  "derivedContent" Insertion (Without Content)

   For each <xref> element that does not have content, fill the
   "derivedContent" based on the "format" attribute.  * For
   format='counter', the "derivedContent" is the section, figure, table,
   or ordered list number of the element with anchor equal to the xref
   target.  * For format='default' and the "target" attribute points to
   a <reference> or <referencegroup> element, the "derivedContent" is
   the value of the "target" attribute (or the "to" attribute of a
   <displayreference> element for the targeted <reference>).  * For
   format='default' and the "target" attribute points to a <section>,
   <figure>, or <table>, the "derivedContent" is the name of the thing
   pointed to, such as "Section 2.3", "Figure 12", or "Table 4".  * For
   format='title', if the target is a <reference> element, the
   "derivedContent" attribute is the name of the reference, extracted
   from the <title> child of the <front> child of the reference.  * For
   format='title', if the target element has a <name> child element, the
   "derivedContent" attribute is the text content of that <name> element
   concatenated with the text content of each descendant node of <name>
   (that is, stripping out all of the XML markup, leaving only the
   text).  * For format='title', if the target element does not contain
   a <name> child element, the "derivedContent" attribute is the value
   of the "target" attribute with no other adornment.  Issue a warning
   if the "derivedContent" attribute already exists and has a different
   value from what was being filled in.

5.4.9.  <relref> Processing

   If any <relref> element's "target" attribute refers to anything but a
   <reference> element, give an error.

   For each <relref> element, fill in the "derivedLink" attribute.






Hoffman & Hildebrand     Expires January 1, 2017               [Page 11]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


5.5.  Inclusion

   These steps will include external files into the output document.

5.5.1.  <artwork> Processing

   1.  If an <artwork> element has a "src" attribute where no scheme is
       specified, copy the "src" attribute value to the "originalSrc"
       attribute, and replace the "src" value with a URI that uses the
       "file:" scheme in a path relative to the file being processed.
       See Section 8 for warnings about this step.  This will likely be
       one of the most common authoring approaches.

   2.  If an <artwork> element has a "src" attribute with a "file:"
       scheme, and if processing the URL would cause the processor to
       retrieve a file that is not in the same directory, or a
       subdirectory, as the file being processed, give an error.  If the
       "src" has any shellmeta strings (such as "`", "$USER", and so on)
       that would be processed , give an error.  Replace the "src"
       attribute with a URI that uses the "file:" scheme in a path
       relative to the file being processed.  This rule attempts to
       prevent <artwork src='file:///etc/passwd'> and similar security
       issues.  See Section 8 for warnings about this step.

   3.  If an <artwork> element has a "src" attribute, and the element
       has content, give an error.

   4.  If an <artwork> element has type='svg' and there is a "src"
       attribute, the data needs to be moved into the content of the
       <artwork> element.

       *  If the "src" URI scheme is "data:", fill the content of the
          <artwork> element with that data and remove the "src"
          attribute.

       *  If the "src" URI scheme is "file:", "http:", or "https:", fill
          the content of the <artwork> element with the resolved XML
          from the URI in the "src" attribute.  If there is no
          "originalSrc" attribute, add an "originalSrc" attribute with
          the value of the URI and remove the "src" attribute.

       *  If the <artwork> element has an "alt" attribute, and the SVG
          does not have a <desc> element, add the <desc> element with
          the contents of the "alt" attribute.

   5.  If an <artwork> element has type='binary-art', the data needs to
       be in a "src" attribute with a URI scheme of "data:".  If the
       "src" URI scheme is "file:", "http:", or "https:", resolve the



Hoffman & Hildebrand     Expires January 1, 2017               [Page 12]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


       URL.  Replace the "src" attribute with a "data:" URI, and add an
       "originalSrc" attribute with the value of the URI.  For the
       "http:" and "https:" URI schemes, the mediatype of the "data:"
       URI will be the Content-Type of the HTTP response.  For the
       "file:" URI scheme, the mediatype of the "data:" URI needs to be
       guessed with heuristics (this is possibly a bad idea).  This also
       fails for content that includes binary images but uses a type
       other than "binary-art".  Note: since this feature can't be used
       for RFCs at the moment, this entire feature might be de-
       prioritized.

   6.  If an <artwork> element does not have type='svg' or type='binary-
       art' and there is a "src" attribute, the data needs to be moved
       into the content of the <artwork> element.  Note that this step
       assumes that all of the preferred types other than "binary-art"
       are text, which is possibly wrong.

       *  If the "src" URI scheme is "data:", fill the content of the
          <artwork> element with the correctly-escaped form of that data
          and remove the "src" attribute.

       *  If the "src" URI scheme is "file:", "http:", or "https:", fill
          the content of the <artwork> element with the correctly-
          escaped form of the resolved text from the URI in the "src"
          attribute.  If there is no "originalSrc" attribute, add an
          "originalSrc" attribute with the value of the URI and remove
          the "src" attribute.

5.5.2.  <sourcecode> Processing

   1.  If an <sourcecode> element has a "src" attribute where no scheme
       is specified, copy the "src" attribute value to the "originalSrc"
       attribute, and replace the "src" value with a URI that uses the
       "file:" scheme in a path relative to the file being processed.
       See Section 8 for warnings about this step.  This will likely be
       one of the most common authoring approaches.

   2.  If an <sourcecode> element has a "src" attribute with a "file:"
       scheme, and if processing the URL would cause the processor to
       retrieve a file that is not in the same directory, or a
       subdirectory, as the file being processed, give an error.  If the
       "src" has any shellmeta strings (such as "`", "$USER", and so on)
       that would be processed , give an error.  Replace the "src"
       attribute with a URI that uses the "file:" scheme in a path
       relative to the file being processed.  This rule attempts to
       prevent <sourcecode src='file:///etc/passwd'> and similar
       security issues.  See Section 8 for warnings about this step.




Hoffman & Hildebrand     Expires January 1, 2017               [Page 13]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


   3.  If an <sourcecode> element has a "src" attribute, and the element
       has content, give an error.

   4.  If an <sourcecode> elementhas a "src" attribute, the data needs
       to be moved into the content of the <sourcecode> element.

       *  If the "src" URI scheme is "data:", fill the content of the
          <sourcecode> element with that data and remove the "src"
          attribute.

       *  If the "src" URI scheme is "file:", "http:", or "https:", fill
          the content of the <sourcecode> element with the resolved XML
          from the URI in the "src" attribute.  If there is no
          "originalSrc" attribute, add an "originalSrc" attribute with
          the value of the URI and remove the "src" attribute.

5.6.  RFC Production Mode Cleanup

   These steps provide extra cleanup of the output document in RFC
   production mode.

5.6.1.  <note> Removal

   If in RFC production mode, if there is a <note> or <section> element
   with a "removeInRFC" attribute that has the value "true", remove the
   element.

5.6.2.  <cref> Removal

   If in RFC production mode, remove all <cref> elements.

5.6.3.  <link> Processing

   1.  If in RFC production mode, remove all <link> elements whose "rel"
       attribute has the value "alternate".

   2.  If in RFC production mode, check if there is a <link> element
       with the current ISSN for the RFC series (2070-1721); if not, add
       <link rel="item" href="urn:issn:2070-1721">.

   3.  If in RFC production mode, check if there is a <link> element
       with a DOI for this RFC; if not, add one of the form <link
       rel="describedBy" href="https://dx.doi.org/10.17487/rfcdd"> where
       "dd" is the number of the RFC, such as
       "https://dx.doi.org/10.17487/rfc2109".  The URI is described in
       [RFC7669].  If there was already a <link> element with a DOI for
       this RFC, check that the "href" value has the right format.




Hoffman & Hildebrand     Expires January 1, 2017               [Page 14]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


   4.  If in RFC production mode, check if there is a <link> element
       with the file name of the Internet-Draft that became this RFC the
       form <link rel="convertedFrom"
       href="https://datatracker.ietf.org/doc/draft-tttttttttt/">.  If
       one does not exist, give an error.

5.6.4.  XML Comment Removal

   If in RFC production mode, remove XML comments.

5.6.5.  "xml:base" and "originalSrc" Removal

   If in RFC production mode, remove all "xml:base" or "originalSrc"
   attributes from all elements.

5.6.6.  Compliance Check

   If in RFC production mode, ensure that the result is in full
   compliance to v3 schema, without any deprecated elements or
   attributes, and give an error if any issues are found.

5.7.  Finalization

   These steps provide the finishing touches on the output document.

5.7.1.  "scripts" Insertion

   Determine all the characters used in the document, and fill in the
   "scripts" attribute for <rfc>.

5.7.2.  Pretty-Format

   Pretty-format the XML output.  (Note: there are many tools that do an
   adequate job.)

6.  Additional Uses for the Prep Tool

   There will be a need for Internet-Draft authors who include files
   from their local disk (such as for <artwork src="mydrawing.svg"/>) to
   have the contents of those files inlined to their drafts before
   submitting them to the Internet-Draft processor.  (There is a
   possibility that the Internet-Draft processor will allow XML files
   and accompanying files to be submitted at the same time, but this
   seems troublesome from a security, portability, and complexity
   standpoint.)  For these users, having a local copy of the prep tool
   that has an option to just inline all local files would be terribly
   useful.  That option would be a proper subset of the steps given in
   Section 5.



Hoffman & Hildebrand     Expires January 1, 2017               [Page 15]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


   A feature that might be useful in a local prep tool would be the
   inverse of the "just inline" option would be "extract all".  This
   would allow a user who has a v3 RFC or Internet-Draft to dump all of
   the <artwork> and <sourcecode> elements into local files instead of
   having to find each one in the XML.  This option might even do as
   much validation as possible on the extracted <sourcecode> elements.
   This feature might also remove some of the features added by the prep
   tool (such as part numbers and slugifiedName's starting with "n-") in
   order to make the resulting file easier to edit.

7.  IANA Considerations

   None.

8.  Security Considerations

   Steps in this document attempt to prevent the <artwork> and
   <sourcecode> entities from exposing the contents of files outside the
   directory in which the document being processed resides.  For
   example, values starting with "/", "./", or "../" should generate
   errors.

   The security considerations in [RFC3470] apply here.  In specific,
   processing XML external references can expose a prep tool
   implementation to various threats by causing the implementation to
   access external resources automatically.  It is important to disallow
   arbitrary access to such external references within XML data from
   untrusted sources.

9.  Acknowledgements

   Many people contributed valuable ideas to this document.  Special
   thanks go to Robert Sparks for his in-depth review and contributions
   early in the development of this document, and to Julian Reschke for
   his help getting the document structured more clearly.

10.  Informative References

   [I-D.iab-rfc5741bis]
              Halpern, J., Daigle, L., and O. Kolkman, "On RFC Streams,
              Headers, and Boilerplates", draft-iab-rfc5741bis-02 (work
              in progress), February 2016.

   [I-D.iab-xml2rfc]
              Hoffman, P., "The "xml2rfc" version 3 Vocabulary", draft-
              iab-xml2rfc-04 (work in progress), June 2016.





Hoffman & Hildebrand     Expires January 1, 2017               [Page 16]


Internet-Draft        RFC v3 Prep Tool Description             June 2016


   [RFC3470]  Hollenbeck, S., Rose, M., and L. Masinter, "Guidelines for
              the Use of Extensible Markup Language (XML) within IETF
              Protocols", BCP 70, RFC 3470, DOI 10.17487/RFC3470,
              January 2003, <http://www.rfc-editor.org/info/rfc3470>.

   [RFC6949]  Flanagan, H. and N. Brownlee, "RFC Series Format
              Requirements and Future Development", RFC 6949,
              DOI 10.17487/RFC6949, May 2013,
              <http://www.rfc-editor.org/info/rfc6949>.

   [RFC7669]  Levine, J., "Assigning Digital Object Identifiers to
              RFCs", RFC 7669, DOI 10.17487/RFC7669, October 2015,
              <http://www.rfc-editor.org/info/rfc7669>.

Authors' Addresses

   Paul Hoffman
   ICANN

   Email: paul.hoffman@icann.org


   Joe Hildebrand
   Cisco

   Email: jhildebr@cisco.com

























Hoffman & Hildebrand     Expires January 1, 2017               [Page 17]


Html markup produced by rfcmarkup 1.122, available from https://tools.ietf.org/tools/rfcmarkup/