Applications Area Working Group                               S. Leonard
Internet-Draft                                             Penango, Inc.
Intended Status: Informational                          October 17,                         December 16, 2014
Expires: April 20, June 19, 2015

                      The text/markdown Media Type
                  draft-ietf-appsawg-text-markdown-03
                  draft-ietf-appsawg-text-markdown-04

Abstract

   This document registers the text/markdown media type for use with
   Markdown, a family of plain text formatting syntaxes that optionally
   can be converted to formal markup languages such as HTML.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

Copyright Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  2
     1.1. This Is Markdown! Or: Markup and Its Discontents  . . . . .  2
     1.2. Markdown Is About Writing and Editing . . . . . . . . . . .  3
     1.3. RFC 2119  . Definitions . . . . . . . . . . . . . . . . . . . . . . . .  5
   2. Markdown Media Type Registration Application  . . . . . . . . .  5
   3.  Optional Parameters  . . . . . . . . . . . . . . . . . . . . .  7
     3.1. syntax  . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     3.2. output-type . . . . . . . . . . . . . . . . . . . . . . . . 11
   4. Fragment Identifiers  . . . . . . . . . . . . . . . . . . . . . 13  7
     4.1. #t  . . . . . . . . . . . . . . . . . General-Purpose Fragment Identifiers  . . . . . . . . . . . 13  8
     4.2. #o  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     4.3. #l and #ldef  . . . . . . . . . . . . . . . Parameters  . . . . . . . . 13
     4.4. Other Fragment Identifiers . . . . . . . . . . . . . . . . 14  8
   5.  Example  . . . . . . . . . . . . . . . . . . . . . . . . . . . 14  9
   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 15  9
     6.1. Syntax Template . Markdown Variants . . . . . . . . . . . . . . . . . . . . . 15 10
     6.2. Initial Registration  . . . . . . . . . . . . . . . . . . . 17
     6.3. Reserved Identifiers  . . . . . . . . . . . . . . . . . . . 18
     6.4. 10
     6.3. Standard of Review  . . . . . . . . . . . . . . . . . . . . 18
     6.5. 11
     6.4. Provisional Registration  . . . . . . . . . . . . . . . . . 19 11
   7. Security Considerations . . . . . . . . . . . . . . . . . . . . 19 11
   8. References  . . . . . . . . . . . . . . . . . . . . . . . . . . 19 11
     8.1. Normative References  . . . . . . . . . . . . . . . . . . . 19 11
     8.2. Informative References  . . . . . . . . . . . . . . . . . . 20 12
   Appendix A.  Change Log  . . . . . . . . . . . . . . . . . . . . . 21 13
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 22 14

1. Introduction

1.1. This Is Markdown! Or: Markup and Its Discontents

   In computer systems, textual data is stored and processed using a
   continuum of techniques. On the one end is plain text: a linear
   sequence of characters in some character set (code), possibly
   interrupted by line breaks, page breaks, or other control characters.
   The repertoire of these control characters (a form of in-band
   signaling) is necessarily limited, and not particularly extensible.
   Because they are non-printing, these characters are also hard to
   enter with standard keyboards.

   Markup offers an alternative means to encode this signaling
   information by overloading certain characters with additional
   meanings. Therefore, markup languages allow for annotating a document
   in such a way that annotations are syntactically distinguishable from
   the printing information. Markup languages are (reasonably) well-
   specified and tend to follow (mostly) standardized syntax rules.
   Examples of formal markup languages include SGML, HTML, XML, and
   LaTeX. Standardized rules lead to interoperability between markup
   processors, but impose skill requirements on new users that lead to
   markup languages becoming less accessible to beginners. These rules
   also reify "validity": content that does not conform to the rules is
   treated differently (i.e., is rejected) than content that conforms.

   In contrast to formal markup languages, lightweight markup languages
   use simple syntaxes; they are designed to be easy for humans to enter
   and understand with basic text editors. Markdown, the subject of this
   document, began as an /informal/ plain text formatting syntax
   [MDSYNTAX] and Perl script HTML/XHTML processor [MARKDOWN] targeted
   at non-technical users using unspecialized tools, such as plain text
   e-mail clients. [MDSYNTAX] explicitly rejects the notion of validity:
   there is no such thing as "invalid" Markdown. If the Markdown content
   does not result in the "right" output (defined as output that the
   author wants, not output that adheres to some dictated system of
   rules), the expectation is that the author should continue
   experimenting by changing the content or the processor to achieve the
   desired output.

   Since its development in 2004 [MARKDOWN], a number of web- and
   Internet-facing applications have incorporated Markdown into their
   text entry systems, frequently with custom extensions. Markdown has
   thus evolved into a kind of Internet meme [INETMEME] as different
   communities encounter it and adapt the syntax for their specific use
   cases. Markdown now represents a family of related plain text
   formatting syntaxes and implementations that, while broadly
   compatible with humans [HUMANE], are intended to produce different
   kinds of outputs that push the boundaries of mutual intelligibility
   between software systems.

   To support identifying and conveying Markdown, this document defines
   a media type and parameters that indicate the author's intent on how
   to interpret the Markdown. This registration draws particular
   inspiration from text/troff [RFC4263], which is a plain text
   formatting syntax for typesetting based on tools from the 1960s
   ("RUNOFF") and 1970s ("nroff", et. al.). In that sense, Markdown is a
   kind of troff for modern computing. A companion document [MDMTUSES]
   provides additional Markdown background and philosophy.

1.2. Markdown Is About Writing and Editing

     "HTML is a *publishing* format; Markdown is a *writing* format.
      Thus, Markdown's formatting syntax only addresses issues
      that can be conveyed in plain text." [MDSYNTAX]

   The paradigmatic use case for text/markdown is the Markdown editor:
   an application that presents Markdown content (which looks like an e-
   mail or other piece of plain text writing) alongside a published
   format, so that an author can see results instantaneously and can
   tweak his or her input in real-time. A significant number of Markdown
   editors have adopted "split-screen view" (or "live preview")
   technology that looks like Figure 1:

+----------------------------------------------------------------------+
| File  Edit  (Cloud Stuff)  (Fork Me on GitHub)  Help                 |
+----------------------------------------------------------------------+
| [ such-and-such identifier ]                 [ useful statistics]    |
+----------------------------------++----------------------------------+
| (plain text, with                || (text/html, likely               |
|  syntax highlighting)            ||  rendered to screen)             |
|                                  ||                                  |
|# Introduction                    ||<h1>Introduction</h1>             |
|                                  ||                                  |
|## Markdown Is About Writing and  /|<h2>Markdown Is About Writing and |
/ Editing                          ||Editing</h2>                      |
|                                  ||                                  |
|> HTML is a *publishing* format;  ||<blockquote><p>HTML is a          |
|> Markdown is a *writing* format. || <em>publishing</em> format;      |
|> Thus, Markdown's formatting     || Markdown is a <em>writing</em>   |
|> syntax only addresses issues    || format. Thus, Markdown's         |
|> that can be conveyed in plain   <> formatting syntax only addresses |
|> text. [MDSYNTAX][]              || issues that can be conveyed in   |
|                                  || plain text. <a href="http://darin/
|The paradigmatic use case for     |/gfireball.net/projects/markdown/sy/
|`text/markdown` is the Markdown   |/ntax#html" title="Markdown: Syntax/
|editor: an application that       |/: HTML">MDSYNTAX</a>              |
|presents Markdown content         ||</p></blockquote>                 |
|...                               ||                                  |
|                                  ||<p>The paradigmatic use case for  |
|[MDSYNTAX]: http://daringfireball./| <code>text/markdown</code> is the|
/net/projects/markdown/syntax#html || Markdown editor: an application  |
|"Markdown: Syntax: HTML"          || that presents Markdown content   |
|                                  || ...</p>                          |
+----------------------------------++----------------------------------+

 LEGEND: "/" embedded in a vertical line represents a line-continuation
  marker, since a line break is not supposed to occur in that content.

          Figure 1: Markdown Split-Screen/Live Preview Editor

Users on diverse platforms SHOULD be able to collaborate with their
tools of choice, whether those tools are desktop-based (MarkdownPad,
MultiMarkdown Composer), browser-based (Dillinger, Markable), integrated
widgets (Discourse, GitHub), general-purpose editors (emacs, vi), or
plain old "Notepad". Additionally, users SHOULD be able to identify
particular areas of Markdown content when the Markdown becomes
appreciably large (e.g., book chapters and Internet-Drafts--not just
blog posts). Users SHOULD be able to use text/markdown to convey their
works in progress, not just their finished products (for which full-
blown markups ranging from text/html to application/pdf are
appropriate). This registration facilitates interoperability between
these Markdown editors by conveying the syntax of the particular
Markdown variant and the desired output format.

1.3. RFC 2119 Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   Since Markdown signifies a family of related formats with varying
   degrees of formal documentation and implementation, this
   specification uses the term "variant" to identify such formats.

2. Markdown Media Type Registration Application

   This section provides the media type registration application for the
   text/markdown media type (see [RFC6838], Section 5.6).

    Type name: text

    Subtype name: markdown

    Required parameters:

     charset: Per Section 4.2.1 of [RFC6838], charset is REQUIRED. There
       is no default value. [MDSYNTAX] clearly describes Markdown as a
       writing format; its syntax rules operate on characters
       (specifically, on punctuation) rather than code points. Neither
       [MDSYNTAX] nor many popular implementations at the time of this
       registration actually require or assume any particular encoding.
       Many Markdown processors will get along just fine by operating on
       character codes that lie in printable US-ASCII, blissfully
       oblivious to coded values outside of that range.

    Optional parameters:

     The following parameters reflect the author's intent regarding the
     content. A detailed specification can be found in Section 3.

     syntax: The Markdown-derivative syntax of the content, with

     variant: An optional version and named extensions. Default value: none
       (receiver's choice).

     output-type: The Content-Type (Internet media type) identifier that serves as a "hint" to the
       recipient of the output,
       with optional parameters. Default value: "text/html".

    Encoding considerations: Text.

    Security considerations: specific Markdown interpreted as plain text variant that the author
       intended. When omitted, there is relatively harmless. A text
     editor need only display no hint; the text. interpretation is
       entirely up to the receiver and context. This identifier is plain
       US-ASCII and case-insensitive. To promote interoperability,
       identifiers MAY be registered in the registry defined in Section
       6. If a receiver does not recognize the variant identifier, the
       receiver MAY present the identifier to a user to inform him or
       her of it.

     Other parameters MAY be included with the media type. The variant
     SHOULD define the semantics of such parameters. Additionally, the
     variant MAY be registered under another media type; this
     text/markdown registration does not preclude other registrations.

    Encoding considerations: Text.

    Security considerations:

     Markdown interpreted as plain text is relatively harmless. A text
     editor need only display the text. The editor SHOULD take care to
     handle control characters appropriately, and to limit the effect of
     the Markdown to the text editing area itself; malicious Unicode-
     based Markdown could, for example, surreptitiously change the
     directionality of the text. An editor for normal text would already
     take these control characters into consideration, however.

     Markdown interpreted as a precursor to other formats, such as HTML,
     carries all of the security considerations as the target formats.
     For example, HTML can contain instructions to execute scripts,
     redirect the user to other webpages, download remote content, and
     upload personally identifiable information. Markdown also can
     contain islands of formal markup, such as HTML. These islands of
     formal markup may be passed as-is, transformed, or ignored (perhaps
     because the islands are conditional or incompatible) when the
     Markdown is processed. Since Markdown may have different
     interpretations depending on the tool and the environment, a better
     approach is to analyze (and sanitize or block) the output markup,
     rather than attempting to analyze the Markdown.

     Security provides a significant motivator for the output-type
     parameter. Most Markdown processors emit byte (octet) streams.
     Without a well-defined means for a Markdown processor to pass
     metadata onwards, it is perilous for post-processing to assume that
     the content is always HTML or XHTML. A processor might emit
     PostScript (application/postscript) content, for example, in which
     case an HTML sanitizer would fail to excise dangerous instructions.

   Interoperability considerations:

     Markdown syntaxes are designed to be broadly compatible with humans
     ("humane"), but not necessarily with each other. Therefore, syntax
     in one Markdown derivative may be ignored or treated differently in
     another derivative. The overall effect is a general degradation of
     the output, proportional to the quantity of syntax-specific
     Markdown used in the text. When it is desirable to reflect the
     author's intent in the output, stick with the syntax identified in
     the syntax parameter.

   Published specification: This specification; [MDSYNTAX].

   Applications that use this media type:

     Markdown conversion tools, Markdown WYSIWYG editors, and plain text
     editors and viewers; markup processor targets indirectly use
     Markdown (e.g., web browsers for Markdown converted to HTML).

   Fragment identifier considerations:

     Markdown content acts as a "bridge" between plain text and formal
     markup, so this specification permits fragment identifiers [[NB:
     used to be #i]] #t for the [[NB: used to be input]] source text and
     #o for the output content. The #l and #ldef fragment identifiers
     identify link references. A detailed specification can be found in

     See Section 4.

   Additional information:

     Magic number(s): None
     File extension(s): .md, .markdown
     Macintosh file type code(s):
       TEXT. A uniform type identifier (UTI) of
       "net.daringfireball.markdown", which conforms to "public.plain-
       text", is RECOMMENDED [MDUTI]. Additionally, implementations
       SHOULD record syntax and output-type parameters along with the
       Markdown, such as in extended attributes; however, the exact
       manner of storage is a local matter.

   Person & email address to contact for further information:

     Sean Leonard <dev+ietf@seantek.com>

   Restrictions on usage: None.

   Author/Change controller: Sean Leonard <dev+ietf@seantek.com>

   Intended usage: COMMON

   Provisional registration? No

3.  Optional Parameters

   The optional parameters "syntax" and "output-type" can be used by an
   author to indicate the author's intent regarding how the Markdown
   ought to be processed.

   All identifiers are case-sensitive; receivers MUST compare for exact
   equality. At the same time, identifiers MUST NOT be registered in the
   IANA registry (see Section 6) if another registration differs only in
   the casing, as these registrations may cause confusion.

   The following ABNF definitions are used in

   [[NB: OMITTED from this section:

          EXTCHAR  = <any character outside the printable US-ASCII
                      range, essentially amounting to Unicode code
                      points less than U+0020 or greater than U+007E
                      without requiring Unicode or any particular
                      encoding>

          REXTCHAR = <EXTCHAR without separators (Z category) or
                      control characters (C category)>

                  Figure X: ABNF Used in draft. This Section

   The discussion in this section presumes may be replaced with
   Content-Disposition: ... preview-type=...]]

4. Fragment Identifiers

   Many types of content (such as HTML or PDF) that is output from a
   Markdown processor will have well-defined fragment identifier
   semantics associated with the parameter values are
   discrete strings. When encoded in protocols such content (such as MIME [RFC2045],
   however, named anchors or page
   numbers, respectively). However, the value strings MUST be escaped properly. [MDMTUSES]
   provides original [MDSYNTAX] neither
   defines a syntax for naming such content parts, nor associates such
   parts with fragment identifiers. Several variants have since defined
   such content parts, making them suitable for use with fragment
   identifiers.

4.1. General-Purpose Fragment Identifiers

   A Markdown fragment identifier is a sequence of characters that
   identifies some strategies to preserve this information when it leaves
   the domain area of IETF protocols.

3.1. syntax

   The syntax parameter indicates the Markdown-derivative Markdown content. Each Markdown variant
   can formally define a syntax in
   which for such fragment identifiers. (In
   practice, identifiers that are similar to HTML's anchors are used by
   many variants, usually by surrounding the author composed identifier with "{#" and
   "}" and placing the content, without regard to any production at the end of a line that comprises
   particular implementation. With reference kinds of content, such as a header, table, or image.)
   [[NB: citation necessary to the "paradigmatic use
   case" (i.e., collaborative PHP Markdown editing) Extra as an exemplary
   syntax?]]

   When encoded in Section 1.3, a URI, the
   syntax parameter primarily affects production SHALL conform to the "left-hand" side fragment
   production of a Markdown
   editor. [RFC3986] (specifically: pchar, "/", and "?"
   characters). Characters that are outside of that production SHALL be
   percent-encoded. The entire parameter is case-sensitive.

   Syntaxes other than [MDSYNTAX] extend the original rules in some way.
   These extensions fall into broad categories: clarifying ambiguities
   in [MDSYNTAX], adding brand new features, repurposing [MDSYNTAX] character set for
   completely new use cases, and adding metadata percent-encoded octets SHALL
   be the same as the Markdown content, i.e., identified by the charset
   parameter or by other structured
   data blocks. Occasionally new syntaxes directly contradict [MDSYNTAX]
   based on seasoned experience.

   A syntax identifier is composed of two or more characters excluding
   (Unicode) separators, control characters, contextual means. Variants are free to specify
   how fragment identifiers are compared. In the hyphen-minus "-",
   quotation marks """, and angle brackets "<" and ">"; however, ASCII
   characters alone absence of a variant-
   specific rule, fragment identifiers SHOULD be used. To promote interoperability, only
   registered syntaxes are permissible. An IANA registry of syntaxes
   will considered case-
   sensitive, which maintains consistency with HTML. [[NB: citation
   necessary to HTML4/HTML5?]]

   At least the first equals sign "=" SHOULD be created percent-encoded to
   prevent ambiguity as discussed described in Section 6.

   When omitted, the default value is unspecified, which means that the
   syntax interpretation is up following section.

4.2. Parameters

   Similar to the receiver. However, the receiver
   SHOULD NOT "guess" based on content-sniffing, as application/pdf [RFC3778] and text/plain [RFC5147], this methodology is
   error-prone. Generators SHOULD always specify
   registration permits a syntax, whether
   explicitly or by context in embedding protocols or formats. All
   implementations MUST support the parameter syntax value "Original", with the
   meaning covered in Section 6. Generators MUST omit the for fragment identifiers. The
   syntax is a parameter rather than transmitting an empty string (""); name, the empty
   string is equals sign "=" (which MUST NOT be
   percent-encoded), and a syntax error per parameter value. To the ABNF below. The full ABNF of extent that multiple
   parameters can appear in a fragment production, the
   syntax parameter is:

      syntax-param     = syntax-id [ "-" version ]
                         *( 1*WSP extension ) *WSP

      syntax-id        = 2*sid-char

      version          = 1*sid-char

      sid-char         = %d33 / %d35-44 / %d46-59 / %d61 /
                         %d63-126 / REXTCHAR

      extension        = ext-name [ ":" ( ext-string / ext-uri ) ]

      ext-name         = 1*( %d33 / %d35-57 / %d59 / %d61 /
                             %d63-126 / REXTCHAR )

      ext-string       = ext-quoted [ ext-string ] /
                           ( ext-safe-char / ">" )
                           *( ext-safe-char / "<" / ">" / ext-quoted )

      ext-safe-char    = %d33 / %d35-59 / %d61 / %d63-126 / REXTCHAR
        ; [[NB: Could parameters SHALL
   be EXTCHAR ? depends on how we feel about Unicode
        ; high-order separators]]

      ext-quoted       = DQUOTE *eqcontent DQUOTE

      ext-uri          = "<" URI-reference ">"         ; from [RFC3986]

      eqcontent        = %d0-33 / %d35-127 / EXTCHAR / DQUOTE DQUOTE

                 Figure X: ABNF of the syntax parameter

3.1.1. syntax version

   For better precision, an author MAY include separated by the syntax version. ampersand "&" (which MUST NOT be percent-
   encoded).

   The
   version only parameter defined in this registration is delimited from the syntax identifier with a hyphen-minus
   "-" and "line", which has
   the same repertoire meaning as the syntax identifier. The version
   string itself [RFC5147] (i.e., counting is an opaque string zero-based). For
   example: "#line=10" identifies the eleventh line of at least one character. Version
   strings (e.g., "2.0", "3.0.5") are registered Markdown input.
   Implementers should take heed that different environments and updated along with
   the syntax registration. Updates to syntax registrations SHOULD only
   add new versions when those new versions
   character sets may have a material difference
   on the interpretation wide range of the code sequences to divide
   lines.

   Markdown content. If a syntax has a
   version "2014.10" and a version "2014.11", for example, but "2014.11"
   only fixes typos in the specification, the registration SHOULD NOT
   separately register variants are free to define additional parameters.

   [[NB: This draft does not import all of text/plain's fragment
   identifier schemes, mainly because the "2014.11" version. The repertoire utility of the
   version string other schemes
   is the same as the syntax identifier (and like the
   processor identifier, ASCII characters alone SHOULD be used).

   A receiver that recognizes the syntax but not the version MAY use any
   version of the syntax, preferably the latest version.

3.1.2. syntax extensions

   Some Markdown syntaxes are self-contained, with no options. However,
   others have optional rules or features that may be applied with
   discretion. For those syntax systems where optional rules are an
   integral feature, the author MAY indicate that those named extensions
   be applied in a whitespace-separated list. The syntax for extensions
   derives in significant part from pandoc [PANDOC].

   All extensions for a particular syntax are to be registered as part
   of the syntax registration in Section 7.

   An extension identifier is composed of any sequence of characters
   excluding (Unicode) separators, control characters, the colon ":",
   quotation marks """, and angle brackets "<" and ">"; however,
   lowercase ASCII letters and the underscore "_" alone SHOULD be used,
   where the underscore SHOULD NOT be at the beginning or end.

   When present, an extension is "enabled", "enabled, with string", or
   "enabled, with URI". When absent, an extension is "disabled". An
   extension can have different semantics depending on whether a string
   or URI is supplied. For example, an extension "bullet" could specify
   whether and how to render bulleted lists. "Disabled" could mean
   "bulleted" lists do not have bullets; "enabled" could mean that the
   bullet is some default character; "enabled, with string" could mean
   that the string is used as the bullet; finally, "enabled, with URI"
   could mean that the image identified by URI is used as the bullet.

3.1.2.1. Enabled, with String

   According to the ABNF above, extensions are delimited by whitespace.
   Quotation marks are used to support zero-length strings, whitespace
   or quotation marks in a single string, or strings where the first
   character is "<". If a quotation mark appears anywhere in the string,
   the following text is considered quoted; two successive quotation
   marks "" within quoted text mean one quotation mark in the string. A
   single quotation mark ends the quoting. Generators MUST NOT generate
   unterminated quoted strings; however, parsers SHOULD treat an
   unterminated quoted string as if it were terminated. Because of this
   rule, quotation marks do not have to appear at the termini of a
   string; embedded quotation marks start (and end) quoting within a
   single argument. For example:
      a""b
   means:
      ab
   for the actual argument. In spite of this relaxed positioning rule,
   for human readability generators SHOULD quote the entire string in
   lieu of embedding quoted sub-strings.

3.1.2.2. Enabled, with URI

   Certain syntaxes can take supplementary content, such as metadata,
   from other resources. To support these workflows, an extension can
   use the URI delimiters "<" and ">" to signal a URI, such as a cid: or
   mid: URL [RFC2392] in the context of MIME messages. The URI MUST
   comply with [RFC3986], and MAY be a relative reference if the subject
   Markdown content has a base URI. The charset parameter specifies the
   character encoding that is relevant to the URI's semantics (to the
   extent that the URI needs it).

3.2. output-type

   The output-type parameter indicates the Internet media type (and
   parameters) of the output from the processor. With reference to the
   "paradigmatic use case" (i.e., collaborative Markdown editing) in
   Section 1.3, the outout-type parameter primarily affects the "right-
   hand" side of a Markdown editor.

   When omitted, the default value is "text/html". Implementations
   SHOULD anticipate and support HTML (text/html) and XHTML
   (application/xhtml+xml) output, to the extent that a syntax targets
   those markup languages.

   The default value of text/html ought to be suitable for the majority
   of current purposes. However, Markdown is increasingly becoming
   integral to workflows where HTML is not the target output; examples
   range from TeX, to PDF, to OPML, and even to entire e-books (e.g.,
   [PANDOC]). Anticipated output types for a particular syntax are to be
   registered as part of the syntax registration in Section 7.

3.2.1. Value Format and Semantics

   The value of output-type is an Internet media type with optional
   parameters. The syntax (including case sensitivity considerations) is
   the same as specified in [RFC2045] for the Content-Type header (with
   updates over time), namely:

          type "/" subtype *(";" parameter)
                          ; Matching of media type and subtype
                          ; is ALWAYS case-insensitive.

              Figure X: Content-Type ABNF (from [RFC2045])

   The Internet media type in the output-type parameter MUST be
   observed.

   Although arbitrary parameters may be passed along with the Internet
   media type, receivers are under no obligation to honor or interpret
   them in any particular way. For example, the parameter value
   "text/plain; format=flowed; charset=ISO-2022-JP" obligates the
   receiver to output text/plain (and to treat the output as plain text:
   no sneaking in or labeling the output as HTML!). In contrast, such a
   parameter value neither obligates the receiver to follow [RFC3676]
   (for flowed output) nor to output ISO-2022-JP Japanese character
   encoding (see [RFC1468]).

   The output-type parameter does not distinguish between fragment
   content and whole-document content. A Markdown processor MAY (and
   typically will) output HTML or XHTML fragment content, without
   preambles or postambles such as <!DOCTYPE>, <html>, <head>, </head>,
   <body>, </body>, or </html> elements. Receivers MUST be aware of this
   behavior and take appropriate precautions. Fragment vs. whole-
   document output considerations are appropriate for addressing in
   syntax specifications, either as part of the syntax or by a syntax
   extension.

3.2.2. text/markdown Special Value

   The author may specify the output-type "text/markdown", which has a
   special meaning. "text/markdown" means that the author does not want
   to invoke Markdown processing at all: the receiver SHOULD view the
   Markdown source as-is.

   This output-type is not the default because one generally assumes
   that Markdown is meant for composing rather than reading: readers
   expect to see the output format (or dual-display of the output and
   the Markdown). However, if authors are collaboratively editing a
   document or are discussing Markdown, "text/markdown" may make sense.
   Furthermore, "text/markdown" differs from "text/plain" in that
   "text/plain" encompasses a wide range of characters and formatting
   techniques (in Unicode, examples include bullet points, roman
   numerals, unambiguous line and paragraph separators, and interlinear
   annotation). While the optional parameter output-type may be used
   recursively (as a sneaky way to stash the author's follow-on or
   secondary intent), receivers are not obligated to recognize it;
   optional parameters internal to output-type MAY be ignored.

4. Fragment Identifiers

4.1. #t

   [[NB: This section used to say: The fragment #i refers to the content
   input into a Markdown processor, which for purposes of this fragment
   identifier, MUST be treated as plain text (text/plain).]]

   The fragment #t refers to the Markdown content treated as plain text
   (text/plain). A specific area of the text can be identified with a
   text/plain sub-fragment identifier (e.g., [RFC5147] or its
   successors) delimited by a second "#" character. For example:
   #t#line=10 identifies the eleventh line of Markdown input.
   Implementers should take heed that the "char" scheme counts by
   characters rather than octets (or, for that matter, code points);
   thus proper interpretation of the charset parameter is REQUIRED for
   interoperability of the "char" scheme. For example, "character" and
   "code point" are NOT synonymous in the Unicode Standard.

4.2. #o

   The fragment #o refers to the content output from a Markdown
   processor, which is governed by the output-type parameter. A specific
   area of the output can be identified with a sub-fragment identifier
   delimited by a second "#" character. The encoding and semantics of
   sub-fragment identifiers are also governed by the output-type
   parameter. Examples: when the output-type is text/html [RFC2854],
   #o#section6 identifies the named anchor "section6" specified by the
   input that the Markdown processor converts to <a
   name=section6>...</a>. When the output-type is application/pdf
   [RFC3778], #o#page=6 causes the sixth page to open.

   When the output-type is "text/markdown" (regardless of parameters),
   the #o fragment identifier has no semantics; generators MUST use #t
   in lieu of #o.

4.3. #l and #ldef

   The fragment prefix #l refers to links by their link identifiers. The
   sub-component of this identifier is delimited by a second "#"
   character, followed by the encoded link identifier, optionally
   followed by a 1-based index number. Without the index number, the
   fragment refers to all such identified links. Example: #l#eS matches
   links such as "The rain in [Spain][ES]" and "The word [es][] means
   'is' in Spanish." #l#es#2 only matches the second instance of the
   "es" link identifier.

   The fragment prefix #ldef refers to link reference definitions. The
   sub-component of this identifier far from obvious. Implementing line= is delimited by a second "#"
   character, followed by the encoded link identifier. There not difficult but char= is no index
   number; in the case of multiple link reference definitions, the last
   definition wins.

   Both the #l and #ldef REQUIRE that "#" characters be percent-encoded
   if they are part of the link identifier. The percent-encoding of
   other characters follow the regular rules of [RFC3986]. [MDSYNTAX]
   states
   more difficult since "character" has various meanings that identifiers (or names) "may consist of letters, numbers,
   spaces, and punctuation--but they are NOT case sensitive." Characters
   outside of the URI character set SHALL be percent-encoded with will skew
   the
   same encoding numbering significantly as the Markdown content. For maximum compatibility and
   readability, authors who intend to reference links content grows in fragment
   identifiers SHOULD limit themselves to URI characters that length; the other
   integrity check things simply do not
   require percent-encoding.

4.4. Other Fragment Identifiers

   Specific syntaxes may define additional fragment identifiers specific
   to the syntax. For example, a syntax that incorporates "header"
   information might consider #h to refer to the "header" part, and #b seem to refer to the "body" part. be particularly
   useful.]]

5.  Example

   The following is an example of Markdown as an e-mail attachment:

    MIME-Version: 1.0
    Content-Type: text/markdown; charset=UTF-8; syntax=Original;
     output-type="application/xhtml+xml"
    Content-Disposition: attachment; filename=readme.md

    Sample HTML 4 Markdown
    =============

    This is some sample Markdown. [Hooray!][foo]
    (Remember that link identifiers are not case-sensitive.)

    Bulleted Lists
    -------

    Here are some bulleted lists...

    * One Potato
    * Two Potato
    * Three Potato

    - One Tomato
    - Two Tomato
    - Three Tomato

    More Information
    -----------

    [.markdown, .md](http://daringfireball.net/projects/markdown/)
    has more information.

    [fOo]: http://example.com/loc 'Will Not Work with Markdown.pl-1.0.1'

6.  IANA Considerations

   IANA is asked to register the media type text/markdown in the
   Standards tree using the application provided in Section 2 of this
   document.

6.1. Markdown Variants

   IANA is also asked to establish a subtype registry called "Markdown
   Syntaxes". Each entry in this registry shall consist of a syntax
   identifier and information about the syntax, as follows:

6.1. Syntax Template

      {if provisional}
      PROVISIONAL REGISTRATION EXPIRES [YYYY-MM-DD date format]

      Identifier: [Identifier]

      Description: [Concise, prose description of the syntax, with
                    emphasis on its purpose and notable variations
                    from [MDSYNTAX] or another syntax. If the syntax
                    permits structured data, this fact ought to be
                    included. Other Markdown syntaxes may be referenced
                    by quoting their registered identifiers.]

      Documentation: [References to documentation.]

      Community of Use: [Concise, prose description of the
                         community of use, such as
                         "scholarly publications" or "screenwriting".
                         "General" may be entered if the community
                         encompasses general users of the Internet.]
                        [[TODO: Users (screenwriters) or use cases
                          (screenwriting)?]]

      [[NB: Should Versions: and Extensions: be {optional} and
        therefore omittable, or should they have "None." to
        indicate that no versions or extensions apply?]]
      Versions:
       {for each version}
        Identifier: [Identifier]
        Description: [Optional, concise, prose description of the
          version. "N/A" SHALL be used to indicate no description.]

      Extensions:
       {for each extension}
        Identifier: [Identifier]
        Syntax:
         {if Enabled}
          Enabled
         {if Enabled, with String}
          Enabled, with String: [prose description of what to establish a registry called "Markdown
   Variants". While the
                                 string registry is (not what it does)]
         {if Enabled, with URI}
          Enabled, with URI: [prose description of what being created in the URI
                              is (not what it does)]
        Description: [Concise, prose description context of the extension,
                      i.e., what it does.]
        Documentation: [References to documentation.]

      Anticipated Output Types:
       {for each output-type}
        [media type]
          {optional} [prose description of parameter considerations]

      {optional}
      Additional Fragment Identifiers:
       [Prose description of additional fragment identifiers,
        sufficient
   text/markdown media type, the registry is intended for interoperability.]

      Responsible Parties:
       {for each party}
        ([type: individual, corporate, representative])
        [Name] <contact info 1>...<contact info n>

      Currently Maintained? [Yes/No]

      {optional}
      Implementations:
       {for each implementation}
        Name: [Name]
        Version(s): [Significant version or versions broad
   community use, so protocols and systems that
                     implement the syntax]

        Type: ["Processor" or some other type]
        References: <contact info 1>...<contact info n>
        Purpose: [Concise, prose description of the implementation.]

   A responsible party do not rely on Internet
   media types can be an individual author or maintainer, a
   corporate author or maintainer (plus an individual contact), or a
   representative of a community of interest dedicated to the still tag Markdown
   syntax.

   The Versions, Extensions, Additional Fragment Identifiers, and
   Implementations sections are optional.

6.2. Initial Registration

   The content with a common variant
   identifier. Each entry in this registry shall have consist of basic
   information about the following initial registration;
   implementations conforming to this document MUST handle this syntax.
   [MDMTUSES] provides variant:

      Identifier
      Name
      Description
      References
      Contact Information
      Expiration Date (if provisional)

   Variants that have additional exemplary syntaxes.

      Identifier: Original

      Description: Gruber's original Markdown syntax.

      Documentation:
        [MDSYNTAX]. For media type parameters or fragment
   identifier considerations SHOULD describe them in detail in the "2004" version,
   Description field.

   While the documentation variant parameter is
        provided in HTML "plain US-ASCII" (see registration
   template), the Identifier field (and by implication, all registered
   identifiers) SHALL conform to the ABNF:

     ALPHA [*(ALPHA / DIGIT / "-" / "." / "_" / "~") (ALPHA / DIGIT)]
     [[NB: Be less restrictive, maybe reuse some other common ABNF]]

   I.e., the identifier MUST start with a letter and MAY contain
   punctuation in Markdown, as follows:
        syntax:      Content-Type: text/html; charset=UTF-8
                     Accessed at October 12, 2014 8:27 PM (-0700)
                     38570 bytes
                     SHA-256 hash: B2EC2A62 3257F164 FBC88AE8 C7E76F3F
                                   80F16845 105D9F3E 3E8CE25B 6F0CB33B

        syntax.text: Content-Type: text/plain; charset=UTF-8
                                   (actually text/markdown;
                                    syntax=Original;
                                    output-type="text/markdown")
                     Accessed the middle, but not at October 12, 2014 8:27 PM (-0700)
                     27784 bytes
                     SHA-256 hash: 01A6A07A F51838E1 8749454B 06D716BC
                                   B1BC0EAA A21B67B7 D6FB5A6B 4FFB5D5B

      Community of Use: General.

      Versions:
        Identifier: 2004
        Description: [MDSYNTAX] as it (is rumored to have) existed
                     since December 14, 2004, corresponding the end: the last character
   MUST be alphanumeric. Since the identifier MAY be displayed to
                     Markdown.pl 1.0.1. The version "2004" a
   user--particularly in cases where the receiver does not recognize the
   identifier--the identifier SHOULD NOT be specified until further notice; is is only
                     documented for completeness (in case Gruber
                     revises the syntax with material contradictions).

      Anticipated Output Types:
        text/html
        application/xhtml+xml

      Responsible Parties:
       (individual) John Gruber <http://daringfireball.net/>
                                <comments@daringfireball.net>

      Currently Maintained? No

      Implementations:
       Name: Markdown.pl
       Version(s): 1.0.1, 1.0.2b8
       Type: Processor
       References: [MARKDOWN]
       Purpose: Converts Markdown rationally related to HTML or XHTML circa 2004. the
   vernacular name of the variant.

   The argument "--html4tags" causes HTML output.

6.3. Name, Description, References, and Contact Information fields
   SHALL be in a Unicode character set (e.g., UTF-8).

6.2. Reserved Identifiers

   The registry SHALL have the following identifiers RESERVED. No one is
   allowed to register them (or any case variations of them).
      Standard
      Common
      Markdown

6.4.

6.3. Standard of Review

   Registrations are made on a First-Come, First-Served [RFC5226] basis
   by anyone with a need to interoperate. While documentation is
   required, any level of documentation is sufficient; thus, neither
   Specification Required nor Expert Review are warranted. The checks
   prescribed by this section can be performed automatically.

   Syntax, version, and extension identifiers MUST comply with the
   syntaxes specified in this document. Additionally, the identifier
   MUST NOT differ from other registered identifiers merely by case.
   Identifiers MUST conform to [[TODO: PRECIS? STRINGPREP?]]. The
   purpose of this requirement is to eliminate confusingly similar
   identifiers, placing the burden on the registration process rather
   than on syntax parameter parsers.

   All references (including contact information) MUST be verified as
   functional at the time of the registration.

   If a registration is being updated, the contact information MUST
   either match the prior registration and be verified, or the prior
   registrant MUST confirm that the updating registrant has authority to
   update the registration. As a special "escape valve", registrations
   can be updated with IETF Review [RFC5226]. [[NB: Two purposes: 1) to
   deal with "harmful" registrations (stale references are not a
   sufficient justification); 2) to deal with registrations that are
   IETF registrations, like RFC-related Markdown (but this could be
   handled by listing the IETF as the contact organization, right?).]]
   All fields may be updated except the syntax variant identifier, which is
   permanent: not even case may be changed.

6.5.

6.4. Provisional Registration

   Any registrant may make a provisional registration to reserve a
   syntax
   variant identifier. Provisional registrations include the ALL-CAPS
   legend as shown in Section 6.1. All fields are optional except for Only the syntax variant identifier and contact information.
   information fields are required; the rest are optional. Provisional
   registrations expire after three months, after which time the syntax variant
   identifier may be reused.

7. Security Considerations

   See the Security considerations entry in Section 2.

8. References

8.1. Normative References

   [MARKDOWN] Gruber, J., "Daring Fireball: Markdown", December 2004,
              <http://daringfireball.net/projects/markdown/>.

   [MDSYNTAX] Gruber, J., "Daring Fireball: Markdown Syntax
              Documentation", December 2004,
              <http://daringfireball.net/projects/markdown/syntax>.

   [MDUTI]    Gruber, J., "Daring Fireball: Uniform Type Identifier for
              Markdown", August 2011,
              <http://daringfireball.net/linked/2011/08/05/markdown-
              uti>.

   [RFC2045]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
              Extensions (MIME) Part One: Format of Internet Message
              Bodies", RFC 2045, November 1996.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2854]  Connolly, D. and L. Masinter, "The 'text/html' Media
              Type", RFC 2854, June 2000.

   [RFC3778]  Taft, E., Pravetz, J., Zilles, S., and L. Masinter, "The
              application/pdf Media Type", RFC 3778, May 2004.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66, RFC
              3986, January 2005.

   [RFC5147]  Wilde, E. and M. Duerst, "URI Fragment Identifiers for the
              text/plain Media Type", RFC 5147, April 2008.

   [RFC5226]  Narten, T., and H. Alvestrand, "Guidelines for Writing an
              IANA Considerations Section in RFCs", RFC 5226, May 2008.

   [RFC5322]  Resnick, P., Ed., "Internet Message Format", RFC 5322,
              October 2008.

   [RFC6838]  Freed, N., Klensin, J., and T. Hansen, "Media Type
              Specifications and Registration Procedures", BCP 13, RFC
              6838, January 2013.

8.2. Informative References

   [HUMANE]   Atwood, J., "Is HTML a Humane Markup Language?", May 2008,
              <http://blog.codinghorror.com/is-html-a-humane-markup-
              language/>.

   [INETMEME] Solon, O., "Richard Dawkins on the internet's hijacking of
              the word 'meme'", June 2013,
              <http://www.wired.co.uk/news/archive/2013-06/20/richard-
              dawkins-memes>, <http://www.webcitation.org/6HzDGE9Go>.

   [MDMTUSES] Leonard, S., "text/markdown Use Cases", draft-seantek-
              text-markdown-use-cases-00 (work in progress), October
              2014.

   [PANDOC]   MacFarlane, J., "Pandoc", 2014,
              <http://johnmacfarlane.net/pandoc/>.

   [RAILFROG] Railfrog Team, "Railfrog", April 2009,
              <http://railfrog.com/>.

   [RFC1468]  Murai, J., Crispin, M., and E. van der Poel, "Japanese
              Character Encoding for Internet Messages", RFC 1468, June
              1993.

   [RFC2392]  Levinson, E., "Content-ID and Message-ID Uniform Resource
              Locators", RFC 2392, August 1998.

   [RFC3676]  Gellens, R., "The Text/Plain Format and DelSp Parameters",
              RFC 3676, February 2004.

   [RFC4263]  Lilly, B., "Media Subtype Registration for Media Type
              text/troff", RFC 4263, January 2006.

   [FOUNTAIN] Maschwitz, S. and J. August, "Fountain | A markup language
              for screenwriting.", 2014, <http://fountain.io/>.

   [FTSYNTAX] Maschwitz, S. and J. August, "Syntax - Fountain | A markup
              language for screenwriting.", 1.1, March 2014,
              <http://fountain.io/syntax>.

Appendix A.  Change Log

   This draft is a continuation from draft-ietf-appsawg-text-markdown-
   02.txt.
   03.txt. These technical changes were made:

      1.  Proposed that the document be split into two documents: the
          main document (which is normative), and a second document. The
          second document (draft-seantek-text-markdown-use-cases-00)
          [MDMTUSES] provides additional background information,
          suggestions for preserving metadata, registration templates
          for common Markdown syntaxes, and examples for common Markdown
          syntaxes. RFC 2119 key words are not included in draft-
          seantek-text-markdown-use-cases because this content is not
          normative (at least, not as normative) compared with the main
          document.
      2.  De-emphasized Unicode (and UTF-8 encoding) after close
          consideration of the original [MDSYNTAX], and the various
          proposed extensions to Markdown in the intervening time.
          "CommonMark", for example, places stronger emphasizes on
          Unicode (and UTF-8).
      3.  Deleted processor parameter.
      4.  Renamed flavor parameter to syntax  Removed output-type optional parameter.
      5.
      2.  Renamed "rules" to "extensions" in the syntax parameter.
      6.  Parameterized "extensions" so that it can have a string or a
          URI.
      7.  Simplified the syntax optional parameter (compared to draft-02, in any
          event) variant.
      3.  Defined variant optional parameter as discussed on mailing
          list.
      4.  Removed Section 3 (which may be replaced with fewer exceptional cases Content-
          Disposition/preview-type in the ABNF.
      8.  Rewrote significant parts of the output-type parameter, and
          gave text/markdown additional explanation.
      9.  Rewrote the introduction so that it is much shorter.
      10. Moved the example towards future).
      5.  Redid the end.
      11. Added Fragment Identifier Considerations.
      12. Consolidated fragment identifier considerations, simplifying the Security Considerations into
          specification considerably.
      6.  Discussed the registration
          template.
      13. Rewrote meaning of "variant" in the IANA Considerations section so that it only
          creates one new registry.
      14. context of Markdown.
      7.  Redefined the flavors IANA registry (now called the Markdown
          Syntaxes registry).
      15. Rewrote the "Original" syntax registration to conform to the
          new registration template.
      16. Added a discussion as "Markdown Variants" and example
          expanded its applicability outside of this particular media
          type.
      8.  Drastically simplified the Paradigmatic Use Case
          (Markdown Editors). registration template.

Author's Address

   Sean Leonard
   Penango, Inc.
   5900 Wilshire Boulevard
   21st Floor
   Los Angeles, CA  90036
   USA

   EMail: dev+ietf@seantek.com
   URI:   http://www.penango.com/