[Docs] [txt|pdf] [Tracker] [Email] [Nits]

Versions: 00 01 02 03 04 05 RFC 4180

Network Working Group                                    Y. Shafranovich
Internet-Draft                            SolidMatrix Technologies, Inc.
Expires: August 6, 2005                                 February 2, 2005


                        MIME Type for CSV Files
                   draft-shafranovich-mime-csv-00.txt

Status of this Memo

   This document is an Internet-Draft and is subject to all provisions
   of Section 3 of RFC 3667.  By submitting this Internet-Draft, each
   author represents that any applicable patent or other IPR claims of
   which he or she is aware have been or will be disclosed, and any of
   which he or she become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on August 6, 2005.

Copyright Notice

   Copyright (C) The Internet Society (2005).

Abstract

   This document defines MIME types "text/csv" and
   "text/comma-separated-values"  which used for Comma-Separated Values
   (CSV) files.







Shafranovich             Expires August 6, 2005                 [Page 1]

Internet-Draft           MIME Type for CSV Files           February 2005


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  MIME Type Registration of text/csv and
       text/comma-separated-values  . . . . . . . . . . . . . . . . .  3
   3.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  4
   4.  Security Considerations  . . . . . . . . . . . . . . . . . . .  5
   5.  References . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     5.1   Normative References . . . . . . . . . . . . . . . . . . .  5
     5.2   Informative References . . . . . . . . . . . . . . . . . .  5
       Author's Address . . . . . . . . . . . . . . . . . . . . . . .  5
   A.  Appendix A - Discussion of the CSV format  . . . . . . . . . .  6
       Intellectual Property and Copyright Statements . . . . . . . .  8






































Shafranovich             Expires August 6, 2005                 [Page 2]

Internet-Draft           MIME Type for CSV Files           February 2005


1.  Introduction

   The comma separated values format (CSV) has been used for exchanging
   and converting data between various spreadsheet programs for quite
   some time.  Surprisingly, while this file is very common it has never
   been formally documented.  Additionally, while the IANA MIME
   registration tree includes a registraton for
   "text/tab-separated-values" type, no MIME types have ever been
   registered with IANA for CSV.  At the same time, various programs and
   operating systems have begun to use different MIME types for this
   format, many of which vary from system to system.  This document
   seeks to formally register two MIME types for CSV in accordance with
   RFC 2048 [4].

2.  MIME Type Registration of text/csv and text/comma-separated-values

   This section provides the media-type registration application (as per
   RFC 2048 [4], which will be submitted to IANA after IESG approval of
   this document.

   To: ietf-types@iana.org

   Subject: Registration of MIME media types text/csv and
   text/comma-separated-values

   MIME media type name: text

   MIME subtype name: csv, comma-separated-values

   Required parameters: none

   Optional parameters: charset

      Common usage of CSV is US-ASCII, but other character sets as
      defined by IANA for the "text" tree may be used.

   Encoding considerations:

      While section 4.1.1.  of RFC 2046 [1] stipulates that "text"
      subtypes MUST use a CRLF sequence as a line break, in practice
      that is not always true for CSV.  Therefore, implementors should
      be aware that either CR or CRLF maybe used as a line break for
      this format.

   Security considerations:

      CSV files contain passive text data which should not pose any
      risks.  However, it is possible in theory that malicious binary



Shafranovich             Expires August 6, 2005                 [Page 3]

Internet-Draft           MIME Type for CSV Files           February 2005


      data maybe included in order to exploit potential buffer overruns
      in the program processing CSV data.  Additionally, private data
      maybe shared via this format which of course applies to any text
      data.

   Interoperability considerations:

      Due to lack of a single specification there are considerable
      differences among different implementations as described in
      appendix A.  The most common difference among various format is
      whether double quotes (") are used to enclose strings.
      Implementors should "be conservative in what you do, be liberal in
      what you accept from others" (RFC 793 [2]) when processing CSV
      files.

   Published specification:

      While numerous private specifications exist for various programs
      and systems, there is no single "master" specification for this
      format.  A sampling of formats and discussion of differences is
      included in appendix A.

   Applications which use this media type:

      Spreadsheet programs and various data conversion utilities

   Additional information:

      Magic number(s): none

      File extension(s): CSV

      Macintosh File Type Code(s): TEXT

   Person & email address to contact for further information:

      Yakov Shafranovich <ietf@shaftek.org>

   Intended usage: COMMON

   Author/Change controller: IESG

3.  IANA Considerations

   After IESG approval, IANA is expected to register these two types
   "text/csv" and "text/comma-separated-values" using the application
   provided in this document.




Shafranovich             Expires August 6, 2005                 [Page 4]

Internet-Draft           MIME Type for CSV Files           February 2005


4.  Security Considerations

   See discussion above

5.  References

5.1  Normative References

   [1]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
        Extensions (MIME) Part Two: Media Types", RFC 2046, November
        1996.

   [2]  Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
        September 1981.

   [3]  Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
        Specifications: ABNF", RFC 2234, November 1997.

5.2  Informative References

   [4]  Freed, N., Klensin, J. and J. Postel, "Multipurpose Internet
        Mail Extensions (MIME) Part Four: Registration Procedures",
        BCP 13, RFC 2048, November 1996.

   [5]  Repici, J., "HOW-TO: The Comma Separated Value (CSV) File
        Format", 2004,
        <http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm>.

   [6]  Edoceo, Inc., "CSV Standard File Format", 2004,
        <http://www.edoceo.com/utilis/csv-file-format.php>.

   [7]  Rodger, R. and O. Shanaghy, "Documentation for Ricebridge CSV
        Manager", February 2005,
        <http://www.ricebridge.com/products/csvman/reference.htm>.

   [8]  Raymond, E., "The Art of Unix Programming, Chapter 5", September
        2003,
        <http://www.catb.org/~esr/writings/taoup/html/ch05s02.html>.


Author's Address

   Yakov Shafranovich
   SolidMatrix Technologies, Inc.

   Email: ietf@shaftek.org
   URI:   http://www.shaftek.org




Shafranovich             Expires August 6, 2005                 [Page 5]

Internet-Draft           MIME Type for CSV Files           February 2005


Appendix A.  Appendix A - Discussion of the CSV format

   While there are various specifications and implementations for the
   CSV format (for ex.  [5], [6], [7] and [8]), no formal specification
   exists.  This causes a wide variety of interpretations for CSV files.
   While this document does not seek to document the CSV format,
   nevertheless we want to document the format that seems to be followed
   by most implementations:

   1.  Each record is located on a separate line delimited by a line
       break (either CR or CR/LF).  For example:

       aaa,bbb,ccc CRLF
       zzz,yyy,xxx CRLF

   2.  The last record in the file may or may not have an ending
       linebreak.  For example:

       aaa,bbb,ccc CRLF
       zzz,yyy,xxx

   3.  There maybe an optional header line appearing as the first line
       of the file with the same format as normal record lines.  This
       header will contain names corresponding to the fields in the file
       and will usually contain the same number of fields as the records
       in the rest of the file.  For example:

       field_name,field_name,field_name CRLF
       aaa,bbb,ccc CRLF
       zzz,yyy,xxx CRLF

   4.  Within the header and each record there may be one or more
       fields, delimited by commas.  The last field in the record may or
       may not be followed by a comma.  For example:

       aaa,bbb,ccc

   5.  Each field may or may not be enclosed in double quotes, however
       some programs such as Microsoft Excel do not use double quotes at
       all.  For example:

       "aaa","bbb","ccc" CRLF
       zzz,yyy,xxx

   6.  Field containing line breaks (CR or CR/LF) and commas should be
       enclosed in double-quotes.  For example:

       "aaa","b CRLF



Shafranovich             Expires August 6, 2005                 [Page 6]

Internet-Draft           MIME Type for CSV Files           February 2005


       bb","ccc" CRLF
       zzz,yyy,xxx

   7.  If double-quotes are used to enclosed fields, then double-quotes
       inside fields must be surounded by double quotes.  For example:

       "aaa","b"""bb","ccc"

   8.  Whitespace immediately before and after commas maybe removed
       unless it appears inside double-quotes.  For example:

       zzz,   yyy   ,   xxx

       would be processed as if it was:

       zzz,yyy,xxx

   The ABNF grammar [3] appears as follows:

      COMMA = %x2C

      file = [header] *record

      end-of-field = COMMA / (CR / CRLF)

      header = *(*WSP field *WSP end-of-field)

      record = *(*WSP field *WSP end-of-field)

      field = escaped / non-escaped

      escaped = DQUOTE *(VCHAR / CR / CRLF / 3*DQUOTE) DQUOTE

      non-escaped = *VCHAR

















Shafranovich             Expires August 6, 2005                 [Page 7]

Internet-Draft           MIME Type for CSV Files           February 2005


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2005).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.




Shafranovich             Expires August 6, 2005                 [Page 8]


Html markup produced by rfcmarkup 1.109, available from https://tools.ietf.org/tools/rfcmarkup/