* WGs marked with an * asterisk has had at least one new draft made available during the last 5 days

Ticket #74 (closed design: fixed)

Opened 7 years ago

Last modified 6 years ago

Character Encodings in TEXT

Reported by: mnot@pobox.com Owned by:
Priority: Milestone: 07
Component: p6-cache Severity:
Keywords: Cc:
Origin: http://www.w3.org/mid/6.0.0.20.2.20070610165356.0a69cec0@localhost

Description (last modified by mnot@pobox.com) (diff)

RFC 2616 prescribes that headers containing non-ASCII have to use either iso-8859-1 or RFC 2047. This is unnecessarily complex and not necessarily followed by implementations or by specifications of new headers.

This issue is limited to:

  • determining whether UTF-8 can be allowed in some way (e.g., in current uses of TEXT, and/or new headers), and
  • possibly tightening up use of iso-8859-1 in TEXT (in particular, C1 controls).

See also #63, #111.

Change History

comment:1 Changed 7 years ago by mnot@pobox.com

  • Component set to messaging
  • Milestone set to unassigned

comment:2 follow-up: ↓ 3 Changed 7 years ago by mnot@pobox.com

  • Summary changed from Character encodings for Headers to Encodings for non-ASCII Headers

There was discussion of this at the APPS Area Architecture Workshop, with some disagreement as to whether it's possible to encode IRI->URI->IRI. Specific advice to IRIs may be necessary.

comment:3 in reply to: ↑ 2 Changed 7 years ago by julian.reschke@gmx.de

Replying to mnot@pobox.com:

There was discussion of this at the APPS Area Architecture Workshop, with some disagreement as to whether it's possible to encode IRI->URI->IRI. Specific advice to IRIs may be necessary.

Is this about round-tripping IRIs through URIs? Obviously that's not possible.

For example, consider the two IRIs:

I1: http://www.example.org/Dürst

I2: http://www.example.org/D%C3%BCrst

Both would be converted to the URI:

U: http://www.example.org/D%C3%BCrst

Now whether that disctinction is relevant of course depends on which kind of URI/IRI comparison is needed; but there are cases where it is relevant (for instance, XML namespace names using IRIs (urg!)).

(see also http://tools.ietf.org/html/rfc3987#section-3.2.1)

comment:4 Changed 7 years ago by mnot@pobox.com

  • Description modified (diff)
  • Summary changed from Encodings for non-ASCII Headers to Character Encodings in TEXT

comment:5 Changed 7 years ago by mnot@pobox.com

  • Description modified (diff)

comment:6 Changed 7 years ago by mnot@pobox.com

  • Description modified (diff)

comment:7 Changed 6 years ago by mnot@pobox.com

  • Milestone changed from unassigned to 06

comment:8 Changed 6 years ago by fielding@gbiv.com

From [395]:

Deprecate line folding, addresses #77. Require that invalid whitespace around field-names be rejected, addresses #30. Make non-ASCII content obsolete and opaque in header fields and reason phrase, addresses #63, #74, #94, #111.

comment:9 Changed 6 years ago by julian.reschke@gmx.de

  • Status changed from new to closed
  • Resolution set to fixed

Fixed in [398]:

Resolve #63, #74, #94, #111: Issues around TEXT rule closed with revision [395] (closes #63, #74, #94, #111)

comment:10 Changed 6 years ago by julian.reschke@gmx.de

  • Status changed from closed to reopened
  • Resolution fixed deleted

re-open until reviewed

comment:11 Changed 6 years ago by julian.reschke@gmx.de

  • Component changed from p1-messaging to p6-cache
  • Milestone changed from 06 to 07

Part 6 still allows RFC2047 encoding for the Warn header.

comment:12 Changed 6 years ago by mnot@pobox.com

  • Status changed from reopened to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.