I saw the âone ofâŠâ, but they arenât defined in the RFC? Your spirit of Limited Use sounds about right for big5 though.
Thanks,
Shawn
From: Ira McDonald [mailto:***@gmail.com]
Sent: Wednesday, September 21, 2011 9:41 AM
To: Shawn Steele; Ira McDonald
Cc: "Martin J. DÃŒrst"; ietf-***@mail.apps.ietf.org; Makoto Murata (eb2m-***@asahi-net.or.jp)
Subject: Re: Big5 / CP950
Hi Shawn,
RFC 2978 section 5 'Charset Registration Template'
"Intended usage:
(One of COMMON, LIMITED USE or OBSOLETE)"
The spirit of LIMITED USE has been to discourage the use
of legacy charsets that are particularly problematic - Big5.
Not sure if OBSOLETE has ever been used.
Martin - searching for this made me realize that the
plaintext IANA Charset Registry at
ftp://ftp.iana.org/assignments/character-sets
contains 257 entries - they don't include the Intended
Usage field.
I suggest we work w/ IANA to change the plaintext
registry.
In most cases this data is long lost (if ever submitted)
because the directory
ftp://ftp.iana.org/assignments/charset-reg
contains only 55 entries.
Cheers,
- Ira
Ira McDonald (Musician / Software Architect)
Chair - Linux Foundation Open Printing WG
Co-Chair - IEEE-ISTO PWG IPP WG
Chair - TCG Embedded Systems Hardcopy SWG
IETF Designated Expert - IPP & Printer MIB
Blue Roof Music/High North Inc
http://sites.google.com/site/blueroofmusic
http://sites.google.com/site/highnorthinc
mailto:***@gmail.com<mailto:***@gmail.com>
Christmas through April:
579 Park Place Saline, MI 48176
734-944-0094
May to Christmas:
PO Box 221 Grand Marais, MI 49839
906-494-2434
On Wed, Sep 21, 2011 at 11:13 AM, Shawn Steele <***@microsoft.com<mailto:***@microsoft.com>> wrote:
Moved the note, Removed big5+, if anyone knows other examples, I'd include those.
Post by Martin J. DürstYou have "COMMON" here while your Shift_JIS registration has "LIMITED".
Is that by accident, or is there some rationale behind it?
Um, by accident. I copied the original shift-jis registration, and used the windows-1252 as a template for this. I have no clue what the distinction is :) Changed to LIMITED USE. (reasoning that the variations are cause instability between implementations, so I'd much rather have people picking something like UTF-8). Is there a definition of these terms? All of them should be OBSOLETE in favor of UTF-* ;-) I'd use that if I could get away with it.
-Shawn
ï£ï£§ï£ ï£ï£ï£
http://blogs.msdn.com/shawnste
________________________________________
From: "Martin J. DÃŒrst" [***@it.aoyama.ac.jp<mailto:***@it.aoyama.ac.jp>]
Sent: Wednesday, September 21, 2011 1:00 AM
To: Shawn Steele
Cc: 'ietf-***@mail.apps.ietf.org<mailto:ietf-***@mail.apps.ietf.org>'; Makoto Murata (eb2m-***@asahi-net.or.jp<mailto:eb2m-***@asahi-net.or.jp>)
Subject: Re: Big5 / CP950
Hello Shawn,
Post by Martin J. DürstHere's some proposed text for a more complete registration.
Many thanks for doing this work. Some comments below, mostly nits.
Post by Martin J. DürstComments welcome. AFAICT this code page is quite a bit less stable than others, and there are a plethora of mappings. I've included two ISO10646 equivalency tables for that reason.
Thanks,
Shawn
-----------------------------------
Charset name: big5
Charset aliases: (None)
MIBenum: 2026
Suitability for use in MIME text:
Yes, big5 is suitable for use with subtypes of the "text"
Content-Type. Note that big5 is an 8-bit charset. Care should
be taken to choose an appropriate Content-Transfer-Encoding.
Two example ISO 10646 equivalency tables: http://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT
http://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP950.TXT
Note that Big5 has many variants, so these exemplars provide two
common mappings:
Additional information:
Several vendor specific charsets that derive from Big5 often use
the Big5 name instead of a more specific vendor charset name.
Big5-HKSCS is one example, Microsoft Code Page 950, and
several font specific variations are other examples.
Although not authoritative, the following references may also be of
interest:
Printed mapping table:
Dr. International "Developing International Software, Second Edition",
Microsoft Press, ISBN 0-7356-1583-7, 2003, p. 778 and appendixes on CD.
Microsoft windows extended "best fit" behavior:
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit950.txt
Additional information about the many variants of Big5:
http://en.wikipedia.org/wiki/Big-5
The wide variety of existing variations of Big5 may make it
unsuitable for many modern applications. Developers should
consider whether UTF-8 or UTF-16 would be more appropriate for
new applications.
This is an update of an existing registration of this charset. This
charset name is in use.
This charset is also known as Windows Code Page 950 or cp950 for
short; these are NOT aliases.
Person& email address to contact for further information:
Shawn Steele
Email: Shawn.Steeleµsoft.com<http://microsoft.com>
Microsoft Corporation
One Microsoft Way
Redmond, WA 98052
U.S.A.
Intended usage: LIMITED USE