Discussion:
windows-874
peter green
2006-01-13 01:32:49 UTC
Permalink
this is the only thai charset that outlook 2000 supports for sending mail in
and so is almost certainly in wide use already online. furthermore all the
other windows code pages are already registered.

http://www.microsoft.com/globaldev/reference/sbcs/874.mspx

p.s. i am not in any way affiliated with microsoft nor do i speak thai. I
just noticed a strange ommision from the list of registered character sets.

p.p.s please cc any replies to me as i am not on the ietf-charsets list.
Erik van der Poel
2006-01-14 21:43:45 UTC
Permalink
The procedure for registering a charset is documented here:

http://www.rfc-editor.org/rfc/rfc2978.txt

One of the required pieces of info is:

Person & email address to contact for further information

I'm guessing that that wouldn't be you. Microsoft should really register
this themselves, especially if they're already using it on the Internet.
However, I did notice that when they sent something related to the other
windows-* charsets to this list, they appeared to feel that they weren't
getting much response. I don't know whether that was due to the
unavailability of a stable, openly-available spec. (But UTF-8 isn't
stable either -- it grows with Unicode.)

The name "windows-874" appears to be used in Java too. See the ICU list:

http://dev.icu-project.org/cgi-bin/viewcvs.cgi/icu/source/data/mappings/convrtrs.txt?rev=1.169&content-type=text/vnd.viewcvs-markup

Erik
Post by peter green
this is the only thai charset that outlook 2000 supports for sending mail in
and so is almost certainly in wide use already online. furthermore all the
other windows code pages are already registered.
http://www.microsoft.com/globaldev/reference/sbcs/874.mspx
p.s. i am not in any way affiliated with microsoft nor do i speak thai. I
just noticed a strange ommision from the list of registered character sets.
p.p.s please cc any replies to me as i am not on the ietf-charsets list.
Markus Scherer
2006-01-14 21:53:56 UTC
Permalink
I don't think it is *required* that the author/"owner" of a charset
submit a registration for it.
Post by Erik van der Poel
However, I did notice that when they sent something related to the other
windows-* charsets to this list, they appeared to feel that they weren't
getting much response. I don't know whether that was due to the
unavailability of a stable, openly-available spec. (But UTF-8 isn't
stable either -- it grows with Unicode.)
In one sense, this is correct. However, for the purposes of the IANA
charset registration, the name "UTF-8" is stable and suitable. See the
discussion in RFC 3629 - UTF-8, a transformation format of ISO 10646,
Section 8 MIME registration.

Best regards,
markus

--
Opinions expressed here may not reflect my company's positions unless
otherwise noted.
Erik van der Poel
2006-01-15 00:30:18 UTC
Permalink
I agree that the author/owner of a charset shouldn't be required to
submit a registration for it, but I think it would be inappropriate for
Peter Green to put his name on the windows-874 registration.

Actually, Microsoft already attempted to register windows-874 on March
15th, 2005:

http://mail.apps.ietf.org/ietf/charsets/msg01510.html

It appears that some of the items listed in the template in section 5 of
the RFC are missing from the windows-874 registration:

http://www.rfc-editor.org/rfc/rfc2978.txt

But I don't know whether that is one of the reasons why this charset has
not been registered. Also, if you take a look at the email archive, you
will see that no authority has bothered to respond to Microsoft's
repeated questions:

http://mail.apps.ietf.org/ietf/charsets/maillist.html#01510

The RFC mentions a "charset reviewer". Would that be Paul Hoffman, as
mentioned in the following?

http://www.iana.org/numbers.html#C

I've Cc'ed Paul in the hope that he might respond.

Erik
Post by Markus Scherer
I don't think it is *required* that the author/"owner" of a charset
submit a registration for it.
Frank Ellermann
2006-01-15 04:13:10 UTC
Permalink
if you take a look at the email archive, you will see that no
authority has bothered to respond to Microsoft's repeated
questions
The requester claiming to speak for MS didn't bother to address
the questions about his registrations, e.g. discrepancies of
his 1252 with the cp-1252 listed in

http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT

The latter was apparently supplied by another person claiming
to speak for MS. Either 0x81, 0x8D, 0x8F, 0x90, and 0x9D are
mapped one to one, u+0081 etc., or they are not. As long as
that's not absolutely clear and also reflected in the Unicode
mappings modifying the IANA registry about 1252 makes no sense.

Bye, Frank
Erik van der Poel
2006-01-15 06:05:18 UTC
Permalink
Post by Frank Ellermann
if you take a look at the email archive, you will see that no
authority has bothered to respond to Microsoft's repeated
questions
The requester claiming to speak for MS didn't bother to address
the questions about his registrations, e.g. discrepancies of
his 1252 with the cp-1252 listed in
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
What discrepancies do you claim to exist, exactly?
Post by Frank Ellermann
The latter was apparently supplied by another person claiming
to speak for MS. Either 0x81, 0x8D, 0x8F, 0x90, and 0x9D are
mapped one to one, u+0081 etc., or they are not. As long as
that's not absolutely clear and also reflected in the Unicode
mappings modifying the IANA registry about 1252 makes no sense.
RFC 2978 does not require a Unicode mapping. It says that there "SHOULD"
be a 10646 mapping, but it does not use the word "MUST".

I agree that it is nice to have the 10646 mapping, but are unassigned
codepoints not allowed to exist in IANA-registered charsets (other than
UTF-8 and all the 10646-based charsets)? If so, where in the RFC does it
say that?

Erik
Frank Ellermann
2006-01-15 07:22:34 UTC
Permalink
Post by Erik van der Poel
What discrepancies do you claim to exist, exactly?
Either 0x81, 0x8D, 0x8F, 0x90, and 0x9D are mapped one to
one, u+0081 etc., or they are not.
RFC 2978 does not require a Unicode mapping. It says that
there "SHOULD" be a 10646 mapping, but it does not use the
word "MUST".
You need a good excuse to ignore a SHOULD, a typical example
are old implementations (= here old charset registrations).

It also says "MUST be stable", that's why we got tons of new
registered charsets doing something for the "Euro", like 858
instead of 850.
Post by Erik van der Poel
are unassigned codepoints not allowed to exist in
IANA-registered charsets
Not that I'm aware of, unassigned code points are fine. In
the case of 1252 all it takes is to explain what the five
interesting octets are supposed to be: Maybe "cp-1252" and
windows-1252 are two different charsets, the former with one
to one mappings, the latter with five unassigned code points.

But that's a rather important difference for implementations.

From my POV windows-1252 is one of the most important charsets,
in practice more relevant than say Latin-9. While I now know
that what my OS considers as "1004" is in fact cp-1252 (after
an embarassing episode with ICU when I didn't know this), I'm
still interested if that's "windows-1252" or "cp-1252" or both,
if they are identical.
Bye, Frank
Erik van der Poel
2006-01-15 08:15:28 UTC
Permalink
Post by Frank Ellermann
Post by Erik van der Poel
RFC 2978 does not require a Unicode mapping. It says that
there "SHOULD" be a 10646 mapping, but it does not use the
word "MUST".
You need a good excuse to ignore a SHOULD, a typical example
are old implementations (= here old charset registrations).
I agree that it is really a good idea to provide the 10646 mapping.
Post by Frank Ellermann
It also says "MUST be stable", that's why we got tons of new
registered charsets doing something for the "Euro", like 858
instead of 850.
It is true that the RFC says "stable", but it does not say what "stable"
means in the context of charsets. Does it mean that assigned codepoints
must not change? Of course. Does it mean that unassigned codepoints must
not change? That is debatable. (And remember that UTF-8 is specifically
permitted to have unassigned codepoints that might change later.)
Post by Frank Ellermann
In the case of 1252 all it takes is to explain what the five
interesting octets are supposed to be: Maybe "cp-1252" and
windows-1252 are two different charsets, the former with one
to one mappings, the latter with five unassigned code points.
I can't find "cp-1252" in the IANA charset registry:

http://www.iana.org/assignments/character-sets

I have been wondering, however, about this "re-registration" of
windows-1252. Why is it being registered again? Is it because the
contact person/email address is being changed? If so, then that should
be stated explicitly. I'm not very happy about these 2 URLs supplied
with it either:

http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_19mb.asp

Those documents are not 10646 equivalency tables, nor are they even
published specs, for 1252.

Mike, I have a suggestion. How about dealing with windows-1252
separately? If we can agree on a change for the windows-1252
registration (e.g. contact person/email), then you can apply the same
fix to the other re-registrations. It might even be a good idea to have
some non-person email address at Microsoft be the contact. E.g.
iana-***@microsoft.com. Then it doesn't matter whether Chris Wendt
or Mike Ksar leave Microsoft.

Then, or in parallel, discuss windows-874 separately. If we can agree on
a pattern for that charset, you can apply the same pattern to the other
new windows-NNN charsets.

Erik
Frank Ellermann
2006-01-15 14:01:19 UTC
Permalink
Post by Erik van der Poel
It is true that the RFC says "stable", but it does not say
what "stable" means in the context of charsets. Does it mean
that assigned codepoints must not change? Of course.
Yes, otherwise silently deleting UNICODE-1-1 would be an idea.
Post by Erik van der Poel
Does it mean that unassigned codepoints must not change?
That is debatable.
New Unicode points are assigned all the time. Nothing's wrong
with that if you know that it's possible doing something where
it doesn't matter (e.g. no canocicalization).
Post by Erik van der Poel
http://www.iana.org/assignments/character-sets
It's on the quoted MAPPINGS/VENDOR/MICSFT/WINDOWS/CP1252.TXT
page at unicode.org submitted by ***@microsoft.com

The old (RfC 2278) = proposed new (RfC 2978) table at...
<http://www.microsoft.com/globaldev/reference/sbcs/1252.htm>
...is most probably simply the same charset. In that case it
would be nice to register "cp-1252" as an alias with a pointer
to CP1252.TXT. The old RfC 2278 registrtation template is
<http://www.iana.org/assignments/charset-reg/windows-1252>

The proposed RfC 2978 update offers two additional links and
as you said a new contact address.
Post by Erik van der Poel
Then, or in parallel, discuss windows-874 separately.
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT
http://www.microsoft.com/globaldev/reference/sbcs/874.htm

JFTR. We should check if there's a potential conflict for an
alias cp-874. ICU says http://purl.net/net/cp/874 => TIS-620.

For http://purl.net/net/cp/1252 it says "windows-1252" mapping
the five interesting octects 0x81 to u+0081 etc. I've no idea
what ibm1252-P100-2000 is supposed to be, my IBM OS/2 199mumble
has an Euro at 0x80, not u+0080. OTOH it also says that this
is "1004" and not 1252, so probably that's beside the point.

Bye, Frank
Erik van der Poel
2006-01-15 17:37:33 UTC
Permalink
Post by Frank Ellermann
It's on the quoted MAPPINGS/VENDOR/MICSFT/WINDOWS/CP1252.TXT
Neither the original windows-1252 registrant (Chris Wendt) nor the new
windows-1252 re-registrant (Mike Ksar) has pointed to the unicode.org
site for the definition of "windows-1252". Chris and Mike both gave
microsoft.com URLs and Mike provided a reference to a Microsoft Press
book. I don't think there is any rule that says that the URL(s) must
point to a site other than the registrant's. The book is "openly
available" so it's probably OK to include that reference.

However, it still would be nice if Microsoft provided a machine-readable
10646 mapping for windows-1252.
Post by Frank Ellermann
In that case it
would be nice to register "cp-1252" as an alias
In general, I think we should avoid adding aliases. However, if
consensus can be reached that "cp-1252" is in fact used a lot and almost
always means the same as "windows-1252", then my feeling is that it's a
good idea to add that alias.

Erik
Markus Scherer
2006-01-15 07:39:58 UTC
Permalink
Post by Erik van der Poel
Post by Frank Ellermann
The latter was apparently supplied by another person claiming
to speak for MS. Either 0x81, 0x8D, 0x8F, 0x90, and 0x9D are
mapped one to one, u+0081 etc., or they are not. As long as
that's not absolutely clear and also reflected in the Unicode
mappings modifying the IANA registry about 1252 makes no sense.
RFC 2978 does not require a Unicode mapping. It says that there "SHOULD"
be a 10646 mapping, but it does not use the word "MUST".
Section 1.3 does require that "the definition associated with a
charset name must fully specify the mapping to be performed." - even
though it does not require a *Unicode* conversion table.

Given that much modern software operates in Unicode internally, it is
desirable to have an authoritative Unicode conversion table,
especially if it could be kept up to date with (hopefully compatible)
changes.
Post by Erik van der Poel
I agree that it is nice to have the 10646 mapping, but are unassigned
codepoints not allowed to exist in IANA-registered charsets (other than
UTF-8 and all the 10646-based charsets)? If so, where in the RFC does it
say that?
Unassigned code points are of course allowed. Without double-checking
the tables which Mike pointed to in his requests, I think what Frank
alluded to is how Windows treats unassigned codes in its SBCS
charsets: It usually roundtrips unassigned bytes xx to/from Unicode
U+00xx, rather than mapping unassigned codes to some SUBstitution
character, and some but not all published tables reflect this.

As a result, when such an undefined but roundtripping code gets a
character assigned to it, then its roundtrip mapping changes to the
new Unicode code point. There is usually no fallback (one-way) mapping
from the old U+00xx to the now-assigned xx.

If you treat the presence of a roundtrip mapping (to/from Unicode or
any other character set) as equivalent to establishing a byte
sequence's character identity, then windows-1252 used to have a C1
control code at 0x80 while now it has a prominent currency symbol
assigned to that code - which would be an incompatible change. Of
course the documentation of windows-1252 used to show 0x80 as
undefined before the change.

I am not arguing for or against any registration here, merely
explaining what I have seen in Windows conversion.

Note that a charset is defined as "a method of converting a sequence
of octets into a sequence of characters" so roundtrip mappings don't
formally come into play. There is also no definition for when a
charset is "stable", for allowing or disallowing extension by
assigning characters to formerly unassigned codes (or otherwise), or
for whether two charsets are considered different or compatible.

Final note: windows-874 and windows-1252 and the other charsets that
Mike Ksar requested to register are of course the Windows "ANSI"
system code pages which are very widely used.

markus

--
Opinions expressed here may not reflect my company's positions unless
otherwise noted.
Erik van der Poel
2006-01-15 08:30:40 UTC
Permalink
Post by Markus Scherer
Unassigned code points are of course allowed. Without double-checking
the tables which Mike pointed to in his requests, I think what Frank
alluded to is how Windows treats unassigned codes in its SBCS
charsets: It usually roundtrips unassigned bytes xx to/from Unicode
U+00xx, rather than mapping unassigned codes to some SUBstitution
character, and some but not all published tables reflect this.
The current registration of windows-1252 points to a document that lists
those codepoints as unassigned. It does not matter what other tables say
about them. If an implementation decides to round-trip these unassigned
codepoints, they do so for particular reasons. But that does not change
the fact that those codepoints are actually unassigned, and that people
should not use them.

Erik
Mike Ksar
2006-01-15 04:47:49 UTC
Permalink
Erik,

Thanks for pursuing the registrations that I have submitted last year for which I prepared a disposition of comments already but I have not heard from anyone on the status.

I am ready to pursue this when IANA is ready and when Paul Hoffman is ready to iron out any pending issues to register Windows-874 and the rest of my contribution. I have no idea who Peter Green is. If he has any questions he can contact me.

Mike Ksar

________________________________

From: Erik van der Poel [mailto:***@vanderpoel.org]
Sent: Sat 1/14/2006 4:30 PM
To: Paul Hoffman / IMC
Cc: Markus Scherer; ietf-***@iana.org; peter green
Subject: Re: windows-874



I agree that the author/owner of a charset shouldn't be required to
submit a registration for it, but I think it would be inappropriate for
Peter Green to put his name on the windows-874 registration.

Actually, Microsoft already attempted to register windows-874 on March
15th, 2005:

http://mail.apps.ietf.org/ietf/charsets/msg01510.html

It appears that some of the items listed in the template in section 5 of
the RFC are missing from the windows-874 registration:

http://www.rfc-editor.org/rfc/rfc2978.txt

But I don't know whether that is one of the reasons why this charset has
not been registered. Also, if you take a look at the email archive, you
will see that no authority has bothered to respond to Microsoft's
repeated questions:

http://mail.apps.ietf.org/ietf/charsets/maillist.html#01510

The RFC mentions a "charset reviewer". Would that be Paul Hoffman, as
mentioned in the following?

http://www.iana.org/numbers.html#C

I've Cc'ed Paul in the hope that he might respond.

Erik
Post by Markus Scherer
I don't think it is *required* that the author/"owner" of a charset
submit a registration for it.
Mike Ksar
2006-01-15 10:16:14 UTC
Permalink
I am open to do the registrations one at a time. I am only trying to
follow the registration procedures as documented and I think I have done
so. I have met with and talked over the phone with several of the
people that submitted comments. No one has any objection to the
registration but some expressed preference to include additional
information which is not required by the registration rules.

Mike Ksar

-----Original Message-----
From: Erik van der Poel [mailto:***@vanderpoel.org]
Sent: Sunday, January 15, 2006 12:15 AM
To: Frank Ellermann
Cc: ietf-***@mail.apps.ietf.org
Subject: Re: windows-1252
Post by Frank Ellermann
Post by Erik van der Poel
RFC 2978 does not require a Unicode mapping. It says that
there "SHOULD" be a 10646 mapping, but it does not use the
word "MUST".
You need a good excuse to ignore a SHOULD, a typical example
are old implementations (= here old charset registrations).
I agree that it is really a good idea to provide the 10646 mapping.
Post by Frank Ellermann
It also says "MUST be stable", that's why we got tons of new
registered charsets doing something for the "Euro", like 858
instead of 850.
It is true that the RFC says "stable", but it does not say what "stable"

means in the context of charsets. Does it mean that assigned codepoints
must not change? Of course. Does it mean that unassigned codepoints must

not change? That is debatable. (And remember that UTF-8 is specifically
permitted to have unassigned codepoints that might change later.)
Post by Frank Ellermann
In the case of 1252 all it takes is to explain what the five
interesting octets are supposed to be: Maybe "cp-1252" and
windows-1252 are two different charsets, the former with one
to one mappings, the latter with five unassigned code points.
I can't find "cp-1252" in the IANA charset registry:

http://www.iana.org/assignments/character-sets

I have been wondering, however, about this "re-registration" of
windows-1252. Why is it being registered again? Is it because the
contact person/email address is being changed? If so, then that should
be stated explicitly. I'm not very happy about these 2 URLs supplied
with it either:

http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/un
icode_19mb.asp

Those documents are not 10646 equivalency tables, nor are they even
published specs, for 1252.

Mike, I have a suggestion. How about dealing with windows-1252
separately? If we can agree on a change for the windows-1252
registration (e.g. contact person/email), then you can apply the same
fix to the other re-registrations. It might even be a good idea to have
some non-person email address at Microsoft be the contact. E.g.
iana-***@microsoft.com. Then it doesn't matter whether Chris Wendt
or Mike Ksar leave Microsoft.

Then, or in parallel, discuss windows-874 separately. If we can agree on

a pattern for that charset, you can apply the same pattern to the other
new windows-NNN charsets.

Erik
Erik van der Poel
2006-01-15 16:54:19 UTC
Permalink
My feeling is that it is inappropriate to say that you met with and
phoned people that had comments. I believe those discussions are
supposed to happen on this mailing list, so that we can try to achieve
consensus in an open fashion.

Maybe the charset reviewer should say something at this point. Paul, is
that you?

Erik
Post by Mike Ksar
I am open to do the registrations one at a time. I am only trying to
follow the registration procedures as documented and I think I have done
so. I have met with and talked over the phone with several of the
people that submitted comments. No one has any objection to the
registration but some expressed preference to include additional
information which is not required by the registration rules.
Mike Ksar
-----Original Message-----
Sent: Sunday, January 15, 2006 12:15 AM
To: Frank Ellermann
Subject: Re: windows-1252
Post by Frank Ellermann
Post by Erik van der Poel
RFC 2978 does not require a Unicode mapping. It says that
there "SHOULD" be a 10646 mapping, but it does not use the
word "MUST".
You need a good excuse to ignore a SHOULD, a typical example
are old implementations (= here old charset registrations).
I agree that it is really a good idea to provide the 10646 mapping.
Post by Frank Ellermann
It also says "MUST be stable", that's why we got tons of new
registered charsets doing something for the "Euro", like 858
instead of 850.
It is true that the RFC says "stable", but it does not say what "stable"
means in the context of charsets. Does it mean that assigned codepoints
must not change? Of course. Does it mean that unassigned codepoints must
not change? That is debatable. (And remember that UTF-8 is specifically
permitted to have unassigned codepoints that might change later.)
Post by Frank Ellermann
In the case of 1252 all it takes is to explain what the five
interesting octets are supposed to be: Maybe "cp-1252" and
windows-1252 are two different charsets, the former with one
to one mappings, the latter with five unassigned code points.
http://www.iana.org/assignments/character-sets
I have been wondering, however, about this "re-registration" of
windows-1252. Why is it being registered again? Is it because the
contact person/email address is being changed? If so, then that should
be stated explicitly. I'm not very happy about these 2 URLs supplied
http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/un
icode_19mb.asp
Those documents are not 10646 equivalency tables, nor are they even
published specs, for 1252.
Mike, I have a suggestion. How about dealing with windows-1252
separately? If we can agree on a change for the windows-1252
registration (e.g. contact person/email), then you can apply the same
fix to the other re-registrations. It might even be a good idea to have
some non-person email address at Microsoft be the contact. E.g.
or Mike Ksar leave Microsoft.
Then, or in parallel, discuss windows-874 separately. If we can agree on
a pattern for that charset, you can apply the same pattern to the other
new windows-NNN charsets.
Erik
Mike Ksar
2006-01-16 04:09:37 UTC
Permalink
My proposed disposition of comments was posted on the alias. It was not
done behind the scenes. Unfortunately no one responded and my
contribution and disposition of comments have been sitting idly for
almost one year.

Again, I followed the procedures strictly as outlined for the
registration. Any additional preferences are not a must but nice to
have.

Mike Ksar
-----Original Message-----
From: Erik van der Poel [mailto:***@vanderpoel.org]
Sent: Sunday, January 15, 2006 8:54 AM
To: Mike Ksar; ***@imc.org
Cc: Frank Ellermann; ietf-***@mail.apps.ietf.org
Subject: Re: windows-1252

My feeling is that it is inappropriate to say that you met with and
phoned people that had comments. I believe those discussions are
supposed to happen on this mailing list, so that we can try to achieve
consensus in an open fashion.

Maybe the charset reviewer should say something at this point. Paul, is
that you?

Erik
Post by Mike Ksar
I am open to do the registrations one at a time. I am only trying to
follow the registration procedures as documented and I think I have
done
Post by Mike Ksar
so. I have met with and talked over the phone with several of the
people that submitted comments. No one has any objection to the
registration but some expressed preference to include additional
information which is not required by the registration rules.
Mike Ksar
-----Original Message-----
Sent: Sunday, January 15, 2006 12:15 AM
To: Frank Ellermann
Subject: Re: windows-1252
Post by Frank Ellermann
Post by Erik van der Poel
RFC 2978 does not require a Unicode mapping. It says that
there "SHOULD" be a 10646 mapping, but it does not use the
word "MUST".
You need a good excuse to ignore a SHOULD, a typical example
are old implementations (= here old charset registrations).
I agree that it is really a good idea to provide the 10646 mapping.
Post by Frank Ellermann
It also says "MUST be stable", that's why we got tons of new
registered charsets doing something for the "Euro", like 858
instead of 850.
It is true that the RFC says "stable", but it does not say what
"stable"
Post by Mike Ksar
means in the context of charsets. Does it mean that assigned
codepoints
Post by Mike Ksar
must not change? Of course. Does it mean that unassigned codepoints
must
Post by Mike Ksar
not change? That is debatable. (And remember that UTF-8 is
specifically
Post by Mike Ksar
permitted to have unassigned codepoints that might change later.)
Post by Frank Ellermann
In the case of 1252 all it takes is to explain what the five
interesting octets are supposed to be: Maybe "cp-1252" and
windows-1252 are two different charsets, the former with one
to one mappings, the latter with five unassigned code points.
http://www.iana.org/assignments/character-sets
I have been wondering, however, about this "re-registration" of
windows-1252. Why is it being registered again? Is it because the
contact person/email address is being changed? If so, then that should
be stated explicitly. I'm not very happy about these 2 URLs supplied
http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/un
Post by Mike Ksar
icode_19mb.asp
Those documents are not 10646 equivalency tables, nor are they even
published specs, for 1252.
Mike, I have a suggestion. How about dealing with windows-1252
separately? If we can agree on a change for the windows-1252
registration (e.g. contact person/email), then you can apply the same
fix to the other re-registrations. It might even be a good idea to
have
Post by Mike Ksar
some non-person email address at Microsoft be the contact. E.g.
Wendt
Post by Mike Ksar
or Mike Ksar leave Microsoft.
Then, or in parallel, discuss windows-874 separately. If we can agree
on
Post by Mike Ksar
a pattern for that charset, you can apply the same pattern to the
other
Post by Mike Ksar
new windows-NNN charsets.
Erik
Frank Ellermann
2006-01-16 18:30:24 UTC
Permalink
Unfortunately no one responded and my contribution and
disposition of comments have been sitting idly for almost
one year.
I've no clear idea what went wrong, but there are only 47
articles in the GMaNe archive of this list since 2004-12-03,
see also <http://dir.gmane.org/gmane.ietf.charsets>

Your request got a few replies from among others Bruce, Ira,
Mark, and Markus, you can see that March 2005 thread on GMaNe
or at <http://mail.apps.ietf.org/ietf/charsets/threads.html>

Paul's last article was apparently posted December 2004, and
maybe he missed your later re-registration request. The last
registration I see in Ned's official charsets list archive
at <http://mail.apps.ietf.org/ietf/charsets/threads.html> was
apparently something about "ECMA cyrillic" in February 2004.
Again, I followed the procedures strictly as outlined for the
registration. Any additional preferences are not a must but
nice to have.
There are a few minor differences between 2278 and 2978, it's
up to the reviewer to decide this after reading the discussion.

But he wouldn't let you wait nine months for his decision, the
time for discussions is two weeks, so that's a probable case of
"Murphy" somewhere between Paul and this list.

Bye, Frank

Continue reading on narkive:
Loading...