Request for egistration of character set TSCII for TAMIL language

Discussion:

Martin Duerst

2007-02-23 10:02:39 UTC

[I prepared this yesterday, but it didn't get out because
my Win box was suddenly getting slower and slower and I had
to restart. My comments seem to be more or less parallel
to what Ned wrote.]

Hello Kuppuswamy,

I'm extremely sorry that I haven't replied to you earlier.
There are two reasons for this: I have been extremely busy during
the term here at the University, and the mailing list archive
(at http://mail.apps.ietf.org/ietf/charsets/maillist.html)
wasn't accessible for a long time. It is now accessible, but
strangely enough doesn't contain your messages.

Ned, can you check? THANKS!
(one message came back today saying that the posting limit
is too low, so the ppt attachment didn't make it through.)

Looking at your submission, it looks very good in terms of
all the data and documentation available. The main problem
is that the data isn't grouped the way we would expect it.
You sent in a .pdf file as the main registration document.
However, the main registration document should be in ASCII
only. So the main thing to do is to take some of the information
from the .pdf document, maybe with some other information
(e.g. correspondence table to Unicode) and put the ASCII
part of it directly in the registration itself.

Anything else (e.g. code chart, glyph tables,...), in particular
things that obviously cannot be put into ASCII, should remain
in .pdf (or a simliar form), but should not be entitled
"Draft Proposal for formal registration of TSCII..." or
anything similar. It's just a document describing the encoding.

Some more detailled comments below.

Registration of new language charset for Tamil
TSCII
(TAMIL SCRIPT CODE FOR INFORMATION INTERCHANGE)
None
YES

maybe add: As 8bit or with base64 or quoted-printable encoding

www.tscii.org/tsciispec.html
http://www.unicode.org/notes/tn15/

Other registrations (see e.g.
http://mail.apps.ietf.org/ietf/charsets/msg01739.html)
have included this. In your case, the mapping is more
complex, so maybe including it is not such a good idea,
but probably adding something like "please note that not
all codepoints can be converted one-to-one" or so
would help.

COMMON

The intent of this section, as I understand it, is mostly
for implementers. But your text reads more like a pamphlet
trying to convince us to register. If necessary (I'm already
convinced, so I don't think it's necessary), such arguments
can be given on this mailing list.

Tamil is one of the main Indian languages (Dravidian in Origin)
currently spoken by over 70 million people worldwide. TSCII
(Tamil Script Code for Information Interchange) is a bilingual 8-bit
glyph-based encoding scheme (Roman and Tamil) to deal with
Tamil materials on computers and for Information Interchange
across platforms using different protocols and document formats.
The TSCII scheme was collectively worked out through Net-based
discussions in 1998. TSCII is modelled on the ISO-8859-XX scheme
with standard plain ASCII set filling the 7-bit part and a set of Tamil
character glyphs filling the 8-bit part.
The TSCII scheme has been widely in use for over 5 years in all three
popular computer platforms (Windows, Macintosh and Unix/Linux).
In addition to millions of home-users (particularly in India,
Singapore,
Malaysia, Sri Lanka, Western Europe and North America), TSCII
encoding is used widely in Net-based mailing lists, newspapers
and ezines on-line, digital library etc. Legacy data in TSCII format
generated during the last 5 years is quite substantial and is growing
constantly.
TSCII as an established language encoding is already recognized
by major IT players like the Unicode Consortium, Microsoft, Apple,
Oracle and Sun Microsystems. With OS-level support for Tamil in
Microsoft Windows 2000 and later OS releases and very recently in
Apple$BCT(B Mac OS X 10.4 (Tiger) release, Tamil Diaspora has started to
use Unicode already. The Purpose of this formal registration with IETF
is to facilitate migration of the vast amounts of legacy data in TSCII
and
multitude of users.

In general, the idea is to have a small number of
people, but with more address information. Please
see other examples. The contact information is not
there to show that the proposal has wide support, but
to list somebody who can be contacted in cases of questions.

Regards, Martin.

TSCII USER GROUP represented by
Kalyanasundaram, Kuppuswamy (Switzerland)
Manivannan, Mani (USA)
Nedumaran, Muthu (Malaysia)
Kaviarasan, S (USA)
Paul, Ravindran K (Malaysia)
Doddannan, Sivaraj (India)
RM. Krishnan (India)
Kumar Mallikarjunan (USA)
Sinnathurai Srivas (UK)

#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-# http://www.sw.it.aoyama.ac.jp mailto:***@it.aoyama.ac.jp

Ned Freed

2007-02-24 19:40:33 UTC

Permalink

Post by Martin Duerst
[I prepared this yesterday, but it didn't get out because
my Win box was suddenly getting slower and slower and I had
to restart. My comments seem to be more or less parallel
to what Ned wrote.]
Hello Kuppuswamy,
I'm extremely sorry that I haven't replied to you earlier.
There are two reasons for this: I have been extremely busy during
the term here at the University, and the mailing list archive
(at http://mail.apps.ietf.org/ietf/charsets/maillist.html)
wasn't accessible for a long time.

Please let me know if/when this happens.

Post by Martin Duerst
It is now accessible, but
strangely enough doesn't contain your messages.

There seems to be a problem with the iana.org forwarder - some messages aren't
making it through. (But plenty of spam is, unfortunately.)

Post by Martin Duerst
Ned, can you check? THANKS!
(one message came back today saying that the posting limit
is too low, so the ppt attachment didn't make it through.)

I don't impose such restrictions on the list at this end.

Post by Martin Duerst
Looking at your submission, it looks very good in terms of
all the data and documentation available. The main problem
is that the data isn't grouped the way we would expect it.
You sent in a .pdf file as the main registration document.
However, the main registration document should be in ASCII
only. So the main thing to do is to take some of the information
from the .pdf document, maybe with some other information
(e.g. correspondence table to Unicode) and put the ASCII
part of it directly in the registration itself.

The approach I suggested in private email was to have a reference from the
ASCII version to the PDF version somewhere else - either on the IANA site if
IANA is willing to put up a PDF, or somewhere else if they aren't.

Post by Martin Duerst
Anything else (e.g. code chart, glyph tables,...), in particular
things that obviously cannot be put into ASCII, should remain
in .pdf (or a simliar form), but should not be entitled
"Draft Proposal for formal registration of TSCII..." or
anything similar. It's just a document describing the encoding.
Some more detailled comments below.

Registration of new language charset for Tamil
TSCII
(TAMIL SCRIPT CODE FOR INFORMATION INTERCHANGE)
None
YES

maybe add: As 8bit or with base64 or quoted-printable encoding

www.tscii.org/tsciispec.html
http://www.unicode.org/notes/tn15/

I made the same suggestion.

Post by Martin Duerst

COMMON

Agreed.

Post by Martin Duerst

I'd prefer this to say "ISO-8859-XX charsets" rather than "ISO-8859-XX scheme".
There's more to the structure of the iso-8859-n family of CCSes than the
7-is-ascii/8-is-something-else split.

Post by Martin Duerst

character glyphs filling the 8-bit part.
The TSCII scheme has been widely in use for over 5 years in all three
popular computer platforms (Windows, Macintosh and Unix/Linux).
In addition to millions of home-users (particularly in India,
Singapore,
Malaysia, Sri Lanka, Western Europe and North America), TSCII
encoding is used widely in Net-based mailing lists, newspapers
and ezines on-line, digital library etc. Legacy data in TSCII format
generated during the last 5 years is quite substantial and is growing
constantly.
TSCII as an established language encoding is already recognized
by major IT players like the Unicode Consortium, Microsoft, Apple,
Oracle and Sun Microsystems. With OS-level support for Tamil in
Microsoft Windows 2000 and later OS releases and very recently in
Apple$BCT(B Mac OS X 10.4 (Tiger) release, Tamil Diaspora has started to
use Unicode already. The Purpose of this formal registration with IETF
is to facilitate migration of the vast amounts of legacy data in TSCII
and
multitude of users.

Seems reasonable to me.

Post by Martin Duerst
Regards, Martin.

#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University

Ned Freed

2007-02-25 01:10:57 UTC

Permalink

Post by Ned Freed

Please let me know if/when this happens.

Post by Martin Duerst
It is now accessible, but
strangely enough doesn't contain your messages.

There seems to be a problem with the iana.org forwarder - some messages aren't
making it through. (But plenty of spam is, unfortunately.)

Turns out the list address was incorrect - ietf-***@iana.org, not
ietf-***@iana.org. The proper address appears to be working, although IANA
has some greylisting facility in place now that may cause replies to be delayed
a bit.

Ned