Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: tlug: A couple of questions about Unicode
- To: tlug@example.com
- Subject: Re: tlug: A couple of questions about Unicode
- From: Gaspar Sinai <gsinai@example.com>
- Date: Mon, 12 Jan 1998 00:27:26 +0900 (JST)
- Content-Type: TEXT/PLAIN; charset=US-ASCII
- In-Reply-To: <199801091717.CAA03920@example.com>
- Reply-To: tlug@example.com
- Sender: owner-tlug@example.com
Hi, I feel compelled to contribute to this thread. So here are my thoughts: o It is very unfortunate that practical standard for unicode becomes 16 bit(UCS2) instead of 32 bit (UCS4). There is no 7 bit transformation format for UCS4 (the 8-bit format is UTF8). I think the people involved in the standard were influenced too much by NT and they had to make very Microsoft-ish hacks: o there is a codespace where two 16-bit characters are used to map a portion of the UCS4 space into UCS2. o if you want to process some Indian or Arabic scripts you need to combine two 16-bit unicode character to form a single glyph. I think linux only gains if it uses utf8 instead of ucs2. o When you compare the advantages of sharing codes between Japanese,Chinese characters I think there are more advantages than disadvantages. The disadvantages go away when you are allowed to change font in the document. o Unicode is not consistent to the rules it set to itself. You would expect that the wide ASCII characters would have the ASCII values just like wide Cyrillic or Greek but this is not the case. For some strange reason they kept the wide ASCII. o I know that there are some people in Japan who do not like Unicode. I bash unicode - still I like it. And Japan is very lucky when it comes to Unicode (Tamil and Malayanan scripts come to my mind...) o Someone mentioned inconsistency with SJIS. IMHO SJIS is not sufficient and should not be used. It can not encode a lot of characters that JIS and EUC can. (Yes I know that SJIS is the standard format in Win95.) o The NT unicode format is a simple dump of UCS2 with a magic U+FEFF code at the beginning. This code is used to determine endiannnes. So much for now. Sorry for the short-ish style. BTW: I have released yudit-0.95 yesterday. Now it compiles with egcs and it fixes some bugs. It supports NT notepad format. You can get it from sunsite or: http://www2.gol.com/users/gsinai/yudit-0.95..tar.gz cheers, gaspar --------------------------------------------------------------- Next TLUG Nomikai: 14 January 1998 19:15 Tokyo station Yaesu Chuo ticket gate. Or go directly to Tengu TokyoEkiMae 19:30 Chuo-ku, Kyobashi 1-1-6, EchiZenYa Bld. B1/B2 03-3275-3691 Next Saturday Meeting: 14 February 1998 12:30 Tokyo Station Yaesu Chuo ticket gate. --------------------------------------------------------------- a word from the sponsor: TWICS - Japan's First Public-Access Internet System www.twics.com info@example.com Tel:03-3351-5977 Fax:03-3353-6096
- Follow-Ups:
- Re: tlug: A couple of questions about Unicode
- From: "J. David Beutel" <jdb@example.com>
- UTF-8 [was: Re: tlug: A couple of questions about Unicode]
- From: "Stephen J. Turnbull" <turnbull@example.com>
- References:
- tlug: A couple of questions about Unicode
- From: "Jonathan Byrne" <jbyrne@example.com>
Home | Main Index | Thread Index
- Prev by Date: Re: tlug: A couple of questions about Unicode
- Next by Date: Re: tlug: various stuff -> Nomikai Administrator
- Prev by thread: Re: tlug: A couple of questions about Unicode
- Next by thread: Re: tlug: A couple of questions about Unicode
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links