Mailing List Archive
tlug.jp Mailing List
tlug archive
tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Re: Unicode (Was: apache2...)
- Date: Sat, 12 Jul 2003 19:30:17 +1000 (EST)
- From: Jim Breen <jwb@example.com>
- Subject: Re: [tlug] Re: Unicode (Was: apache2...)
Shimpei Yamashita <shimpei@example.com> wrote: >> A few questions, from a complete amateur.... >> >> On Sat, Jul 12, 2003 at 12:45:28AM +1000, >> Jim Breen wrote: >> > Things don't "look" like anything in Unicode. The look comes from the >> > font. You choose the font. You buy a Chinese-style Unicode font where >> > the hanzi look Chinese, or you buy a Japanese-style font. The codes >> > stay the same. >> >> Does that mean that a multilingual text document, rendered with a single >> Unicode font, may only "look" correct in one Asian language at a time? Depending on the font, yes. >> If so, >> does it not mean that Unicode only *pretends* to be context-independent, and >> actually depends on the user (which could be the application or the human >> being) to provide that context because it fails to provide a context- >> presentation mechanism internally? Not at all. There are language codes in Unicode, and if the document has been prepared with them, a smart application can do things like selecting fonts according to them, or invoking spell-checkers according to the language, or all the other language-dependent things. It's the same with A,a,B,b, etc. Different European cultures actually have their preferred fonts and think others look foreign, but no-one has accused ISO-8859-* of pretense or cultural hegemony on this score. >> > Be that as it may, EVERY kanji in JIS X 0208 and JIS X 0212 ended up in >> > Unicode 1.0. What is called the "source separation rule" meant that if >> > a kanji/hanzi/hanja pair that would otherwise be unified occurs >> > multiply in one of the national standards, then it appears multiply in >> > Unicode. Thus all six version of the "ken" kanji, which blind Freddie >> > could tell are really the same, are dutifully replicated in Unicode, >> > because that's the way they are in JIS X 0208. >> >> That doesn't seem to solve the above problem at all, which involves >> *different* countries using different glyphs for the "same" character. No, I mentioned that because people still say Unicode is "missing some kanji", and "was prepared ignoring national wishes", which is where this thread started. >> Jim, what I don't quite understand is this: exactly what problem is Unicode >> meant to solve anyway? The key problem was the inability of the pre-Unicode codes to mix languages in a usable way. Have you ever tried to mix Japanese with French or German? It was only possible before Unicode by using ISO-2022 escaping which is a truly horrible way to handle text. In the case of the "CJK" languages it was worse. At least with ordinary alphabetics an "a" or a "b" tended to be the same regardless of language, but with the CJK languages, something like $Bhttp://www.csse.monash.edu.au/~jwb/) Computer Science & Software Engineering, Tel: +61 3 9905 3298 Monash University, VIC 3800, Australia Fax: +61 3 9905 5146 (Monash Provider No. 00008C) $B%8%`!&%V%j!<%s(B@$B%b%J%7%eBg3X(B
- Follow-Ups:
- Re: [tlug] Re: Unicode (Was: apache2...)
- From: simon colston
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Re: Unicode
- Next by Date: [tlug] Using Linux for the desktop
- Previous by thread: Re: [tlug] Re: Unicode
- Next by thread: Re: [tlug] Re: Unicode (Was: apache2...)
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links