Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] CJK Unicode problem . . . . . . . . (was: Re: Learn a Variety of Languages)



Jim writes:

 > I worry: 
 > 
 >    http://en.wikipedia.org/wiki/Han_unification
 > 
 > Unicode solves many problems and maintains or creates others. 

Han unification isn't a problem, it's the solution.  Everybody, except
for a certain breed of nationalist dinosaur, agrees that those are the
same characters.  Buddhist scholars, Japanese Chinese poets, and
Gang-of-Six translators may wish to differentiate Han flavors in a
single document---but they should be using explicit markup, anyway.
It's not like *any* of the Han-using languages can collate by
codepoint order, and most especially not Japanese---who, curiously
enough, count as the most fervant opponents of Han unification.

For everybody else, it's a matter of font preference and a character
database (eg, collation order).  But this is true even for Latin
characters.  I forget the details, but in proper typography in some
European languages accents are considered part of the character, and
more (but not all!)  of those languages prefer fonts that put less
space between the main character and the accent.  In others, accents
are considered additions, and the reverse is true.  This distinction
also has implications for collation order, some languages collating
accented versions of a character with the base, others collating them
in some other, fairly arbitrary place.




Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links