Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] Unicode and kanji (was apache2 setup and japanese)




--- Jim Breen <jwb@example.com> wrote:
> 
> (*) Since Unicode 4.0, it has every kanji known to mankind, apart
> from those created in the last 6 months.
> 

For Japanese, I think it is probably about %99.99 there.  
Unfortunately, Unicode doesn't have all the characters from
CNS 11643, the character set used in Taiwan for government systems
like the household registry and tax administration. And it may 
never happen, as some of the characters encoded there as separate
code points would be unified by Unicode. 

I used to do document management systems for the government 
in Taiwan, and it gets even worse. They often had people come
in with documents containing "incorrect" characters in 
personal or place names (e.g. should have three stroke
water radical instead of two stroke water radical). 
They went to the standards office in Taiwan and asked to 
have codepoints for these characters, but were refused. 
So they create their own font glyphs whenever necessary and
use the equivalent of Adobe Acrobat to exchange documents with
other departments. They can be printed out, but the text is 
not searchable. What a mess :^P



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links