Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: nkf [was: RE: tlug: RE: cathedral and bazaar Japanese]



>>>>> "jb" == Jonathan Byrne--3Web <jq@example.com> writes:

    jb> That whole thing with Netscrape not being able to read that
    jb> page is rather disappointing.  Apparently, "Japanese
    jb> auto-detect" in the Linux version of Communicator means
    jb> something less than that in reality.  The versions of

100% "Japanese autodetect" is impossible with less than artificial
intelligence.  There are some heuristics that help, but there is no
purely arithmetical way to distinguish EUC-JP from SJIS with 100%
reliability.  You basically have to be able to look at the text and
realize it's nonsense.  (That's harder than you think, because it
would invariably decide that Windows Help is not written in any known
character set :-)

The content-type of "x-euc-jp" is even less reliable.  That "x-" means
that this is a private encoding, not registered with the IANA.  (It is
not possible to register an "x-" character set with the IANA by
definition.)  Few MIME implementations understand it, and I don't know 
if there is a public registration of a proper MIME type.

The right thing to do with localized Japanese pages is to use the
ISO-2022-JP encoding with a MIME Content-Type of ISO-2022-JP;
essentially all MIME implementations can handle that.  Alternatively,
you can use UTF-[78].  I forget what the MIME type is, though I know
it's registered.

BTW, HotJava is probably fine if you have infinite RAM; it may be
better now, too, but when I tried it a year ago, after an hour of
browsing it had a footprint of 100MB.  Stomp!  Stomp!
---------------------------------------------------------------
Next TLUG Meeting: 11 April Sat, Tokyo Station Yaesu gate 12:30
Featuring Tague Griffith of Netscape i18n talking on source code
---------------------------------------------------------------
a word from the sponsor:
TWICS - Japan's First Public-Access Internet System
www.twics.com  info@example.com  Tel:03-3351-5977  Fax:03-3353-6096



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links