Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Re: UTF-8 Terminal Emulators?



>>>>> "simon" == simon colston <simon@example.com> writes:

    simon> I'm trying to convert the encoding of japanese email to
    simon> EUC-JP using libjconv (a convinience wrapper for iconv).

Another victim of Uli Drepper's admirable conformance-mania.  ;-)

    simon> The problem is that single-byte katakana cause mojibake
    simon> when converting from iso-2022-jp.  Does anyone know the
    simon> reason for this?

Yes.  There are no single-byte katakana in ISO-2022-JP.  For more
information, get RFC 1468 from your nearest repository.  It's short (6
screens or so).

    simon> I've discovered that converting the same text from
    simon> iso-2022-jp-2 works perfectly.  Are the content-type
    simon> headers lying to me when they say 'charset="iso-2022-jp"'?

Strictly speaking, yes.

    simon> Do they really mean 'charset="iso-2022-jp-2"' ??

No.  They really mean "nobody else fully conforms to standards, so why
should I---expect the worst, because you'll get it."  ISO-2022-JP-2
aka ISO-2022-INT is a good way to prepare for that; in theory it can
even handle Unicode, although I've never tested iconv with it.

I-have-5-dan-in-kijundo-ly y'rs,

Steve

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
              Don't ask how you can "do" free software business;
              ask what your business can "do for" free software.


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links