Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Japanese regex question



On Wednesday 24 August 2005 19:50, Brett Robson wrote:

> But he said he was using raw 2022 encoding and those numbers look
> correct for katakana in 2022. I was thinking though that he would
> probably be better off converting to unicode internally for that
> exact reason.

Brett.  you can't imagine how much I wish that was an option :(

I'm dealing with regexes that have to deal with Russian (in Koi-8 and 
WIndows-1251), Japanese, Chinese (mostly Big-5 and GB-2312), and (soon) 
Korean through an application that wasn't intended to handle anything 
but ASCII when it was created.  Wheeeeeeeeeee!

That's why I'm dabbling in raw character codes :p

And yes, Ken Lunde's CJKV Information Processing is worth much more than 
its fairly steep price :-)

Jonathan


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links