Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] i18n Primer
- Date: Wed, 11 Aug 2004 10:26:37 +1000 (EST)
- From: Jim Breen <Jim.Breen@example.com>
- Subject: Re: [tlug] i18n Primer
"Stephen J. Turnbull" <stephen@example.com> wrote: >> >>>>> "Josh" == Josh Glover <tlug@example.com> writes: >> Josh> Will one of the i18n gurus on this list (Steve, Jim, et al.) >> Josh> please recommend *the* i18n primer for software developers? >> Josh> I need to remedy my glaring lack of knowledge in the area. >> >> There isn't one that I know of. Jim and I were going to write one, >> but it hasn't happened yet. Naruhodo. I am bearing a huge load of guilt because I have done very little towards it. >> Josh> Using Google, I found "Introduction to i18n", by Kubota >> Josh> Tomohiro,[1] which I have printed off for reading. Is this a >> Josh> good intro? >> >> No. It's written by a Japanese, which normally implies a quite warped >> point of view toward i18n. Kubota is no exception. Nor am I (I may >> not have been born Japanese, but my I18N baptism was in the Church of >> Shit-JIS Mojibake). I've read some of Kubota's stuff. It's not but, but a bit idiosyncratic (but then, aren't we all?) >> However, it may be the best there is in English. Suzuki et al have a >> book in Japanese, but it's oriented to the specialist. O'Reilly's X >> Window System series had an X11R5 update volume which introduced a >> bunch of things reasonably well, but obviously it's heavily >> X-oriented. And I don't know how that's been treated in the X11R6 >> editions. My article in the LJ in 1999 was too heavy on theory, not >> very strong on practice, as was my chapter in Wrox's Professional >> Linux Programming. 'Twas early days though. I things are a bit clearer now in a practical sense. >> Josh> I am trying to avoid writing software that is difficult to >> Josh> internationalise, so I am looking to become familiar with >> Josh> the basics of i18n. >> >> There are no basics of I18N, really. It's all advanced details. And that's the hurdle we are trying to climb over. There is a heap of stuff you have to do and do right. >> However, if you plan to leave the details to others, it's not too >> hard. The first principle is to convert to Unicode (if you're working >> in a low-level language like C/C++, preferably widechars, not UTF-8, >> so as to ensure that English doesn't work serendipitously if you use >> the wrong API) as soon as possible, and do all internal string >> processing in Unicode. Um. I'm not so sure about this. I can think of many situations where you can comfortably leave the internals in UTF8. The hit converting UTF8->Unicode->UTF8 while working on a large file can be horrible. For example, the main part of the internal format of my JMdict is in EUC, and I can open it in a EUC-capable editor in 3 seconds. Opening the UTF8 version in something like Yudit takes about as long as making a cup of coffee. >> The second issue is to decide whether you are supporting localization >> (ie, users are normally monolingual), or multilingualization (the user >> community is multilingual, even if the users are not). In the former >> case, you just need to make sure you always do the conversion, and the >> default external encoding can be a global setting. In the latter >> case, you need to strictly control which modules are allowed to do >> I/O, because otherwise it's possible for different modules to get >> conflicting ideas about what encodings are being used. Furthermore, >> if somebody later decides to do more sophisticated conversions etc, >> they'll be chasing bugs forever as different parts of the program get >> updated at different times because there's no complete list. >> >> The third issue is message localization using gettext. This has a >> moderate number of tricky parts if you want to do it right (for >> example, dealing with printf when the variable parts come in different >> orders in different languages), but it's also something that you can >> typically leave to a specialist, since these issues are normally >> localized to each message. That is, your program's architecture can't >> make it harder or easier for the translation team. However, if you >> want to encourage L10N from the get-go, learn that stuff and provide >> message catalogs. There are lots of tools, and the suite in the >> gettext package is quite complete. All well put. >> As long as you use a language (a p-language, for example) or toolkit >> (GTK) that supports Unicode internally, you generally do not have to >> worry about issues like font handling or input methods. Those are >> somebody else's problem. ;-) Many people think these are the problem, but I agree that if you have done the earlier bits properly, fonts and inputs are a done deal. Jim -- Jim Breen http://www.csse.monash.edu.au/~jwb/ Computer Science & Software Engineering, Tel: +61 3 9905 9554 Monash University, VIC 3800, Australia Fax: +61 3 9905 5146 (Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学
- Follow-Ups:
- Re: [tlug] i18n Primer
- From: Stephen J. Turnbull
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Potentially Dying Hard Disk Questions [2](DroppingHD's...)
- Next by Date: Re: [tlug] i18n Primer
- Previous by thread: Re: [tlug] i18n Primer
- Next by thread: Re: [tlug] i18n Primer
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links