Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: tlug: Kanji to Hiragana soft
- To: tlug@example.com
- Subject: Re: tlug: Kanji to Hiragana soft
- From: Jonathan Byrne <jq@example.com>
- Date: Sat, 24 Oct 1998 20:39:20 +0900 (JST)
- Content-Type: TEXT/PLAIN; charset=US-ASCII
- In-Reply-To: <Pine.LNX.3.96.981024183116.663A-100000@example.com>
- Reply-To: tlug@example.com
- Sender: owner-tlug@example.com
On Sat, 24 Oct 1998, Eric S. Standlee wrote: > Is there a software package that will change kanji in to hiragana > (furigana) for linux. I need to take a large amount of Japanese text and > filter kanji into hiragana for those who cannot read kanji well. I've never seen anything like that on any platform, so you may have your work cut out for you in this search. If I may, I'd like to suggest that what you really need here is a program that will supply hiragana readings in addition to the kanji, rather than by replacing them. A page of pure hiragana is quite difficult to read. It's such a nuisance, in fact, that I probably wouldn't bother doing it if someone gave me such a page. There are two approaches that could be taken to this. One is to add furigana to the kanji, which AFAIK requires a GUI-based solution. The other would be to insert hiragana readings in parentheses or square brackets as inline text after the kanji in question. A difficult point in doing this is that it will need a *big* dictionary, and also a pretty accurate parser to decide where one word ends and the other begins in cases where there are several kanji compounds in a row that are not broken up by punctuation or interspersed kana. Put another way, the program you are talking about is more or less an inverted input method: it takes kanji compounds, compares them against it's dictionary, and outputs its best guess as to what the correct readings are. It will need to be able to not only check accurately find word boundaries and check its dictionary, but have algorithms for deciding what to do about compounds that aren't in the dictionary (ignore them or try to figure it out and mark it as unsure). A pretty necessary feature would also be the ability to add to the dictionary. The one part that's easier than making an IME is that it doesn't have to deal with accepting output from applications. This could be written so that it just accepted a text file as input and produced another one (with kana readings added) as output. A really sophisticated one would work interactively and allow the user to correct readings that were wrong, or flag in red those that seemed questionable. This is certainly not a trivial program, and one which there probably has been and will continue to be little demand for on Linux (or on other platforms too, maybe). However, an accurate and reasonably fast tool that could add furigana could potentially be a very useful item for language teachers and students, etc. I wish I had the ability to write something like this myself, I really do. If one couldn't be located anywhere, I'd start working on it myself. I have a CD with the Monash U. Nihongo archive on it. I'll search through it and see if I can find something. I'll let you know what I come up with. Cheers, Jonathan --------------------------------------------------------------- Next Nomikai: 20 November, 19:30 Tengu TokyoEkiMae 03-3275-3691 Next Meeting: 12 December, 12:30 Tokyo Station Yaesu central gate --------------------------------------------------------------- Sponsor: PHT, makers of TurboLinux http://www.pht.co.jp
- References:
- tlug: Kanji to Hiragana soft
- From: "Eric S. Standlee" <fwiw3980@example.com>
Home | Main Index | Thread Index
- Prev by Date: tlug: Kanji to Hiragana soft
- Next by Date: Re: tlug: Kanji to Hiragana soft
- Prev by thread: tlug: Kanji to Hiragana soft
- Next by thread: Re: tlug: Kanji to Hiragana soft
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links