Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] [OT/long] Yet another JMdict front-end



[apologies if this appears twice--I sent it earlier from the wrong account]

This really has nothing to do with Linux, but I know many of you are interested in Japanese and Japanese dictionaries, and many of you also are knowledgeable about Web design & development, so I thought I'd let you know about a little project I have begun, and solicit some feedback.
To make a long story short, a couple of weeks ago I was looking for an 
online Kanji dictionary, and couldn't find one I really liked. Or 
rather, I couldn't find an *interface* I really liked. So I decided to 
create my own. The site is intended to be fast, easy to navigate, and 
aesthetically pleasing. The target audience is English-speaking learners 
of Japanese (intermediate-to-advanced?), so the emphasis is on providing 
easy access to phrases including the target kanji, readings, and 
definitions--basically the kind of info you find in edict/jmdict. It 
currently does not provide information of more scholarly interest, such 
as Nelson index numbers and all that, though such info could maybe be 
added later on as an "advanced option."
Another thing you should know is that my site makes heavy use of the 
latest in Web-standard[*] technology. It is an AJAX application, so you 
at least need a browser with JavaScript enabled, and that supports 
XMLHttpRequest. So recent Gecko-based browsers should be fine, along 
with IE 5.5+ (?) and Opera 8+. Since the whole point of the project is 
to develop a nicer interface to content that is easily available 
elsewhere, I don't feel obligated to create an alternative for older 
browsers, but of course I provide links to other online Kanji dictionaries.
Here's the URL:  <http://matt.gushee.net:8250/index.html>. That's 
probably temporary, so even if you really like it, please don't post any 
links to it just yet. If you have comments and don't want to clutter 
this list, just send an e-mail to <matt@example.com>.

SOME ISSUES TO CONSIDER
=======================

First of all, the title. I am tentatively calling the thing "楽漢摘." I like to think it's rather a clever pun, but if any native Japanese speakers are reading this, I'd like to know how it sounds to you. Is it just a wake-wakaranai gaijin joke? Please don't worry about offending me--I will be happy to change the title if it is too weird.
Now on to more substantive issues:

Indexing approach
-----------------

There will probably be several indexes in the future, but currently I provide one way to look up Kanji: a traditional radical/stroke-count index. Specifically, you select the radical stroke count, then the radical itself, then the stroke count for the whole character, then the specific character that you want. Although it is a linear process and thus easy to understand in principle, it has the disadvantage that people don't know by heart how many strokes are in a character, and it can be very hard to figure out for the more complex ones. In a printed dictionary it's less of a problem because you can easily shift your eyes to another part of the page; in a browser I think it will be awkward at best.
What other alternatives might work well (when you don't know the 
pronunciation)? I've seen Jim Breen's "multi-radical" method and was 
initially resistant to it for a couple of reasons: first, it is 
non-linear, and thus is superficially more complex than the 
radicals/strokes method.
Second, I have been taught (for both Chinese and Japanese) that the 
radical is the "meaning" component, and that in general a character has 
exactly one radical. At any rate, I believe the radical has etymological 
significance, and that understanding which part of the Kanji is the 
radical can contribute to an overall mastery of the language. And a 
single-radical dictionary index reinforces that understanding.
But I'm thinking that a multi--can I say "component" instead of 
"radical"?  Then maybe I could set aside the philosophical objection. 
Anyway, a well-designed multi-thing index might after all be an easier 
way to look up Kanji.
Strokes/radicals index navigation
---------------------------------

If I decide to go to a multi-component index, this might not matter any more. But for the moment, there is an issue with the index menus: in view of the fact that the user will often not be sure how many strokes there are in a character, I have created dynamic menus such that ... actually it's best if you try it out. Basically, if you move your mouse over an item in one row of the menu, the next row is *temporarily* displayed. Thus, let's say you have chosen a given radical. There is a row of numbers representing stroke counts of characters with that radical; if you run your mouse along that row you can easily see what characters exist for each stroke count.
So, do you think this is (a) useful, and (b) intuitive? It would be a 
lot easier to make the menus so that the next row only changes when you 
click something. But if people find the transient display a very helpful 
feature, I will make it work.
Presentation of results
-----------------------

Currently when you select a Kanji, a request goes to the server, which returns a document containing all phrases that start with that Kanji. This document is dumped into a table with 3 columns: [Kanji] Phrase, Reading, and Definitions. This is reasonable in some cases, but sometimes the response document is quite large, so I think some kind of chunking and/or filtering would be helpful. It gets worse if we want to look up all phrases *containing* the selected character. My server-side script can indeed do that, but sometimes it's just way too much data, so I've disabled that behavior for the moment.
Another issue with the result sets is that they're not sorted in any 
useful way--actually I believe they are ordered according to the JMdict 
entry sequence number.
So, how can I improve the processing and presentation of the results?

Miscellaneous technical stuff
-----------------------------

Preparing the index: my list of radicals is derived from Jim Breen's KANJIDIC, but since his data is prepared for a multi-radical lookup system, I can't automatically extract a radicals-and-strokes index, so I am currently creating the index manually. That's why it's so incomplete, of course. Does anyone know of another database somewhere that list each kanji by (single) radical and stroke count?
Glyphs for radicals: if my understanding of the KANJIDIC documentation 
is correct, there is a glyph of each radical in Japanese Kanji, but some 
of them only exist in JISX-0212. If so, you either have to require the 
user to have a JISX-0212 font, use images to represent some radicals, or 
use substitute glyphs from JISX-0208. The last option is not really 
acceptable, I don't think. E.g., 化 for 人偏??
Nice Japanese font: this is purely subjective, of course, but I find 
Mincho rather ugly. I have a font family called DFKaisho which I find to 
be an excellent combination of elegance and readability; my stylesheet 
specifies it for some of the Kanji display elements (with "serif" as a 
fallback, of course). But in the interest of a more beautiful 
Kanji-browsing experience, are there other Kaisho or similar fonts that 
are widely used? Let me know their names and I'll stick 'em in the 
stylesheet. Or tell me to just use Mincho if that's your view. But be 
advised: I am very stubborn about fonts.

[*] Using the term 'standard' to include some de facto standards as well
    as official published ones.

--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language :     matt.gushee.net/rascl/ :


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links