Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Search MySQL for Japanese Names]
- Date: Tue, 20 Oct 2009 10:58:50 +0900
- From: 黒鉄章 <akira@example.com>
- Subject: Re: [tlug] Search MySQL for Japanese Names]
- References: <5634e9210910191749m675cdf8cl3ca73efa0fcbeccb@example.com>
> There are several sources of possible readings of names-in-kanji. Don't > rely on things like MeCab or Chasen and their lexicons are rather limited > for names. ENAMDICT has a huge name collection and you can get the > possibilities by looking them up on > http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi?2C > > HOWEVER you really must get confirmation on how people read their names. > A significant number of names are read in unusual ways. > > Jim Absolutely right. Mecab/Chasen dictionaries (IPADIC, Unidic, whichever one you plug into them) don't include anywhere the amount of name readings as ENAMDICT. By design these parsers don't want multiple readings for names. They just want the most likely one. I've made account-registration webpage forms which, being AJAX-y, do a lot of things dynamically as the user types. E.g. when they type the yomi (a.k.a. furigana a.k.a. readings) fior their names I create the romaji version simultaneously. Or when they type the postcode, the address field is filled out to the town/suburb level. But I don't fill in the furigana when they type the kanji version of the name, for fear of pissing off those whose name readings are not the most common. Jim, curious question: how many names in ENAMDICT resolve to just one reading? Even a I-would-have-thought-surefire candidate for uniqueness such as 田中(tanaka) resolves to ten different readings in ENAMDICT (tanata, tanka, danaka, nunoka, ....). 鈴木(suzuki) has seven. Yours, Akira Kurogane
- Follow-Ups:
- Re: [tlug] Search MySQL for Japanese Names]
- From: Stephen Lee
- Re: [tlug] Search MySQL for Japanese Names]
- From: Jim Breen
- References:
- Re: [tlug] Search MySQL for Japanese Names]
- From: Jim Breen
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Search MySQL for Japanese Names]
- Next by Date: Re: [tlug] [Fwd: Search MySQL for Japanese Names]
- Previous by thread: Re: [tlug] Search MySQL for Japanese Names]
- Next by thread: Re: [tlug] Search MySQL for Japanese Names]
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links