Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: Japanese search engines
- To: tlug@example.com
- Subject: Re: Japanese search engines
- From: "Frank BENNETT (=?iso-2022-jp?B?GyRCJVUlaSVzJS8hISVZJU0lQyVIGyhC?= )" <bennett@example.com>
- Date: Sun, 17 Dec 2000 21:48:35 +0900
- Content-Transfer-Encoding: 7bit
- Content-Type: text/plain; charset=iso-2022-jp
- In-Reply-To: <4.2.0.58.J.20001217181239.02a4a028@example.com>; from YAMAGATA Hiroo on Sun, Dec 17, 2000 at 06:14:13PM +0900
- References: <20001216192102.A2171@example.com> <20001216192102.A2171@example.com> <200012170902.SAA05208@example.com> <4.2.0.58.J.20001217181239.02a4a028@example.com>
- Reply-To: tlug@example.com
- Resent-From: tlug@example.com
- Resent-Message-ID: <9uCVkB.A.Q6E.lYLP6@example.com>
- Resent-Sender: tlug-request@example.com
On Sun, Dec 17, 2000 at 06:14:13PM +0900, YAMAGATA Hiroo wrote: > At 18:07 00/12/17 +0900, you wrote: > >I stick to Namazu + Kakashi. It's well designed Japanese search engine, if > >you installed properly. I did not have a experience to handle such huge > >data archive you mentioned , however it worth to test it. > > Maybe better to use ChaSen rather than Kakashi. Those archaic Kanpo > languages may not score well with Kakashi... but you need to test. Honda-san, Yamagata-san, thank you. I will definitely look at ChaSen, and once things settle down, I will take a stab at running Namazu over the sources, so see how it performs. I did sit down and write syntax-checking code in Python for freeWAIS-sf-jp during the weekend. The attractions of freeWAIS-sf are its support for free-text parsing of the target document (so we can define a date field, keyword fields, etc), and its support for proximity operators (for example, "Prime w/2 Mori" would find documents containing "Prime Minister Mori", as well as "George Mori likes prime rib", but not "Mori, it must be said, is a poor excuse for a Prime Minister"). I absolutely need the first feature, because of the way my data set is built. The second is nice, because it looks and feels like the Lexis service, with which most legal practitioners are familiar. With Honda-san's encouragement, I'll follow this one up myself. Many thanks. Oh, and I should mention that the archive I'm working on _will_ be thrown open for general access in due course. More news in a few more days :-) Cheers, Frank
- References:
- Japanese search engines
- From: Frank BENNETT <bennett@example.com>
- Re: Japanese search engines
- From: Shigeo Honda <shige@example.com>
- Re: Japanese search engines
- From: YAMAGATA Hiroo <hiyori13@example.com>
Home | Main Index | Thread Index
- Prev by Date: Re: "restarting" scsi
- Next by Date: Re: "restarting" scsi
- Prev by thread: Re: Japanese search engines
- Next by thread: latex,lyx
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links