Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] oneliners, Was: Moving on from xterm
- Date: Thu, 25 Aug 2016 08:34:26 +0900
- From: NOKUBI Takatsugu <knok@example.com>
- Subject: Re: [tlug] oneliners, Was: Moving on from xterm
- References: <20160819111442.GA30780@quadratic.cynic.net> <9f9cc5f579c92c3ddf7f29865d5862c2@jp.sometwo.net> <20160822114101.GA3944@fluxcoil.net> <87h9ace7zm.wl-knok@daionet.gr.jp> <CABHGxq4gBx39m0+TPZe3LLYPFetAvoc1wfZj0_0YGz3+w2A=1w@mail.gmail.com> <87fupvdqna.wl-knok@daionet.gr.jp> <CABHGxq5=zHNYVLEK+SPg8jg3jJsD54rFsM9=-hNwZwQ2jhkEOw@mail.gmail.com>
- User-agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (Gojō) APEL/10.8 EasyPG/1.0.0 Emacs/24.4 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
At Wed, 24 Aug 2016 12:11:58 +1000, Jim Breen wrote: > Apart from its age, IPADIC also had/has problems with release permissions > dating back to its ICOT source. For that reason the people at NAIST built > a replacement "NAIST DIC". (https://en.osdn.jp/projects/naist-jdic/) Oh... This "IPADIC/ICOT license issue" is caused by me... This problem was discussed on debian-legal mailing list, and the following is the summary: https://wiki.debian.org/IpadicLicense Now debian treats ipadic as DFSG-free. BTW, this is only discussed on Debian Project. Other distribution (like Fedora, OpenSuSE) don't care about it. I think this problem overrated in the public mind. > > On the other hand, Toshinori Sato said that mecab-ipadic-neologd is > > better performance than plain ipadic on text classification task. > > It's really hard problem... > > "text classification task"って? For getting the right yomikata (aka furigana) > on a proper name longer sequences can be useful, but there's a lot of > text analysis where the stuff Sato has added would cause quite some grief. > His addition of "中居正広のミになる図書館" as an entry is a hoot. It means using mecab-ipadic-neologd for word segmentation, and not using feature. Word segmentation is widery used for text classification task. I didn't make clear. Toshinari said using text classification task is for quantitive evaluation for the dictionary. I heard from him in a public event, but the are no presentation material, so I don't now the details. In general natural language processing, mecab-ipadic-neologd is not good. I agree with you. By the way, I made a script to convert from SKKJISYO to kakasidict. I think It is also useful for everyone. http://www.namazu.org/gitweb/?p=dictconv.git;a=tree The original kakasidict is also based on very old SKKJISYO, but SKKJISYO itself has been updated now.
- References:
- [tlug] Moving on from xterm
- From: Curt Sampson
- Re: [tlug] Moving on from xterm
- From: Furkan Mustafa
- [tlug] oneliners, Was: Moving on from xterm
- From: Christian Horn
- Re: [tlug] oneliners, Was: Moving on from xterm
- From: NOKUBI Takatsugu
- Re: [tlug] oneliners, Was: Moving on from xterm
- From: Jim Breen
- Re: [tlug] oneliners, Was: Moving on from xterm
- From: NOKUBI Takatsugu
- Re: [tlug] oneliners, Was: Moving on from xterm
- From: Jim Breen
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] oneliners, Was: Moving on from xterm
- Previous by thread: Re: [tlug] oneliners, Was: Moving on from xterm
- Next by thread: Re: [tlug] Moving on from xterm
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links