
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] searching for kanji strings, ignore punctuation and endof lines
- Date: Mon, 16 Jan 2006 17:00:48 +0900
- From: Ramil Sagum <ramil@example.com>
- Subject: Re: [tlug] searching for kanji strings, ignore punctuation and endof lines
- References: <43CB4F48.1060200@example.com>
- User-agent: Mozilla Thunderbird 1.0.7 (Windows/20050923)
David Riggs wrote:
> If I could take a two line unit spat out by grep -A2, then process it
> as a separate set, I could do it rather easily. Strip out stuff after
> the match for the first kanji: newline, punctuation, and line numbers.
> Then if there is a match print out the working data area.
How about making a second copy of the text with the punctuations stripped
(preserving the line count) and then search the phrase from there?
It's a bit of a kludge, but if disk space isn't a problem, then this is an easy
way. Since you have to do this a lot, the processed copy might even give you
that needed speed boost. (I'm assuming your haystack won't change a lot, will it
always be the CBETA canon?)
-moogs
Home |
Main Index |
Thread Index