Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Limits on file numbers in sort -m
- Date: Fri, 30 May 2014 11:14:06 +1000
- From: Jim Breen <jimbreen@example.com>
- Subject: Re: [tlug] Limits on file numbers in sort -m
- References: <CABHGxq7jYkDDLkF8uzzNK8WeU+37t1wgpVhk6VD2HQKyEi7wBw@mail.gmail.com> <CAJMSLH618MfmhL9ufAOfLXxw52i4STpF8dsc_+xe-2GRB3JM8g@mail.gmail.com> <87bnui8sky.fsf@uwakimon.sk.tsukuba.ac.jp> <CABHGxq4NEBMVR8jndiEvcgsGkc_B0f-qcrs2sFjqaAdWH3n9sw@mail.gmail.com> <CAJMSLH6SdSUmvHsjmZBZP-g1graNuPV51vdwLzpPf7ipmz7+zA@mail.gmail.com> <CABHGxq7eCk9Pk1JtNrZuqK_8yv4bt7ftoWwyXqf5P+GKYQH=5w@mail.gmail.com> <87sins7mhy.fsf@uwakimon.sk.tsukuba.ac.jp> <CAJA1Y2b6XyFNsFhDbK+ktgWk0cE5Lzfv9OrhimBH8RyN78yzLQ@mail.gmail.com> <87d2ew76yd.fsf@uwakimon.sk.tsukuba.ac.jp> <CAJA1Y2Y2vaH06nJyt25uREjCT9RELoTnfwDpeXX5Z97W45oZUQ@mail.gmail.com> <5387D422.2070302@extellisys.com>
On 30 May 2014 10:43, Travis Cardwell <travis.cardwell@example.com> wrote: > The `sort -m` command does not sum counts, which is why Jim said that he > will need to use external software to do so. Exactly, and since I'm aggregating counts, I can't use "uniq -c". I'm building an n-gram corpus from a large text corpus. So as I work through the text, I'm collecting things like the 4-gram: これ は 何 です As I'm merging and counting I'll have interim files such as; file-n: これ は 何 です 19 file-m: これ は 何 です 27 leading ultimately to: file-x: これ は 何 です 46 "sort-m" is the thing to use for merging the presorted initial and intermediate files, but I still need my own utility to aggregate them because it can handle the interim counts. Cheers Jim -- Jim Breen Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
- Follow-Ups:
- Re: [tlug] Limits on file numbers in sort -m
- From: Travis Cardwell
- Re: [tlug] Limits on file numbers in sort -m
- From: Bruno Raoult
- References:
- [tlug] Limits on file numbers in sort -m
- From: Jim Breen
- Re: [tlug] Limits on file numbers in sort -m
- From: 黒鉄章
- Re: [tlug] Limits on file numbers in sort -m
- From: Stephen J. Turnbull
- Re: [tlug] Limits on file numbers in sort -m
- From: Jim Breen
- Re: [tlug] Limits on file numbers in sort -m
- From: 黒鉄章
- Re: [tlug] Limits on file numbers in sort -m
- From: Jim Breen
- Re: [tlug] Limits on file numbers in sort -m
- From: Stephen J. Turnbull
- Re: [tlug] Limits on file numbers in sort -m
- From: Bruno Raoult
- Re: [tlug] Limits on file numbers in sort -m
- From: Stephen J. Turnbull
- Re: [tlug] Limits on file numbers in sort -m
- From: Bruno Raoult
- Re: [tlug] Limits on file numbers in sort -m
- From: Travis Cardwell
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Limits on file numbers in sort -m
- Next by Date: Re: [tlug] Limits on file numbers in sort -m
- Previous by thread: Re: [tlug] Limits on file numbers in sort -m
- Next by thread: Re: [tlug] Limits on file numbers in sort -m
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links