Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][tlug] Seeking recommendations for file consolidation
- Date: Thu, 17 Aug 2006 20:42:31 +0900
- From: <stephen@example.com>
- Subject: [tlug] Seeking recommendations for file consolidation
- References: <44E44C72.4050509@example.com>
Dave M G writes: > There are so many of them now, though, that I don't need them all, and I > know most of the files will be duplicates anyway. Not to mention a lot > of junk that simply isn't needed at all anymore. You come up with the neatest problems! Here's a cute hack: put the whole schmeer into a git repository. That will automatically (1) compress the files and (2) content-index them. Two files which are byte-for-byte identical will become aliases for the same object in the git database. If you have lots of duplicate files, this also speeds up a tree diff immensely. Unfortunately, git is designed for *comparing a sequence of related trees*, not for *identifying duplicates*. However, for sufficient quantities of pizza (in advance, hold the mayo) and beer (completion bonus) I could probably be convinced to hack up a script to do the identification of duplicates for you. Since git's database is designed for tracking tree changes, you can move stuff around to your heart's content without confusing it, too. > Now that my current computer has many gigabytes of free space, I'm > copying the contents of all the CD-ROMS to a directory on my hard > drive. Each CD-ROM's contents goes into it's own sub-directory to > prevent accidental over-writing. A possible alternative for series of disks that probably cover basically the same material, with mostly identical files from generation to generation, would be to copy, git commit, copy, git commit, etc. However, it's unlikely to do what you want unless you were extremely systematic and consistent about your backup policy, and certainly won't catch file renames. > Once all the data is in one place, I hoped to find a way I can weed > out duplicates and be left with one set of just the most recent > versions of unique files. "Most recent." Hm, that may take another iteration of pizza and beer ... nah, that's pretty simple, too. I think. :-) > I also downloaded and ran Kompare. It says on their web site that it can > recursively compare subdirectories. But I can't find any such feature in > the interface. `diff -rq dir-1 dir-2' will compare two directories, recursing into subdirectories, announcing only which files with the same names differ. Kompare probably will do the same thing. However, the comment above about "extreme system and coherence in backups" applies here. And diff -rq will be *very* slow, I bet Kompare is too.
- Follow-Ups:
- [tlug] Seeking recommendations for file consolidation
- From: stephen
- References:
- [tlug] Seeking recommendations for file consolidation
- From: Dave M G
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Language considerations
- Next by Date: [tlug] Seeking recommendations for file consolidation
- Previous by thread: [tlug] Seeking recommendations for file consolidation
- Next by thread: [tlug] Seeking recommendations for file consolidation
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links