In this pursuit, I just dl'ed and ran a duplicate-file finder/cleaner program for the first time. It found 18305 duplicate groups (# of files duped 2X + # of files duped 3X, etc., totalled), 52300 duplicated files, 4.26 GB duplicate size. Holy #^&%! These are TRUE duplicate files (compared byte-to-byte). The dupe program has a "quick and dirty" button to instantly delete all but the first file in each of the dupe groups. That is certainly an easy solution, but it's also dangerous I think. The way I see it, I need to take a look at each of the files, (think about what they do, why they're where they are, if I want to keep them there or put them somewhere else), and then decide if I want to get rid of one or more of the duplicates.
This is complicated. Example: Let's say I have a folder of customized icons, and the folder happens to be duplicated in several places, and other files/folders use those icons, i.e., they get the icons via a path (thread, whatever you call it) to ONE of the duplicated icon source folders. Right? Right. So now what happens if I move or delete the wrong duplicate source folder? No custom icons is what happens. Because the path is a dead end. I know this because I just did it. I deleted the source icon folder -> no icons. Restored it -> icons. Changed the path to the other duped icon source folder -> icons.
Wouldn't this principle apply across the board? If I delete one of two duped files, any path to it will dead end, so I have to change the path to the remaining file. Some files are connected to hundreds of others. So, to delete dupes properly, it appears that I need to check every single path to/from all 52300 duplicated files/folders and reroute every last one if necessary?
So now that I vented (I feel better already), How DO I get rid of duplicate files responsibly, i.e., without using the "quick and dirty" button, AND do it in a reasonable length of time, in hours, not days or weeks. Is there some kind of tutorial or guru geek's web site explaining how to do this in a methodical, organized way? (I've looked and haven't found one) And most important, whatever the answer, it has to be understandable to an average-joe end-user like me.
And then... I could really use some way of preventing dupes in the future; not duplicating the problem someone might say. I wouldn't say that of course, but someone might. So if you have a handy-dandy method to prevent dupes, let me know that, too.
Thanks for any suggestions. They are greatly appreciated.
Edited by brillo, 18 August 2006 - 02:51 AM.