I have a gigantic deduplicating/reorganizing job ahead of me. I had no plan over the years, and I made backups of backups and then backups of that -- proliferating exponentially.
I am using rmlint, since that seems to do the most with the least hardware. Dupeguru was not up to this.
I've had to write a script that moves deeply nested folders up to the top level so that I don't tax my software or hardware with extremely large and complex structures. This is taking a looooong time -- maybe twelve hours for a fifty GB folder.
I'm also trying to sort the data by type, and make rmlint dedup one type of data at a time -- again, to prevent CPU bottlenecks or other forms of failure.
I also have made scripts that clean filenames and folder names.
It's taking so long I'm tempted to just use rmlint now, letting it deal with deeply nested folders, but I'm afraid it might gag on the data. I'm thinking of using rmlint's merge-folders feature, but it sounds experimental, and I don't fully understand it yet.
Moral of the story -- keep current with your data organization, and have a good backup system.
I'm using 2015 iMac 27" with MacOS-Montery. 4GHz clock, 32 GB RAM.
Any pointers on how can proceed?
Thanks.