The way I have been hoarding data is buying a few SSDs at a time and slowly filling one up to around 60% (some more, some less. I buy new drives once or twice a year) while copying it for backups, the problem with this method obviously is that none of the drives are dedicated to any one type or topic of data and now that I have reached a decent level of storage I am attempting to rectify this sorry state of affairs and I am finding it somewhat tedious.
I can bang one out in a few days but I was hoping there was a software or script already made that would perform the following tasks:
- Scan attached drives for directory and file hierarchy
- add to database with something similar to
[ drive UUID: "" , "filename": filename , "duplicate_exists": boolean, "duplicate_drive_UUID": [drive_UUID,drive_UUID, ...etc...]
- save to yaml with direct code serialization and/or json and/or some sort of sql database (preferably sqlite3)
Thank you for any information you can provide to help with this. I know I should have done it different but I focused on duplicates more than fast acquisition of storage space (cost prohibitive).