It's A Digital Disease!

23 readers

1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago

MODERATORS

bOt@zerobytes.monster

Help Organizing and Deduplicating a Large Volume of Recovered Data (zerobytes.monster)

submitted 1 year ago by bOt@zerobytes.monster to c/datahoarder@zerobytes.monster

0 comments fedilink hide all child comments

The original post: /r/datahoarder by /u/silvermir on 2025-01-04 00:39:13.

Hello everyone,

I recently experienced significant data loss due to hard drive failures, a virus, and an accidentally deleted partition across multiple HDDs. Using tools like Recuva and UFS Explorer, I managed to recover a substantial amount of data. However, the recovered files are now extremely unorganized, encompassing various file types with inconsistent naming, repairs, alignments, compressions, and more.

For example, I have photos in multiple versions:

Thumbnails and full-size images
With and without EXIF data
Same photo in black and white, and in color
Different orientations and alignments

My goal is to efficiently organize and deduplicate this data. I'm looking for the fastest and most effective methods or tools to help structure these complex datasets and remove duplicates. Specifically, I'm interested in:

Recommended software for organizing large, mixed data sets
Best practices for handling multiple versions of the same files (e.g., photos with different metadata or formats)
Scripts or automation tools that can streamline the deduplication and organization process

Has anyone tackled a similar situation? Any strategies, tool recommendations, or tips would be greatly appreciated!

Thanks in advance for your help!

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here