It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
3951
 
 
The original post: /r/datahoarder by /u/WesternWitchy52 on 2025-02-02 02:32:22.

I'm not sure if this is the right sub to ask but I have an older PC that has been very good to me - 7 years old, custom built machine. It still works well however I ran out of space on my main drive and it's getting a bit old in terms of software, and upgrades. I contacted the store that built it (Memory Express) and they suggested a new build for me, relatively priced.

My question is -- what to do with the old computer. I've already backed up all my software and files I want to transfer over to the new one. And I have everything double backed up in a couple of places (I'm paranoid about losing personal files and projects after a Seagate crashed on and I lost 10,000+ mp3 files.)

Would it make sense to use the older pc for my creative projects like music production? The software is the culprit for taking up so much storage. I thought I'd use the older PC for just music stuff. And then the new PC for gaming and other art projects.

Thoughts? I mean, the computer still works (some faulty graphic issues) and I'm sad to have to upgrade but I needed something more robust for art projects. The computer was a custom built machine from 2018.

TL/DR: I dont know what to do with my old PC that still works but has run out of storage and is too old to upgrade. I regret not upgrading it earlier.

3952
 
 
The original post: /r/datahoarder by /u/didyousayboop on 2025-02-02 01:59:53.

In recent months the Harvard Law School Library Innovation Lab has created a data vault to download, sign as authentic, and make available copies of public government data that is most valuable to researchers, scholars, civil society and the public at large across every field. To begin, we have collected major portions of the datasets tracked by data.gov, federal Github repositories, and PubMed.


As a first step, we have collected the metadata and primary contents for over 300,000 datasets available on data.gov.


In coming weeks we will share full data and metadata for our collection so far. We look forward to seeing how our archive will be used by scholarly researchers and the public.

https://lil.law.harvard.edu/blog/2025/01/30/preserving-public-u-s-federal-data/

3953
 
 
The original post: /r/datahoarder by /u/JollyPreparation747 on 2025-02-01 23:25:27.

I've got 218GB of crawled CDC website artifacts (including links to FDA and NIH artifacts), plus 60GB of about 1200 datasets from data.cdc.gov. I also have lots of NIH pubmed data. Where is a useful place to put this? I checked with the EoT folks, but they just wanted nominated URLs because of provenance issues. But you can upload as a separate collection on archive.org anyway? Can anyone enlighten me?

3954
 
 
The original post: /r/datahoarder by /u/llJohnnyboy13ll on 2025-02-02 06:58:22.
3955
 
 
The original post: /r/datahoarder by /u/Pfacejones on 2025-02-02 05:43:26.

hi if there is a fashion website that I want to save all the images on, it looks like it's thousands, how would I go about doing that? I don't know anything about computers

3956
 
 
The original post: /r/datahoarder by /u/Mindless-Can5751 on 2025-02-02 04:02:52.

Are there any liberty loving organizations mirroring or serving this data?

3957
 
 
The original post: /r/datahoarder by /u/helpmehomeowner on 2025-02-02 02:36:32.

This relates to the recent scramble to preserve gov data before the current administration flushes it down the toilet. The files being scraped or mirrored or torrented, how do we know content hasn't been altered from the original? Seems like a good place to poison the well. Are there hmac, crc, sha hashes, etc. to verify integrity or authenticity?

3958
 
 
The original post: /r/datahoarder by /u/spicy_placenta on 2025-02-02 02:31:27.

This could be a bit of a niche question, and may be a question for another subreddit. But I thought I would try here first as the data is substantially bigger than your common music collector, and you guys might have experience.

I have a pretty big music collection. Many, many terabytes. Many thousands of bands. I have been collecting over 15 years. Initially, my folder structure was:

Music / Band Name / Year - Album Name / Songs

This was my preferred layout as I could easily find the band I am looking for, and would prevent duplicates. However, when the collection grew, loading the Music folder became slower and slower, taking up to 20 seconds or more to load.

I then re-did the layout to create smaller sub-folders based on genre. This resolved the load time issues, but it has quickly become convoluted.

Music / Genre / Sub Genre / Band Name / Compressed or Lossless / Year - Album / Songs

I tried to keep this as basic as possible. An example.

Music / Metal / Thrash Metal / Band Name...

But then had to further break down sub genres as some singular genres still had many thousands of bands in the one folder. It wasn't as bad as before, but would still cause a considerable lag. Using Metal Archives and other sources as a guide. Now most parent sub genre's sit alongside sub-sub genres. Technical Thrash, Progressive Thrash, Death Thrash, Black Thrash etc. These sub-sub genres are sitting along side the parent genre.

A bi-product of this layout is many bands change genres over their career. I have sorted their placement based on their latest album genre. It can make it tricky though as if I was a fan of their earlier work when they were one genre, but then they changed, that band will end up in a different subgenre or sub-subgenre. I need to remember or go hunting. It also means I can have duplicated bands in different genres if I think I am missing their albums. I have maintained this as best I can, and it will be pretty clean, but it's almost a certainty I have duplicated whole discographies.

There are also some genres that are difficult to subdivide. Broad, general genres such as Rock or Pop. I have made an effort to try and intelligently divide these up so artists avoid these massive genre folders, such as subdividing by Indie Rock, Alternative Rock, Post Grunge. But still many bands and artists still fall into this generic genres.

I don't think the genre system is working, or maintainable long term.

Further to this, in each band folder, I have a compressed and lossless folder. Compressed with mp3, and Lossless being FLAC. This means I can't use programs or tools like Lidarr to sort as they insist on changing this folder structure. And with the obscurity of many bands, Lidarr often incorrectly recognises bands or albums and starts moving data.

I need a manual system to simplify this somehow. Does anyone have any ideas?

3959
 
 
The original post: /r/datahoarder by /u/Elkhose on 2025-02-01 22:29:14.

So I just shucked the 8tb seagate backup plus. I know some of these might need a jumper or something to work in SATA inside the PC is this one of these or can I use it directly internally?

3960
 
 
The original post: /r/datahoarder by /u/CaputHumerus on 2025-02-01 20:22:41.

I’ve got 5 HDDs (3x4TB, 1x6TB, 1x12TB) in a Yottamaster 5-bay enclosure. Drivepool makes them into one drive for Plex (Win11) purposes.

The Yottamaster is driving me crazy. Drives randomly unmount, which causes Drivepool to freak out, and I have to physically pull the drive out and reinsert it to get it to come back. By the time I do that, Plex has updated to remove the missing files and now thinks everything on that remounted drive is a new addition. It’s infuriating.

I now know there are issues with port multipliers and the controller in Yottamaster boxes doing this—I should’ve dug deeper before I bought it. But I haven’t seen another brand that doesn’t have the same issue (Mediasonic boxes did it to me too).

I’m looking for suggestions for different setups. Here are my constraints:

  • I’m not going Linux. The Plex computer is also my primary gaming rig.
  • I would do NAS if folks think that’ll be more stable
  • My PC tower (a Y60) doesn’t have 5 free bays for me to move the drives internal
  • I considered a PCI SATA expander to add some SATA ports to my motherboard, but (1) still not enough bays to put the drives internally and (2) I think the PCI ports are blocked because of how the GPU is mounted (Y60 has a riser that flips the GPU onto its side). Probably surmountable if this is 100% the answer.
  • I considered a PCI->SATA card with long enough SATA cables to run to an external HDD array, but that seems… a little inelegant.

Ideally, someone knows of a USB-C HDD enclosure that won’t be plagued by this issue, but folks seem pretty down on these devices, so I’m at a loss.

3961
 
 
The original post: /r/datahoarder by /u/Selfhostert on 2025-02-01 19:35:56.

Would you buy a used 10TB Ironwolf disk with 27.500 hours on it when it gives a "Reallocated sectors" warning in Synology?

Would you pay like 15 bucks to it as a through away drive that holds no important data? Of would you pass on it?

Thanks in advance!

3962
 
 
The original post: /r/datahoarder by /u/modernDayKing on 2025-02-01 18:22:41.

So Im finally on meds and organzing my life. 40 Years of video tapes, optical disks, hard drives, and more. Luckily I have a giant bucket to pour them all in.

anyway...

As ive been a scattered mess with a hoard first, sort it out later mentality, the time has come to pay that piper.

I am wondering if there is a clear favorite go to for finding and managing duplicate files, irrespective of OS. ie., not the best for windows, or best for mac, but best of best. If such a thing exists.

I guess while we are here and I am trauma dumping, if you have any cool utils for scanning harddrive contents and/or tips in general for folding it all up please do share.

Thanks in advance.

3963
 
 
The original post: /r/datahoarder by /u/SomeComparison on 2025-02-01 17:37:15.

For a while now I have been setup to grab in the clear ( unencrypted ) back haul news feeds, I usually watch what is interesting and then let it record over as it runs out of space.

In the current environment I feel like it may beneficial to save these long term. What these feed are, are the feeds the news stations use to insert into the news. They are picked apart and use in various portions of the broadcast. This will give you a good idea on the content that I'm referring to. A lot of this is saved but it is behind a paywall and not really accessible.

I don't have a ton of time to dedicate to this. So I was curious if others thought this was a worthwhile endeavor. And if anyone here is also doing something similar I would be happy to contribute.

Most of this content is 1080i/p as various bit rates depending on the feed, there is little to no 4k and I'm also not setup to receive that anyway.

3964
 
 
The original post: /r/datahoarder by /u/toyguy2k on 2025-02-01 17:32:35.

New here, and hopefully asking the right people! My goal is to rip 3D blu-rays into 3D sbs MP4’s for viewing on a Quest 3S headset. I bought EaseFab Losslesscopy based on reading their online tutorials. It seemed very user friendly, and capable of accomplishing what I wanted. Unfortunately, the software I downloaded (latest version) doesn’t quite reflect what they illustrate in their tutorials. Fewer output choices specifically, but I found a preset for sbs mp4. Unfortunately, while the output file LOOKS like a proper sbs file, when viewed it’s simply not in 3D. There is a depth setting in the software, from 0-100, and I’ve tried multiple settings. No 3D effect. I’m guessing my queries for support are going unanswered due to Chinese New Year, and was wondering if anyone here is familiar with this software. Thanks!

3965
 
 
The original post: /r/datahoarder by /u/Polyscript on 2025-02-01 17:08:10.

Hi Guys,

I need help in choosing the right NAS drive to buy, I have been watching videos and reading other reddit posts but have no idea what i'm doing.

I live in the UK, I am looking for a NAS drive with 4 bays. I will be storing personal stuff on here and some business files. The main reason for the NAS drive is for media, movies on here which I would like to be able to stream wherever I am, but mostly will be streaming to my Samsung TV.

I don't know which one to buy, I know I want 4 bays and need video transcoding and would like a fairly quick transfer speed.

I've seen various Synology, QNAP and Terramaster bays but I find one I like, do more research, and people give me a reason not to buy one or the other lol

Any advise would be appreciated. I check, get confused, go away and come back after a few weeks searching again.

Thank you in advance.

3966
 
 
The original post: /r/datahoarder by /u/andreifasola on 2025-02-01 16:18:39.

I have one dockcase PLP 10 sec unit (nr 2) stuck on "detecting SSD"; even after swapping with a good ssd from unit 1. I also tested its own ssd nr 2 in another dockcase unit nr 1 and it's working well so I ruled out ssd failure.

What happened. I wasn't able to delete some files on dockcase 2, it was taking a long time - something was bugging. I unplugged manually the cable and upon reconnecting the files were corrupted. I ejected and reconnected. Windows found it but slowly, I formatted it and then copied on it files (was aiming to back something up). I let it copy and went away. When I returned to the computer I could not find the drive; gone just like that. I could see the dockcase plugged in but the drive was gone in explorer. I ejected it and every since been stuck on "detecting ssd" every time I plug it in regardless of the sdd I put in.

Is it fried? Anybody got a clue?

3967
 
 
The original post: /r/datahoarder by /u/theswedishguy94 on 2025-02-01 16:14:54.

Hey fellow data hoarders,

I've got a situation with my archive (18TB drive, ~10TB used, thousands of files) that I need advice on. I want to maintain an exact backup on another 18TB drive, but here's the catch:

For safety reasons, I don't want both drives connected at the same time. Ever.

The main challenge is: When I reorganize folders or rename stuff on my main drive, how can I efficiently sync these changes to the backup drive later? I need a solution that can:

  • Track structural changes on main drive while backup is disconnected

  • Safely mirror everything when I connect the backup

  • Handle thousands of files

  • Be reliable with large datasets (10TB+)

  • Ideally have a GUI (not comfortable with command line)

I've looked at FreeFileSync but not sure if it's the best solution for this specific need. What do you recommend?

3968
 
 
The original post: /r/datahoarder by /u/SurjitShow on 2025-02-01 15:00:27.

I'm not sure where to begin with M-disc. Is there a good guide on using M-disc. A video would be good.

Few questions I have:

I would like to use the 25GB because I have read single layer is safer. Are all the 25GB M Discs the same?

Where should I order in the UK to get a legitimate Disc?

Before I burn the files should I put them any kinda of order? I read you need to verify the data, not sure what that means.

What software do I need to burn the discs?

Thanks for any advice.

3969
 
 
The original post: /r/datahoarder by /u/TW-Twisti on 2025-02-01 13:09:06.

I've been using Storage Spaces and while I have no major complaints, I always felt unsettled by how little it seems to be used and by how rare resources seem to be when anything goes wrong. I would feel considerably more at ease with a Linux based solution, which addmittedly I might only prefer due to more familiarity.

That being said, I am also looking for a feature SS don't support; namely the functionality of MergerFS to simply put whole files on single disks. My use case are primarily media files which I can get back with some effort, so I don't want to waste the space on data mirroring, but I also wouldn't want one out of five drives failing mean a 100% loss rate due to bits of files being spread.

I was thinking I could detach the drives from Windows management and hand them over to WSL2, where I could then set up MergerFS/ZFS/Whatever to my liking, and access the file systems via \wsl$ or whatever that notiation was. Would that be reasonable (for media files, so performance is less of a concern) ? Anyone try anything like that ?

3970
 
 
The original post: /r/datahoarder by /u/awildwildlife on 2025-02-01 08:39:48.

First of all, it sounds like a lot of you are doing really important work in light of what's going on.

This a really simple task for you guys, but it's currently way beyond my skill set.

I ran a website that used a template from a company called Zenfolio. I still have it, but just want to download all of the blog entries, ideally with pics.

I haven't been sure who to ask until I saw this sub mentioned a lot today. An ELI10 format would be very much appreciated.

3971
 
 
The original post: /r/datahoarder by /u/Venkat97 on 2025-02-01 23:55:40.

Looking for leads on anyone who might have already archived the USAID site or subsites before it went down. Thanks!

3972
 
 
The original post: /r/datahoarder by /u/PlasticPluto on 2025-02-01 23:45:17.
3973
 
 
The original post: /r/datahoarder by /u/alchemist1e9 on 2025-02-01 23:16:04.

Original Title: OWC Archive Pro: LTO-9 Thunderbolt Tape Drive; “Ruggedly small with a built-in handle, the Archive Pro is able to go on-set or move among studio, department, or office computers for a shared data protection solution.”


Does anyone have one of these? And ideally used it from a Linux host. I need an old school sneaker/postal/car solution which is relative based on LTO-9 tapes but in a rugged portable enclosure.

What do you 12! r/LTO redditors think?

3974
 
 
The original post: /r/datahoarder by /u/flaminglasrswrd on 2025-02-01 21:27:00.
3975
 
 
The original post: /r/datahoarder by /u/7dayintern on 2025-02-01 21:12:27.
view more: ‹ prev next ›