It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
2201
 
 
The original post: /r/datahoarder by /u/doyoueventdrift on 2025-04-08 07:00:38.

Hi, so I'm finally getting there in terms of a proper 3-2-1 backup solution.

Setup:

  • DS124 - active dataset
  • External USB harddisk - for weekly backups, always connected
  • External USB harddisk dock - for quarterly/half-year/yearly backups
    • Quarterly - onsite air-gap backup
    • half-year/yearly - offsite air-gap backup

I still need to store my critical data in a cloud somewhere, but I haven't gotten to that yet.

That means that I'll have 2 harddisk in a rotation using the hd dock.

I've seen many storage enclosures, from just a plastic case to crush resistant to EMP resistant.

What do you store air-gap backups in?

2202
 
 
The original post: /r/datahoarder by /u/piefanart on 2025-04-08 06:09:49.

My father in law was a computer programmer at the dawn of the internet for a few large companies. We have a lot of random old computers and hard drives in our possession. I don't know exactly what is on it. I know some of it has to do with the groudnwork for hospital programs from the 70s and 80s. One of the hard drives has a receipt where it cost around $5000 in the 80s. it is huge.

This is all being stored on my enclosed back porch and in my shed, neither of which are fully protected from the elements. My partner who technically owns the house doesnt seem concerned with this rotting away because he thinks it is obsolete, or not worth preserving. But he cant get rid of it. He has actual hoarding tendencies, where he keeps everything but doesnt do anything to keep it safe. piles and piles of broken computers, some 50+ years old. etc.

What concerns me the most is the reels of actual paper code, the type where its spools of thin paper with holes punched in it. My father in law made these in the 70s.

I dont know what this code is, but i want to digitize it. I dont think we have the computers that read it still, as most of his stuff from that era was owned by the companies he worked for, my partner recalls he would go to an office to work on it. The reels offer no help, only stating his name and sometimes the year. I can go take some photos tomorrow.

This is in salt lake city utah.

If anyone has help on how to archive this, please let me know.

2203
 
 
The original post: /r/datahoarder by /u/Boombapdoobs on 2025-04-08 05:46:35.

Hello Data Hoarders!

I have a 1tb external hard drive that is almost entirely full of data (pictures, videos, audio). However, this hard drive is encrypted on mac and wont work with windows.

I am exiting the apple ecosystem and getting an android photo and windows laptop, however, how can I duplicate the 1 TB HDD and have it accessible on my windows?

The apple closed ecosystem has really got me into stress.

2204
 
 
The original post: /r/datahoarder by /u/magikarpower on 2025-04-08 05:30:52.

https://web.archive.org/web/20100102033631/http://www.vbox7.com:80/play:f853e171

This a lost Soulja Boy song. I was hoping to hear it, and I think the video may be successfully saved, but I have no clue how to extract videos from wayback that aren't youtube vids. Any ideas?

2205
 
 
The original post: /r/datahoarder by /u/moliteashirt on 2025-04-08 03:05:01.

I like the idea of having everything stored in one place, but is NAS a bit complicated to maintain? I’ve seen posts about remote access, automatic backups, AI sorting etc, but how smooth is it really once you’re using it day-to-day? Not looking for a super techy solution, just something that works and doesn’t break all the time. Honest pros & cons would be helpful. Ty.

2206
 
 
The original post: /r/datahoarder by /u/CilicianCrusader on 2025-04-08 02:07:42.

hello, anyone has experience ordering the 1TB SanDisk Micro SDs from Amazon? the Extreme is going for $90, but wondering if people have seen it at a better price recently. I know a few years ago, 1TBs were in hundreds, but not sure if $90 is the floor. Also does anyone know Ultra vs Extreme differences?

2207
 
 
The original post: /r/datahoarder by /u/Eastern-Bluejay-8912 on 2025-04-08 01:58:14.

Hey, so I am planning on making a Ugreen or another nas console home movie server/streaming service for myself. An I was wondering a few things:

1.If I were to get a set up, should I run 2 2.5 SSDs as my main storage units and have 2 standard 3.5 drives as the back ups? Or vice versus? A with the write 1 time and read 1000+ times, will this cause the SSDs to ware down to where I might as well just use a 3.5 standard? Or would I be fine with SSDs as the main, since reading compared to writing causes so much less ware on the drive?

2.For a movie server, would it matter if they are standard drives or NAS drives? Because I’m looking between WD black and WD red and WD blue and can’t decide which is the best. 🤔

Edit:yes I did search the site beforehand but haven’t found a definitive answer on either the drive type and WD drive types.

Edit 2:•Software I’m gona run:jellyfin or plex, •hardware:might be an intel PC or either Ugreen/synology/terramaster NAS with the sliding drive bays •Storage:Thinking at least 2 main drives and 2 for redundancy/ back up and the size of each at least 8TB akin to what I posted above and my questions.

2208
 
 
The original post: /r/datahoarder by /u/BostonDrivingIsWorse on 2025-04-08 01:49:42.

name: zimit
services:
    zimit:
        volumes:
            - ${OUTPUT}:/output
        shm_size: 1gb
        image: ghcr.io/openzim/zimit
        command: zimit --seeds ${URL} --name
            ${FILENAME} --depth ${DEPTH} #number of hops. -1 (infinite) is default.

#The image accepts the following parameters, as well as any of the Browsertrix crawler and warc2zim ones:
#    Required: --seeds URL - the url to start crawling from ; multiple URLs can be separated by a comma (even if usually not needed, these are just the seeds of the crawl) ; first seed URL is used as ZIM homepage
#    Required: --name - Name of ZIM file
#    --output - output directory (defaults to /output)
#    --pageLimit U - Limit capture to at most U URLs
#    --scopeExcludeRx <regex> - skip URLs that match the regex from crawling. Can be specified multiple times. An example is --scopeExcludeRx="(\?q=|signup-landing\?|\?cid=)", where URLs that contain either ?q= or signup-landing? or ?cid= will be excluded.
#    --workers N - number of crawl workers to be run in parallel
#    --waitUntil - Puppeteer setting for how long to wait for page load. See page.goto waitUntil options. The default is load, but for static sites, --waitUntil domcontentloaded may be used to speed up the crawl (to avoid waiting for ads to load for example).
#    --keep - in case of failure, WARC files and other temporary files (which are stored as a subfolder of output directory) are always kept, otherwise they are automatically deleted. Use this flag to always keep WARC files, even in case of success.

For the four variables, you can add them individually in Portainer (like I did), use a .env file, or replace ${OUTPUT}, ${URL},${FILENAME}, and ${DEPTH} directly.

2209
 
 
The original post: /r/datahoarder by /u/BostonDrivingIsWorse on 2025-04-08 01:46:09.

I just grabbed a CDC.gov zim from January. Anyone have links to other gov sites before they were scrubbed?

2210
 
 
The original post: /r/datahoarder by /u/KingAlex105X on 2025-04-07 23:07:10.

If this isnt the best place to ask please recommend me where. But I ordered this USB and planned to use it to move abunch of video files over but whenever I do now after like 900gb was in it corrupts them seemingly.

So Im asking here if people have any recommendations for ones (preferably not too expensive), can be of similar sizes like I'd accept 800gb.

2211
 
 
The original post: /r/datahoarder by /u/coetaneity92 on 2025-04-07 22:18:40.

My partner's grandmother has passed and has left a collection of hundreds possibly thousands of DVDs. These range from official releases to pirated and bootleg copies.

What would be the best way to digitize and archive this collection? Is there an external device out there that will let me burn and convert the DVDs? I'd want to possibly upload on archive.org if the copyright expired, store on backblaze or maybe another digital archiving site besides a regular torrent, would appreciate any recs on sites and advice in general. I haven't gone through these yet but figure the project would be a fun learning experience.

2212
 
 
The original post: /r/datahoarder by /u/VobsandBagene on 2025-04-07 22:09:33.

So I'm completely new to stashapp, and I'm trying to figure out how to scrape properly. I installed the community scrapers, and some are working fine right out of the box, but a number of the say "could not unmarshal json from script output: EOF" whenever I try to use them, and I don't have the first clue as to what that menas, any help would be much appreciated

2213
 
 
The original post: /r/datahoarder by /u/NCResident5 on 2025-04-07 22:01:56.

I have an Easy Store that is filling up and need something else. At one time I heard the passport was really good about surviving drips, but I was not sure if there still is a real difference.

2214
 
 
The original post: /r/datahoarder by /u/Powerful-World4181 on 2025-04-07 19:06:30.
2215
 
 
The original post: /r/datahoarder by /u/salty_greens on 2025-04-07 16:40:11.

Hi everyone,

I recently bought a Seagate IronWolf Pro 20TB (model ST20000NT001-3MB101) drive and today I had to return the second replacement drive within just one week. I’ve got it from B&H and they’ve been very helpful as they replaced the first drive in 3 days and today I had to send back the second one. Both exhibited filesystem corruption early on, even after clean formats with different file systems (Btrfs and EXT4). Despite passing SMART tests initially, the latest drive quickly failed with group descriptor checksum errors, rendering it unusable.

I rely on these drives for important backups, and it's unacceptable to have to RMA two brand-new units in a row. At this price and with the "Pro" label, I expected enterprise-grade reliability — not drives that fail in under 200 hours.

I looked this drive on Amazon as well and actually Amazon warns that this item is frequently returned. On B&H some mentioned they ordered a batch and many of them were dead on arrival. I have other Seagate drives and they’ve been pretty good and this brand has been on the market forever. The question is, what’s happening to Seagate? Is this series indeed so bad? I have the option to get a refund as well and I was thinking to get a 20TB Exos drive instead but this is just ridiculous at this point. How can I rely on a drive long term? Of course I could buy another backup drive but that’s insane!

2216
 
 
The original post: /r/datahoarder by /u/wells68 on 2025-04-07 19:28:48.

Is this a typo at Newegg? The deal ends in 11 hours.

Seagate BarraCuda ST24000DM001 24TB - $249.99

That's $10.41 per TB. They show the regular price as $299.99, so something is weird.

They also have a 16TB Seagate BarraCuda drive for $329, so over $20/TB.

2217
 
 
The original post: /r/datahoarder by /u/R0b0tWarz on 2025-04-07 16:25:49.

I currently have an 8 bay QNAP NAS in my wall mounted rack. It has 2x 1TB SSD's and 6x 8TB spinners. I want to replace the 2x 1TB SSD's with regular spinners. If I replace both of them them with larger than the current 8TB Iron Wolf Pros that occupy the rest of the bays, will it cause an issue with the RAID setup ? I'm really asking if all the HDDs in the RAID stupid need to be the same side HDD ?

Cheers

2218
 
 
The original post: /r/datahoarder by /u/sofitapulga on 2025-04-07 16:23:38.

Buenos días! Necesito digitalizar una muestra de casi 1.000 videos, en distintos formatos, siendo estos VHS, Betacam, Betacam SP, Data Cartridge y CD. Por favor alguien que me pueda ayudar a encontrar el mejor software y las cosas que necesitaré.

2219
 
 
The original post: /r/datahoarder by /u/anotherjunkie on 2025-04-07 16:15:26.

Right now my hoard is spread across drives of various sizes, generations, and operating systems — mostly stored in my closet. Maybe 20-24TB in all at the moment. The thing is, almost none of it is replicated at the moment.

So I want to get a single drive enclosure (& drives) where I can store everything with some redundancy, as well as make the media available on my home network. I’d like something that I can build out over time, ie. multiple replaceable drive bays that may not all be filled in the beginning. My questions are:

  • Is it better to get a networked enclosure, or network it using something like a Pi?
  • Are there enclosures that accept HDD and SSD? Should I be looking for one that also takes NVME?
  • I’m a RAID newbie. Do these enclosures have built in RAID or do they need to be connected to something running software?
  • What kind of enclosure is recommended for this?
  • Where is a good source of drives that won’t break the bank, and what should I look for?

Thanks for any help you can offer. I’m hoping to not break the bank since this is unplanned/ I’m trying to sneak it in before the prices go up too much.

2220
 
 
The original post: /r/datahoarder by /u/Pythonistar on 2025-04-07 15:58:29.
2221
 
 
The original post: /r/datahoarder by /u/whitenack on 2025-04-07 15:58:24.

Hey all,

I have a synology, and trying to juggle storage capacity of my backups. I have backups set to run daily, and settings to keep versions for a certain period of time. I also have snapshots set up on my backup folder, set to run at certain intervals and to keep versions for a certain period of time. This has created a huge storage concern, as my snapshots are filling up my storage capacity. I have gone in and tried to reduce the number or stored snapshots, but my snapshots are still huge...the same size as my backups.

I can always buy more storage, but I don't want to waste money if I am doing something silly with my retention policies. But I also don't want to leave myself exposed if hackers were to delete my backups and I should have done something more with my snapshots.

2222
 
 
The original post: /r/datahoarder by /u/vghgvbh on 2025-04-07 15:36:56.

I really want to add a NVMe SSD to a proxmox mini PC via USB and control the drive health and temperature via S.M.A.R.T values.

But like 90% of all articles on the internet are false. Drives with a Realtek RLT9220 chip for example are marketed as S.M.A.R.T-pass-through, but they do only with SATA drives. Then there are from sabrent that to pass-through values via USB but they are unreliable and get hot.

Are there any proven USB cases out there that work?

2223
 
 
The original post: /r/datahoarder by /u/_stracci on 2025-04-07 14:42:33.

I am a bit lost, I need to buy a case for Exos X24 24TB SAS, Model No: ST24000NM002H. What do I need to check?

Thank you.

2224
 
 
The original post: /r/datahoarder by /u/HakoForge on 2025-04-07 12:23:03.
2225
 
 
The original post: /r/datahoarder by /u/Historical_Flight_91 on 2025-04-07 02:52:51.

I had a few questions about ReFS since documentation is not very good. Directed at anybody with experience using it.

Objective - want checksumming of files for alerting of present bitrot. ReFS has file integrity streams that in theory do exactly this. I have backups, so I don't care for redundancy. I just need to know which files are bad ASAP.

Setup - ReFS drive is an external drive connected to windows 11 (pro). (Using another pc with enterprise to format.)

A couple questions/concerns

#1- ReFS "salvage" feature. It removes files from the namespace if they are corrupted and can't be repaired (which is always on a single disk). Is this tied to the -Enforce option being on for integrity streams or is having integrity streams enabled sufficient for this to happen. I absolutely do not want files to disappear (acknowledging removed from namespace != deleted) without me knowing.

https://learn.microsoft.com/en-us/archive/blogs/b8/building-the-next-generation-file-system-for-windows-refs

#2 - I noticed that data integrity scans are not enabled in task scheduler (and contradicting the documentation, has triggers set to run every day instead of every 4 weeks, though it's disabled.) There also seem to be three different options

https://preview.redd.it/37ey264atbte1.png?width=2261&format=png&auto=webp&s=0c662ac841bcddb3a3bd98757291673e50bb7cf2

What's the difference between the first 2 apart from the triggers? Does this scan even work in windows 11 non server?

view more: ‹ prev next ›