It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
4001
 
 
The original post: /r/datahoarder by /u/unimportantfuck on 2025-01-31 23:49:25.

Data.gov is dropping datasets fast!

I just checked, it has a steady and big increase in datasets until Jan 21, 2025, at 307,854 datasets http://web.archive.org/web/20250120135355/https://data.gov/

Now it has lost 2,290 datasets in 9 days!

Look at this huge decrease on Jan 21, between 03:04:19 and 15:15:42 http://web.archive.org/web/20250120135355/https://data.gov/ http://web.archive.org/web/20250121233247/https://data.gov/

Drops from 307,854 to 306,012 datasets!!! It's been decreasing everyday and today it's at 305,564 data.gov

This needs to be on the news! We need to back up as much information as possible relating to climate change and anything tengentally connected to "Green New Deal"!

4002
 
 
The original post: /r/datahoarder by /u/AwaitingCombat on 2025-01-31 23:46:48.
4003
 
 
The original post: /r/datahoarder by /u/grumpy-systems on 2025-01-31 23:34:08.

Given the news I'm planning on turning my TubeArchivist instance for good. I don't think these are in the EOT archives, but if they are feel free to ignore me.

So far I'm collecting:

  • CDC
  • HHS
  • Census Department
  • Department of State (large channel, will take time)

I'm sure there's more, but the first two are my highest priority right now, I've had a handful of videos removed already.

4004
 
 
The original post: /r/datahoarder by /u/coasterghost on 2025-01-31 22:01:25.

You know what to do.

4005
 
 
The original post: /r/datahoarder by /u/ComprehensivePast428 on 2025-01-31 21:30:34.

I didn't realize DHS was federal US data but it is funded by USAID.

DHS is no longer taking new data requests, you can still download all data from old requests. If you have them, download all data ASAP.

4006
 
 
The original post: /r/datahoarder by /u/Raenoke on 2025-01-31 20:56:17.
4007
 
 
The original post: /r/datahoarder by /u/FrozenDog6880 on 2025-01-31 20:34:20.
4008
 
 
The original post: /r/datahoarder by /u/busytransitgworl on 2025-01-31 19:51:16.

I just wanted to ask if there's a way to help your efforts to save and archive public data from Trump's actions.

I got an Unraid setup at home and I want to do something to help you all out, because knowledge is so damn important.

Is there a simple Docker container I could set up? Can I lend a hand somehow?

I hope this is the right sub...

Thanks in advance xxo

4009
 
 
The original post: /r/datahoarder by /u/lucyditeaa on 2025-01-31 18:02:53.
4010
 
 
The original post: /r/datahoarder by /u/ThroyRoy on 2025-01-31 17:50:03.

3pm Eastern they're going to be offline, content and data scrubbed of politically inconvenient material.

Some things already taken down, so this could be last chance to get some datasets.

Source: friend of friend at CDC

4011
 
 
The original post: /r/datahoarder by /u/PricePerGig on 2025-01-31 17:44:00.
4012
 
 
The original post: /r/datahoarder by /u/dnsoares on 2025-01-31 14:04:14.

Hi everyone, (first time posting something, but this has pissed me off)

I'm looking for others who have experienced the same issue with Google deleting a deceased loved one's account before the 2-year inactivity period had expired.

In my case, Google deleted my late father's account before the 2-year inactivity rule was met. I have proof of an email sent from his account on January 24, 2022, to me, and I also know I accessed it afterwards, but I need to find the exact records, my father left me his passwords. However, by November 17, 2023, when I tried to access it again, the account had already been permanently deleted.

Google never contacted me via the recovery email (which was mine) before deletion. This account contained not only emails but also irreplaceable photos and memories stored in Google Drive, which my family (my mother, siblings, and I) can now maybe never recover.

I believe this might not be an isolated case, and Google could be deleting accounts without properly notifying users or respecting their own policies. If you or someone you know has faced a similar situation, let’s connect!

We are considering filing a joint complaint with European data protection authorities to demand an explanation and explore recovery options.

Thank you for your help and any suggestions are welcome!

Best Regards from Portugal

If you’ve gone through this, please share your experience here!

PS: Enthusiastic about saving and democratising access to data. I was responsible for implementing the professional management of the archive of my students' union association, which had more than 100 years of history (40 of them fighting against the dictatorship in Portugal)

4013
 
 
The original post: /r/datahoarder by /u/kennyj313 on 2025-01-31 17:21:02.

Hi all,

My current NAS build is reaching capacity, and I'm looking to upgrade it soon and add capacity/ performance.

Currently, I'm running a Fractal Define R5 which I've absolutely loved, but I've filled the 8 drive bays and need to expand. Current hardware includes a 6600k passed down from my previous gaming setup, a P400 for transcoding, 16GB RAM, 8x8TB Seagate drives along with a 1TB NVME cache drive and a 512GB SATA SSD for application data, and an LSI 9211-8i for connecting the HDD's. Use case wise, I currently run Unraid (got that lifetime license while it was cheap!), and run Plex, a few game servers, some assorted dockers, and a Windows VM. I also like the idea of setting up a gaming VM in the future to play on low power devices, but that's more aspirational.

First thing I'm debating is rack mount vs. freestanding case - my wife and I were fortunate enough to have bought our first (and hopefully forever) home recently, and we've got a few different spaces that could support a rack, and I think long term that's a solution that I, and perhaps more importantly she, am/is open to. However, I know there are also a few full ATX cases out there that can also support upwards of 18 drives, although I figure adding another rack mount enclosure would be easier/cleaner than adding more freestanding capacity.

Second is drives - I've been opening up to the idea of refurbed enterprise drives, but it looks like a lot of others have too, and the prices/ availability seem to reflect that. I'm also thinking I need to finally make the jump to larger capacity drives now that the price/TB sweet spot seems to be shifting.

Third is the other core hardware - I'm strongly leaning towards 12th gen Intel for the price/performance, core count, and QSV. My understanding is that different tiers of UHD don't offer particularly big differences in transcode performance (most of mine is 4K HDR to 1080p for <5 concurrent users), but I figure the extra CPU cores will be useful for additional dockers/game servers and if I decide to run more VM's in the future. I'm also aware that Unraid doesn't recognize the efficiency cores "intelligently", but I figure I can just pin them manually accordingly.

Thoughts/input? Thank you in advance for any advice or callouts on specific part choices!

4014
 
 
The original post: /r/datahoarder by /u/sa3bbb on 2025-01-31 16:46:51.

I would really appreciate some help I have over 500GB .flac I have recovered from a format, the hard drive by mistake, but the problem is that they are corrupt, I cannot find a solution or search engines, give me a nothing worked. Does anyone know away on Mac to actually the restore file.

I don't know if this is the right form to post on, but if someone can direct me I would really appreciate it. Thank you

4015
 
 
The original post: /r/datahoarder by /u/timgrahamart on 2025-01-31 16:02:27.

I'm trying to back up an entire site using WinHTTtrack but it's only grabbing the index page (which is called "index.cfm").

It's not grabbing any of the other pages, which have the following structure:

https://www.websitename.com/index.cfm?page=store&action=categories&do=view&categoryID=16

Does anyone know if there are parameters I could adjust to try and get those pages? Thank you.

4016
 
 
The original post: /r/datahoarder by /u/SchoolOfElectro on 2025-01-31 15:10:35.

https://www.youtube.com/watch?v=vDDXmghT848

4017
 
 
The original post: /r/datahoarder by /u/rudeer_poke on 2025-01-31 14:40:55.

I have a bunch of small files on my server (backups created by rsnapshot) that i dont want to delete yet, but its slowing down my storage, especially during scrub tasks. I was thinking about compressing them, so i went for .zst.tar, but I ended up with a file that has 60 GB, contains millions of files and cannot be browsed by a filemanager like Midnight Commander or Total Commander (via a network share) as it looks like it has to be extracted first.

What other format would be more suitable, so in case I would looking for a file from the archive I could browse it and extract the single file from it. I don't really care bout compression ratio, but it would nice to have something better than plain .tar

In the meantime I have moved to restic based backups, but this is a leftover I don't want to delete yet.

4018
 
 
The original post: /r/datahoarder by /u/AshleyAshes1984 on 2025-01-31 13:59:50.

Original Title: Been gearing up to host LAN Parties and wanted to practice 'The Old Ways' for fun. Between a 2gbps internet connection and also running LANCache on the network, this is the slowest way to install CS2 in my home.

4019
 
 
The original post: /r/datahoarder by /u/BesterFriend on 2025-01-31 13:40:58.

If you’re serious about data hoarding, a few rules will save you headaches:

  1. Storage Redundancy – Use a 3-2-1 backup strategy: 3 copies, 2 different media types, 1 offsite.
  2. File Organization – Automate sorting with tools like FileBot for media and FreeFileSync for backups.
  3. Data Integrity – Use PAR2 for recovery, hashing tools for verification, and ZFS/Btrfs for bit-rot protection.
  4. Efficient Compression – Zstd or 7z (ultra) can save space without killing performance.
  5. Self-Hosting – If using a NAS, consider Unraid or TrueNAS for flexibility.

What’s your biggest regret as a data hoarder? Let’s share lessons!

4020
 
 
The original post: /r/datahoarder by /u/MotorcycleDreamer on 2025-01-31 13:40:10.
4021
 
 
The original post: /r/datahoarder by /u/kinisonkhan on 2025-01-31 13:27:29.

Have a full tower with 12 HDDs, Win11 Ryzen7 5700, 32gb ram, Seasonic 850watt ps.

LSI MegaRaid 9240 6gb/s

LSI MegaRaid 9361 12gb/s

Both raid cards used to be able to handle all 12 drives, but now the 9361 is failing and I had to move 3 hdds to the internal controller built into the motherboard.

I'm not doing any mirroring, stripping, etc... just connecting multiple drives as a Plex server. What should I buy to replace both LSI MegaRaid controllers, in the $200-350 range?

4022
 
 
The original post: /r/datahoarder by /u/inlinesix81 on 2025-01-31 12:42:21.

Looks like the perfect unit for the optical-disc-datahoarder, but here in Europe (Italy) is nowhere to be found, and some sites in Germany say the price is more than 400€.. is it worth it? anybody know anything?

4023
 
 
The original post: /r/datahoarder by /u/DEEP_HURTING on 2025-01-31 12:28:33.

I use FF, and just want to have all these URLs on the clipboard - I use JDownloader to snag stuff en masse. Is there some way of doing this? It seems hopelessly arcane. I've hit F12 and tried all kinds of approaches, and looked high and low for an extension or program that does such a thing. I've installed yt-dl-gui, but it doesn't seem to know how to scrape a page for URLs like what I'm talking about - or maybe I just haven't read the wiki enough.

Augh! Oh, for the days where crap like this was just all on a page you had access to, and IDM would pull it all down for you, no mess no fuss.

4024
 
 
The original post: /r/datahoarder by /u/MaKraMc on 2025-01-31 12:16:57.
4025
 
 
The original post: /r/datahoarder by /u/AF4Q on 2025-01-31 11:25:56.

I currently have a 2 bay Synology NAS (with 2x 8TB Drives) which I want to upgrade. My main usage is Plex streaming to my Apple TV, torrents downloads, some not so critical data storage and TimeMachine backups for my Macs. The main reason for upgrading is that I need more storage and future expandability. I also looked at 4-bay Synology NASes (DS923+ and DS423+) but both of them lack something or the other. DS423+ is very barebones but works well with Plex, whereas the DS923+ is feature rich but has AMD CPUs which isnt really future proof for a NAS Devices (as my knowledge so far).

So I am looking into building my own custom box. I majorly need recommendations for following things:

  • Case: Which one to buy? My NAS sits in my lounge so I would prefer that it looks nice and isnt too big. I am currently looking at Jonsbo N4 and Fractal Design Node 804. I wouldn't really need more than 4-6 drives as you can always put in high capacity drives once you run out of space.

  • Which motherboard to get? I dont really need the latest or greatest but it should be Intel and should have 4-6 SATA ports. 2.5gig or 10gig NIC would be a huge plus but I cant really find any such board. Another option is to get any mATX board and pop in a HBA card for extra SATA ports and a 2.5/10gig NIC.

  • Lastly budget. I wouldnt want this custom build to cost more than the DS923+ or DS423+.

Thanks.

view more: ‹ prev next ›