It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
3851
 
 
The original post: /r/datahoarder by /u/CGG0 on 2025-02-04 19:17:05.

My Jellyfin server went rouge a few nights ago and started to delete EVERY single show/episode I had flagged as "watched" (10gb+ worth.) Files are on a Synology NAS.

Is data recovery possible? Recommended tools?

Edit: 10tb+ not gb)

3852
 
 
The original post: /r/datahoarder by /u/ChaosxBadger on 2025-02-04 18:06:21.

I’m looking to upgrade my current storage setup, which right now is just a single external drive. I’d like to expand to around 4 drives. Over the past few days, I’ve been researching solutions, but I’m still unsure about the best option.

Part of me wants to build my own NAS or repurpose an old office PC, but since I’m moving in about a year, I’m not in a position to commit to that right now. I’m considering getting a 4-bay DAS for the time being. If I go that route, would I be able to later connect it to a NUC, old PC, or a future NAS build without any major limitations? I’ve read a lot about RAID arrays over USB not being ideal. If I choose this setup, are there other connection options that would make it faster or more reliable?

3853
 
 
The original post: /r/datahoarder by /u/Scanner771_The_2nd on 2025-02-04 18:05:19.

Workers at NASA Told to ‘Drop Everything’ to Scrub Mentions of Indigenous People, Women from Its Websites - 404media.co

I am seeing if anyone knows of a backup of the NASA site? I saw that the NTRS (NASA Technical Reports Server) is backed up.

3854
 
 
The original post: /r/datahoarder by /u/nostrademons on 2025-02-04 17:14:27.

I'm talking about technical documentation or videos, precise enough to replicate the steps and finished product, for things like:

  • Agriculture - which seeds grow where, and how to start and care for them?
  • Seed banks
  • Mining at scale
  • Geologic maps of mineral deposits
  • Metallurgy
  • Manufacturing processes
  • Construction techniques. How do we build buildings today? Would we be able to replicate the supply chain so that people used to getting drywall, plumbing fixtures, and electrical outlets can actually get drywall, plumbing fixtures, and electrical outlets?
  • Chemistry
  • How to make and mold things like plastics
  • Electrical infrastructure - how do you run and repair a grid?
  • Modern medicine. Diagnoses, treatments, anatomy, etc.
  • Semiconductor fabrication. It doesn't have to be the latest generation (which is insanely complicated), but any group that can get a ~2000s-era fab up and running while everybody else is struggling not to starve would have a huge quality of life advantage
  • Other electronic manufacture
  • Etc.

Sort of like the Doomsday Vault in Svalbard, but with the knowledge distributed across many communities, because Svalbard is likely to be the last place that people will be able to get to in a collapse of civilization.

3855
 
 
The original post: /r/datahoarder by /u/Hospuales on 2025-02-04 17:10:39.

I would like to export my Twitter/x followers into a CSV list, is there a free way that will allow me to do so? I am not subscribed to Twitter Blue

3856
 
 
The original post: /r/datahoarder by /u/puzzle_nova on 2025-02-04 16:32:55.

Hey all, hope this kind of question is allowed (I think it follows the sub rules but I'm new here). I use a lot of NCES data (nces.ed.gov), and given the administration's removal of Census data and threats to the Department of Education, I'm wondering if anyone is backing up NCES data. There's a lot that they produce about the number of students in K-12, higher education, and beyond; these data are used in so, so many reports about the state of education in the US. I'm happy to contribute to ongoing efforts but didn't see anything else in this sub, and I wanted to ask before spending a lot of time duplicating efforts.

3857
 
 
The original post: /r/datahoarder by /u/imawesomehello on 2025-02-04 15:25:11.

Thanks to those archiving data and protecting information! I've begun backing up content that's important to me and other critical sources.

A few thoughts:

  1. Having spent my career in tech, I approach this with healthy skepticism. A key challenge will be verifying the integrity of backups if they need to be used, especially when we can't directly verify the trustworthiness of those maintaining them.
  2. I'm particularly interested in understanding the preservation status of economic data. For instance, data series like egg prices (e.g. from FRED/St. Louis Fed) https://fred.stlouisfed.org/series/APU0000708111 - are these being systematically backed up anywhere beyond their source institutions?
  3. While Archive.org serves a crucial role, I have concerns about its long-term resilience given its public stance on certain issues. Their mission statement about providing "universal access to all knowledge" is admirable but could potentially make them a target.
3858
 
 
The original post: /r/datahoarder by /u/StardustLegend on 2025-02-04 14:45:10.

I've been casually trying to get into data archiving, saving information from things like the emursive/punchdrunk show that recently closed "Sleep No More", however with recent events with the CDC website scrubbing data on anything queer/lgbt, I wanted to start helping with the effort of preserving that which is being erased.

I've just been going through the "banned" terms on the CDC website, downloading any PDFs and saving any of the pages I can as PDFs, as well as attempting to save links onto the wayback machine and using it for any cdc pages that are already downed/scrubbed.

Anybody have any tips for methods/tools to make this more efficient than just panic downloading whatever I can? any tips on places to post these for others who may want to access this information?

Thank y'all in advance!

3859
 
 
The original post: /r/datahoarder by /u/humor4fun on 2025-02-04 14:26:49.
3860
 
 
The original post: /r/datahoarder by /u/coincidencenator on 2025-02-04 11:35:43.

Check out my script and give me some feedback.

I kindly ask you star 🌟 project on github, so I can get a trophy (helps with job junting)

Regards

3861
 
 
3862
 
 
The original post: /r/datahoarder by /u/timchenw on 2025-02-04 08:45:05.

Hey everyone,

Given how the number of SATA ports on motherboards keep dwindling, the time has come for me to seriously consider moving to an external NAS drive, but I am a little lost in what NAS system I should use, or what type of filesystem I should use.

Basically, my requirements are:

  1. At least 30TB of starting effective capacity with enough redundancy against 2 drive failures. Mostly to give me room to replace drives when one drive fail.
  2. Preferably ready made NAS enclosure.
  3. Data being able to survive and retrieved when the NAS hardware or OS fails. This is mandatory.
  4. Being able to upgrade capacity without needing to do a lengthy or complete migration would be nice.

Thanks!

3863
 
 
The original post: /r/datahoarder by /u/didyousayboop on 2025-02-04 05:30:15.

Archive Team is a collective of volunteer digital archivists led by Jason Scott (u/textfiles), who holds the job title of Free Range Archivist and Software Curator at the Internet Archive.

Archive Team has a special relationship with the Internet Archive and is able to upload captures of web pages to the Wayback Machine.

Currently, Archive Team is running a US Government project focused on webpages belonging to the U.S. federal government.


Here's how you can contribute.

Step 1. Download Oracle VirtualBox: https://www.virtualbox.org/wiki/Downloads

Step 2. Install it.

Step 3. Download the ArchiveTeam Warrior appliance: https://warriorhq.archiveteam.org/downloads/warrior4/archiveteam-warrior-v4.1-20240906.ova

Step 4. Run OracleVirtual Box. Select "File" → "Import Appliance..." and select the .ova file you downloaded in Step 3.

Step 5. Click "Next" and "Finish". The default settings are fine.

Step 6. Click on "archiveteam-warrior-4.1" and click the "Start" button. (Note: If you get an error message when attempting to start the Warrior, restarting your computer might fix the problem. Seriously.)

Step 7. Wait a few moments for the ArchiveTeam Warrior software to boot up. When it's ready, it will display a message telling you to go to a certain address in your web browser. (It will be a bunch of numbers.)

Step 8. Go to that address in your web browser or you can just try going to http://localhost:8001/

Step 9. Choose a nickname (it could be your Reddit username or any other name).

Step 10. Select your project. Next to "US Government", click "Work on this project".

Step 11. Confirm that things are happening by clicking on "Current project" and seeing that a bunch of inscrutable log messages are filling up the screen.

For more documentation on ArchiveTeam Warrior, check the Archive Team wiki: https://wiki.archiveteam.org/index.php/ArchiveTeam_Warrior

You can see live statistics and a leaderboard for the US Government project here: https://tracker.archiveteam.org/usgovernment/


For technical support, go to the #warrior channel on Hackint's IRC network.

To ask questions about the US Government project, go to #UncleSamsArchive on Hackint's IRC network.

Please note that using IRC reveals your IP address to everyone else on the IRC server.

You can somewhat (but not fully) mitigate this by getting a cloak on the Hackint network by following the instructions here: https://hackint.org/faq

To use IRC, you can use the web chat here: https://chat.hackint.org/#/connect

You can also download one of these IRC clients: https://libera.chat/guides/clients

For Windows, I recommend KVIrc: https://github.com/kvirc/KVIrc/releases

Archive Team also has a subreddit at r/Archiveteam

3864
 
 
The original post: /r/datahoarder by /u/Mission-Employee-405 on 2025-02-04 04:38:03.

I'm trying to use the Wayback machine, but I'm hearing it doesn't work well for Youtube videos - is that right? I'm a total newbie on this stuff. I really want to make sure all of the Census videos don't get removed and lost. Looks like most, if not all, of Census' videos are on Youtube.

Please any ideas on how to save these: https://www.youtube.com/@uscensusbureau/featured

3865
 
 
The original post: /r/datahoarder by /u/Strange-Life-2 on 2025-02-04 04:33:04.

Hi, so I recently bought a 7-season TV Show on DVD and don't intend on watching it just yet, that being said, I'd like to check for errors in case I need to return it, is there a way to do so?

3866
 
 
The original post: /r/datahoarder by /u/r01pea on 2025-02-04 04:16:27.

Historically I've used MacBook Pros backed up to an external drive using Time Machine, + external SSDs holding a few TB of various media. I'm buying a newer ThinkPad P1 and moving to Linux, and now is a good time to take a look creating a more reliable backup routine. It seems simple, but to be honest I can read for a few hours and feel like I haven't learned anything.

I have my data on my laptop and the external drives I can restore data from, but after I experienced what an SSD failure looks like, I decided I want to have an additional HDD I can back everything up to. When I started looking at 3.5" enclosures, I came across RAID enclosures like the OWC Mercury Elite Pro Dual and it got me thinking about setting up a RAID 1 array as surely having my backups mirrored to a 2nd drive can't be worse than backing up to a single HDD. I have since learned that is not a good plan because of Reasons™ but I do plan to mirror my backup to a 2nd drive manually. I understand this doesn't protect me in case of a fire, but it does greatly reduce the risk from drive failure which is my main concern.

I should note the drives will only be powered up and attached during backups. After giving up on the idea of a RAID, I planned to buy a plain dual bay enclosure (or a RAID enclosure but use the drives individualy) for the 2x 8TB UltraStar HDDs I've already bought. But, basically every enclosure out there has reviews saying the drives started disconnecting randomly and that BOTH drives were suddenly corrupted. This is true for $50 on up to several hundred dollar enclosures and too common to ignore when the whole point is to help me rest easy.

My question is: what is going on with all these failures? Shouldn't it be harder to make a mistake so bad that all your data gets corrupted when you're just trying to make a backup? I haven't been able to find any good answers about this. I'd prefer a single enclosure to avoid double the cords and power supplies plus I imagine the speed is better transferring between the drives inside, but if 2 separate enclosures is safer I'm good with that. My needs are simple but I know a lot of the same 4/5/6/10 bay enclosures come in a dual bay version so hopefully someone has some good experience - is it that all enclosures use crappy controllers? Is there a reliable one out there?

I've been told you should always have a backup to be safe, but come on - this is the backup that I'm already making to be safe. It's not reasonable to need a backup for my backup for my backup, with the expectation that whole drives being corrupted is a normal contingency. I think I've planned out a solution that is better than average, and I'm confident there is a method that is "pretty darn good" even if I don't run my own data center deep within a mountain or something. So I'd appreciate any info/tips from those with experience!

3867
 
 
The original post: /r/datahoarder by /u/Packet7hrower on 2025-02-04 04:10:47.

Hey Team! I’m looking for a quiet-ish solution to add additional 3.5” drives.

I have a 12 Bay JBOD right now, but the PSUs are very loud.

I’m not opposed to normal fan noise, but I can’t do enterprise grade high pitched PSUs or fans.

Are there any decent Dell / Supermicro chassis that I can make quiet, or a custom JBOD solution?

3868
 
 
The original post: /r/datahoarder by /u/d-e-void on 2025-02-04 03:46:33.

Okay so I've mostly got LFF drives in my current setup, but I'm looking at upgrading to a 4 blade system that has 24 SFF bays.

I'll absolutely be keeping my primary storage running, but I'd like to look into actually using the SFF bays and not just having them as decoration.

However, in my (albeit quick cursory) research I'm seeing that m.2 form factor drives seem to be more available than SFF drives.

Does anyone have any experience with using SFF adapters to run m.2 drives, and are the adapter cards that carry multiple m.2 drives into a single SFF bay actually worth it?

(Or would I be better off replacing the backplane with something that I can throw m.2 drives straight into?)

3869
 
 
The original post: /r/datahoarder by /u/ApricotDismal3740 on 2025-02-04 02:38:01.

I am a Sociologist and Criminologist and I was just wondering if anyone had archived the Bureau of Justice Statistics and or the FBI Uniform Crime Reports/NIBRS National Incident-Based Reporting System? It hasn't disappeared yet but I fear it will.

3870
 
 
The original post: /r/datahoarder by /u/polawiaczperel on 2025-02-03 23:40:33.
3871
 
 
The original post: /r/datahoarder by /u/stfn1337 on 2025-02-03 22:32:05.

I wrote two blog posts how to hoard Youtube videos and serve them locally without ads and other bloat. I think other datahoarders will find them interesting. I also have other posts about NASes and homelabs under the "homelab" tag.

How to Archive Youtube

Using Pinchflat and Jellyfin to download and watch Youtube videos

3872
 
 
The original post: /r/datahoarder by /u/[deleted] on 2025-02-03 20:47:02.

Hey Guys,

firstly, Thank you so much for defending public health, i am asian and just witnessing this chaos and destruction in the USA has me shook.

I am a HIV Counsellor and public health student, for 3 Years i have been working in the HIV prevention initiatives, one of my biggest resource has been the American Free course on HIV https://www.hiv.uw.edu/alternate

National HIV Curriculum is a important resource for GPs and other people who get certified to become counsellors and specialists, unfortunately it has disappeared like every other HIV resource from the US govt.

do you guys have any backups or solutions for this and do you guys think they will bring this back?

thanks for everything you guys are doing

3873
 
 
The original post: /r/datahoarder by /u/squarlo on 2025-02-03 18:29:08.

Heads up that CDC STACKS may soon be removing all their publications in the “Advisory Committee on Immunization Practices” (ACIP) collection.

Not sure who to tell, but this community seems like a good place.

3874
 
 
The original post: /r/datahoarder by /u/8_Miles_8 on 2025-02-03 18:23:26.

Without giving too much identifying info, I’m a nerd and an activist and desperately working to slow down The Administration’s attempts to burn everything down. I’m also transgender, and the loss of CDC and medical library info is directly screwing up my availing to research and my healthcare provider’s ability to make informed decisions about my care. Y’all are doing extremely important work, and you’ve been doing it for decades.

From the entire activism and transgender communities, thank you.

3875
 
 
The original post: /r/datahoarder by /u/RoxxieMuzic on 2025-02-03 18:19:59.

Is anyone concerned as regards this resource. There is a high probability that if they ban what I think they are aiming for this will go dark.

I am digitizing a ton of music, my current ebooks library, and s ton of audio books, but only have just so much time, space (48 TB), download speed/bandwidth, money (fixed income that soon may disappear), and limited digital knowledge, old person here. The Gutenberg Library is an important resource of books in ebook format. It is also free.

view more: ‹ prev next ›