It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
3826
 
 
The original post: /r/datahoarder by /u/jim_ocoee on 2025-02-05 08:57:35.

I appreciate the work being done here, and I know this might not be the perfect place to ask. But does anyone know where people are continuing to gather data? Something along the lines of what r/kaimingtao posted? I know a lot of CDC data, for example, are aggregated from local sources, but I can't find who is following up on that. Any tips are appreciated. I want to help!

*edited to fix link

3827
 
 
The original post: /r/datahoarder by /u/Hamilcar_Barca_17 on 2025-02-05 07:34:16.

Hey all!

One thing I've noticed with the data hoarding of government websites is that not all people who need access to the data are tech savvy enough to download torrents, use archive.org, or have permissions to install Kiwix on their work machines, or even have the space to sometimes download some of these sites.

Access to this data is critical, and for the time being, sharing the data is not illegal. So, in the interest of posterity and ease of access, I also figure there no such thing as too many mirrors.

So I want to get your thoughts on a possible solution that's as close to a federated site for hosting all these archived sites and data as possible.

I own a domain that I can easily create subdomains for, i.e. cdc.thearchive.info, pubmed.thearchive.info, etc., and suppose I point the subdomains to hosts that host the sites and make them available again via Kiwix. This would make it easier for any health care workers, researchers, etc. who are not tech savvy to access the data again in a way they're familiar with and can figure out more easily.

Then, the interesting twist on this is, is anyone who also wants to help host this data via Kiwix or any other means, you'd give me the host you want me to add to DNS and I'd add it on my end, and on your end you'd create the Let's Encrypt certificates for the subdomain using the same proton Mail address I used to create the domain.

What are your thoughts? Would this work and be something you all see as useful? I just want to make the data more easily available and I figure there can't be enough mirrors of it for posterity.

3828
 
 
The original post: /r/datahoarder by /u/dr100 on 2025-02-05 05:14:09.

Literally that. The irony is thick for this one in multiple ways, and particularly under "What do you mean DELETE?" banner.

Update: It also appears that whoever is handling currently the modmail doesn't make the difference between DELETED and DOWNVOTED because that's the answer I've got

That’s how Reddit works. People decide what content surfaces with their votes

3829
 
 
The original post: /r/datahoarder by /u/a_Ninja_b0y on 2025-02-04 19:58:20.
3830
 
 
The original post: /r/datahoarder by /u/Walmart_Valet on 2025-02-05 04:08:13.

Here are three youtube channels for USAID. Scraping them now, but could probably use some help.

https://youtube.com/@usaidrdma

https://youtube.com/@usaidafrica

https://youtube.com/@usaidkenyaandeastafrica

3831
 
 
The original post: /r/datahoarder by /u/SoloTravelz on 2025-02-05 03:59:59.

With the US government deleting everything on trans people (and a lot of LGBTQIA+ stuff in general) I'd love to do my part in preserving some of that data. Does anyone have a torrent link? I already have some of the CDC datasets but I feel like it wasn't everything.

3832
 
 
The original post: /r/datahoarder by /u/Universal-Magnet on 2025-02-05 03:31:07.

I got a Seagate 2tb hard drive in 2016 and used it in my PS4 for the past 9 years. I didn’t use my PS4 very much. But I just sold my PlayStation, and now just using the hard drive to store movies. Seems to be working great, but would suck if it fails without warning. Should I get a new hard drive or do they last longer than you’d expect if not used frequently?

3833
 
 
The original post: /r/datahoarder by /u/kaimingtao on 2025-02-05 02:44:38.

Storing and archiving the data is just a beginning. We need professionals to teach people how to understand them, how to use them, how to get new data. Hence datasets need active communities to maintain them, keep them alive. As long as the community exists, the data is alive.

3834
 
 
The original post: /r/datahoarder by /u/maybeofftopic365 on 2025-02-05 01:29:56.

I decided when things started getting bad to start downloading everything I could find on the internet relating to Jews and Judaism. A big part of Judaism is preservation of texts. The prohibition on throwing out anything containing God's name has almost accidentally functioned as demand to preserve material culture. A prime example of this is the Cairo Geniza, a collection of texts found in a Cairo synagogue's attic that is a mind-blowing resource for historians because its pretty every scrap of parchment a community used for over 200 years.

So I've got my Geniza going and I've got something like 70 GB. I have a couple of 128 GB USB sticks and an extremely limited budget. I also kind of want to write a novel about someone who finds my Geniza in the far off future.

It's cool to see other people with the same impulse I have. I've got my own little corner of reality to preserve. So do the rest of you, apparently. That's cool.

Anyway, any tips for organizing an extremely large and unwieldy library of pdf's?

3835
 
 
The original post: /r/datahoarder by /u/NovarisLight on 2025-02-05 01:19:03.

Preserve real history. Don't let the money rule people's lives.

3836
 
 
The original post: /r/datahoarder by /u/Kat-Attack-52 on 2025-02-05 01:10:15.

I know it’s probably not much, but I think it’s a start for my data hoarding journey. All the things I’ve been downloading so far is US history, LGBT info, and anything related to CDC and WHO.

I also have a spare laptop as well that’s mostly for various news articles from different media sites and it also has TOR and a VPN installed.

Any advice or insight for a newbie here on what else to download would be greatly appreciated!

3837
 
 
The original post: /r/datahoarder by /u/alchenn on 2025-02-05 01:03:29.
3838
 
 
The original post: /r/datahoarder by /u/Pandamm0niumNO3 on 2025-02-05 00:59:46.

Large HDDs are cheap and I have good internet.

Most importantly, I don't live in America.

How can I help?

3839
 
 
The original post: /r/datahoarder by /u/Jaden_Social on 2025-02-05 00:09:31.

Hello, I got a 50 pack of 25GB BD-R disc's awhile ago to make another backup of my storage. I wasn't aware that you can only write to BD-R once. Is there anyway I could still write more data to them after the first write? If this isn't possible is it then possible to remove the first write and create another?

3840
 
 
The original post: /r/datahoarder by /u/charlesGodman on 2025-02-04 23:57:45.

I am deciding between a WD Gold 10TB-14TB drive filled with helium or air.

Use case: Make a backup of important data, remove it from power and let it sit in my drawer. I will just plug it in once a year to check if it works. It might be that I need it twice a year but no more.

About 5-6 years ago it seemed that everyone was strongly recommending air, but I read that the helium technology (especially in enterprise grade drives) has come a long way and now some people recommend Helium over Air for longevity. WD themselves state that the helium drives live longer citing better MTBF and better AFR. I am just not sure that these are "relevant" for my use case.

Any advice for me?

Addendums:

  • I checked the posts on this forum, I didn't find any helium longevity posts >=2024.

  • Of course I will keep multiple copies.

  • I will not get tapes because of the price

  • I will not use Amazon S3 Glacier. Uploading and (maybe) retrieving data (approx 10TB) seems horrendously expensive.

3841
 
 
The original post: /r/datahoarder by /u/Frozen-Dragon-626 on 2025-02-04 22:01:52.

Hello,

I have a massive collection of movies, tv shows, music, video games, ebooks, audiobooks, comic books, etc. and I want to have a way to easily search my library to see if I have something. Manually making spreadsheets would take years. How do you guys catalog everything? Is there special software?

Update: I am on Windows 10

3842
 
 
The original post: /r/datahoarder by /u/xilet on 2025-02-04 19:13:09.

I have been working on another set of archiving as much gov data as I can find. Is there a chat discussion (irc/discord/whatever) anywhere from the folks working on it?

3843
 
 
The original post: /r/datahoarder by /u/morgcraft on 2025-02-04 19:09:57.

Specifically, there's a LOT of data listed on this webpage that I'm not sure the wayback machine will get.

https://www.fedscope.opm.gov/

3844
 
 
The original post: /r/datahoarder by /u/hymnmarch on 2025-02-04 06:42:27.

Is there any good frontend for windows that can display my videos with cover Art or do some sort of thumbnail? I use MPC HC player with madvr and just want the covers really for a better browsing experience. Thanks

3845
 
 
The original post: /r/datahoarder by /u/OrganTrafficker900 on 2025-02-04 23:30:20.

I used aomei backupper and did a partition backup however all my cad, 3dmax, sketchup files are not backed up. I am doing this with an external ssd and the pc that i am downloading these files to also have the same version of these programs installed. What can I do?

3846
 
 
The original post: /r/datahoarder by /u/can_of_spray_taint on 2025-02-04 23:07:08.

It's insane what the Trump admin is doing to US federal data. Why would user data, backed up using services such as BackBlaze, be considered safe?

Yes, probably freaking out a little hard, but also, if someone can tell me of Europe-based alternatives to look into, that'd be just dandy.

I know BackBlaze has some servers in the EU, but they appear to be majority U-based and I just don't think we can trust the current US admin at all. So I'd like to be able to consider my options.

3847
 
 
The original post: /r/datahoarder by /u/FriendRaven1 on 2025-02-04 22:54:21.

All of you. You're preserving history, preparing for the future, and we're all in awe.

Keep going, Champions! You're helping the entire world.

3848
 
 
The original post: /r/datahoarder by /u/Intellectual_INFJ on 2025-02-04 22:35:54.

Hello all,

Brand new data hoarder here. My goal is to back up media content - photos, videos.

I've selected the "Synology 2-Bay NAS DS223 (Diskless)" as my selected NAS system

I've selected the " WD Red Plus - 10tb" x2 as my selected NAS hard drive.

Is this is suitable or selection for my small-scale archival purposes?

Any insight is appreciated.

3849
 
 
The original post: /r/datahoarder by /u/Elrecoal19-0 on 2025-02-04 22:26:40.

So, in light of recent events at the US (like the deletion of CDC data), I want to start saving data so others can access it throught torrenting (and not just limited to US stuff like the CDC, it was just what triggered me to get into this), and a guide, or some pointers to guides, would be wonderful. Things like

  • Important stuff that would need torrenting (like the CDC, Wikipedia, data (or software) from other important organizations...)
  • Setup tips (HDD or SSD? external or internal? a dedicated PC/server [asking because I have no idea]?)
  • Good practices (good trackers, bad trackers, should I use VPN, should I structure the torrent folders a certain way[again, asking because I have no idea]?)

Right now I'm planning on getting a 1TB HDD just for it (and I'm aware it's too small, but I guess I gotta start with something?)

3850
 
 
The original post: /r/datahoarder by /u/Average-Addict on 2025-02-04 21:20:50.

Why are they so damn expensive here lol.

I was wondering if you guys have any recommendations for where to buy cheap drives. I'm looking for around 4-12TB drives. Thanks in advance!

view more: ‹ prev next ›