It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
4326
 
 
The original post: /r/datahoarder by /u/zyzhu2000 on 2025-01-23 15:12:07.

There are a ton of structured and unstructured data that I collect. There are several cases:

  1. Web pages and PDF files I saved from subscription services (completely unstructured),
  2. Data that I periodically scrape, parse, and extract from web pages are mostly structured but sometimes fields can occasionally change. An example is real estate info.
  3. Data I downloaded from APIs I purchased. They are typically json files each describing a record. These are very structured but when the API changes versions, the fields can still change.

My questions are:

  1. For long-term archive, should I keep the raw format (i.e. downloaded web pages as is), or extracted data?
  2. how do I deal with the occasional field changes when I archive data?
  3. In what file format should I archive? Parquet, sqlite, csv, json tar ball?

It’s a bit like I need to create a personal data lake.

4327
 
 
The original post: /r/datahoarder by /u/didyousayboop on 2025-01-23 14:43:06.

The podcast The Backup Wrap-Up has two episodes about M-Discs.

(The links below go to episodes.fm pages for these episodes, which provide links for every major podcast app.)

First episode: Is M-Disc the ultimate archive medium for SMBs and home users? (June 27, 2022)

This week we talk about this exciting "new" medium for archiving data that is especially attractive to SMBs and home users. It's an optical disc that looks like a DVD and is readable in all Blu-Ray drives, but underneath it's something very different. If you haven't heard of it, then you're in luck! Thanks to Daniel Rosehill, backup anorak and friend of the show, we're going to talk about it – and its competitors on this week's episode! We discuss the good and bad about using all of the following for archiving: paper, SSD, disk, tape, DVD, Blu-Ray, ending with M-Disc. Learn what's wrong with these other mediums, and what's so great about this one in another fun episode of Restore it All! [Note: Restore it All is the old name of the podcast.]

Second episode: M-disc founder explains how it keeps data for 1000 years (August 15, 2022)

This week we have Barry Lunt, one of two founders of Milleniata, the creators of M-Disc. The company may be gone, but the format lives on. Most modern DVD and Blu-Ray drives can write to M-Disc, and Verbatim still sells it. Barry explains to us why they decided to make M-Disc, and why it's different than any other optical product. He also offers a shocker: a study done many years ago that shows that recordable DVDs are nowhere near as good at holding onto data as they claim. There is a lot of good info in this episode. Hope you like it.

Apart from M-Disc, I'm wondering if any archival grade optical discs, such as Blu-rays or DVDs, exist, are available for purchase, and have credible evidence supporting claims about their longevity.

For example, I see that Verbatim sells "archival grade" DVD-Rs with a gold layer. Verbatim says, "these discs are designed to last up to 100 years when properly stored." The Canadian Conservation Institute (part of the Canadian federal government) estimates the longevity of DVD-Rs with a gold metal layer at "50 to 100 years". The big downside here is each disc only holds 4.7 GB. Seems like it would be a pain to burn that many DVDs.

4328
 
 
The original post: /r/datahoarder by /u/Ron3lack on 2025-01-23 14:20:44.

Hey, I just got a 5TB WD Ultra HDD to fully backup my Windows 10 PC.

Never used a HDD before and I have a few questions:

  1. Is it possible to do a full backup if I activate this drive's pin lock system?
  2. Whats the simplest free way to do it? I tried through window's interface but its stuck on 0 Bytes for hours..
  3. Is there any way I can plug the HDD once a month for example, and just update the backup? without removing everything and moving it back there?
  4. Is there any way to also backup my whole iphone Photos, Whatsapp chats, Notes and Files to the HDD ? I use iCloud to backup everything but I want to make an extra copy

Thanks Everyone !!!

4329
 
 
The original post: /r/datahoarder by /u/raymate on 2025-01-23 11:57:26.

Been burned by fake cards before and I need a bigger card.

I use SanDisk for everything and just noticed I can buy direct from the SanDisk website and surprisingly the cards I’m look at one is the same price as Amazon and the other is cheaper than Amazon. So I figured just get them direct they are going to real for sure.

Anyone have experience buying them from the SanDisk Canada online store?

How fast did they ship out. Was the buying experience good?

4330
 
 
The original post: /r/datahoarder by /u/Faditt on 2025-01-23 11:28:48.

Original Title: GitHub - beveradb/youtube-bulk-upload: Upload all videos in a folder to youtube, e.g. to help re-populate an unfairly terminated channel. this great repo needs contributors as the owner is not interested in maintaining it.

4331
 
 
The original post: /r/datahoarder by /u/Dominy990784 on 2025-01-23 10:03:04.

So I just bought a new pc so I'm transfering the whole user file from the old pc's C drive to a external ssd, so as to transfer that back to the new pc.

But when it got to 79%, it got stuck on a file, the speed is 0b/s. I don't want to redo the whole 3 hour process again, is there a way to forcefully skip that file? Anyway to fix this? Please... Thank you.

https://preview.redd.it/1d2kr18xvpee1.png?width=447&format=png&auto=webp&s=8a1ca0a267a26a2ba283833477e7a1d5e0af4ee6

4332
 
 
The original post: /r/datahoarder by /u/newfireorange on 2025-01-23 09:26:36.
4333
 
 
The original post: /r/datahoarder by /u/djtron99 on 2025-01-23 06:58:28.

Will a $65 second hand 6 bay DIY NAS with an i7 5675c, 16gb ramh, h97n-wifi with windows server 2019 ok for a main low power home file server and 1080p streaming? The 3.3-3.7ghz 5675c has 65w TDP with a configurable 37w TDP

I have also almost decade old QNAP and noisy Asus NAS' with slow 1.8-2.5ghz dual celeron n3060 processors and I plan these to be the backups or sell them.

4334
 
 
The original post: /r/datahoarder by /u/GoldNux on 2025-01-23 06:34:46.

Hoarding for the end of the world and for entertainment. Any suggestions?

4335
 
 
The original post: /r/datahoarder by /u/wickedplayer494 on 2025-01-23 06:09:04.
4336
 
 
The original post: /r/datahoarder by /u/unrebigulator on 2025-01-23 03:34:30.

I signed up for a Medium.com account, to read a specific article. And I'm kinda salty about it.

Hypothetically, if I wanted to download and save a bunch of content during my year of membership, where would I start?

4337
 
 
The original post: /r/datahoarder by /u/pagem4 on 2025-01-23 03:17:18.
4338
 
 
The original post: /r/datahoarder by /u/Had_to_make_this_up on 2025-01-23 02:53:37.

I have one that was working just fine, I recently moved and it's been a year roughly since it's been powered up. To my dismay, it does not power up. I don't know if it's the powersupply or the backplate. I was hoping someone also has one of these that they could pull the powersupply from the chassis, plug it in, turn on the power switch and tell me if any lights illuminate when it's not plugged into the chassis.

Thanks in advance!

4339
 
 
The original post: /r/datahoarder by /u/WhosItHanging on 2025-01-23 02:40:36.

I have many (2)12, (4)16 and (2)18TB Exos' and I was wondering if I could boost my transfer speeds any more.

I was always curious if a NAS server plugged via a 2.5gbps ethernet port would provide any more speed over just a SATA 3 connection direct to the PC.

Or if a HDD dock with USB-C USB 3.whatever Gen whatever (10gbps) would be any quicker? I'm guessing it's all limited to SATA 3 speeds anyways. It's sad that there isn't a faster way to move stuff around for us hoarders when it comes to spinning disks.

(don't recommend switching to SSDs, lol. I trying to be cost effective)

4340
 
 
The original post: /r/datahoarder by /u/Officer-K_2049 on 2025-01-23 02:32:46.

I went into Micro Center to pick up a WD Red Pro 18TB at $379 but the salesman convinced me that Toshiba NAS drives are a better choice.  I got a 22TB N300 for $419 which isn't bad for an extra $50 for an additional 4TB.  I should have looked up the Amazon reviews as the rating for these units is 4.2 stars, WD is 4.3, and Seagate Ironwolfs are 4.5.

I am using the HDD in a dedicated tower with swappable 4 bays running an ancient AMD 8350 Bulldozer on a ASUS M5A99FX PRO R2.0 AM3+ ATX mobo basically as a storage center.  The tower is on only a few times a month when I need to update files. Currently it has a few WD Red 14TBs but I have filled those up.   I expect to get another 22TB HDD to mirror the Toshiba but maybe I will go WD Red next time for redundancy.

What do you think about Toshiba HDDs? Are they reliable? Should I exchange it for a different drive?

4341
 
 
The original post: /r/datahoarder by /u/RodoCapsule on 2025-01-23 00:52:39.
4342
 
 
The original post: /r/datahoarder by /u/4EcwXIlhS9BQxC8 on 2025-01-22 23:20:57.

I currently have an unraid server using a silverstone 8 bay case, and frankly the airflow isn't great and the noise isn't amazing either as unraid doesn't make it particularly easy the control the case fan speeds depending on hdd temps (the cpu is generally always cool).

Anyway, I currently have 44TB of usable space with 4 drives and I'm only using 50% of it so it's going to be a while before I need to add any more.

So I'm considering migrating to a prebuilt 4 bay NAS just for noise levels and efficiency. I will be going to TrueNAS scale with RAIDZ1.

I then thought, with interfaces such as Oculink gaining traction, is there anything that would let me buy a 4 bay host node now, and then down the line buy another 4 bay dumb box which is essentially a backplane, fan and a power connector? I could then just add another 4 disk vdev into the storage pool.

I'm less keen on using USB, but I suppose with USB4 offering Thunderbolt functionality which is essentially pcie, it's almost as good as OcuLink.

Thoughts?

4343
 
 
The original post: /r/datahoarder by /u/Spaduf on 2025-01-22 23:11:01.
4344
 
 
The original post: /r/datahoarder by /u/n1ght_watchman on 2025-01-22 21:48:17.

Hey all,

I want to move my finished video projects and assets off my m.2 SSDs and archive them on an external drive. I'm thinking about getting an Ironwolf Pro or WD Red Pro and I want to keep them externally.

I was looking to get a docking station, although apparently they are pretty unreliable.

What is the best/best-buy setup in my case? Should I just use a WD Elements instead?

Thanks

4345
 
 
The original post: /r/datahoarder by /u/PhonicSword on 2025-01-22 21:42:09.

Sorry in advance for the long post! I’m planning to set up a family server for storing and viewing all our photos, but I’m pretty new to home servers and feeling a bit lost after doing some research. My primary goals are:

  1. Allow all family members to upload their photos to a shared server
  2. Organize photos and remove duplicates
  3. Make photos searchable by categories
  4. Automate sorting newly uploaded photos

For the first two steps, my idea is to create a NAS server with folders for each family member based on who took the photos. I'd have two subfolders within their folders: "unorganized" where they'd upload their photos, and "organized." I would then remove all duplicates between our photos, rename old or apple photos to the android name structure based on date, and then sort them in subfolders based on year.

Based on my research, Czkawka seems to be best for finding duplicates and Namexif is best for batch renaming files. However, I’d love recommendations if there are better options.

Where I’m struggling is with tagging and viewing the photos. I’ve read that tools like Adobe Lightroom, Synology, or Google Photos can add tags for easy searching, but I’m unclear if the photos would retain the metadata after leaving the program. Could my family could search directly on the NAS server itself, or would I need something like a Plex server for my family to search via the metadata from any device?

I’d also appreciate suggestions for family members to categorize photos during uploading. For example, could they choose from a dropdown menu (e.g., dog photos, Christmas party, family vacation) to assign categories? I’ve seen examples of custom scripts for automating tasks like renaming files during uploads, but I’m unsure if these can work across multiple users uploading from different devices.

My backup plan is to use the NAS and sort new uploads myself periodically. However, the harsh reality is that if my backup solution isn't convenient or it isn't easy to search for photos, my family won't use it. Any advice would be greatly appreciated, even if it's just showing me resources to learn how to code. Thanks in advance!

4346
 
 
The original post: /r/datahoarder by /u/Oxeda on 2025-01-22 21:31:23.

Hello friends, i´m looking for a unified cloud program to syncs my cloud services (gdrive and onedrive) into my USB, this is a necessity for my work i know it's not optimal.

previously i had gdrive installed in my pc and synced with my usb drive, but i can't do the same with onedrive, right now i'm using odrive.com but it's just too slow, if i modify something locally i have to wait to up 10 minutes to see the chage online.

can you recommend an alternative?

thanks.

4347
 
 
The original post: /r/datahoarder by /u/lkjmnb423 on 2025-01-22 21:31:07.
4348
 
 
The original post: /r/datahoarder by /u/dunkyavell on 2025-01-22 21:00:08.

Hey everyone, I am moving from an old Synology DS214play to a newly built unraid machine. Not a ton of experience in Unraid. Currently I have 2 8 TB drives mirrored in the synology box, maybe 70% full. I am wondering what a good strategy for purchasing new drives for the new box, as I'd like to start downloading a lot more. I was looking at starting with 3 SAS drives in the 14-18TB range, but don't know where to start with regard to evaluating which are good brands/models?

4349
 
 
The original post: /r/datahoarder by /u/Unsungghost on 2025-01-22 20:57:13.

Trakt.tv has long been my favorite place for tracking TV and movies that I have on Plex, and more importantly, what I don't have. Recently, they just put limits of 100 on all types of lists and even your own collection. What's more, you can't create new lists to just have like 20 lists be your collection. This makes the core functionality basically useless. Of course you could subscribe, but that is basically the price of a streaming service and who wants another subscription?

So, I'm asking, does anyone have a good solution that is self hosted? It would also be a high priority feature if it would help me find things that I'm missing. That means if I want to get all top 250 IMDB movies, I can see which ones I already have. Or if I'm trying to get every Tom Hanks movie, it will show me the ones I'm missing.

4350
 
 
The original post: /r/datahoarder by /u/keenedge422 on 2025-01-22 20:55:07.

Was in the hospital this last week getting my gallbladder out. Finally was prepping for surgery and got talking about pc gaming with the anesthesia nurse because we'd just recently upgraded our gaming pcs and she asked "so did you spring for something like a 2TB NVME for all these games?"

"Oh, actually I went a little spendhappy and put in two 4TB NVMEs."

"Holy crap!"

"Yeah, I have a data hoarding issue."

"I guess I do, too. Not to sound like I'm trying to one-up you, but we just set up a 16TB NAS for media and it's already half full."

"oh, neat. my media server is nearing a quarter petabyte."

"... a quarter-"

"petabyte. Yes."

"...ok, we're talking when you get to recovery."

view more: ‹ prev next ›