It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
6901
 
 
The original post: /r/datahoarder by /u/raphaeladidas on 2024-09-01 22:07:14.

I swear I've had Micro-B cables fail on me at a rate 1000% higher than any cable I've ever used since I got an Atari 8000XL in the 80s. I want to punch Mr. Western Digital in the face for using that interface on his drives.

6902
 
 
The original post: /r/datahoarder by /u/ItalianMothMan on 2024-09-01 21:32:34.

I've got thousands of slides with some cool stuff on them, and I want to get them scanned. They're in diffrent kinds of cases. Most of them are in hard plastic cases, some are glass, some are in paper. Basically they are all different shapes and I'm struggling to find a scanner to buy. I found one on Amazon that looks like it takes the plastic ones, but will I need a new one for the others? Help me I'm technologically impaired.

6903
 
 
The original post: /r/datahoarder by /u/automaton11 on 2024-09-01 21:05:47.

Hey everyone. I figured someone here might be able to help guide me. I'm trying to mirror some pages from a forum at https://ampgarage.com and having an issue.

Here is an example of a page I am trying to mirror. If you scroll through, you can see that some posts include attachments which are unavailable unless the user is logged in, which my mirror reflects.

I signed in on firefox and exported my cookies with the cookies.txt extension, which I passed to my httrack command, but the mirror still failed to get the attachments, showing the same red bar as if I wasn't signed in.

I did ask chatgpt, which provided a number of possible alternate avenues, but I don't understand them well, and so it will take me a while to investigate each possible solution. So I figured since this seems like a relatively simple site, maybe someone on here might be able to give me more direct advice.

Here is the httrack command I used:

httrack "https://ampgarage.com/forum/viewtopic.php?f=5&t=32047" -O "/home/automaton11/amp_garage_mirror/test2/" "+*.ampgarage.com/" --sockets 1 --max-rate 50K -%c.2 --cookies /home/automaton11/Desktop/cookies.txt -r1 --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15"

NOTE: The site is unstable and has been going up and down for the past few days (why I want to archive it). The biggest issues seem to be with the 'Dumble Files' and 'Dumble Discussion' sections so if you need an example of the problem, here is a page that seems to have a lot more uptime.

Thanks a lot for your help

6904
 
 
The original post: /r/datahoarder by /u/watermons on 2024-09-01 19:50:31.

I tried looking through old posts but I’m getting lost with some of the recommendations. I need at least 256GB, but not necessarily 1TB. Can I get a drive for $60-$70?

I was thinking of the WD external but then I read a comment saying it could be better to buy an internal and put it in a case? If so, does the quality of the case matter?

HDD/SDD CMR/SMR?

I am considering using Time Machine or SuperDuper or CCC.

Main concern is getting a full backup before the old computer fails. I have some backed up but I don’t have a total backup. Thanks

6905
 
 
The original post: /r/datahoarder by /u/AfterSwordfish6342 on 2024-09-01 16:44:24.

Hi guys.

I'm in the process of setting a cluster of PC's for my homelab. I have 3 Workstations with 2 HDD's each. and might add 1-2 workstations and maybe a few Harddrives aswell ( to have more storage for jellyfin and pictures)

So now im looking for a good storage system that would fit my usecase.

my goal is to have the following:

  • A way to pool all this storage across the different servers.
  • Something that Kubernetes can consume easily.
  • Ideally, I’d like to present all this storage as a single block device, even though the drives are spread out across different machines.

I have a spearate solution to have a Backup for pictures and important files i dont want to loose. this is purely to be able to work with the data

Id really appreciate any experiences or suggestions you guys might have

6906
 
 
The original post: /r/datahoarder by /u/oldassveteran on 2024-09-01 16:41:25.
6907
 
 
The original post: /r/datahoarder by /u/dpunk3 on 2024-09-01 15:53:41.

One argument for piracy I don’t see is saving the data off your already purchased disks so you can burn them later once the original disks fail and create a new set of disks that are fully functional.

6908
 
 
The original post: /r/datahoarder by /u/BigMickDo on 2024-09-01 15:49:24.

I'm scraping certain pages daily for past year or so, nothing too crazy but right now I have like 50 GB and over a million text files (JSON + HTML).

I have been lazy and hadn't done anything with the data, but given the annoying amount of files at this point, I think I need to zip for archiving (reducing the amount of files in there, and size).

The only thing I need to store to keep going is the file names of the ones that I've successfully downloaded.

I'm thinking about writing a text file of that info, then automatically having the files themselves added to zip.

Looking for suggestions.

things I'm considering:

Storing in duckdb of file name + the file content as text

storing in parquet

storing in SQLite Archive.

just zip file as compression and container.

SQLite seems like a good solution but how good is their compression? I know there are add-ons like https://sqlite.org/com/zipvfs.html

6909
 
 
The original post: /r/datahoarder by /u/Party_9001 on 2024-09-01 15:34:40.
6910
 
 
The original post: /r/datahoarder by /u/pancakeforpresident on 2024-09-01 14:32:23.

I have a YouTube channel to share home movies with my family. How can I start preserving the edited and raw home movie footage? I have close to 10TB but it will always keep growing. Is there an online resource or do I need to buy a data center for my home?

I am new to this.

6911
 
 
The original post: /r/datahoarder by /u/tellmewhy24 on 2024-09-01 12:50:24.

I mainly archive stuff on cloud services but never had a physical or more reliable way of storing my data. any ideas?

6912
 
 
The original post: /r/datahoarder by /u/seekingadvice331 on 2024-09-01 11:56:46.

I am currently in the process of building a small home server. I just purchased a small (128GB) M.2 SSD as a boot drive and been wondering what kind of disks should I get for storage. I saw a lot of discussions about various disks (HDDs) but can't really make a decision. The main idea of the server is to store videos, pictures and to offload the data from our computers (maybe some games or other documents). The plan is to run the disks in RAID1 (I am aware of 3 2 1 rule).

Also, there is a chance that I will be running some small game servers (like terraria, minecraft, etc) on the server, maybe some websites and what not.

What kind of disks should I get? I found as much as to look for CMR drives instead of SMR. Seagate or WD? Skyhawk, Purple, Baraccuda, Ironwolf, does it really matter for my use case? I heard the drives can either live for a day or 10s of years, hence why I think a a cheap drive (not a NAS grade) is the way to go. The server is not gonna be running 24/7 for years, it might be running for a couple weeks straight, maybe turn it off for a while, etc. The budget is kinda tight, I was looking at 2TB drives to start out.

Thanks in advance.

6913
 
 
The original post: /r/datahoarder by /u/manzurfahim on 2024-09-01 11:06:45.

https://preview.redd.it/aeg92qr0j6md1.jpg?width=497&format=pjpg&auto=webp&s=7567232d086bdcddf9af398a634fc51a722fe4dc

Just wondering if anyone here is still trying to keep RARBG alive? I know we can't download new torrents from them, but it seems there are a lot of users still trying to get files from the network judging from my upload. So much so that I had to move my RARBG torrents to SSDs, I was afraid my RAID will fail, the way 200+ torrents are getting accessed and uploaded.

I still have a lot of files from RARBG that I need to find torrent file for, so that I can seed them again. Got a lot of files from RARBG when it was alive, just trying to give back what I can to the community.

6914
 
 
The original post: /r/datahoarder by /u/wholeloadaquestions on 2024-09-01 10:15:44.

Hi folks, what software is the go to recommendation for backing up Windows 11? After 2 recent blue screen scares on my laptop, I don't want to risk playing a game of chance with my data on it.

I have a 12 bay Synology NAS and I have set it up now to sync documents in real time between my laptop and the NAS using Synology Drive Client.

Is this best practice? Can it only be used for documents? Can it also backup things like settings, programmes and their settings?

From a previous data recovery adventure (about 8 or 9 years ago) I have an AOMEI professional license. I installed it again yesterday on my laptop and ran a backup using it, but I don't think it has a real time sync option.

By real time sync, I mean on detection of a change to a file that file is freshly backed up.

I do have 1TB of cloud backup with OneDrive, I guess I could use this? But 1) I'm not sure if it can do what I want and 2) I'd rather use my own NAS than the cloud.

Wants: a) Real time sync b) All laptop documents, folders, software, settings, preferences

Any help is appreciated - thanks

6915
 
 
The original post: /r/datahoarder by /u/joeboe12345 on 2024-09-01 09:57:13.

Hi

First of all, sorry English is not my native language.

Input

  • 1200+ old printed photos for a few generations of my family.
  • Most of them are black and white. Quality and sizes vary.
  • No time restriction to complete the project.
  • No big limitations for budget (aim < 2000$).
  • Don't want to use "scanning" services. I want to do it by myself.

Goal

  • Want to do it once (and do it correctly) and then forget it to the end of my life.
  • I plan to use the maximum possible quality for scanning (don't care about the image size).
  • I plan to have 2 copies of each image: 1 - Lossless TIFF (archive). 2 - Lossy JPG for day-to-day use.

Strategy

  • Plan to use Epson Perfection V750 Pro (I found used in my country in very- very good condition, the previous. owner did the same type of scanning).
  • For lossless scanning TIFF, Im planning to use the maximum available 6400 DPI.
  • 48-bit / 16 bits per channel for RGB for color photos.
  • 24-bit 16 bits per channel for RGB for black and white photos.

Q1: What do you think about the scanner and scanning quality options?

Q2: Do you have any recommendations for TIFF parameters?

Q3: What do you think in general?

Thank you.

6916
 
 
The original post: /r/datahoarder by /u/Sugardaddy_satan on 2024-09-01 09:21:10.

https://preview.redd.it/z0zinm3b16md1.png?width=653&format=png&auto=webp&s=4fdda802b39b2b151f4db9519ce1e455375b9782

I dont understand what most of these mean, is there something to worry about for this hard disk, which parameters can tell me if the disk is failing and i need to replace it.

6917
 
 
The original post: /r/datahoarder by /u/Coolerthanicecubez on 2024-09-01 07:53:15.

I have not used it in years, but today I plugged it in and tried tovcopy files on to it. The lights turn on, the machine begins to hum, and the computer even recognizes it. But it won't show up for transferring files. Please tell me this isn't the end.. The one time I actually need it!

6918
 
 
The original post: /r/datahoarder by /u/redditAgain3x on 2024-08-31 22:24:08.

Hello DataHoarders! I am trying to figure out how to setup the most simple local offline data backup solution I can with a focus on medium to longer term data integrity (preventing data rot) mainly from MacOS and other OS's (using ZFS or something similar, and as automated as possible). The data ranges from highly active to being in need of longer term archiving.

Please forgive my ignorance as I'm new both to reddit and this topic, and comp sci isn't my calling. I've been researching this topic as much as I can but am finding it a rather complicated and confusing rabbit hole and am trying to come up with a workable solution for it as soon as I can... my most limited resource on this project is time.

Due to iOS development (and wanting to avoid fighting with 'non native' OS/FS/hardware) I mainly have to use MacOS and APFS on one of the computers (Mac Mini M2) that produces the data. How do I get this to work with ZFS (or something that has similar capabilties for data integrity)? Does a DAS or NAS or neither make more sense for this? Is there a way to use ZFS in this context without building an entire separate computer (with RAID[X], ECC, etc.) or is that inevitable? If I can't use ZFS on the host computer then is DAS already out of the question? If DAS is a viable option how do I use it with the Mac while avoiding USB (causes problems with ZFS, right?). If a NAS makes more sense, how do I use it as 'offline' and securely as possible to protect against malware, etc? Though I'd prefer not to, if I do have to build a separate computer for this, what would be the fastest, easiest 'min-spec' setup details I should focus on to get it working and sufficiently usable?

As far as I'm aware using APFS and Time Machine (and/or SD or CCC) doesn't provide nearly the same data integrity functionality that ZFS does (e.g. not even checksums for user data, and/or makes it more obscure or harder to check yourself). I was originally hoping I could just do manual backups to some external disks, but once I became aware of how important data integrity / file fixity is (and how awesome something like ZFS is compared to other tools), I can't 'un-see' that now. I then thought maybe I could just do some manual backups using checksums but that seems like a horribly slow and inefficient long term solution, especially for active data that will most likely keep growing in size. Zooming out this would be part of a 3-2-1 or x-x-x strategy for me with varied media as suitable (SSD, HDD, and then something like archival grade optical or magnetic tape if needed), but I want to try to get the data integrity piece of this right.

I greatly appreciate any feedback, guidance, or wisdom you're willing to share with me on this. I can tell from these forums you guys have TONS of knowledge and experience with this stuff that I don't have anything close to.

6919
 
 
The original post: /r/datahoarder by /u/jamiw_ on 2024-08-31 22:06:52.

Hello! I want to have multiple hard drives for data redundancy in my home made server (which i also use for my Nextcloud instance, among other things). I don't plan on buying a NAS, but I am looking for an external 2+ bay hard drive enclosure that doesn't use USB, since my server doesn't have USB 3.0. I don't want anything over $150-ish, as at that point i would be better off buying a NAS which i don't want to do. Any recommendations?

6920
 
 
The original post: /r/datahoarder by /u/teenagemutntredtjnky on 2024-08-31 21:53:22.

Gemini Doesn't recommend Rapidgator. Is there a cloud site dedicated for adult films? Drive, Dropbox etc, Might Terminated the whole account, and I don't wanna waste my time uploading them only to have them Deleted by the site.

6921
 
 
The original post: /r/datahoarder by /u/Empty_Tax3607 on 2024-09-01 02:03:29.

Hello, I’m sure this question has been asked in many different forms, but I can’t find anything that meets my specific needs.

I have a dvd writer and want to copy the main movie part of several movies on DVD (no need for the menu, scene selection, etc.) just the movie itself into MP4.

I want to do this quickly and for free on a laptop with an AMD graphics processor (I think). I’m unsure of the concept of codec but want to have the easiest to play on any device, if that makes any sense.

Not gonna be super picky on a degradation in movie quality so long as it isn’t ridiculously noticeable.

On MKV, it said there was licensing software not allowing the download so I would assume I need some software to counter act that.

Thank you to any one who is willing to help in advance!

6922
 
 
The original post: /r/datahoarder by /u/omgitsadad on 2024-09-01 00:15:46.

TLDR: The current needs for storage is 46TBs, growing at 16TB / year. I would like to buy something that would serve me well for the next 5 years. What setup would you recommend?

Details:

I shoot a photos & videos as my primary hobby and my storage needs keep growing.

Critical Storage: 16TB on NVMEs, backed up to cloud, growing at 4-8TB / year.

Less Critical Storage: 30 TB on 3 external HDDs (16 TB drives), not backed up to cloud, growing at 8-10TB/year

Still have an old QNAP 469L with 4x4TB drives, but not actively using it anymore.

6923
 
 
The original post: /r/datahoarder by /u/mookie8 on 2024-08-31 22:32:50.

I know it's silly, but I'm cataloguing the titles and artists in a spreadsheet because the nerd in me wants to know what I listened to the most, and also to see what classics I've just forgotten over the years. Sitting on the schoolbus listening to Kirsty MacColl, Pogues, Sublime etc. are core memories.

I got a portable CD player for my macbook, but all the info about the CD or the tracks list today's date, and no track names. I was kind of hoping it would list the date that the CD/Track was first created, just because I'm curious, but again, it has no info.

My dad still has the old family Windows computer kicking around (the one I used to burn some of the CDs), and I'm tempted to see if the CD's show any of the original data. Do you think it's feasible? It would be a real pain to arrange.

If this is the wrong subreddit to post to, let me know and I'll remove. :)

6924
 
 
The original post: /r/datahoarder by /u/Tystros on 2024-08-31 21:43:08.

With M.2 SSDs now being cheaper per TB than SATA SSDs, how comes there is no proper external M.2 SSD enclosure with integrated RAID1 (mirroring)?

Basically, just a simple fast external SSD with automatic mirroring, because I like to have the security that a drive failure doesn’t mean I have to restore 4 TB from my cloud backup, swapping a broken drive is much quicker.

I'm generally surprised about anyone who can sleep with having an external drive that's not automatically mirrored somehow, because all drives eventually break, and something like an external dual-M.2 USB drive with integrated mirroring would be super small and portable, and offer good safety for the data.

I found a total of 1 product that fits this description, but it seems to be a no-name chinese company and this has relatively bad reviews on Amazon, the “MAIWO NVMe RAID Enclosure 20Gbps USB3.2 GEN2x2, RAID 0/RAID 1/PM/Large, 8TB Capacity Tool Free Aluminum, Dual Bay RAID Enclosure for M.2 NVMe PCIe M-Key 2230/2242/2260/2280mm SSD”.

So that likely is not a good choice.

I have an equivalent of this as a SATA SSD version already from IcyBox (https://icybox.de/product/externe_speicherloesungen/IB-RD2253-C31), and that works well, but it really feels like a waste of performance to still buy SATA SSDs when M.2 NVMe SSDs cost just significantly less per TB, and are way, way faster.

6925
 
 
The original post: /r/datahoarder by /u/WannurSyafiqah74 on 2024-08-31 20:34:33.

Best subreddit I could find for help with storage. Hard drive brand is Kingston, I got one that has 526 GB.

I got a hard drive yesterday, wanted to move old stuff I archived into said drive today, turns out that the stuff within said folder couldn't work. Weird. So I put it back and the files in it are all gone. Oh god...

I didn't back the folder up, so I wonder if it's possible to restore everything. Help?

view more: ‹ prev next ›