It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
6051
 
 
The original post: /r/datahoarder by /u/rohithkumarsp on 2024-10-03 10:24:47.

ORICO 5 Bay USB 3.0 3.5 inch External Hard Drive Enclosure Support 80TB (5 x 16TB) Aluminum Alloy HDD Enclosure with Fan / 150W / Disk Data Storage (Hard Drive Not Included) https://amzn.in/d/7yGv2gm

There's one with RAID aswell ORICO 9558RU3-V1

ORICO 5 Bay Raid Hard Drive Enclosure Aluminum USB 3.0 to SATA Hard Drive Tray Less Docking Station Max Up to 100TB Support RAID 0/1/3/5/10/JBOD Single Mode, Designed with Safety Lock-9558RU3 https://amzn.in/d/7nHSrkY

I have the option to sell off the non RAID version with RAID version, OR should I build my own setup then trusting this closure?

6052
 
 
The original post: /r/datahoarder by /u/Korgeman on 2024-10-03 09:52:42.

Hi,

Super new to the data hoarder thing. I have under a dozen WD/Seagate drives and I want to give RAID a try. Don't really need a NAS as I don't want to make all my files accessible online.

I've read that LSI 9300 seems to be the best card to do this but after some searching, I find there are different models. For an example, I see LSI SAS 9300 16I, LSI Broadcom SAS 9300-8i and LSI 9300 8e listed on Amazon. What's the different between these cards and which one best suits my needs. Also, is there is a huge difference between 6GB and 12GB speeds in terms of performance?

6053
 
 
The original post: /r/datahoarder by /u/TheBlacktom on 2024-10-03 09:16:46.

Original Title: I suspect AI text generators accessible to everyone will spam the internet with marketing & propaganda indistinguishable from other content. What solutions are there to archive the pre-ChatGPT internet? I think the quality 5 years ago is likely better than what the internet will be like in 5 years.


The goal is a version of the internet without AI generated text.

Is there a quicker version of Internet Archive? The main issue with it is it's slow.

Maybe something that we can download as our own archives of selected websites, forums?

Or a browser plugin that shows when a website/article has been written and a button that instantly takes back to a pre AI version of the site? Same for forums and reddit. For example highlight comments that has been written after AI text generators became wildly accessible.

Or a Wikipedia mirror that shows articles as of 2020 for example?

6054
 
 
The original post: /r/datahoarder by /u/Buhdurkachomp on 2024-10-03 08:06:56.

I was looking into some drives for backing up my personal stuff and one i was interested in was the 8tb firecuda external drive. I seen somewhere that they are just barracudas and i looked it up again and found someone who said they bought one to shuck and inside was a Barracuda ST8000DM004. This was interesting because i had the exact same drive picked out as an alternative. So do the firecuda external 8tb really use this drive? The Barracuda ST8000DM004 has 5,400rpm but the internal firecuda drives are listed at 7,200. It seems strange if they aren't using the actual firecuda drives in the external firecudas. I was also wondering, would an Exos X16 12tb drive be good for using as a desktop drive? I plan to put music, pictures, videos, music production programs and games on it. The games will probably take up the most space but i don't intend on playing the games from the drive, just to store them on and move them to my ssd when i decide to play. So I'm wondering if the external firecuda is a bad idea or would an internal firecuda be better? I dont exactly know how to hook an internal one up but i guess i can youtube it. Or should i just use the Exos? I was also considering one of those renewed WD Ultrastar 10tb drives for $69.99. I keep reading that the enterprise drives are longer lasting and im not going to use them anywhere near at a commercial level so I'm hoping that would mean they would last even longer. You guys are probably a lot more knowledgeable than me and hopefully yall can help me make a better choice with my drive purchases.

6055
 
 
The original post: /r/datahoarder by /u/Selethor on 2024-10-03 07:08:29.

Hello,

I recently installed raidrive to use an ftp server as a local resource and it has been working great. Much better then I expected actually. My only issue is that, when I create a hyperlink to a file on the remote drive, it includes the current username in the link. Obviously I can manually change the username and make it work, but is there a way to make the links uniform for all users?

6056
 
 
The original post: /r/datahoarder by /u/brotherxim on 2024-10-03 05:48:49.

Hi all, recently managed to get my hands on an HBA card and looking to add to my server for more storage however this card has an internal SFF 8088 plug and and external 8088 plug. For the life of me I can't seem to find SFF 8088 to SAS breakout cables, are they a thing? Alternatively, would there be another way, through adapters (?), to connect the drives to this board or do I need to source a different board instead?

6057
 
 
The original post: /r/datahoarder by /u/Aggressive-Bath-6190 on 2024-10-03 05:22:43.

Just so you know, i do have my stuff backed up. I'm running a 256gb ultra sandisk card on my phone right now with my photos and videos. Just want to hear others' experiences.

6058
 
 
The original post: /r/datahoarder by /u/rcg8tor on 2024-10-03 04:22:16.

Does anyone have a recommendation on a 4 or preferably 5 bay 3.5" sata HDD dock? Ideally I want the following.

  • dock not enclosure so I don't need to worry about airflow, also less expensive.
  • preferably USB 3.1 gen 2 (10gbps). Anyone know if this could be saturated with software raid 0 (i.e. mdadm)?
  • ability to read s.m.a.r.t data from drives
  • power disable compatible

Thanks!

6059
 
 
The original post: /r/datahoarder by /u/hotdogsoup-nl on 2024-10-03 03:21:26.

Why YSK : because if there's ever a cyber attack, or future government censors the internet, or you're on a plane or a boat or camping with no internet, you can still access like the entirety of human knowledge.

The full English Wikipedia is about 6 million pages including images and is less than 100GB.

Wikipedia themselves support this and there's a variety of tools and torrents available to download compressed version. You can even download the entire dump to a flash drive as long as it's ex-fat format.

The same software (Kiwix) that let's you download Wikipedia also lets you save other wiki type sites, so you can save other medical guides, travel guides, or anything you think you might need.

6060
 
 
The original post: /r/datahoarder by /u/Clone_Two on 2024-10-03 03:11:25.

In short, there's this youtube channel with a bunch of videos I'd like to download. While all their videos have since been privated, the wayback machine has more than enough profile snapshots that you could piece together a solid list of a good chunk of their videos (not all but good enough for me). Does there exist any tool for automatically extracting these urls then passing them through to yt-dlp to download? I know there are some extensions that can automatically extract urls from your current page but this is spread between many different snapshots so that'd take too long for me to do manually.

6061
 
 
The original post: /r/datahoarder by /u/jrbearboy on 2024-10-03 02:48:24.

So I've been hoarding my data for years and years, I probably still have several school papers floating around on random drives.

Over the years, as happens, things fail. The dreaded tick tick sounds of something not being quite right. The quite scream of a drive finally deciding it's lived too long. And so I've backed things up and moved things around.

So, naturally problems arise. I know I must have that file somewhere, but where???? Strange, I could have sworn I had at least 20 more pages done on that writing thing I haven't looked at in years??? Wow, didn't I already download that PDF like eight times already? Why can I never find it?

So, I'm looking for a way to basically scan in my dozens of external drives, old internals, USB sticks, and what have you, and create an index of things. The types of files would be everything from old Minecraft worlds to videos to word docs to PDFs to mp3s.

At the very least, I'm looking for a program that doesn't need me to have all the drives plugged in at the same time to compare stuff, because if nothing else I can't even imagine how you would plug in so many things at once. Just index what's on drive A and drive B and tell me if the same file pops up. Then index drive C and tell me if anything matches, and so on...

Ideally, I want a program that can index and scans the data, not just file names. And can tell me "hey, drive A has a file call ABCD, and drive B has a file called EFGH, but looks like it might be the same file" because I (know I) might have changed names on the same files over the years. Or downloaded something from 2 different sources. Also being able to find different versions of a file, so like a word doc where I for some reason have 5 drafts saved would be great.

Best case would maybe be some kind of AI tool that could look through the files and take notes of each one, then as new stuff gets indexed, it goes back to it's notes to flag stuff that's similar. And then I can also look over these notes to know generally speaking what I have where. Massive bonus points if it worked on not just word docs and PDFs but pictures and videos as well.

Must have: I don't want anything needing to connect to Wi-Fi or cloud. Some of these files are things like old doctor notes, tax stuff, banking info. It's all staying under my roof.

What I have: I have auxiliary laptops that I can just set up to run in the background so time isn't an issue, one is Linux and the other Windows 10. I would like it not to be monstrously expensive, but if the program is good enough, I'd spend maybe $100 max to finally clean up and know where my files all are.

If anyone can suggest a super dupper magic box that can do all this, and runs 8000% faster then any laptop I could think of, and can do 12 drives at once, and all I need to do is plug and play, that I might be willing to spend considerably more on.

So, any advice anyone wants to give would be greatly appreciated. And if you know of any programs that can do what I'm asking, please let me know. Or any hardware advice. Thanks for reading.

6062
 
 
The original post: /r/datahoarder by /u/yecnum on 2024-10-03 02:21:58.

Trying to figure out what solutions to try/use for my needs. I need an offsite remote backup (around 10TB data) solution (ideally just one) that can automagically backup MacOS, Windows, and FreeNAS. Data would be encrypted locally, sent to servers (combination of leased Linux servers in a data center, S3, and PC/Mac/Freenas computers at an offsite location locally), and stored encrypted, of course. Ideally, there is a control panel/dashboard I can bring up to see status of backups, servers, etc.

In the past I have used Crashplan, which worked nicely, but their mac solution is pretty awful. I've also use ArqBackup which works well and is super reliable, but no dashboard to look at all the clients. So, currently using a combination of Arq and Duplicati.

What do you all use? I don't mind a paid solution if it's "affordable" and does everything I want, but ideally, I use something I host myself. thanks!! <3

EDIT: forgot to add,, data source is around 10TBs worth.

6063
 
 
The original post: /r/datahoarder by /u/Crafty_DIY on 2024-10-03 01:52:25.

Does anybody know of a large format flatbed scanner that is available to purchase? Money is not an object.

I would like a flatbed, not a feeder, that does 18x24. even a 12x24 would be great. The largest that I can seem to find is the Epson Expression 10000 XL which is only 12x17.

I want to avoid copy-standing and want a really high integrity image from a flatbed scanner.

6064
 
 
The original post: /r/datahoarder by /u/Chance-Permit4247 on 2024-10-02 23:28:55.

What’s the best way to approach this? I’ve been wanting to do this for a few months now. I know a lot of C# and Python, and was going to try and build my own downloader for this purpose (just my saved/liked videos of which I have a lot) does anyone have recommendations or thoughts?

6065
 
 
The original post: /r/datahoarder by /u/moksha04 on 2024-10-02 21:52:11.

Here's a reason to stay away from G-RAID / SanDisk Pro products:

In late 2022, we purchased a SanDisk Professional 160TB G-RAID Shuttle 8. A few months later, in February 2023, the enclosure failed. We're a film production company, so this became an emergency. We had 140TB of critical data stored on the drives and downtime wasn’t an option.

We contacted Western Digital support to try and place an advance RMA. We attempted this twice, one while online with a customer representative. Those didn't work due to issues with their portal. We immediately bought an identical enclosure and transferred all our drives in it, and with that the array was back online.

WD wanted our faulty enclosure back with the original drives in it -- which is a tall order considering we had 140TB of stuff on them. After a lot of back and forth, we agreed with them to just send the faulty enclosure back for repair under warranty—no need to send the drives.

(do they actually expect users to extract 140TB of sensitive data before wiping? That would have required buying TWO additional enclosures, one to read the drives with and the other one to back 'em up.)

Fast forward a month, and our warranty claim was denied. The reason? “You didn’t send the drives.” After reminding them that they had told us to send just the enclosure, they finally agreed to replace the enclosure if we sent it along with the new, identical drives from the new enclosure. We did that.

Two months went by, and again, our claim was denied... Now, they claimed we had sent the “wrong drives” or that the enclosure had been “tampered with.” What a joke.

Six months of chats and calls with support, during which we were repeatedly assured that our issue would be “escalated to the warranty team” and that we would receive a call back within a day or two. This led to absolutely nothing.

Now, after all this time, we finally heard from the actual warranty department. They are asking for the serial numbers of all the old drives we've kept (currently stored in a location that's hard for us to access), proof of purchase, and other documentation. Alternatively, they just informed us—again, after six months of back-and-forth—that we could send a letter with additional documentation to request permission to keep our drives. After six months they told us what the correct procedure was.

They sell a pro-oriented, expensive product that has design flaws. Couple that with their atrocious customer service and unreasonable warranty procedures and it's a shit-show. We spent $12,000 and are left with one working drive, when all we needed was a fix or replacement of a defective enclosure...

Beware of SanDisk Pro drives.

6066
 
 
The original post: /r/datahoarder by /u/QING-CHARLES on 2024-10-02 21:16:28.
6067
 
 
The original post: /r/datahoarder by /u/eastcoastzen94 on 2024-10-02 20:26:07.

How do you guys manage your tabs? I've always got so many open, usually things I want to download, articles to read, etc. It's overwhelming

6068
 
 
The original post: /r/datahoarder by /u/makemeking706 on 2024-10-02 20:21:17.

Hi all, I built my own TrueNas server some time ago just to back up my desktop and scratch the computer building itch. To get the project off the ground I used two 8TB drives in a mirrored vdev.

At this point, the drives are getting full and the server has become so much its own thing that it needs its own back up. The data is mostly media, but aside from movies, but also a lot of difficult-to-replace/irreplaceable stuff as well.

I am trying to figure out where to go from here, but there are so many considerations that I am having trouble making decisions.

For bigger drives, i am thinking of upgrading to 14tb drives at least, but am unsure if i should move beyond the simple mirror to a configuration like RAID5. Since it is mainly just my back up, up time is not a super huge concern, so maybe mirror is still the best?

For the back up back up, should the disk configuration be as complex as the system its backing up? ie. Since I have a mirrored vdev currently, should my back up be on a mirrored vdev as well? At the same time, would it be a bad idea to use something like a couple MyBooks or large but single disks to back up the NAS. For example, getting two 24tb drives that are stored in two different places off site and alternate backing up the entire NAS to them periodically.

6069
 
 
The original post: /r/datahoarder by /u/Devilslave84 on 2024-10-02 19:12:18.

Im new to hoarding figured i would pick up some hdds and get a Nas and i come across a lot of 20 wd golds for 100 + 68 shipping and handling , took about a week to arrive and i tested all 20 and all seemed to have have roughly 11 k hours on them but seem in like new condition otherwise , i dont think i will need all 20 so i will likely sell 10 of them , Also Hd sentinel says theyre all in perfect condition with more than 1000 days remaining , My question is this , Are these good brand hdds ? did i get ripped off or get a decent deal ? and what nas would you recommend ? also where would i be able to sell the other 10 hdds i dont need at ? my friend said something about a Mediasonic Nas but id like to get your thoughts before i bought a Nas thanks appreciate any help

6070
 
 
The original post: /r/datahoarder by /u/seronlover on 2024-10-02 19:02:16.

In the past I used gallery -dl, but recently I keep getting the :

"[twitter][error] AuthenticationError: Login rejected as suspicious"

error, even with new accounts. I guess they also started blacklisting IP adresses. So i wil try that next, but was jsut curious how you people handle archiving from there?

6071
6072
 
 
The original post: /r/datahoarder by /u/Green-Scratch-1230 on 2024-10-02 18:27:28.

What is everyone's thoughts on this unit ?

https://www.kickstarter.com/projects/1945743381/ut2-redefining-portable-storage-solutions

i was considering backing it to do a review on the unit.

It seems too good to be true right ?

apparently its an all in one device that runs on a linux os , seems to have tons of features.

6073
 
 
The original post: /r/datahoarder by /u/D-Alembert on 2024-10-02 17:37:31.

Original Title: Flash/SSD loses data when the charge slowly bleeds off bits over years. When you periodically plug in a USB drive or a SSD, does anyone know (with certainty) what processes will replenish the charge of every bit of data on a drive, to set up the entire drive's storage up to last another few years?


This information has been infuriatingly hard to find. The vague suggestions I've found so far suggest that it depends; for a simple device like a thumbdrive or SD card, you probably have to read (and write?) every bit on the drive to replenish their charge level, but an SSD with a high-end management system might replenish everything simply when it gets powered up. (If so, is that instantaneous, or is it a background process that takes a while? How would you find out whether your model of SSD does what?)

Most discussion is rumor and guesswork, but this seems like this is something we should KNOW about.

Does anyone have proper knowledge or good sources?

6074
 
 
The original post: /r/datahoarder by /u/Minute-Session8703 on 2024-10-02 17:13:03.

https://preview.redd.it/q9rv4gelldsd1.png?width=988&format=png&auto=webp&s=68335dda3eb111fa3b630f47a5bff59f1545df06

Hey everyone,

I’m looking to upgrade the storage on my NAS and want to get a durable SSD. The server won’t be running 24/7, but reliability and longevity are still really important to me. It’ll be used for a home server setup, so I’m looking for an SSD that can handle occasional heavy read/write loads and I don't care much about speed, but I don't consider HDD. I can choose form the list in the photo.

I’m considering the Kingston KC3000 because of its high TBW rating. Personal experiences with SSDs that have worked well for your NAS (especially in setups that aren’t always running) would be much appreciated!

Thanks in advance!

6075
 
 
The original post: /r/datahoarder by /u/madhurw on 2024-10-02 16:54:16.

Hey,

I work for a client that wants to analyse what people are saying about their brand on reddit. They want to track multiple metrics/keywords across subreddits and need a 24h/weekly/monthly breakdown.

What would be the best way to get this data from reddit? I can see apps like Gummy Search providing a similar functionality, so I'm sure this can be done. What is missing?

view more: ‹ prev next ›