It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
8251
 
 
The original post: /r/datahoarder by /u/ADPL34 on 2024-07-15 19:58:52.

I am looking to by a 3.5" internal HDD that I can put in a case and use an external HDD for my laptop. I am looking for at least 6TB or larger depending what is the current £/TB sweet spot. I want to get the right drive that I can late use as a drive in my NAS.

The main storage content of this drive will be photos and videos. I currently have my data on a 4TB SSD that is quickly filling up. I have use around 2.5TB since the start of 2023.

I want to get my feet into having a good storage system but not sure which HDD to begin with that is also a good choice to be a drive in my NAS bay down the road (in around 2+ years time).

I am based in the UK and looking to keep my budget under £300. I don't mind buying used if I can get a good deal and quality.

Thank you for the help.

8252
 
 
The original post: /r/datahoarder by /u/BarsMonster on 2024-07-15 19:57:42.

For many years I was running MDADM raid6+ext4 on my 2 previous NASes.

But now I am getting to some serious size (10x of Exos X20 20Tb disks) where tendency of MDADM to "harden" soft read errors during weekly scrubs worries me (MDADM does not try to figure out which drive had soft error, always assumes parity is incorrect). I know about btrfs and it's filesystem checksums - but I would prefer to avoid/autocorrect errors rather than just detecting them.

I did not liked TrueNas/ZFS due to complexity of growing array - I got really spoiled by mdadm.

What is the most robust way to go in your experience? Obvious ideas:

  1. Switch raid6 weekly scrub to raid6check to autocorrect random errors.
  2. Manually directly assemble dm-integrity + mdadm + btrfs array
  3. Use LVM to assemble dm-integrity + mdadm + btrfs array
  4. Any other ways?
8253
 
 
The original post: /r/datahoarder by /u/nzodd on 2024-07-15 18:56:20.

In light of recent news regarding the imminent shutdown of downloads and DVD-R sales from Something Weird Video (https://www.reddit.com/r/DataHoarder/comments/1e0p9qo/something_weird_video_is_ending_downloads/), I have started building a (so far very) small collection. There's more material than I can possibly grab before Nov. 1. I am trying to find like-minded folks to help pool resources to preserve this legacy before it gets converted into low-bitrate streaming garbage (best case scenario) or is simply deleted en masse without further consideration (worst case scenario). A lot of this stuff is phenomenally rare, even the public domain content.

Does anybody have any tips regarding the coordination of a project of this scope? Is there another subreddit that might be more on-topic for such a project?

8254
 
 
The original post: /r/datahoarder by /u/Forest_Lam0927 on 2024-07-15 17:57:09.

I was trying to archive some twitter posts with gallery-dl, and twmd. My target is large, about 500 profiles need to be pulled with the first few profiles i succeed downloading having a few thousand media files each. So i hit the quota soon after those few profiles.

A while ago when i was doing the same with yt-dlp on other sites, the same problem also happen. Back then --flat-playlist was a must have for large playlists.

I've tried out -g -G -j with gallery dl, thinking there might be a chance we can get a whole list of url with one request. But the quota capped on this also

I also need to constantly update the files, adding new files from new posts. Any idea what could be done to avoid hitting the quota? Or is there a request count documentation for gallery-dl?

8255
 
 
The original post: /r/datahoarder by /u/xaplexus on 2024-07-15 16:34:49.

Original Title: I dropped an easystore drive 6" from the floor. It doesn't show up in the Win 11 "This PC" folder. It does show up in open USB devices, but not with the drive name I gave it. I can't see the drive or its file contents. DOA?


Tried multiple computers / cables / power adapters. Is it even worth the 1 hour I would spend (semi-nube) cracking the easystore case and shoving the drive into a SATA -> USB docking station?

8256
 
 
The original post: /r/datahoarder by /u/WannaNetflixAndChill on 2024-07-15 14:58:18.

I have 2 12tb drives formatted to NTFS mounted to a Windows computer. They're backing up to Backblaze, but I'm afraid of data on the drives getting corrupted and bad data getting uploaded and overwriting my online backup.

What's the best way to monitor if the drive is getting bad or if data is starting to get corrupted? Preferably not RAID or anything requiring more drives.

8257
 
 
The original post: /r/datahoarder by /u/macemaster11 on 2024-07-15 14:31:32.

Hi everyone, first time posting here and I plan on putting this together over the summer/fall.

I picked up this enclosure from a local wholesaler for under $400, it came with 2 1500w PSU’s, front/rear backplanes, and all the drive caddies. I‘m looking for an intel build that’s going to start at ~140tb and grow to maybe 450 in the long run? I plan on purchasing in batches of ~10 drives as I scale. I read a lot about going the LSI 9000 route. I don’t plan on using it for any insane intensive loads but the usual store as I go; maybe put homeassistant on it but I already have another machine that’s handling this. Should I go TrueNAS Scale or Unraid.

My goal for this server is for it to be a long term storage solution and for pulling recorded video. I have a little over 30tb of data currently stored in two different systems. They’re not network accessible and usb connection isn‘t ideal to me in the long run. Recommendations on getting this build started would be great.

Link to the box I got.

https://www.supermicro.com/en/products/system/4U/6048/SSG-6048R-E1CR36L.cfm

8258
 
 
The original post: /r/datahoarder by /u/TyNyne on 2024-07-15 14:18:43.

Some context: I've got a team of people that run all sorts of content related projects. Across my team, they personally have to store Terabytes worth of data amongst themselves with pictures & video. Google Drive just isn't cutting it anymore since I can't find a way for the storage of files they own in the shared drives to count against my storage, not theirs. I also just entirely want to get away from subscription cloud services in general.

My goal is to support my team in handling the mass amount of data we have without asking them to pay for their own subscriptions. I have lots of hardware that I don't have a use for, and would love to run a machine where my team can access it remotely and upload our project data to it. Anyone have any good suggestions?

8259
 
 
The original post: /r/datahoarder by /u/The_other_kiwix_guy on 2024-07-15 14:07:22.

This was announced last week at r/Kiwix and I should have crossposted here earlier, but here we go.

Zimit is a (near-) universal website scraper: insert a URL and voilà, a few hours later you can download a fully packaged, single zim file that you can store and browse offline using Kiwix.

You can already test it at zimit.kiwix.org (will crawl up to 1,000 pages; we had to put an arbitrary limit somewhere) or compare this website with its zimit copy to try and find any difference.

The important point here is that this new architecture, while far from perfect, is a lot more powerful than what we had before, and also that it does not require Service Workers anymore (a source of constant befuddlement and annoyance, particularly for desktop and iOS users).

As usual, all code is available for free at github.com/openzim/zimit, and the docker is here. All existing recipes have been updated already and you can find them at library.kiwix.org (or grab the whole repo at download.kiwix.org/zim, which also contains instructions for mirroring)

If you are not the techie type but know of freely-licensed websites that we should add to our library, please open a zim-request and we will look into it.

Last but not least, remember that Kiwix is run by a non-profit that pushes no ads and collects no data, so please consider making a donation to help keep it running.

8260
 
 
The original post: /r/datahoarder by /u/furllamm on 2024-07-15 11:40:09.

i downloading a website but its not downloading js and css

is there working alternative

8261
 
 
The original post: /r/datahoarder by /u/grathontolarsdatarod on 2024-07-15 08:50:41.

But thanks to the generous and educational words of all the hoarders here, I survived!!! (I think).

I just want to say THANKS EVERYONE!!

So... I've always been a hoarder at heart. I used to back-up my files by dragging and dropping windows user directories and dumping them into a 'back-up' directory on the same harddrive, a USB stick, or external harddrive. From about 1997 until last year, this had been my MO.

I knew data recovery was a thing, so between new computers I'd repeat my insanity, shuck the drive from an old computer, chuck it in a drawer... For years. Not just between computers though... Before major life events, OS updates, experimenting with risking processes, you name it. Drag-and-Drop. I started volumes because I started hitting naming limits.

Pictures, papers, online sources, memes, clips of hilarious nights out, legally important documents and recordings.

It was like a ukarian doll of fractal history.

Over the last year I bought a desk and a desktop, then three monitors, shucked the drives, consolidated the keys and the memory sticks. I had a mission. I even read about the duodecimal system try and help out. Which was basically the state of the art when I "started" my habit.

I found linux. I used the three monitors to maximize my efforts. The split view file manager (just plasma, lol) was LIFE CHANGING.

Then I did some reading here. I bought a 4-bay enclosure and consolidated even more. Used the hiccup program (czkwaka, I think) and over a few days took over million files down to about 50, 000.

Then I read the room and started hoarding. Everyone thing I could find. 4x10tb in raid6 was filling up FAST.

I had outgrown my back-up capacity.

Bought another one, a 5-bay this time. Worried the heat would kill my server before amazon showed up. But jellyfin MUST play on.

BUILD THE RAIDS...

I still don't know what happened... But if you read this far, maybe you'll find it useful.

Was running a Ryzen5 in a beelink, but put a 1tb 2.5 inches in there - cause - waste not want not.

Bought an N100 with the second enclosure for transcoding (works AWESOME).

In building the new raids, either there was too much heat on the main board, too my heat on the USB connector, a caching problem, or fussed up mdadm by attaching two "/Dev/md0s" into the same machine, even though they were created with UUIDs and have different names in --detail --DEVICENAME.

My primary raid went down.

Latest JBODs in a draw back-up was only 3 weeks ago. But I had just de-duped (very carefully) and dumped in a LOT of jellyfin material from another beelink whose only purpose is to handle such things.

Went I did the jellyfin dump I wasn't impressed with the internal temps (mid-40s) and had to use a spinner cash my SSD (in a caddy) wasn't working well and a shovel in the heat, so I though I'd wait out the heat wave.

I suck at diagnostics cause I'm still VERY fresh in all of this. But I was trying to figure out if it was the drives, the OS, the USB, the new enclosure.... I still don't really know.

I took the 2.5 inche out of the beelink for better ventilation and sweated through rebuilding the primary raid.

I'll save you the DAY of reading and waiting. --assemble --force worked on a fresh debian install (no actual data is kept on the "computers" anymore) Thank god. For good measure I have custom names for the raid devices.

Mdadm said one of the drives failed. But after re-assembly and a reboot (auto mounted) its seems to be ticking along with the rest of them again. SMART test never showed a problem. So I suspect heat on the cheap beelink hardware.

Not everything is running at high 30s low/mid-40s and the backup raid is building with a panic sync of a few TBs is running to a spare drive.

So..... Build your data storage backwards people. Don't over reach, keep cool, schedule your damn back-up processes, and RAID6 is not over kill, and an untested back-up doesn't do you much good at all!

Hope this entertains or helps someone. I still have about 15 of 20 hours to wait until build is done and no jellyfin....

Thanks to this community!!!!

8262
 
 
The original post: /r/datahoarder by /u/theicarussystem on 2024-07-15 13:09:40.

So... I decided to try and sort out all my online storage and consolidate it into one neat structure.

I joined jungledisk when they were a new company (and I was a young man), uploaded all my old photos, and then used the app again maybe a handful of times in the last decade to upload batches of personal data.

I log on today and see jungledisk is no more.

My questions are:

Is there a way of connecting to my old S3 buckets without the jungledisk client?

If not, would copying all my stuff to S3 again using cryptomator and cloudmounter be a good alternative solution?

btw kudos to jungledisk for keeping my grandfathered account. I didn't use it much but in all those years I never received any email asking me to pay or limit my account. Really nice people.

8263
 
 
The original post: /r/datahoarder by /u/spotanjo3 on 2024-07-15 12:53:33.

https://preview.redd.it/r59at0t1jocd1.png?width=635&format=png&auto=webp&s=7d58445052e04ceef467bf6a834c8bab727fa439

I always drag and drop my larger files on my external hard drive but I am worry about two things: Raw Read Error Rate and Current Pending Errors Count.

8264
 
 
The original post: /r/datahoarder by /u/bitpandajon on 2024-07-15 11:47:31.

Hey all, I’ve been lurking here briefly and decided you all would probably have some good advice on redundant backups. Right now I have about 2TB of data I want to keep long term. It’s copied manually to another drive for redundancy. Super critical data like family photos are saved again on Micro SD chips since they don’t take much room. I’d really like to automate this backups and probably add another drive or two. Should I be learning about snapraid and getting harddrive bay? My current hardware are external drives. Thanks!

8265
 
 
The original post: /r/datahoarder by /u/Glass-Bank on 2024-07-15 09:11:03.
8266
 
 
The original post: /r/datahoarder by /u/ChillCaptain on 2024-07-15 06:18:36.

I recently bought 2 seagate 14tb external drives and keep them open in windows explorer but lately almost every few hours my windows explorer windows will close and crash.

Is this a sign of bad drives?

8267
 
 
The original post: /r/datahoarder by /u/shibalore on 2024-07-15 05:30:06.

Hi all!

I'm not looking for anything fancy but I am a very gentle person and somehow managed to accidentally destroy a Seagate 2TB external hard drive (after spending all night loading things to it...) and I am overwhelmed looking at replacement options. I mainly would like something a tad less fragile.

I ruined the hard drive by placing my laptop to my left (from on my lap) and forgot the drive was plugged in, so it dangled in the air by the cord for maybe <5 seconds and that completely destroyed it. It beeps now, which other threads on this subreddit have told me means that something inside cannot spin and it is, in fact, undetectable.

I take responsibility for it and no complaints there, I simply would just like to replace it with something that can take the occasional less-than-100%-perfect use because I was shocked how fragile it was!

Thank you!!

8268
 
 
The original post: /r/datahoarder by /u/Redrock_Jr on 2024-07-15 05:06:40.

Hello, and thanks for taking the time to read this. I have attempted everything I could find on the internet and have exhausted everything I can short of throwing in the towel. I am going to include as much detail as possible. Sorry if the fix is simple but I just frankly am lost at the current moment.

I bought SAS drives (Seagate Exos 7E8) off of eBay, and the post claims they are working. My LSI card does not see them plugged into the card. My LSI card is LSI-9300-16i flashed to IT mode in version 16. The hard drives are plugged into a backplane from my server case which is a CX4712 from Sliger. Sliger claims they are using an SFF-8482 adaptor in the backplane. Sliger also claims that SAS-HDD is supported which, according to my research, should be working. Now I can confirm these drives are getting power since I can feel the vibrations from them (Maybe a better way than this?) so I don't think the PSU is the issue. The LSI card is being seen by BIOS and yes I was able to control c into that card and check everything out. The card is not see any devices plugged into it. In the office chance it matters, my motherboard is a SuperMicro x10dal-i and yes the PCI slot is working. It does not have SAS support so I could not test the drives by skipping the SAS controller. I was able to plug them into a Windows machine and it showed that the device did exist but since USB is Sata only, I could not make any changes to it. My running belief at the current point in time is that they are formatted badly but the seller claims the format is 512e which in my research should be fine I think. The sticker from the seller is Nist 800-88 clear and passed so I am unsure if that is making a difference. I have confirmed with a Sata Drive that the backplane is not the issue. The issue seems to be something with either the drives themselves or with the LSI SAS card.

In my head at least, This is the transaction for these hard drives

Seagate Exos -> SFF-8482 adaptor -> Mini SAS Cable -> LSI 9300-16i.

Kind of asking for any steps I should take to fix this. I am really close to either returning them or just ordering a SAS dock and going from there. Any kind of advice is appreciated. Ask me any kind of questions you think may be helpful! Thanks

8269
 
 
The original post: /r/datahoarder by /u/PrestigiousGarlic909 on 2024-07-15 04:40:54.
8270
 
 
The original post: /r/datahoarder by /u/2600_yay on 2024-07-15 03:58:39.

Does anyone have any backups of the old Los Alamos Arxiv server from the mid-2000s or so? I know it's a long shot – someone having a backup of a probably-large server of academic papers from the 2000s (or I'll even take earlier backups too from the 1990s!), but wanted to ask here as maybe one of you knows someone who knows someone who knows someone with a tape drive backup somewhere - haha

https://web.archive.org/web/20070520024759/http://lanl.arxiv.org/

The preprint server 'only' got about 50,000 papers per year sent to it in the mid-2000s, per this Arxiv history blurb here, which is 'only' about 200 GB, which I know is small now, but would have been quite the feat to back up personally, back in the day. Or if anyone knows of a search for bibliographic records that would have existed on lanl.arxiv.org back in the day – obviously the search server which used to run on :8081 - isn't usable via the Wayback Machine backups, I'm all ears. Basically, I'm trying to find a few papers with a certain substring mentioned in the paper that no longer exist on the live/current arxiv.org website.


Somewhat related to LANL: does anyone have any of the old 'libraries' backed up from sites like the DoE? The Department of Energy used to have a library online with a few hundred thousand papers in it. Alas, that too was taken down many years ago.


In general, if anyone has a list of what large corpora of scientific literature used to exist online - from National Laboratories, the DoE, or other science R&D orgs - I'm all ears.

8271
 
 
The original post: /r/datahoarder by /u/Bross535 on 2024-07-15 03:36:30.

I need everyone's advice on which of these two to choose. The LC-35U3-Becrux-C1 or LC-35U3-C. I have a laptop which is low on storage, and a HDD with some free space, so I want to transfer some files and keep them hand and have them on the move. Any advice is highly appreciated! Thank you!

8272
 
 
The original post: /r/datahoarder by /u/Intelg on 2024-07-15 00:03:41.

I'm looking for a smart and automatic way of using snapshots for my media files (right now I have none), which tools or solutions would you recommend?

(not sure if this is still true) Post from 2021 mentions some of the pitfalls about snapshot management and the need for a "smart pruning algorithm" so only important or most relevant snapshots are kept - while those with the least amount of file changes can be discarded.

Filesystem wise: ZFS or btrfs is ok with me. I primarily run proxmox or native debian OS as headless servers... so ideally "web based" solution would be ideal... but totally OK with command line tools.

So far I have found:

8273
 
 
The original post: /r/datahoarder by /u/-Hexenhammer- on 2024-07-14 23:51:36.

Hey,

1] So im looking for preferably free solution that will let me use RAID5 on windows, something like OWC SoftRAID.

2] Is there any Tricks/Hacks how to convert USB HDD to Dynamic? If I do it with SATA and then connect to USB will it work? [Dynamic is the only way to use Windows 11 soft RAID-5]

im using USB enclosures, each enclosure is USB10gbps capable and has 4 HDDs.

I had a huge 12 disk unRAID in that dual PC tall Phanteks PC case, but decided that this extra pc sitting under my table, its size and the lack of ease of use that i need to either keep it always on or turn on when i need it, moved me to dissemble it and buy enclosures, these are small, I have 4 already, i can set each enclosure per topic and use them when i need just that.

I tested Windows Spaces, they work fine on USB, but because i have 4 drives and the stripe for best speed is 3 i get about 38.7Tb from x4 16TB HDDs

these are my Speed with FastCopy: Large 60Gb movie file write 430MBs, Medium file write [audio files folder]: 225MB/s, Small file write [pdf books]: 86mb/s]

read speed is over 500MB/s

if tehre is any software, maybe something open source and free, i rather try that and compare.

tomorrow ill test OWC Demo, see if its faster than windows spaces and everything.

Thanks

8274
 
 
The original post: /r/datahoarder by /u/megamoto85 on 2024-07-14 23:41:52.

Hello! I'm about to go crazy :D

I have a group on youtube where we share youtube links (music).

I try to scrape all the links from said group into jdownloader so i can download them.

This was possible before but now all i get is junk data, facebook puts in their own url that i manually have to resolve by clicking it, that way it opens the youtube url, and of course that does not compute pasting it all into jdownloader. I have tried saving the website using firefox to html\txt but it refuses to work. Can anyone please help me find a working method? I tried the plugin downthemall also but it doesnt find any music\video.

8275
 
 
The original post: /r/datahoarder by /u/dm_lucas on 2024-07-14 20:45:53.

I'm trying to share a part of my music collection (im sending appox. 280GB of FLAC quality) with one of my friends who's abroad and just started using ipods. The issue lies in that i dont know how to do this without a cloud subsciption.

Is there a direct way i can send this amount of data, without uploading it to a cloud storage solutuion or getting an expensive file sharing subscription i.e. WeTransfer?

I did attempt a search on the internet, but im not getting any good solutions becouse of all the advertisements for software packages...

view more: ‹ prev next ›