It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
2601
 
 
The original post: /r/datahoarder by /u/bexkali on 2025-03-24 20:36:04.

Specifically, stuff like anyone not about white CIS male, when profiled in their "Profiles in Science' area? That's if at some point, certain 'too DEI' profiles come down off that page...

2602
 
 
The original post: /r/datahoarder by /u/Mean_Article_9960 on 2025-03-24 20:34:28.

Over the years, I backed up all my digital camera and phone photos onto my PC, but they ended up in one huge folder. Sorting it manually would have taken weeks.

So I built a small app that automatically reads the file metadata and sorts all your photos/videos into Year/Month folders.

It saved me hours, and I figured others might find it useful.

If that sounds like something you need, it’s available here: PixOrganizer

I'd love feedback from anyone who tries it or has better ideas!

2603
 
 
The original post: /r/datahoarder by /u/Dmcgrath009 on 2025-03-24 20:28:42.

I have a QNAP TS-664 at home with 3x 12TB drives configured in RAID 5 and I'm looking for a cheap way to do offsite backups. I would like to set up another NAS at my dad's house across town but I dont want to make the same investment I made into my QNAP again. I have 10 3TB dives from an old NAS I can use for storage but I just need to come up with a system to do it. It does not need to be a powerful system its only function will be to be a backup of my QNAP. Has anyone put together a Pi NAS or something similar for something like this?

2604
 
 
The original post: /r/datahoarder by /u/rofflez911 on 2025-03-24 19:31:47.

Is there any way to have offline usage of stashapp in some way on iOS? E.g., integration with Infuse mobile app or something? Thanks!

2605
 
 
The original post: /r/datahoarder by /u/COMEONSTEPITUP on 2025-03-24 19:23:52.

Good Afternoon everyone,

I have a Corsair Obsidian 750D case that I'm running Unraid on right now. I love it, it's simple, and it's just sitting in a closet working just fine.

I'm at the point now where I'm looking at adding additional Hard Drive bays past the initial bay count. I'm aware they sell additional storage bays. But the site says they're sold out. I saw comments claiming you can still email Corsair asking for more of these and they'll send them to you, but given that the case is long past discontinued, should I still try to contact Corsair about this?

Is there some sort of 3D printed option that can function similarly?

Or should I perhaps look at other case options like the Jonsbo N5 or another high capacity drive case? I've currently got

  • 1x NVME drive
  • 2x 2.5" SSDs
  • 5x HDDs

I'm looking at adding a PCIE SAS-SATA card too. Just trying to get some bays install to future proof for potentially upwards of 8-12 more drives.

If you have any suggestions of similar cases, or experiences with getting more drive bays for this, I'd love to hear about them.

2606
 
 
The original post: /r/datahoarder by /u/jaromir83 on 2025-03-24 18:07:30.

trying to batch download a instagram profile, no files found, writes

https://preview.redd.it/dc1mntpsgoqe1.png?width=1024&format=png&auto=webp&s=d82ab7eebdf4469715074e0dad65ae0788bcb7b4

how can I batch download from instagram again? fix pls? another program/web w/ the same functionality pls? thanks

2607
 
 
The original post: /r/datahoarder by /u/Artistic_Pear1834 on 2025-03-24 18:05:58.

Epson FastFoto 840 - any hotkeys or AppleScript to trigger the Start Scanning button? I am so sick of fiddling around with my mouse for each scan (batch doesn't work, old photos a zillion sizes).

I'm staring at latest family members "would you be able to scan these please" piles of albums & just can't bear the manual "mouse to start scanning-image to position then press" for days on end.

I've tried using Chatgpt to figure out how to assign a keyboard shortcut, can't find any documentation about hotkeys, can't find the button code to link to that. Anyone have any luck?

I normally use VueScan with my canon scanner, but with the Epson 840 it produces very pink scans (and I'm a standard vuescan subscriber of many years, not ponying up more cash for professional to reduce the weird red hue it's producing with this scanner - doesn't happen with the standard epson scanning app). Just need some way to start scans without needing to fiddly about with my mouse. TIA!!

2608
 
 
The original post: /r/datahoarder by /u/HeftySoil on 2025-03-24 17:45:10.

I’m looking to purchase an ssd that I can around with me with a a lot of different folders. Is there an SSD or SSD enclosure with a small display where I can view things like the directories and select a folder to access before plugging it in to my devices?

2609
 
 
The original post: /r/datahoarder by /u/sep222 on 2025-03-24 15:49:56.

I just started using mergerfs + snapraid and I'm having a really hard time with syncing. Snapraid sync typically runs smoothly through about 40GB running at 200 MB/s or more but then falls off a cliff and slowly gets all the way down to 1 MB/s, making it unusable.

I've been trying to use the official documentation but also chatgpt and claude to troubleshoot. The chatbots typically run me through troubleshooting steps with disk read and write speeds but everything always comes back clean. The drives aren't the greatest but they aren't in bad health either.

Writing and reading tests on both drives are ~130MB/s

Troubleshooting steps:

  • enabled disk cache on all drives (hdparm -W 1 /dev/sdX)

  • ran fsck on all drives

  • reformatted parity drive

  • adjusted fstab attributes for mergerfs (see below snapraid.conf)

  • changed block_size in snapraid.conf

  • started snapraid setup from scratch multiple times

2 14TB media drives

1 14TB parity drive

*I'd like to add that I did have one successful sync which ran at a constant 138MB/s throughout. After that sync worked, I waited about a day and ran the sync again after adding over 100GB of data and it was back to the same problem of 1MB/s. I have deleted that parity file and all of snapraid content files to start from scratch multiple times

# SnapRAID configuration
block_size 512

# Parity file
parity /mnt/parity/snapraid.parity

# Content files
content /mnt/etc/snapraid/snapraid.content
content /mnt/plex.main/snapraid.content
content /mnt/plex.main2/snapraid.content

# Data disks
data d1 /mnt/plex.main/
data d2 /mnt/plex.main2/

# Excludes
exclude *.unrecoverable
exclude *.temp
exclude *.tmp
exclude /tmp/
exclude /lost+found/
exclude .DS_Store
exclude .Thumbs.db
exclude ._.Trashes
exclude .fseventsd
exclude .Spotlight-V100
exclude .recycle/
exclude /***/__MACOSX/
exclude .localized

# Auto save during sync
autosave 500
______________________________________________
#/etc/fstab
all media drives and parity drive attributes:
- ext4 defaults,auto,users,rw,nofail,noatime 0 0

mergerfs attributes:
- defaults,allow_other,use_ino,cache.files=partial,dropcacheonclose=true,category.create=mfs 0 0

2610
 
 
The original post: /r/datahoarder by /u/EnsilZah on 2025-03-24 14:46:59.

https://preview.redd.it/dilpcc3xgnqe1.jpg?width=4032&format=pjpg&auto=webp&s=911ccb6e140372be5ea054c6f507f263faa6175d

https://preview.redd.it/qxpl9c3xgnqe1.jpg?width=4032&format=pjpg&auto=webp&s=136790ff48ca9b3fe064d505b32483c84956d8e9

I built this NAS/server a bit over a decade ago and it has served (heh) me well.

I like the minimalist look of the Node 304 case, and while access to the HDD brackets is not great I didn't really need to screw around with them too much.

It currently houses a 240GB SSD for the OS (Windows Server), 3x WD RED 10TB, 1x Barracuda 8TB in a Storage Spaces pool.

Recently I started planning for a move to another country and I was trying to figure out the best way to take my data with me.

I thought I'd just remove the drives and build a new computer for them at the destination, I even ordered protective cases for them.

I've also been thinking time might be near where going all SSD might be viable for me.

I looked into second hand SATA SSDs but looks like very for are available right now.

I then came across some reviews of all-NVMe NAS devices, specifically the Terramaster F8 and Asustor Flashtor 12.

The Flashstor had the advantage of expendabilty, but I really hated gamer-wannabe look, and the hardware specs were weaker.

With the Terramaster F8 Plus, I liked the size and look (reminded me of my old WD My Book) and the specs.

So recently I bought the Terramaster and started populating it with NVMe drives (3x WD Blue 4TB, 3x WD Black 8TB).

I installed Windows Server on it rather than use the OS it comes with because I want to run a bunch of other software on it and I'm familiar with Windows and Storage Spaces (though I guess maybe running a VM might be another option).

A few snags I ran into were:

  • I had to remove the internal OS USB drive for the Windows installer to prepare partitions correctly.

  • I had to track down the network driver to bring it online.

  • At first I didn't put the provided heatsinks on the NVMes because I figured network transfer speeds won't be high enough to heat them up significantly, but then I had a drive drop out of the pool due to overheating when I was doing some internal transfers.

  • I haven't yet tracked down the issue that makes it lose connection to the network every few days, not sure if it's a hardware/driver issue, something in the OS, maybe my router.

But now that all my data is transferred I can shut down my old NAS, use it as backup and hopefully sell it to recoup some of the cost after zeroing the drives.

2611
 
 
The original post: /r/datahoarder by /u/magicmikela on 2025-03-14 17:10:56.

I'm trying to pull some videos and haven't found any add-on or app that can do it from Podia.com (an online course platform).

Thanks in advance for any thoughts.

2612
 
 
The original post: /r/datahoarder by /u/canigetahint on 2025-03-14 15:43:12.

Forgive me for my ignorance on this, as I'm still pretty inexperienced with this, but is there a group or a project that makes data available from various sources, such as Kiwix for downloading Wikipedia? I figure the last 2 months have been a real wake up call and I have since downloaded the .wix for Wiki, but wonder if there is something similar that crawls .gov sites or .uni/.edu sites for archiving purposes and packaged for easy distribution/downloading?

Keep in mind, I have no idea how much effort goes into projects like that, and I can definitely appreciate it now that we have seen what happens when we take something for granted.

Just a thought that crossed my mind this morning and I wanted to post it before I forgot.

2613
 
 
The original post: /r/datahoarder by /u/JohnDorian111 on 2025-03-14 14:45:33.

There was someone trying to dedupe 1 million videos which got me interested in the project again. I made a bunch of improvements to the video part as a result, though there is still a lot left to do. The video search is much faster, has a tunable speed/accuracy parameter (-i.vradix) and now also supports much longer videos which was limited to 65k frames previously.

To help index all those videos (not giving up on decoding every single frame yet ;-), hardware decoding is improved and exposes most of the capabilities in ffmpeg (nvdec,vulkan,quicksync,vaapi,d3d11va...) so it should be possible to find something that works for most gpus and not just Nvidia. I've only been able to test on nvidia and quicksync however so ymmv.

New binary release and info here

If you want the best performance I recommend using a Linux system and compiling from source. The codegen for binary release does not include AVX instructions which may be helpful.

2614
 
 
The original post: /r/datahoarder by /u/Rick-Valassi on 2025-03-14 14:10:05.

Looking for a new solution to backup my raw photos that are currently about 5 TB and have a few questions:

  1. Should I use 2 separate external HDDs and sync them from time to time or is 1 enclosure with 2 mirrored HDDs better? I am leaning towards 2 separate ones as it appears to be more redundant.
  2. If I get 2 separate HDDs should I buy 2 different brands or is it safe enough to buy 2 of the same model?
  3. Anyone here who could share their experience with the G-Drive Project 12 TB?
  4. Any other suggestions?

Thanks in advance.

2615
 
 
The original post: /r/datahoarder by /u/jonasrosland on 2025-03-14 14:05:41.

Hello fellow Data Hoarders!

I've been eagerly awaiting Gitea's PR 20311 for over a year, but since it keeps getting pushed out for every release I figured I'd create something in the meantime.

This tool sets up and manages pull mirrors from GitHub repositories to Gitea repositories, including the entire codebase, issues, PRs, releases, and wikis.

It includes a nice web UI with scheduling functions, metadata mirroring, safety features to not overwrite or delete existing repos, and much more.

Take a look, and let me know what you think!

https://github.com/jonasrosland/gitmirror

2616
 
 
The original post: /r/datahoarder by /u/Goofcheese0623 on 2025-03-14 14:05:24.
2617
 
 
The original post: /r/datahoarder by /u/PricePerGig on 2025-03-14 13:46:59.

Hi All

First off,

Thank you for all the support while I've been building out https://pricepergig.com/ (it will be the best place to find digital storage on the internet, and is right now for Amazon imo, but I would say that right :) )

If you were to sign up for price alerts (e.g. the cheapest HDD, or the cheapest NVMe price per TB for example) or in the future alerts for your saved searches HOW would you like to be alerted?

If you could also let me know your country that would help me understand, perhaps it's different in different locations.

Backstory, you don't need to read this!

Many people asked for 'alerts', and I assumed email would be ok/good/great, perhaps I was wrong, not so many people have signed up, it could well be just the form looks scary, perhaps I need to point it out more, I can work on that, or email isn't the thing you guys wanted (I know I have plenty of emails I don't look at). So, let's find out.

Today PricePerGig 'only' does Amazon, but I will be adding other marketplaces once we've figured out the base feature set, so please do participate assuming your large marketplace is also in here.

Thanks

View Poll

2618
 
 
The original post: /r/datahoarder by /u/cartrouble111112 on 2025-03-14 11:00:41.

Hi all,

There are a wide number of sites which offer paid access to film references, including:

  • Shotdeck
  • Film Grab
  • Eyecandy
  • Filmboard
  • Shot Cafe
  • Frame Set
  • Screenmusings

They are paid archives, rather than being true data hoarding / open access.

Is there a centralised resource for this form of data hoarding, does anyone know? A group project?

2619
 
 
The original post: /r/datahoarder by /u/storytracer on 2025-03-14 10:53:31.
2620
 
 
The original post: /r/datahoarder by /u/alexlazar98 on 2025-03-14 10:37:22.

https://preview.redd.it/zp9vlha0vmoe1.png?width=1200&format=png&auto=webp&s=25233afd4d8804e65b7d6dff7bab03f33fe6ef53

I want to start a personal project where I scan, OCR and index markdown for old books. This is a book with ALL of Romania's roads back in 1974. It has tables and maps and all sorts of other interesting historical data points.

I already have some idea of data engineering. I'm a software engineer and I've made a project that helps with RAG, search and indexing of markdown files (even very big ones). My problem is the OCR part. Any tips?

2621
 
 
The original post: /r/datahoarder by /u/Famous_Assistant5390 on 2025-03-14 08:54:39.

Is there a way to tell Ripme to download only images from a URL that contains both images and videos? And can I set a minimum resolution for dowloaded images? I am new to all this. There doesn't seem to be a setting, Can this be done vie a config file?

2622
 
 
The original post: /r/datahoarder by /u/0SwifTBuddY0 on 2025-03-14 07:36:29.

Does anyone have a good grasp or understanding from experience if hiding usb drives (or things in general) in plain sight is more effective than concealing from sight?

I have important data id like to keep backed up, but mobile and offline. I don't care if the data got destroyed over time or corrupted but I want to keep it safe from prying eyes.(i have backups i just need this data offline and portable for my own convenience)

I'm also somewhat new to using bitlocker encryption and it's easy to use but I do find myself wondering how hackable it is if at all (for the common attacker on a common person like myself). is it even worth it to buy a dedicated disguised cheap usb(pen style, throw it in my massive pen collection in office? Or can I just write the data to 1 or 2 of my old usb drives? I guess my concern is if an attacker came though my home they'd check for things that might be valuable like my safe, and obvious data storages/certain paperworks. But again would that even matter if 99.9% of attackers can't fathom breaking a bitlocker encryption?

Thanks for any input

2623
 
 
The original post: /r/datahoarder by /u/Zavad6404 on 2025-03-14 06:03:51.

I have an Orico 9958C3 with hard drives (WD Red and Iron Wolf drives) formated and showing in Windows Disk Manager (NTFS). However, they do not show in Orico's proprietary Raid Manager software. I have reformated drives, changed slots, restarted, etc. Any advice on how to setup Raid 5?

2624
 
 
The original post: /r/datahoarder by /u/itsthexypat on 2025-03-14 05:29:54.

I've been thinking about trying various software raids, truenas, unraid, freenas, etc. and I'm not sure which one to try first. Are there other major software options that I'm not listing? Which do you recommend I try first and which would you ultimately implement to be the central backup to about 5-6 pcs/laptops and three Synology 8 bay NAS?

I've been building my own PCs since I was a kid and I pretty much have most of the pcs I've ever built, some 8 cores and a spare 16 core pc. Only about a year ago did I finally dive into the world of NAS and RAID and ended up getting three eight bay Synology NAS boxes. They are doing alright for what I'm using them for. I thought at first I'd not be good at learning about these things but I dedicated about three months of reading and youtubing and feel I have a good understanding of the synology ecosystem and some general raid knowledge.

Now I'm ready to take the next leap. Instead of buying a different brand NAS I would like to build my own and try some of these free software options using old hardware.

I am a tinkerer but I've never really had to get into much anything dealing with NAS, servers, and commercial IT stuff. Once I'm done tinkering and learning the softwares I'd like to pick one and build a cheap huge cold storage for more tinkering and to back the other computers and three Synology boxes to.

What do you all think? Any tips? Any suggestions?

TLDR: another newb decided to post a question instead of researching this topic ad nauseum and wants to know if he should play around with truenas, unraid, freenas, or other software using older hardware, 8-16 cores, 16 to 64gigs ram.

2625
 
 
The original post: /r/datahoarder by /u/Metallica93 on 2025-03-14 04:53:28.

I'm creating my first Plex server and have not purchased any drive larger than 2 TB before. Right now, Western Digital is having a deal where two 12 TB drives are going for $200 each (i.e., ~$16.7/terabyte).

Is $15-17 good enough to buy four and take advantage of the limited-time offer or is that "Just buy a couple" territory?

How much do you usually spend new per terabyte? Used?

view more: ‹ prev next ›