this post was submitted on 12 Jun 2024
1 points (100.0% liked)

It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
 
The original post: /r/datahoarder by /u/1000Zebras on 2024-06-11 14:54:35.

Hi, 

I'm curious what you guys would implement in this situation in order to, above all, simply maintain at the very least one solid, off-site backup of all of my data files and also, in the event of something happening to my main data drive on-site, reduce downtime as much as possible. 

Here is my current setup as is relevant to the question at hand: 

  • OrangePi 5 Plus running dietpi (so pretty much just debian) as my main server on-site
  • One eMMC boot drive on the OrangePi containing the OS and all of my docker-compose files, as well as the OS itself
  • Recently acquired 14TB external USB drive that houses purely my data for all of my docker containers (and then some outside of those, as well, but not much)
  • OrangePi is running Tailscale
  • A second RPi that lives at my brother's house also running Tailscale (so any connection between the two will more than likely be running over the interwebs, but through Tailscale) and with a second 14tb drive identical to the other connected to it, ready for data storage

What I'm wondering is what may be the best strategy for maintaining a backup of the main data drive on the secondary drive, ideally in a mirrored fashion such that were the main drive to fail, I'd simply be able to plug in the secondary drive to the OrangePi, mount it at the same mountpoint as primary would have been, and I'd be back up and running nearly immediately (once the drive was physically moved between locations, of course). 

It's worth noting that, at present, I am dealing with nearly 4.5TB of data on main data drive (also currently backed up to the cloud via Kopia and iDrive E2)

I've been considering: 

  • Trying out lsyncd or DRDB in order to literally have the drives mirror each other in as near realtime as the connection will allow. I have not used either of these tools yet, however, so I'm not familiar with exactly how they work behind the scene. And also, I realize that it is a lot of data to keep in sync over an internet connection, especially at file or block-level granularity as I believe those tools are designed for. In "normal" usage, I am not necessarily adding or changing all that much data on a day to day basis, but were I to make any major shifts in organization, or simply to add a lot more data into the mix suddenly, I'm wondering if the tools would be able to keep up
  • Running an rsync job over ssh at a specified interval (say, maybe, a couple of times a day) in order to keep the two up date. I would of course again run into the same problem that would arise with the first option were I to make any drastic changes, but theoretically I'd eventually always have a 1 to 1 sync/backup between the two drives
  • Simply running some sort of backup program from the main Orangepi data drive to the RPi's data drive, again at whatever specified interval (say, maybe, daily). I'd probably have to run some sort of webDAV server on the secondary RPi in order to facilitate backups between the two were I to use Kopia. Or, I suppose I could even run the data drive on RPi on a minio instance and have Kopia backup via the S3 protocol, but this seems perhaps like a little bit of overkill, and it wouldn't necessarily be the sort of 1 to 1 sync I'm shooting for as Kopia would organize the backup data in a fashion that it understands. This would be acceptable, though, as again at the end of the day the most important thing is to have all of the data itself stored safely in both locations, one way or another.

How would you guys go about keeping things in sync between the two data drives? Or, should I just eschew that idea given the limitations of the bandwidth/connection between the two and go for straight backups using Kopia, or some othe rbackup system? 

Please, if you have any thoughts on how you'd architect this scenario, I'd very much appreciate any and perspectives/insights. 

Hopefully that all makes sense. If you need anything clarified, by all means speak up and I'll do my best to address. 

Thank you so very much for your time, expertise, and patience with my rambling question. I look forward to hearing how people weigh in. 

Kind Regards,

LS

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here