The original post: /r/datahoarder by /u/Lav0c on 2024-08-16 04:36:42.
The tl;dr lists what I need and want. Below are the options I've looked into and what their issues are.
Snapraid:
good:
- stable
- 2+ parity drives supported
- able to just chuck in another drive of any make/model/size (this is crucial)
bad:
- not real-time
- absolutely destroys resources when syncing or scrubbing (CPU, RAM and disk usage)
BTRFS:
good:
- tons of cool features like automatic file de-duplication and storing changes as diffs (of a sort)
- RAID 6-ish support
- real-time parity calculation
- real-time error checking and file repair (silently fixes bit rot when file is read and doesn't match checksum AND there is good parity data)
bad:
- RAID 6 still has the write hole issue that apparently got fixed for RAID 5 (so it's not stable/reliable)
- spins up all drives when accessing a single file
UNRAID:
good:
- stable
- 2 redundant drives supported
- add a drive of any size at any time
bad:
- spins up all drives when accessing a single file
- doesn't do checksums on read to inform you of bit rot (third party plugins can make this somewhat possible but it's disconcerting to say the least)
- proprietary
ZFS:
good:
- stable
- 2+ drive redundancy
- ability to throw in another drive (caveat later on)
- real-time parity calculation
- real-time error correction
bad:
- spins up all drives when accessing a single file
- if you add a drive to the array you can only use the capacity of the smallest drive (12TB 12TB 10TB 8TB would basically turn into 8TB 8TB 8TB 8TB for usability)
MDADM:
good:
- stable
- 2+ drive redundancy
bad:
FlexRAID: dead
I've also seen some things on Ceph having a RAID 6 that can have mismatched drives added, but I don't have enough research into that.
From everything I know, my options are as follows.
BTRFS RAID 6
Accept that it doesn't have the most reliable/safest RAID 6 implementation and do my best with my current understanding, my decent scripting skill (limited linux knowledge), ChatGPT and pray to get 2 SSDs in RAID 1 for metadata and write scripts that monitor for things that could indicate a write hole could be present and fix it before doing anything else with the array AND make my own logs that I'll have to check and hope they are working properly.
Frankenstein by essentially writing my own software (MergerFS, BTRFS, PAR2, Snapraid, scripting hell)
Go full-crazy and write my own software that uses MergerFS, BTRFS for real-time integrity checks to trigger repairs with either PAR2 (file-level) or Snapraid (anything PAR2 can't fix) - as well as all its other bonuses (e.g. file de-dupe).
MergerFS + Snapraid
Settle for my current setup that is lacking/disappointing/frustrating
tl;dr
need:
- to have a stable system with a minimum of 2 parity or redundant drives
- ability to add drive of any size to array
would like:
- real-time parity
- silent error fixing
- spin up one drive when accessing a single file