this post was submitted on 30 Aug 2024
1 points (100.0% liked)

It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
 
The original post: /r/datahoarder by /u/QuickNick123 on 2024-08-30 16:34:31.

Just need to get some excitement off my chest:

tl;dr My raidz2 lost 3 devices and continued working (mostly).

For the last week or so my LSI HBA has been suffering from the current heatwave, which results in it "losing" some drives every now and then. They go offline and come back online a second later. I have 16x 16TB HDDs in a zpool consisting of two raidz2 vdevs with 8 drives each.

Today, for the first time ever, one of my vdevs had three devices missing at the same time. I would have expected this to result in something catastrophic, make the pool inaccessible or maybe even corrupt it entirely.

But instead zpool status just told me that the pool is degraded, some files are inaccessible and I should use zpool status -v to get more details. That returned a list of files that can't be accessed at the moment.

I was still able to read and write the pool. Even files that I knew must be located on the degraded vdev (since they were old and ZFS doesn't rebalance when adding new vdevs). I'm guessing writing worked because I had a second vdev that wasn't affected?

After I cleared the pool it got resilvered and everything was back to "No known data errors".

How great is this behavior? I was fully prepared to restore the entire pool from backup, but was pleasantly surprised how hard ZFS is working to keep your files accessible, even when a vdev becomes degraded beyond the point of available redundancy.

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here