Just need to get some excitement off my chest:
tl;dr My raidz2 lost 3 devices and continued working (mostly).
For the last week or so my LSI HBA has been suffering from the current heatwave, which results in it "losing" some drives every now and then. They go offline and come back online a second later. I have 16x 16TB HDDs in a zpool consisting of two raidz2 vdevs with 8 drives each.
Today, for the first time ever, one of my vdevs had three devices missing at the same time. I would have expected this to result in something catastrophic, make the pool inaccessible or maybe even corrupt it entirely.
But instead zpool status just told me that the pool is degraded, some files are inaccessible and I should use zpool status -v to get more details. That returned a list of files that can't be accessed at the moment.
I was still able to read and write the pool. Even files that I knew must be located on the degraded vdev (since they were old and ZFS doesn't rebalance when adding new vdevs). I'm guessing writing worked because I had a second vdev that wasn't affected?
After I cleared the pool it got resilvered and everything was back to "No known data errors".
How great is this behavior? I was fully prepared to restore the entire pool from backup, but was pleasantly surprised how hard ZFS is working to keep your files accessible, even when a vdev becomes degraded beyond the point of available redundancy.