The original post: /r/datahoarder by /u/digitalsignalperson on 2024-08-29 23:15:19.
While converting a setup from ZFS to XFS I had an interesting real world test to try. I'm experimenting with keeping my own checksums (combo of reflink snapshots with efficient xxHash scrubbing), so the test is comparing the speed of mass checksumming all files with
find . -type f | parallel -j $(($(nproc)-2)) -X xxhsum -H3 > /tmp/test.xxh3
The set of files I'm testing is from a ZFS dataset with ~130 git repos totaling 730k files. These are mostly small files and with LZ4 compression the reported compressratio by ZFS is 1.54x. The space used by refer is shown as 18GB. On the XFS side du reports it as 29G.
This should provide an interesting test of whether XFS is slower dealing with small files, or if ZFS has an advantage in a case where there is a decent compression ratio. My original thought was that zfs/btrfs may be faster than ext4/xfs because of the faster read speeds with compressed data (per this benchmark)
The setup:
- 2x TEAMGROUP MP34 4TB with DRAM SLC Cache 3D NAND TLC NVMe 1.3 PCIe Gen3x4
- the original ZFS pool is on LUKS in raid 1, including params: compression=lz4, atime=off
- I detached the 2nd drive from the pool and reformatted it for the stack of mdadm -> luks -> XFS
mdadm --create /dev/md0 --level=1 --force --raid-devices=1 --size=3709G /dev/nvme0n1p2so that I can add the 2nd drive to RAID1 later- cryptsetup on /dev/md0 with same params as original pool
- mkfs.xfs with 2KB inodes (I'm experimenting with extended attributes)
- copying the dataset over with
rsync -atook 5 minutes averaging 93MB/sec - ZFS is using
zfs_arc_maxset to 4GB so I'm not bothering clearing l2arc which might be favorable to ZFS - otherwise doing
echo 3 > /proc/sys/vm/drop_cachesbefore runs
Results:
- ZFS three runs: 1m39s, 2m4s, 1m51s; best 1m39s
- XFS three runs: 24.3s, 25.9s, 26.6s; best 24s
XFS was 4x faster.
Think this was a fair test? Possible caveat that the ZFS pool is at 66% capacity and has a bunch of other datasets and snapshots, but I'm not sure if that matters.