I have some very large collections of files organized into nested folder structures, which are currently spread across a few 8 TB USB drives. All in all it's more than 100 milion files, and I have a particular read pattern I usually follow where it would make sense to split them up across different drives based on the last part in the filename.
For example *A.data should be stored on drive 1, *B.dats on drive 2, and C.data on drive 3. The * is an incremented number that would be the same for all three files, and it basically represents a timestamp. The program I use to access these files always reads them in groups of three, so this organization scheme optimizes throughout 3x.
This is actually a program I wrote and that's actually how I currently lay out the files on different drive letters, but it's become a pain to manage the code so I'm wanting to offload the functionality and just have my program think it's reading everything from a single drive letter.
Can DrivePool handle that configuration based on the last part of a filename? Does its performance suffer much compared to using native NTFS once 100+ million files are involved?
Edit: in case it's not obvious, my program is multithreaded so it issues multiple file read requests to the OS in parallel.