this post was submitted on 20 Sep 2024
1 points (100.0% liked)

It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
 
The original post: /r/datahoarder by /u/--dany-- on 2024-09-20 14:14:50.

So I had a directory on with 4 million images, each about 100k - 1m bytes... And it comes out reading anything off the directory becomes extremely slow. We're talking about rsync or tar at a few GB/hr. It never occurred to me it's so slow. I used to serve the images through a web server. No idea what happened, but now I'm paying my tech. debt to archive everything into a big tarball, then back it up. I know I should have split the directory by file prefix... on contrary the other directory with 600GB images/videos organized in subdirectories by date only took an hour to archive.

But can anybody illuminate me: why is it so slow? And hopefully nobody on this sub would repeat my stupidity again. :D

some additional info: CPU usage is about 4%, memory 2%, disk io at a few 100kbytes/s. The file system is ext4 on RAID0. These images are not tiny text files, therefore the disks are hopefully not totally doing random i/o. And the disks are only about 70% full, so unlikely caused by severe fragmentation.

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here