Hi,
I have done quite a lot of research on still image (and video for that matter) and pure text compression.
I have find that JPEG XL (JXL 0.10.3) seems to be the best in term of ratio time/performance and ability to perform well in lossless context on major already compress image like .jpg and .png. The ability to not reencode but just transcode for jpeg input file is also nice.
Through test I have found that in lossless mode (-q 100), using the effort 7 is the best. Above just take too much time (especially the effort 10, more compress for lossless the the time needed increase exponentially).
I have test HEIC, TIFF, AVIF, WEBP, and I have also test a lot the still quite experimental WEBP2. WEBP2 is impressive but compress less and take more time (in lossless mode at less). It seems that this codec still need time to attain maturity. (But I find that on very small data size image it performs surprising very well, image less than 25KB for example).
For JXL, one can hope 20% lossless compression for already compress standard JPEG and 40 to 50% lossless compression for PNG. I need to precise that I find these percentage on manga image in grayscale and in color, in both case these are image that most likely tend to compress well due to a lot area with same color.
For text compression is more complex. I have test the well know format and algorithms like: ZIP, 7ZIP, Brotli, Zstd, BZip2, XZ, Gzip, ARC, and *PAQ. I didn't do a lot of test for the moment but I have found for me (for diverse type of text base file) that ARC > *PAQ > BZip2 > Brotli > Zstd
It is quite surprising that the now quite old ARC performs so well. I want to precise that it didn't compress a big file of tens or hundreds of MB but instead a multitude of ≃1MB TAR file. The ratio difference between all the compression algo isn't that much but still.
For better compression I read that Large Text Compression Benchmark. It truly show jaw dropping compression ratio but most often than not it's at the expense of the compression time needed and therefore the usability. And a lot of these algo seems or are abandonware (I had to go to archive.org to find the file for nanozip!) and were not means to be use in "production".
But there are some nice algo that seems reasonable in time, seems alive and could be use in "production". The like of: zpaq 1.09 BCM 2.03 BSC 3.25 (Maybe bsc-m03 0.4.0) pcompress 3.1
One can hope compression ratio from 40 to 80% on pure text-based file. That is not negligable, one can compress 1TB into maybe just 200GB. The saving in HDD space and then in money can be interesting.
What are your thought on the need to losslessly reencode/compress VS just leave the original as it is?