this post was submitted on 23 Feb 2025
1 points (100.0% liked)

It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
 
The original post: /r/datahoarder by /u/m4d40 on 2025-02-23 14:00:34.

Hi, i have a lot of bigger txt, csv, sql (dump) files and wondered what the best way is to organize them and make them better searchable.

first i thought about pushing all in a nosql, but then it would be over 1TB which i think would be overkill to ever try to initiate and do queries from.

Next thought was, searching for common ids or fields, and create my own tree sctructure with files, where then i create an index like file to each with references to the big files where the detailed data about that id/field is stored, so if i want detailed information another script could go to the specific files and lines and grep/collect it.

(i also thought about elasticsearch, apache solr, or sth similar, but i have no knowledge in this are yet)

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here