this post was submitted on 17 Jul 2025
1 points (100.0% liked)

It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
 
The original post: /r/datahoarder by /u/exiledfan on 2025-07-17 14:02:21.

I've figured out a way to use WGET on login-only Tumblr blogs for mirroring, and I've figured out a script that removes the privacy popup that seems to be inescapable, so each individual HTML file needs to be revised, but this is still a massive win in my book.

However, the problem is now that it only scrapes a handful of posts--presumably being tripped up by the pagination.

Does anyone have any ideas on how this can be worked around?

(Tumblthree is not a viable alternative as it only downloads a fraction of the posts in the first place...)

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here