this post was submitted on 19 Jan 2025
1 points (100.0% liked)

It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
 
The original post: /r/datahoarder by /u/ontic00 on 2025-01-18 16:51:44.

TL;DR SingleFile saves more accurate Reddit content than Save Page WE, with Save Page WE not accurately displaying comments, buttons, or search bars. Why could this be? File sizes are also difference for Reddit content (~125MB for SingleFile, ~10MB for Save Page WE).

I've been a long-time user of Save Page WE (a browser extension to save a single-file HTML using Data URIs) and was recently trying to see how many of my extensions I could use on Android. I noticed an alternative extension, SingleFile, is supported on both FireFox and Kiwi, versus Save Page WE is supported only on Kiwi. That and in addition to seeing it's been a while since Save Page WE was updated (late 2023), I decided to do a more extensive comparison between Save Page WE and SingleFile. I noticed for the most part, they produced fairly identical looking pages with similar sizes. The one exception I found was Reddit - SingleFile was way more accurate with saving threads, and also had a lot larger files (100-150MB vs ~10MB for Save Page WE). The comments weren't clearly visible in Save Page WE without inspecting, and the buttons and search bars looked different.

Since the file size was smaller, I'm assuming it's not saving some sort of resource. I had it show the list of unsaved resources and they were:

https://accounts.google.com/gsi/style

https://emoji.redditmedia.com/p9sxc1zh1guz_t5_3nqvj/cat_blep

https://accounts.google.com/gsi/client

https://www.google.com/recaptcha/enterprise.js?render=6LfirrMoAAAAAHZOipvza4kpp_VtTwLNuXVwURNQ

I'm an amateur when it comes to web-archiving, mostly just using Save Page WE and Fireshot to save important webpages, so I was wondering if anyone who knew more about HTML and archiving would have any idea why SingleFile would save Reddit content so much better than Save Page WE. Could it be related to one of Save Page WE's unsaved resources above, or are there other possible explanations? I think I'm on the verge of switching to SingleFile due to its more frequent updates, customizable infobar, and it seeming to save at least Reddit content better and possibly other things I haven't run into yet. Thank you for any knowledge!

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here