The original post: /r/datahoarder by /u/garden-3750 on 2025-01-01 14:01:38.

The majority of insecure HTTP websites are likely parked and/or abandoned domains — I have a reasonable amount of experience, having used the Firefox’s HTTPS-only mode since its introduction in late 2020.

The only major websites I recall having encountered are specific Wikidot wikis (e.g. http://darksouls.wikidot.com/), Hardcore Gaming 101 and Projekti Lönnrot (a Project Gutenberg-like undertaking for Finnish literature).

Since the HTTP-only sites tend to be basic HTML pages archiving should be simple — mirroring with wget may be viable (for personal use) and the URLs can be scraped (optimally from the site map), then fed into the Wayback Machine.


One list on Github; seems unmaintained.