I just installed Memex today to help with this problem for myself. I really would like an extension that recorded everything and saved as WARC or some other re-playable format. I hope that Memex can achieve what I want now which is "Crap, which web page was that quote/keyword/idea on?". Chrome history is iffy at best.
SingleFile [1] can do that, i.e. automatically save viewed pages and/or bookmarks, but it will save pages in HTML. Alternatively, SingleFileZ [2] can do the same but will produce zip files (disguised as HTML files). Disclosure: I'm the author.
I'm using something similar I believe. I simply wrote a puppeteer automated browser to go through every page and saves it as `.mhtml`
This work quite well for my purpose. I was archiving a site with content that I pay for and sits behind my login.
I often use material from it when I'm offline and hence needed to put together this hack.
The below code does the job of saving the page as a single file.
I've been looking for something like this for a while now, to store all pages I visit into a personal archive, but all the options I found either involved setting up a proxy and MITMing all your requests (too much effort to set up) or saved to a format I could not easily access.
So far, SingleFile looks like a perfect fit, thanks!
produces repayable archives in a format that doesn't have the issues of WARC. It's a work in progress tho and currently archives everything, instead of just what you select.
You can of course edit the archive by hand. it's very simple to do, but not as simple as being able to select only what you want.
I think there's a whitelist Domain option that works per archive.
Sites that use JavaScript based cache-busting code are probably hardest to archive. Fetching CDN.com/jQuery.js?rnd=63926195 isn't exactly easy without handcoded workarounds.