Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: SingleFile is finally available on Safari (macOS/iOS) (apps.apple.com)
91 points by gildas on Nov 17, 2022 | hide | past | favorite | 50 comments


Related:

Feel the power of the Manifest v3 - https://news.ycombinator.com/item?id=33063619 - Oct 2022 (273 comments)

SingleFile: Save a complete web page into a single HTML file - https://news.ycombinator.com/item?id=30527999 - March 2022 (240 comments)

Show HN: SingleFile Lite, new version of SingleFile compatible with Manifest V3 - https://news.ycombinator.com/item?id=29331038 - Nov 2021 (2 comments)


It would be great if the browser makers adopted something similar to SingleFile as a standard to replace MIME HTML. Though I think SingleFile should be updated a bit to take advantage of new APIs to make the output page cleaner.

With a small bit of JS at the top of the page (or provided by the browser), all the assets could be included are the bottom of the page in an organized JSON blob containing the base64 encoded files, and the rest of the page could remain untouched. The JS simply intercepts the elements as they are parsed and swaps out the src/hrefs for the appropriate data urls below. By adding in a super restrictive CSP to the top of the page, it's guaranteed to be safe and private to open. Here's a prototype:

https://www.hypertext.plus/demo/htmldoc-test1.html


I agree that browsers should offer an API in order to get all the resources easily. It would make things much easier. However, for security reasons, this API would still be restricted to environments like Web Extensions or automation tools like Playwright. FYI, I also took another approach which might interest you, see https://github.com/gildas-lormeau/SingleFileZ. The main drawback is that the HTML produced by SingleFileZ is not valid (but tolerated by the HTML5 spec.).


I’ve been using this extension in Firefox for a while and couldn’t be happier with it. Kudos!


I have created a completely offline knowledge management tool that can connect to singilefile and store web pages with one click.

Once saved, it supports highlighting, full text search, and creating RSS feeds.

In addition to html, it also supports importing mthml, webarchive.

Thanks to singilefile, I don't need to develop a separate browser plugin.

Document: https://hamsterbase.com/docs/integrations/singlefile.html


Why not charge 5$, am I the product?


No, the source code is licensed under AGPL. The business model is to eventually sell licenses to companies selling proprietary products/services.

Edit: I forgot to say that it's thanks to donations (a big thank you to Zotero) that I could have bought a Mac and paid the development license


It's interesting you mention Zotero, I use Zotero all the time especially for university work. It has similar functionality in that it saves webpages. Is there any overlap with Singlefile? Or might the integrate?


I confirm they use SingleFile under the hood to save snapshots :)


That's excellent, I do use Singlefile as well but Zotero offer the database/citation functionality on top. Brilliant extension though, after ublock it's probably number 2 install for me.


Don't worry, even if you only use Zotero, I wouldn't blame you. It's a very good product. Thanks a lot for your support :)


Does this have advantages over .webarchive?


Only Safari can open .webarchive files. Files saved with SingleFile can be opened in any browser (without needing to install any extension).

Edit: I just did a test with https://www.theguardian.com/world/2022/nov/17/three-men-foun.... The CSS of the resulting webarchive is very broken and the file weights 4 times more than the one saved with SingleFile.


.webarchive is just a binary plist, only works on Apple OSes and has security issues. I posted about the various ways to encapsulate HTML last week, including SingleFile, for those who are interested.

https://www.russellbeattie.com/notes/posts/the-decades-long-...


There’s nothing I love to see more on an App Store page than that super cool “Data Not Collected” check mark. :)


I thought the same, until I added automatic crash reporting to my app. It was a revelation: the app that I thought was very stable actually crashed all the time! But nobody told me about it!!! Customers apparently just restarted the app and ignored the crash, despite me offering free email support!

So, I really think some telemetry is needed to make reliable software. It should be anonymous, of course, and limited to things you need to know (eg. crashes or assertion failures) and not include things that don't help the customer (eg. usage statistics for market research are not ok in my opinion).

But if you collect no data at all, you'll never know of your customers issues.


It’s fine if it’s opt-in. Most developers simply assume consent, however, which has all of the normal problems that come along with assuming consent in any context (ie your assumption being wrong in a significant number of cases and the completely avoidable consent violations that occur thereby).


Apple has a framework for doing this without data collection now


Apple has had anonymous crash reports for years. Unfortunately, they are close to useless, because you get reports only for a tiny fraction of crashes. I assume it's either because nobody opts in to share analytics with 3rd party developers, or because Apple is just incompetent and their tech doesn't work.

In any case, I received only a handful of crash reports from Apple. I thought everything was okay, except that customers occasionally reported an issue, but I never got a crash report from Apple that could explain the situation. Until I built my own reporting solution, which just sends a stack trace to my server in a signal handler. I started receiving dozens of crash reports per week.

So while on the one hand I applaud Apple for trying to do the right thing, as a developer I can only say that the crash reports they share with developers are so few they are close to useless.


You’re referring to something older


So what's that new framework? Can I use it to reliably transmit stack traces when the app crashes?


have a look at https://github.com/ChimeHQ/MeterReporter which uses the wwdc20 frameworks


Thank you for the link, this is interesting. It's nice Apple is apparently building support for custom crash reporting into the OS.

From a privacy perspective it sounds like it logs exactly the same info that I currently log with PLCrashReporter, except that it only works on macOS 12.


Ah ok


Is there anything like this, except a CLI tool? Something like yt-dlp. I know of curl and wget, but only the basics. I tend to use qutebrowser, which isn't supported, and I don't really expect it to be supported either. Something more neutral run from the shell would be great.


You can run SingleFile from the command line actually :) See https://github.com/gildas-lormeau/single-file-cli


is it possible to use CLI to download authenticated/logged in pages? Is there anyway I can pass cookie/localstorage data?


I confirm you can pass cookies. You can also add data in the local storage via a user script.


Can someone make a practical reason for saving an entire webpage locally vs a PDF export in 2022?


A PDF doesn’t represent a responsive/resizable version of the page so it will look awkward on most screen sizes even if the original would have handled it.

A PDF doesn’t capture scrolling behavior, so a nested scrolling element will lose most of its content, and a page with a chat prompt or cookie notice might have part of its content covered.

A PDF won’t capture even simple interactive elements like image carousels, lightboxes, and collapsible sections, so content may be lost (“oops, I saved it on the second slide of the image carousel, but I really wanted the first one”).

As far as I know, a PDF won’t include embedded audio/video.

Many PDF exporters chunk the document into paper-sized pages (but, to be fair, some don’t).

Not sure if this tool nails all of those cases, but those are reasons why I’ve saved local copies of pages in the past.


The reason is the same in 2022 as it's always been and will remain - PDF is a lossy conversion - you throw away big piles of structure and metadata, often actual text, code, image resolution. Fidelity often matters a lot to digital packrats.


I'd say the difference is similar to source code vs. compiled-binary-for-a-single-target. The webpage is the source code and can be rendered for various targets if it's responsive. The PDF is just how it looks on a single target, and does not include how it behaves (animations, video, hover effects, etc). So both have benefits.


SingleFile only "compiles" the resources (stylesheets, images, fonts etc.) into an HTML file. You can always transform this HTML file into a PDF afterwards. You will get the same or even better result thanks to the prior transformation.


I use SingleFile occasionally for saving and sharing interactive pages. E.g. ones with zoomable charts. Works great.


Been using SingleFile for a while.

Could it be made possible to save images externally instead of base64? I need a quick way to replace images. It contradicts the name but scrolling & missing in my editor sucks. Unless someone knows the pro tools.


That is the default behaviors' in Chromiun browsers. Right click the page > save as > (WebPage Complete). It gives you one html file and a folder containing all images


I completely missed that. Unfortunately some sites have multi-size links and Chrome doesn't export them if not rendered in DOM.


Is the extension needed?

Could this be an entirely client side API?

For a bookmarking project I'm working on, I'd like to be able to click a button "capture screen" and have the entire rendered single page sent to an s3 bucket.


The Web Extension API is required if you want to download resources blocked via the "Same-origin policy", see https://developer.mozilla.org/en-US/docs/Web/Security/Same-o.... This is the main problem and it is quite common. This could be circumvented by using a proxy, but this would introduce security issues.


Lots of detailed info in the github repo

https://github.com/gildas-lormeau/SingleFile


Thank you for doing this in the open and all the work you've put into this.


Is this better than good ol' mht files?


I think this is the case in terms of longevity and convenience. Pages are saved in HTML. As long as you have a software that can read this type of files, you'll be able to view your saved pages.


This is excellent news, thank you so much!


Nice. Thanks for making it and it sharing.


Did we really need 30 years to finally get a snappy way to save websites in a single file?


Maybe, I started coding SingleFile (for Chrome) almost 13 years ago


Just started using SingleFile — love it. Are there any other utilities/projects that you think that would have such personal and broad utility they would be a decade+ runway?


I've been using SingleFile for Firefox for a long time. Works great, and is way better than what I used to do when traveling in 2007, which was to go to an internet cafe, and save .html files to a USB stick for offline reading when I got back home.


Safari has been able to save pages as webarchives since the beginning.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: