Hacker News new | past | comments | ask | show | jobs | submit login
Pre-rendering static websites with the 23 year-old wget command (apex.sh)
34 points by tjholowaychuk 55 days ago | hide | past | web | favorite | 15 comments

Thank you for sharing this; I find the technique refreshingly simple.

> You may have seen people achieve this with a more complex headless Chrome-based solution, but for many sites, this will be perfectly fine!

Can you elaborate on the difference between using wget and a heavier solution? I assume the main difference is that a headless browser can execute JavaScript and then serialize the resulting DOM back to HTML, allowing you to build sites in client side frameworks (React, Vue) and then make static versions of them for deployment. Are there other benefits of using a full browser vs. simply using wget?

I think this article unnecessarily conflates the pre-rendering techniques used for JS-heavy websites with what I would call a static website build/compile process.

Using wget to compile your website is a clever idea. But it won't work if your website uses JS to generate links, so I wouldn't call it pre-rendering (since there better not be any post-rendering)

Yeah it doesn't work if you're relying on JS for interactions and layout, but wget's crawling technique works great if you're happy with using server-side rendering for content.

The title makes it look like wget is obsolete. Why not use the original title?

The HN post is by the author himself and in the first paragraph the author also mentions '23 year-old wget'.

Wouldn’t it make more sense to generate the html and save it to the appropriate file from the blog generator itself?

What if you have a page that is there but it’s not linked from any other page (a landing page for example)? It would never be pre-rendered.

You can definitely do that, but I find this appealing for development, just write some routes as you normally would and the templates all re-render etc, no need for watching file changes and re-compiling. But you're right, if you have a 404 template for example you have to `curl ... > build/404.html` which is a bit lame.

I did exactly this with a rails project once, called render_to_string on each page I was interested in and saved them as HTML files. The wget method is clever but I agree, working within the same system makes the most sense.

If you are using a web framework designed for dynamic processing - think something like Java servlets/JSP, or in the authors case Go - it's often non-trivial to find/implement a render_to_page function, let alone enumerate all possible pages.

It would be interesting to combine wget with HTMLDOC[0] for convert static websites to PDF book.

[0] https://github.com/michaelrsweet/htmldoc

Would it be less maintenance to use your web server's cache feature? Both Apache and Nginx can cache dynamic pages to static files.

These days I think most people deploy static sites to a CDN, they get you such great performance I can't imagine not using a CDN, my site loads in 10ms in London for example.

Ha, I remember this being a thing to pre-warm caches in java systems a zillion years ago. What was once old is new again.

wget is not a command, it is a program

i would argue that it's both. what it isn't, is an internal command built into the shell. but what it is is a series of characters that identifies a binary, and the act of submitting those characters is a metaphorical command to the shell to invoke said binary. but maybe im just being pedantic ¯\_(ツ)_/¯

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact