edit: https://github.com/matrix-org/matrix.org/ if any indignant gatsby fans want to tell us what we’re doing wrong
There are literally hundreds of static site generators: https://www.staticgen.com
Most of them should get markdown right too!
Incorrect DOM-output is likely caused by common mistakes (e.g. conditional rendering based on `typeof window !== 'undefined'`) which screw up rehydration. Dealt with it in the past and seen a lot of developers struggle with it. This article describes it well:
Gigantic page-data.json files are caused by querying more data than necessary to render a given template/component. Let's say you define a component named `EmployeeCard` that renders a name and photo given an `employeeId`. Now you need to query all employees and render the right one using `.find()`. All this data (including base64 thumbnails) ends up in page-data.json, even if you need to render a only single employee.
This is solved by querying only the relevant employee(s) in the template and provide the data (name + photo) directly to the `EmployeeCard` component as props.
I've developed quite a few websites using Gatsby (mostly backed by various GraphQL API's such as WPGraphQL and Strapi) and while there's lots to learn, it's been an enjoyable experience so far.
For example, we have what roughly amounts to an <Include> component. It uses some contextual information to include language-specific code. That include component pulls in a markdown file from disk based on 1) the includePath you set, and 2) the language selected. In order to make this function, we have to pull in _all_ possible includes within the static query, and then filter it down to the ones we need. There is literally no better way we could do this, because static queries are exactly that - static. There is no fixing this problem because the design of this data layer is broken by default.
You could publish the different languages in different folders/path, thus, using the static queries to build static pages.
If you want them on the same path while loading different language content dynamically, that migh not be the best use case for gatsby. In that case you probably need to load every translation for each of the included paths, but not every path with every language for all of the different pages.
The basic idea is your page context just contains identifiers for the content you need to display on the page and then the page query uses those identifiers to query for just those objects and request all the properties of those objects that the page needs.
It's in a very weird spot in terms of difficulty/productivity.
It's got a very steep learning curve, with a lot of complexity. Things like how it resolves certain values involve following multiple steps that may or many exist in your project and require knowing framework specific rules and terminology. Stuff like which template is used is so complicated. All the other stuff about taxonomy and archetypes, etc also get really confusing.
But all that complexity translated to very little productivity (for me at least). I struggled with setting up beautiful templates, and found all the extra configuration cumbersome because the simple things I wanted to do was buried under all this other crap I had to learn about.
And when I did need advanced features, I didn't want to use hugo. Or things that I did need, were core aspects that were difficult to do with hugo because it's core configuration is so bloated.
Not to mention all the problems with versions and documentation that I had when one at least one of the occasions I tried it.
It doesn't do much, it doesn't handle Markup (because that's not how I want to work), and it probably took less time than really mastering someone else's SSG.
Yes, if you want to add logic in addition to the templating primitives provided by default, then being able to write code into the source pages will be of great help. However I think that when such a level of dynamism is required, then one can start debating the utility of a static generator in the first place.
Two different purposes, IMO.
I think for a startup with the liberty to choose a new tech stack, Nuxt.js is a great choice. It allows you to start with simple S3 bucket hosting, and move up to Vercel/Docker on ECS/etc. as you grow.
Hugo is a fantastic choice for blogs, but not for much else.
Coming from Hakyll which just gives you all the power of Pandoc and a programming language, it was really a chore to work with Hugo. I’ll never use it again.
Maybe I was doing something wrong or the template author, but actually I am thinking about my best options currently, because I am 100% sure, I don't want to go through that mess again. Even building plain HTML pages by hand would have been a lot easier for me than finding out why the updated template didn't compile anymore.
In the end, I just took a completely fresh version of the template and rebuild the documentation by copy and pasting the relevant parts.
For context: Initially I build the documentation in November 2018 and the update happened in April 2020.
It's a moving project and yeah, there have been a lot of changes, with some breaking ones (usually for good reason).
I know React and Next let me solve my problems faster.
(Aside: Both Gatsby and Next have been around for a while and will be for the foreseeable future. So not a flavour of the week.)
Hugo has exactly one thing going for it compared to other static site builders: it's fast. I have about 100 pages now and they build in <1second.
Aside from that, it's a pain to work with. The documentation is sparse but serviceable, but the community on GitHub and Discourse leaves a lot to be desired. Grumpy and unresponsive and much more inclined to close your issue or tersely tell you to read the docs than provide help or admit you have a valid feature request/bug report.
Django is of course a lot bigger, with many more moving parts, but it's much more explicit, Nd the documentation is superb; you're running a lot quicker.
That has a strong smell of losing focus on your core principles and obsessing over adoption stats at any cost
(1) No amateur is especially proud of setting up a static site, because it either seems easy as expected or it wasn't easy and they don't feel like trumpeting that.
I have seen good results from Hugo. It is easy and typically works. Good for a small business website.
For the content (not done yet), I want humans to write it, but haven't found a scalable approach.
Not only Gatsby feels complicated for simple things, but doesn't offer the same level of flexibility as Hugo does. And it also has the most appreciated benefit of not having to battle with the js ecosystem. Hugo just works.
Once you figure out how to really effectivity use it you can be productive and build out some huge websites and it's fast as hell.
I don't doubt that, but a site that is a) huge, and b) doesn't require interactivity seems rather niche to me. SSG's really shine to me on small sites, and to some degree mid-sized ones. Code documentation is a good example, where I made some new library at work and now I want to dump some nice documentation somewhere. It's only a few pages, all I need is a way to convert Markdown to HTML in a semi-pleasing format, and have the SSG handle creating an index or ToC sidebar so people can navigate.
Hugo sucks for this. The templating language is Go's, which I don't care for much, and doesn't seem to be very widely used outside of Hugo. So now I have to learn a DSL just to use Hugo, which is unlikely to be useful for anything else, ever.
The templating language also seems to change constantly. I feel like every time I update Hugo, I have to go find a new theme, because now I get 10,000 warnings that they're going to redo the whole template API, and like 7 things my theme is doing aren't compatible with their new API. Of course, someone will say that I can build my own theme. And you're right! If I want to go become an expert in this esoteric DSL so I can figure out how to iterate over pages to build my ToC. Plus now that I'm building a theme, part of me starts wondering if it wouldn't be easier to just hand-write the HTML.
These NodeJS site generators are at least an order of magnitude easier to work with. I get code completion, so I don't have to try to remember all the various built-in and custom macros. I get unit tests, so when I change something in my theme, I can test it. I find that it's easier to genericize things from tutorials as well. If I want to add a navigation sidebar to all my pages, I just need a tutorial that shows me where I hook into the base page template (i.e. add a customizePageTemplate() call when you instantiate the renderer or something). From there I can use code completion and docs to figure out what I have access to to do the things I want. Again, that's because it's a familiar domain. If I want to do the same thing in Hugo, I pretty much need a tutorial specifically dedicated to customizing the base page templates. Even if someone tells me where to customize it, I end up having issues because the things you can access inside Hugo depend on where you are in the rendering pipeline, and Hugo's templating language doesn't make it easy to tell what you have access to at the moment.
I do admire their speed, but I have had effectively no situations where I have a site that is both large enough to be worth learning the DSL, and simultaneously doesn't require anything that Hugo makes difficult.
The amount of time that you have to spend just fiddling with Gatsby is insane and then in the end it only barely works.
Their marketing keeps touting fast but I doubt there is something slower when you actually use it for something of consequence.
It’s like you’re not a heroic enough a developer if you haven’t wrestled three build pipelines and ten layers of abstraction before breakfast.
I agree largely with the findings described here - Gatsby felt quite slow and the GraphQL part feels like complexity overkill, for my use case I bypassed it using their createPage API but you lose out on eg the ability to reload a page’s data on the fly by doing so (from what I could work out - the documentation was a bit lacking in parts I found).
I was more impressed with NextJS, it feels generally more solid and easier to understand. It has a great preview feature where you can set a cookie and then the site will pull in data at request time rather than from the static site (perfect for CMS previews!) and build times seem alright - I haven’t had cause to look into speeding them up, which is a good sign.
Personally I’d go with NextJS in future based on my experience... that said, I’ve no idea if it would work well for their reasonably specialised use case!
I also just figured out how to trigger hot reloading for sources that normally don’t trigger it (markdown files which I was manually processing).
Basically, I discovered that gatsby will hot-reload when it sees a change in code files (.tsx). So I created an essentially empty _rebuild-trigger.tsx file and imported that in my main page template.
Now, to change the page content and trigger a hot-reload:
- I have an observable subscribed in createPagesStateful that calls deletePage and createPage
- Then write to the file _rebuild-trigger.tsx file
I need to make a blog post because it works great (and I could tie it to any change source - like file system watcher - during development).
I’ll try to put together a blog post when I can.
Thanks a lot for sharing Rick!
Simple is better IMO. Just calculate the ROI and time spent on it and in next 10 years how often you'll be changing stuff in terms of template or links (Neither should change - Design is not fashion and links should be relative).
I also have complete 100% control over the page. I adapt it for a particular content if its got more pictures and less text for example. It's so easy and beautiful. There is a kind of zen-like aspect to it.
However, scale that out to 3000 pages with a bunch of dynamic features, with many other developers and authors. You might find it a bit harder to manage then.
<Alert level="warning" title="Note">
PUT/DELETE methods only apply to updating/deleting issues.
Events in sentry are immutable and can only be deleted by deleting the whole issue.
And goes on to shiw their ugly hack to make it work.
Well, markdown in custom MDX components is rendered. You only need to leave a line break to signal to the parser that this needs further processing. Like this:
<Alert level="warning" title="Note">
**PUT/DELETE** methods only apply to updating/deleting issues.
Which I really don't find to be a deal breaker.
I stopped reading after that because I was hoping for a somewhat more informed opinion.
From my experience, build time/build complexity is Gatsby's achilles heel. But I've built some stuff that looking back I don't think I could have done without Gatsby, as I came yo it being new to React.
It's simple enough on the surface and as you dif deeper, there's plenty of knobs to turn.
My one wish is if they'd let you bypass image processing in the development environment.
Apart from that, I can't recommend it enough.
If my content requires two extra line breaks that do not serve any purpose beyond dealing with the vagarities of the generator (where there is absolutely no reason it should work that way), that does not bode well for my further experiences.
Satisfying hydration constraints is by far the hardest part of doing React server-side rendering. The basic requirement is: whatever HTML you rendered on the server, the initial render of the browser app MUST output that exact same DOM structure. Otherwise, it won't be able to correlate them correctly, and will get confused about which DOM nodes to update. You can seriously mangle your entire page this way – but usually, it looks like nearby content stuck in the wrong place, or incorrect updates.
Some things that lead to hydration mismatches:
- Time-based rendering. Time passes between when the server rendered the HTML and the browser initializes the app. So any component using `new Date` to make decisions can potentially have a hydration mismatch.
- Randomization. Let's say you wanna choose a random promo image to show. The `Math.random()` result is going to be different on the server and client.
- Anything involving browser APIs, like the browser window size, or checking `typeof window`, etc. The server has no access to this info, so it either needs to skip rendering that content, or fallback to a default.
Once in the browser, here's the important part: you need the app's components to make all the same rendering decisions that they made on the server, on the first pass specifically (when hydration occurs). Then, using the component lifecycle, you can make them update to take the latest client-side info into account.
In other words, it's not enough to simply detect `typeof window` and render different content – you're only "allowed" to render that different content after the app has done its initial mount.
The somewhat-reassuring aspect of all this is that it's almost certainly your components at fault, not Gatsby or React, so it can always be fixed without too much effort. But it's an annoying foot-gun to have to worry about nonetheless.
React could do better in both dev and prod though, IMO. It doesn't tell you where/which component had the issue, just the type of DOM node with a mismatch.
I understand that because the API gets so complicated it's better to use something flexible like GraphQL. But the magic breaks down too easily in places and then you end up trying to fix some really peculiar bug (I had one where you had to put blank .eslintrc to the root to prevent the linter from linting local linked npm libraries).
I think parts of Gatsby core should probably be rewritten or done in different language to make it faster. The compilation shouldn't take that long. But I'm not sure what are the remedies for the other problems. Maybe if they focused really hard to simplify the library into its very basics, following the Unix philosophy, it could help with the creeping complexity headache that Gatsby brings.
Our devs complained that even on local, building took forever, and because it had to refetch all of our WordPress CMS media library each time, sometimes it would just crash during build.
Also we grew the need for good SEO support for dynamically generated pages (of which we have thousands), and we had to write our own prerendering lambda function to handle bots (and embed tools) that couldnt run js on their own
After Next announced their support for static site generation and the release of next-serverless, we took the next three weeks to get our site ported over and haven't looked back since.
Has it been a contender for the rewrite of their documentation?
Shameless plug: I'm in the middle of a rewrite of my company's website (https://abilian.com/) using Lektor myself. So far so good, for someone used to Python and Flask it's the obvious choice for a static website generator.
I wouldn't for example know if something is even possible to achieve with a project unless someone had a repository showcasing how it is done, which has made me wary of betting on anything too immature. I just don't have the skills to bridge the gap myself.
Personally, I've stuck with Jekyll up to last year for these reasons, but right when COVID hit I started to delve into 11ty. I can say that it has been nothing short of a delight and the community is pretty stellar as well. I'm currently experiencing build times measured in milliseconds instead of minutes or so. I've been impressed and the project is developed actively and each update brings very concrete upgrades every time. Currently, I have the honor of sitting on top of the performance leaderboard for 11ty. Over half of the 400 sites there scores full marks in Google Lighthouse scores which should tell you something.
Gatsby's plugin system is nice. I like being able to handle things like robots.txt, sitemaps, favicons in a well defined, easily managed part of my app. The downside being you gotta accept what the plugin gives you. For example the favicon plugin generates favicons that are over 30kb. In my case, the favicon is often larger than the entire rest of the page combined. So I still find even here Gatsby is a good idea with questionable execution at times.
Gatsby has strict caching requirements -- https://www.gatsbyjs.com/docs/caching/ -- and if you can't meet them, your site will have subtle bugs. This means Gatsby is not compatible with Github Pages, and is a major downside to Gatsby.
This article spent a long time complaining about MDX. I agree, I think MDX is quite bad. But of the 4 complex Gatsby sites I have built, none of them used MDX. I don't think it's fair to conflate MDX's short comings with Gatsby.
I'm currently playing with Svelte's Sapper. It might possibly become what I always wished Gatsby was.
It also works well with async import for runtime loading of components.
Other then that, I don’t use any of it’s features. (I found a markdown react on npm and manually move the images to the public folder and use normal image tags.)
I’m quite happy with the runtime results, and develop re-render works pretty quickly. However, the production build takes about 30 secs which is pretty slow and I also ran into “out of heap memory“ on node during build and had to increase node’s heap size.
So I’m sure I’ll end up needing to build my own SSR eventually, but it should be easy to do.
Gatsby does, however, suffer from huge scale problems, I think mostly due to the data type inference it needs for GraphQL. Once you start ramping up the number of models and records, it will churn forever ingesting them. There are hacks and ways to particially mitigate some of this, but still Gatsby continues to have problems once you're working with a medium amount of content.
At runtime it’s excellent - perfect scores for performance - and using the website is perfect, so I’m thankful that gatsby let me get started quickly and see how good runtime performance can be.
I’m planning to replace it with React’s built in ReactDOMServer (and I can do a simple change detection and rebuild only changed content easily).
Honestly, it will probably be easier then all the hacking I had to do to get gatsby to work the way I wanted.
It's not this - we only used createPages and ran into these issues. It's the ingesting content from contentful or any database that'll continue to grow where gatsby lets you down.
I wish the article mentioned that.
Last thing I heard from them, is that when faced with the need to update news on the site more quickly (not on realtime, but "sooner"), the development team had to make a separate mechanism to query the news from the backend when the site was running on the browser, instead of doing it at build time, using the builtin GraphQL database. This defeats the central idea that everything on Gatsby comes from a central data source.
I literally thought this was an item about the Australian stand-up comic (but that's Gadsby with a D).
The downside of this is that the lighthouse profiler currently does not score this setup properly and the blurry placeholder results in a lower score. There was an open bug to fix this last I checked.
The other major issue I faced was to setup netlify CMS for my own hosting. This ecosystem seems to very forcefully nudge you to use their infrastructure like Netlify or Gatsby cloud. I didn't want that for a number of reasons. The documentation to set it up for your own server is in an absolutely pathetic state. All the steps described are cryptic and I ran across multiple questions and issues filed by confused users. I finally found a blog post which covered a large number of gotchas and I was able to set it up.
My customer manages the content on their site on their own. They were happy with the performance upgrade but the prospect of waiting a few minutes for the site to recompile everytime they make an update felt like a big step back to them.
I was not happy with seeing enormous json files being generated for what is essentially a website with a homepage and multiple categories with large photo galleries in them.
The other thing I was not prepared for is the hardware required to build the site. Small servers were running out of memory to do the build so I had to setup CircleCI to do the builds for me and then copy everything back to my server.
Another weird scenario I ran into was that the development version and build version had different outputs. I eventually solved this by noticing an issue with the DOM which didn't get picked up by anything in the build tools. Took me a day or two to finally figure it out.
In hindsight I am now wondering if I should just roll my own image resizing scripts, switch to a crud web app and cache all the generated htmls on first load and delete the cache when a change is made and prime it. This has been my approach prior to Gatsby, but I was sold on their marketing about how I get all the speed benefits for free and not have to worry about sizing images, setting up webpack for performance and splitting for routes, setting up metatags for prefetching/preloading etc. Turns out you're better off doing all that on your own.
Broadly I probably wouldn't use this framework again unless it was for a simple blog or something. The experience has left me quite annoyed. I am probably not going to follow this JAM stack philosophy in the future either. I don't really see the benefit. Are people really that afraid of managing servers?
I am also considering redoing the entire site in Sinatra the next time I add features to it.
I do not like it when people use these images. I think you get a better result when you use the now supported loading=lazy attribute on images https://web.dev/browser-level-image-lazy-loading/
In the third world we have to be especially sensitive to these things because mobile data is often incredibly slow, even on 4G.
I use Jekyll for a few sites that benefit from gitflow around markdown collaboration and it’s pretty solid. I read about other generators and if something is easier or better I’m interested.
But for me, it seems Jekyll solved this problem years ago. And it has good docs, active development, etc.
The few people I know who switched off of it did so for what seems like arbitrary reasons to me. A group I collaborate with switched to gatsby because they said some of their developers couldn’t run Ruby on their workstations. And I guess switching their generator to gatsby was easier than figuring out that problem :)
Mr. Nolan, please invert me, I have to fix an awful mistake called "frontend frameworks" back in time.
I found this surprising. I once worked somewhere that used Sentry heavily (liked it a great deal, it was pretty important to us). We had about a hundred projects and were encountering quota problems, the worst of which were usually from misbehaving queue workers on staging or perf the would consume our entire event budget if left unchecked. If we didn't shut it off in time we'd end up losing meaningful prod events.
Our solution was to switch to a YAML based config for rate limits and per project quotas. This would ensure every project was capped, central prod services were allocated more events like they deserved, etc. Sentry staff sent us a helpful starter script for utilizing their project configuration REST API. It was written in Ruby.
All to deploy static websites, something you can do for free with near zero complexity by simply choosing another option. The JS world has become a marketing machine.
That said, the entire goal was to allow a non-engineer to more easily enable our end-users with documentation. We didn't anticipate the amount of engineering hours it'd take to actually get this functioning well given its adoption and ecosystem. Even the investment into MDX though was to service the core goal: create clean abstractions for the complexity where possible.