
Static Sites with Elasticsearch - amitaibu
https://www.gizra.com/content/drupal-static-elasticsearch/
======
simonw
I've been experimenting with SQLite FTS as a way of adding search to an
otherwise static site.

The big advantage of SQLite FTS is that it's really cheap to run. The index is
a single static file on disk, then you add a Python process (I'm using
[https://github.com/simonw/datasette](https://github.com/simonw/datasette) )
to run queries against it. Much less resource intensive than running Solr or
Elasticsearch.

It also works surprisingly well - I've run FTS queries against tables that are
up to around 10GB on disk and performance is great.

It's no way near as featureful as Lucene, but for small to medium sized
projects it's easily good enough.

As for deployment: if the SQLite .db index file is small enough you can bundle
it up as part of a static deployment, e.g. bundled in a Docker container. I've
done this using Heroku, Google Cloud Run, [https://fly.io/](https://fly.io/)
and Zeit Now (aka Vercel).

If the content lives in a git repository you can hook up CI (or a GitHub
Action) to build and publish a new copy of the SQLite index on every change.

I've started thinking of this pattern as a kind of static-dynamic site:
there's dynamic server-side code but it's running in read-only containers, so
you can scale it up by running more copies and if anything goes wrong you just
restart the container.

[https://til.simonwillison.net/](https://til.simonwillison.net/) is my most
recent site to use this pattern, see
[https://github.com/simonw/til](https://github.com/simonw/til) for how it
works.

I also wrote this tutorial describing the pattern a while ago:
[https://24ways.org/2018/fast-autocomplete-search-for-your-
we...](https://24ways.org/2018/fast-autocomplete-search-for-your-website/)

~~~
tootie
It's mentioned in this post but lunr.js is a clever option if you don't have
much content. The idea is that your search page has to create a json object
with all your content and lunr will build an index out of it client-side. This
sounds like terrible architecture but you could stuff about 100 blog posts
into an object the size of one big jpeg. For a few dozen articles it's pretty
snappy.

~~~
techntoke
A few dozen articles? I've seen examples with Fuse.js and Hugo searching over
10,000 articles and it is fast. No server-side components required.

~~~
chopraaa
Great way to botch user experience since the user has to download the index.

~~~
techntoke
There are ways to optimize this per section, alphabetically, etc. Otherwise
Xapian is very easy to setup and would be my goto over Elasticsearch.

------
arkadiyt
Philosophically building website search on top of Elasticsearch seems fine,
but don't call it a static site then - you're deploying a backend.

"Static site search" to me is something like adding a `<form
action="[https://duckduckgo.com"](https://duckduckgo.com") method="get">` text
box to your site.

~~~
amitaibu
@arkadiyt Thanks - I see your point. However, I believe you can also think
about it as a service - similar to how Disqus can be added to your static
site. That is, the site is static, but the results for the search are handled
with a service. In this case the "service" \- Elasticsearch - is tightly
coupled to your static site's revision.

~~~
turnipla
You’re confusing static with serverless.

Your content is indeed static but your search is not: It’s handled by a
service that parses your requests and produces output.

Static content on the other hand are files being served straight from the
filesystem.

As for Disqus, they’re not static either, they’re just “a service for static
websites”

~~~
amitaibu
@turnipla Indeed, the Elasticsearch is completely a typical request - response
kind. However, the point in the post was showing how we could make sure the
search is in full sync with static site - even if we for example rolled-back
deploys. That is, even if we rollback to a revision with less content than
what we have in the "default" index, search will not show it to us.

------
karterk
If you're looking to do the same on something that's easier to run and manage
than ES, consider using Typesense:
[https://github.com/typesense/typesense](https://github.com/typesense/typesense)

The primary benefits are simplicity, typo tolerance and ability to expose the
search engine directly to the front end without having to put it behind an ELB
as described in this post for Elasticsearch.

P.S: I work on this.

~~~
kvz
TypeSense looks nice for this but it could really use an officially supported
browser integration so folks can onboard easier I feel
[https://github.com/typesense/typesense/issues/85](https://github.com/typesense/typesense/issues/85)

~~~
karterk
Agree 100%, very close to launching that (within 2 weeks!).

~~~
kvz
Looking forward! <3

