
Show HN: I've built a Serverless search feature for my blog - gunnarmorling
https://www.morling.dev/blog/how-i-built-a-serverless-search-for-my-blog/
======
scott31
You have 13 posts in your blog, you could have shipped all the contents and do
the search in client side, and that would be an actual serverless search
implementation.

~~~
gunnarmorling
Well, of course I hope to have many posts to come on my blog, rendering this
solution less and less practical ;)

Besides that, I didn't feel like re-implementing all the things I get for free
from a mature library as Lucene, like word stemming, result highlighting etc.
This alternative is discussed shortly in the post.

------
catchmeifyoucan
I built a similar kind of engine with lunr.js:

[https://github.com/rlingineni/Lambda-Serverless-
Search](https://github.com/rlingineni/Lambda-Serverless-Search)

It loads the entire index into memory, and pushes articles to S3 and builds
the index over time. You can see how performance gets hurt over time. It also
just returns the id of an article, not the entire article. I think this
approach with a precompiled index might be genius.

I attached the performance charts, and logs in the readme. At 18000 records,
the slowest part is pulling the index from S3.

~~~
catchmeifyoucan
Also what was your functions memory limit? There’s also a helper function here
if it’s interesting to test how the performance scales with more records:

[https://github.com/rlingineni/Lambda-Serverless-
Search/blob/...](https://github.com/rlingineni/Lambda-Serverless-
Search/blob/master/scale-test/main.js)

It’ll tell you the time to upload a record, and then the time to query it as
it pushes more and more records. Not sure if you had plans to make it dynamic.
Maybe with GitHub actions. Pretty cool nonetheless.

~~~
gunnarmorling
Ah, very nice to see others exploring that area, too!

I think baking the index as immutable object into the deployment package works
great for use cases like personal blogs, which get updated only every so
often, so you can afford rebuilding the search service when doing so. My main
motivation for that is security, that way my entire service is read-only and
immutable.

I definitely want to automate the deployment. My blog sources are on GitHub
too (and already auto-published to GitHub Pages when pushing a change), so it
shouldn't be too difficult to have another GitHub Action which rebuilds and
deploys the search service.

Re memory, I run with 512 MB. The app is fine with much less; 128 MB works,
too. But Lambda allocates CPU shares proportionally to the assigned RAM and
below 512 MB it's just a bit too slow. As I'll probably never leave the free
tier with this service, "wasting" memory that way doesn't cost me anything
really, altough I feel the RAM/CPU correlation isn't ideal in general, because
it seems you'll end up paying for superfluous RAM if you're only actually
after lower request latency by means of more CPU cycles.

------
dabbit
Wow thats soo fast!

~~~
gunnarmorling
Thanks, happy to hear that!

