Hacker News new | past | comments | ask | show | jobs | submit login
Gemma 2 on AWS Lambda with Llamafile (unremarkable.ai)
10 points by metaskills 4 months ago | hide | past | favorite | 3 comments



A small experiment to see if we are there yet with highly virtualized CPU compute and Small Language Models (SLM). The answer is a resounding maybe, but most likely not. Huge thanks to Justine for her work on Llamafile supported by Mozilla. Hope folks find this R&D useful.


Can you expound a bit about why not?

Does it produce bad results? Is it slow to respond? Slow to load?

I've been wanting to play around with llamafile-based edge functions but storing even small models in GitHub (for automated deploys) is a terrible and often impossible experience.


This is great work. Has anyone used it enough to compare the lambda costs with the cost of running a comparable model on, say, OpenAI?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: