Sounds like you should download the 4.45MB llamafile-server-0.1 executable from ...

		simonw on Nov 30, 2023 \| parent \| context \| favorite \| on: Llamafile lets you distribute and run LLMs with a ... Sounds like you should download the 4.45MB llamafile-server-0.1 executable from https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.1 and then run it against your existing gguf model files like this: `./llamafile-server-0.1 -m llama-2-13b.Q8_0.gguf` See here: https://simonwillison.net/2023/Nov/29/llamafile/#llamafile-t...