Hacker News new | past | comments | ask | show | jobs | submit login
Using pip to install a Large Language Model that's under 100MB (simonwillison.net)
24 points by sahli 5 days ago | hide | past | favorite | 5 comments





Not that convinced by this model:

    import llm
    model = llm.get_model("SmolLM2")
    model.prompt("Write me a poem about computers").text()
I ran the above to test it, and got this:

    In the realm of computation, where magic is a thing of the past,
    The digital realm, where the boundaries of what is thought,
    The digital realm, where the boundaries of what is thought,
And it repeats like that.

Oh yeah, it's complete rubbish - the challenge now is to figure out if there's anything useful a <100MB LLM can do.

Someone suggested using it to build the world's worst calculator.


Maybe I can do one better...

For the 'create_chat_completion' on this line: https://github.com/simonw/llm-smollm2/blob/afed0963b3a8bcca1...

I changed it to:

    completion = model.create_chat_completion(
      messages=messages,
      temperature=0.2,
      top_p=0.9,
      presence_penalty=0.06,
      frequency_penalty=0.5,
      repeat_penalty=1.0,
    )
I took as many parameters as I could from their documentation: https://github.com/huggingface/smollm/tree/main/text#transfo...

And then added some to reduce repeating. I then used the prompt "Write a 10 line poem on computers." (note the fullstop), to get:

    In the realm of computation, where magic and science intertwine,
    The art of programming and the art of computing.
    Computers are machines that can think, they are computers.
    They can do what humans cannot – to create.
    Computers are machines that can think, they are computers.
    Computers are machines that think, they are computers.
    Computers are machines that think, they are computers.
    Computers are machines that think, they are computers.
    Computers are machines that think, they are computers.
    Computers are machines that think, they are computers.
    Computers are machines that think, they are computers.
    Computers can think, they can think.
    Computers can be thought of as a machine that thinks.
    Computers can be thought of as a machine that thinks.
Still not perfect, but the parameters seem to make a massive difference. I think it's entirely possible to get something reasonable out of this model.

If it can call tools, then it could be used to download a larger, more useful model. Or send API requests to another model.

I know this isn’t useful or aligned with the rules, but you are on fire these days, Simon!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: