Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What do you use local LLMs for?
15 points by sabrina_ramonov 7 months ago | hide | past | favorite | 18 comments
Me: finetuning Llama3 for SAT test questions



- Basic internet search (I start ollama CLI faster than I can start a browser - https://ollama.com)

- Formatting/changing text

- Troubleshooting code, esp. new frameworks/libs

- Recipes

- Data entry

- Organizing thoughts: High-level lists, comparison, classification, synonyms, jargon & nomenclature

- Learning esp. by analogy and example

RAG for:

- Website assistants (https://github.com/bennyschmidt/ragdoll-studio/tree/master/e...)

- Game NPCs (https://github.com/bennyschmidt/ragdoll-studio/tree/master/e...)

- Discord/Slack/forum bots (https://github.com/bennyschmidt/ragdoll-studio/tree/master/e...)

- Character-driven storytelling and video game concept art (https://github.com/bennyschmidt/ragdoll-studio/tree/master/r...)


I use it as a daily driver (built https://recurse.chat/).

Local RAG and chat with PDF is handy. Some of our users are using it to format transcripts (example: https://talk.macpowerusers.com/t/recursechat-little-app-to-u...).


Amazing work! This looks great.

Question: "no config setup" You mean the user doesn't have to pre-install Ollama or build llama.cpp or anything?

If you don't mind me asking - how are you accomplishing that in an offline desktop app?


Thank you! Yes the app has bundled built-in llama.cpp binary with it. The app just launches the executable - nothing fancy. We are a little llama.cpp wrapper :)

It doesn't require Ollama, but if you have existing Ollama, it works with Ollama's API compatible layer https://recurse.chat/faq/#openai-chat-completion-models.


Very cool you covered all bases. I'm going through this exact thing in an app, just moved it over from relying on ollama to using node-llama-cpp internally, but maybe I'll copy you and still support ollama too :D


have fun :) your users will probably let you know which way they want, or both


Is there something similar to RecurseChat but for Windows?


Probably not but we could build it.


I use the MLCChat iOS app to run Mixtral on my iPhone pretty much exclusively when on flights and am curious about something. Maybe it’s a question about my destination, a fact I’ve forgotten, or if I’m working and haven’t bought WiFi it’s good for quick documentation sythesis.

It’s genuinely useful and fun in a way ChatGPT isn’t- something about the novelty. However outside of that environment I haven’t found a good use.

Also fun to feel the phone heat up as it generates tokens.

Edit: 7B-Instruct


The thought of building an interview assistant for coding interviews and selling it for few $k A pop has crossed my mind on more than one occasion. Run audio capture and OCR, along with some finetuning on prompt engineering of how to solve the problem in steps with explanation of each one.


I'm looking to implement a local RAG for my business to read, pdfs, spreadsheets, FAQ, and scan parent company websites.

Not sure where to start but people projects here are giving me a good direction.

Eventually I'd want the chat on my business site to start a sale process.


You can start by defining the scope and the tools you want to use.

Both coding or nocode tools can help. The simplest option would be to take one of those "privacy-friendly" options out there for a spin on a free tier and as you work you'll see what limitations you encounter or features you'd want.

And then you can customize it further using coding tools.


Using llm for converting LinkedIn post to Google doc


why did you decide to use a local LLM for that use case?


SBERT for classification, ranking, etc.


I feel like SBERT is no longer considered Large :)


I use the 420MB mpnet model. They finally have a T5 that gains a point of performance but is almost 10GB. The SBERT folks already think most people would be happier with a model smaller than mpnet but faster.

I have tried fine-tuning other BERTs and not had one that was really worth using. One of these days I want to train a T5 to do something kinda generative like putting Mastodon tags on articles.


What do you classify / rank ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: