Hacker News new | past | comments | ask | show | jobs | submit | xyc's comments login

If anyone is interested in trying local AI, you can give https://recurse.chat/ a spin.

It lets you use local llama.cpp without setup, chat with PDF offline and provides chat history / nested folders chat organization, and can handle thousands of conversations. In addition you can import your ChatGPT history and continue chats with local AI.


I dont see any indication that it runs on Linux, I'll stick to Jan which is free.

Privacy could be one reason. There are a lot of cases where people do not want to send data to a cloud service.

Cloudflare has it https://developers.cloudflare.com/workers-ai/models/llava-1....

Locally it's actually quite easy to setup. I've made an app https://recurse.chat/ which supports Llava 1.6. It takes a zero-config approach so you can just start chatting and the app downloads the model for you.


Just realized I read your blog about Llava llamafile which got me interested in local AI and made the app :)

What's your reservation about running it locally?


VS Code vs XCode situation is an exact counter example of this. Non-optimized native apps could be that much slower than well optimized Electron apps.


That's not an apt comparison because it could be the non-UI portion of that. VS Code would be even faster if it was a native app.


Not really. VS Code does have some performance optimizations where even the web browser optimization wouldn't suffice, for example it implements its own scroll bar instead of using the web native scroll bar. But for the most part the browser render optimizations is the crucial factor. After years of optimization you can't easily beat a web browser.

Native app is just another set of layers of abstractions. As a comparison, SwiftUI doesn't render 500 items quickly enough (https://www.reddit.com/r/swift/comments/18dlgv0/improving_pe...), which is a tiny number for web.


A user's personal data really does not have that much scale. Worst case they can cache everything locally. I've imported thousands of chat sessions into a local AI chat app's database, total storage is under 30MB. Full text search (with highlights and all) is almost instant.


Check out https://recurse.chat (I'm the dev). You can import ChatGPT messages. It has almost instant full text search over thousands of chat sessions. Also supports llama.cpp, local embedding / RAG, and most recently bookmarks and nested folders.


You can run llama.cpp and structured output with GNBF. There are tools to convert JSON schema to GNBF.


I have been using local LLM as a daily driver. Built https://recurse.chat for it. I've used Llama 3, WizardLM 2, Mistral mostly, and sometimes just trying out models from hugging face (Recently added support for adding it from Hugging Face https://x.com/recursechat/status/1794132295781322909)


Seems that no client-side changes needed for gpt-4o chat completion

Added a custom OpenAI endpoint to https://recurse.chat (i built it) and it just works: https://twitter.com/recursechat/status/1790074433610137995


but does it do the full multimodal in-out capability shown in the app :)


will see :) heard video capability is rolling out later


api access is text/vision for now https://x.com/mpopv/status/1790073021765505244


If you has suggestions for making RecurseChat more useful for you especially for RAG, I'd love to hear about it!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: