Hey HN! After working on this for the last couple of months, we (Continue.dev) are finally releasing tab-autocomplete in beta. It is private, local, and secure, using Ollama, llama.cpp, or any other LLM provider. I'm even more excited about it being completely open-source, because it means I can share how we built it!
I've been sharing details on Twitter for the last month (summarized here: https://twitter.com/NateSesti/status/1763264142163808279), covering the following:
- A specific type of debouncing, and strategy for re-using streamed requests
- How we use the Language Server Protocol to surgically include important context
- Using tree-sitter to calculate the "AST Path", an abstraction with many uses
- Truncation: how we decide to stop early, complete multiple lines, and generally avoid mistaken artifacts
- Jaccard similarity as a method of searching over and ranking recently edited ranges / files
I will continue to share as we improve, so feel free to follow along. I'll also be compiling all of this into a more well-formatted blog post in the future!
Inspired by the "Copilot Internals" post from a year ago (https://news.ycombinator.com/item?id=34032872), I’ll be sharing live updates, clear explanations, and non-obfuscated code.
---
...and one more thing: we've exposed a handful of options that let you customize your experience. You can alter:
- the model
- stop words
- all numerical parameters / thresholds used for retrieval and prompt construction
- the prompt template
- whether to complete multiple lines, or only one at a time
- and a bit more (https://github.com/continuedev/continue/blob/0445b489408f0e7...)
This lets each individual make their own preferred trade-off between speed, accuracy, and other factors.
I tried:
- Continue.dev with local DeepSeek: couldn't get it to work with CUDA easily and my Python experience is close to zero. After over an hour of debugging wild errors and issues I ran into a library that wasn't compiled with CUDA support (which auto-downloaded from the requirements-file) and refused to work, so I gave up.
- Continue.dev with GPT-4 (latest preview which should be less lazy): Still very lazy and slow, and I got tired of having to copy/paste (well, there's a hotkey) everything into the chat window and then missing context. And it preferred giving recommendations over actual code, even when directly instructed to write something.
- Codeium.dev: This seemed very promising right out of the box, but then it started recommending Python code in the middle of a C# project, despite being context aware. I wasn't able to get any useful C# out of it, but it is also not listed in the supported language section so perhaps I should have known better.
- Github Copilot: As a paid premium product and all the hype surrounding it I expected something better here, but it was even worse than Continue.dev with GPT-4. It was incredibly lazy, just gave textual answers instead of code or broad recommendations on how to fix something, or just repeaded what I asked (please provide a method to handle both these cases to remove duplicate code -> oh yes you can indeed write a generic method to handle both these cases; but no actual code, just confirming that it was possible).
Either I did something horrible wrong, or C# is entirely out of scope in these projects, or all the hype is.. People that got lucky? I have no idea. But I was heavily disappointed with my first experiences with this.