Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My problem with all AI code assistants is usually the context. I am not sure how cursor fare in this regard but I always struggle to feed the model enough of the code project to be useful for me on a level more than providing line per line suggestion (which copilot does anyway). I don't have experience with cursor or cody (other alternative) and how they tackle this problem by using embeddings (which I suppose have similar context limit).


All the SOTA LLM solutions like this have nearly the same problem. Sure the context window is huge, but there is no guarantee the model understands what 100K tokens of code is trying to accomplish within the context of the full codebase, or even into the real world, within the context of the business. They are just not good enough yet to use in real projects. Try it, start a greenfield project with "just cursor" like the ai-influencers do and see how far you get before it's an unmanagable mess and the LLM is lost in the weeds.

Going the other direction in terms of model size, one tool I've found usable in these scenarios is Supermaven [0]. It's still just one or multi-line suggestions a la GH Copilot, so it's not generating entire apps for you, but it's much much better about pulling those one liners from the rest of the codebase in a logical way. If you have a custom logging module that overloads the standard one, with special functions, it will actually use those functions. Pretty impressive. Also very fast.

[0] https://supermaven.com/


Cursor has a built in embeddings/RAG solutions to mitigate this problem.


Embeddings/RAG don't address the problem I'm talking about. The issue is that you can stuff the entire context window full of code and the models will superficially leverage it, but will still violate existing conventions, inappropriately bring in dependencies, duplicate functionality, etc. They don't "grok" the context at the correct level.


Cursor has 10K tokens context window. Which is quite low compared to the top LLMs.

https://forum.cursor.com/t/capped-at-10k-context-no-matter-a...


The main and best model according to many is Claude 3.5. But it provides maximum of 200k [1]. While I understand cost effectiveness and other limitations with embeddings. But maximum context of just 5% is probably too low with any standard.

[1] https://support.anthropic.com/en/articles/7996856-what-is-th...


You have to switch to long context chat to get the full 200k (I don’t think this works for Composer though).


Cursor is pretty uniqe and advanced in this regard. They tell a lot about this in the Lex Fridman podcast, very interesting.


It's the users job to provide the context the LLM needs in plain language instruction. Not just relying on the LLM to magically understand everything based on the codebase.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: