Hacker News new | past | comments | ask | show | jobs | submit login

I agree that many AI coding tools have rushed to adopt naive RAG on code.

Have you done any quantitative evaluation of your wiki style code summaries? My first impression is that they might be too wordy and not deliver valuable context in a token efficient way.

Aider uses a repository map [0] to deliver code context. Relevant code is identified using a graph optimization on the repository's AST & call graph, not vector similarity as is typical with RAG. The repo map shows the selected code within its AST context.

Aider currently holds the 2nd highest score on the main SWE Bench [1], without doing any code RAG. So there is some evidence that the repo map is effective at helping the LLM understand large code bases.

[0] https://aider.chat/docs/repomap.html

[1] https://aider.chat/2024/06/02/main-swe-bench.html




I've been thinking about this a lot recently. So in Aider, it looks like "importance" is based on just the number of references to a particular file, is that right?

It seems like in a large repo, you'd want to have a summary of, say, each module, and what its main functions are, and allow the LLM to request repo maps of parts of the repo based on those summaries. e.g. in my website project, I have a documentation module, a client side module, a server side module, and a deployment module. It seems like it would be good for the AI to be able to determine that a particular request requires changes to the client and server parts, and just request those.


The repo map is computed dynamically, based on the current contents of the coding chat. So "importance" is relative to that, and will pull out the parts of each file which are most relevant to the task at hand.


Interesting, how does Aider decide what’s relevant to the chat?


I had forgotten that Aider uses tree-sitter for syntactic analysis. Happy to found you've got the tree-sitter queries ready, to retrieve code information from source. I was researching how to write the queries myself, for exactly the same purpose as Aider.


I tried using Aider but my codebase is a mix of Clojure Clojurescript and Java . I gave up making it work for me it as it created more issues for me. What I really hated about Aider was that it made code changes without my approval.


You might be interested in my project Plandex[1]. It’s similar to aider in some ways, but one major difference is that proposed changes are accumulated in a version-controlled sandbox rather than being directly applied to project files.

1 - https://github.com/plandex-ai/plandex


You can give 16x Prompt a try. It's GUI desktop app designed for AI coding workflow. It also doesn't automatically make code changes.

https://prompt.16x.engineer/


The recommended workflow is to just use /undo to revert any edits that you don’t like.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: