-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[Feature]: Aider-inspired RepoMap #2185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@ryanhoangt will probably take a look at this |
Reference: #2248 Add Aider-inspired RepoMap |
See also: |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
IMO this shouldn't be closed as stale |
repomap and graphrag are critical features IMO for coding agents. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
Is this the reason why when the GitHub repo is too large, OpenHands will hang? That there are no systematic way of understanding/ignoring files and folders (including extensions)? |
Hey @TomLucidor, sorry about the trouble! If you have a large repo OpenHands should not hang, although it might become confused. The hanging behavior is likely due to something else. We'd be happy to help diagnose this, but in order to do so we'll need to be able to reproduce the problem. The easiest way for us to do this is if you can press the "feedback" button when you're encountering this behavior, or share logs. If you can do that it'd be great if you can open a new issue. I'd also be happy to discuss on slack. |
Hey @ryanhoangt can you self-assign this one? |
Sure! |
As a little heads up, has anyone checked these repos out? (something along the lines of "codebase prompt") |
I found this kind of tool helpful, but not for feeding it as such to It's not great (Gemini is not great, at least pre-2.0), but I get something out of it. For now, I doubt it has the precision to be useful for the agent, considering the tradeoffs here. Have you tried to give it like that, in a message? |
@enyst what do you have in mind compared to the established tools made specifically for chatbots? Cus I am tempted but not even sure whether to put/generate it in a text file for the repo, OR to throw it directly in the dialogue box. Asking second opinions |
Just wanted to point out, you can import |
@rbren where is the PR that completed this? |
Just noting here that @xingyaoww discussed the reason for closing this issue in today's community meetup (Jan 31, 2025), so until more info is added here, it should be in the recording about 15 min in. |
I believe @ryanhoangt can explain more about this, but in summary, we've been trying something like this: All-Hands-AI/openhands-aci#41 for integrating Aider-inspired Repo map -- and it seems we didn't get it working on claude after a few attempts (e.g., performance on SWE-Bench didn't improve when adding those file skeleton) -- hoang feel free to correct me if I'm wrong, and it will be nice if you can share some numbers too!
As Engel said, there's a significant trade-off here:
I would think things like RepoMap were initially created to build context for a "single-turn LLM-based system." Here are some baseline code localization performance numbers on a paper I've been working on with some folks (haven't released yet! i can share the preprint here when we get it online): You can see even if we don't have any system like "RepoMap" to manually construct context within a large repo - whereas other systems like "agentless" and "moatless" all have RepoMap-style systems - they didn't really outperform OpenHands on everything (if not worse!). Based on these existing results, I've personally feel it is probably less productive to keep pushing towards this direction compared to other things after spending a lot of effort here -- but I could be totally wrong, and would be open to re-open this PR if people from the community is interested in taking over and actually making it work ❤ |
cc @jimwhite -- thanks for the great discussion in today's community meeting! 👆 hopefully this response above would better answer your question :) |
Hey @xingyaoww! You mentioned that agents often do better when left to explore versus being given a repo map. Could you share some specific examples where the agent's autonomous exploration led to better solutions than when provided with structured context? |
I am thinking something like |
The LLM is already supposed to get in the prompt, after each command, the current directory:
If that is not happening, maybe you could post a new issue, with some log with it missing or wrong? But I think it's also possible that we send it and sometimes the LLM ignores it anyway (depending on LLM). |
It might be an LLM issue then @enyst since I am heavily leaning on DeepSeek v3 atm, but it can't invoke the observation whenever it moves around working directories it seems? |
It's automatic. The LLM asks for a command to execute ('cd ...'), the environment performs it and responds with an observation which includes |
Very interesting writeup @xingyaoww |
As promised! I'm sharing official result above and our preprint here: https://arxiv.org/abs/2503.09089 @czlll, @tangxiangru, @Hydrapse and I were working to figure out localization problems in coding agents using OpenHands. Finally be able to share the pre-print:
@Zhaoling Chen will help integrate this LocAgent into OpenHands in the coming weeks so stay tuned 🙂 |
Looks like an interesting paper. Thanks for the somewhat oblique and generic citation to aider. Is there a reason you didn't directly cite aider's repo-map, discuss it in the related work section or include it in your experimental evaluations? At a minimum, your work seems quite strongly "inspired by" aider's repo-map. You've been participating in this GitHub issue titled "Aider-inspired RepoMap" for months, discussing your attempts to implement and evaluate a version of it in OpenHands. Your top-listed contribution sounds very familiar.
Aider's original repo-map code and blog article were published in May 2023. A follow up blog article was published in Oct 2023, describing the tree-sitter implementation of repo-map. |
Very exciting @xingyaoww! Thank you for sharing and for the follow up on this thread. |
@paul-gauthier Sorry for the oversight & appreciate the feedback! I didn't realize there's a specific blog directly discussing the "repo-map" - we will be sure to cite and discuss them in the next revision of the preprint :) (cc @czlll) I have to admit, even though we are discussing this under the thread of Aider-inspired RepoMap, the question we were trying to answer in this paper is more about which is better, "agentic-based information gathering" or "passively provided information" (e.g., via embedding, where the LLM has few options to control the information being retrieved). IRRC, the original AiderMap is designed primarily for the latter category, and it didn't immediately jump into my mind when reviewing the paper 😅 . But now, reflecting back, the indexing approach we took does look similar to what AiderMap was doing under the hood and we will certainly discuss it and give credits. I'd say the differences between LocAgent vs. Aider are more about "how LLMs are using these graph-based indexing." My (maybe outdated) knowledge of Aider is that it automatically constructs a RepoMap for LLM (based on previously mentioned keywords, etc) to provide the context for LLM to perform edit (more similar to the passively provided information); whereas LocAgent is more about allowing LLM to use tools to explicitly traverse the graph (more leaning towards agentic-based information gathering - which is something we are hoping to continue to do in OpenHands). |
@paul-gauthier We will cite aider's repo-map in our next version. Thanks. |
What problem or use case are you trying to solve?
Aider has a functionality to create a RepoMap, which is a concise description of the repo in text format, with the most relevant/important parts highlighted.
Describe the UX of the solution you'd like
It would be nice to have a RepoMap class within OpenDevin that can be used by any agent to pull in a description of the repo.
Do you have thoughts on the technical implementation?
indexing
folder here, or in the memory folder.Describe alternatives you've considered
We could also implement this from scratch, or create improved code search functionality.
Additional context
The text was updated successfully, but these errors were encountered: