I was getting tired of copy/pasting reams of code into GPT-4 to give it context before I asked it to help me, so I started this small tool. In a nutshell, gpt-repository-loader will spit out file paths and file contents in a prompt-friendly format. You can also use .gptignore to ignore files/folders that are irrelevant to your prompt.
gpt-repository-loader as-is works pretty well in helping me achieve better responses. Eventually, I thought it would be cute to load itself into GPT-4 and have GPT-4 improve it. I was honestly surprised by PR#17. GPT-4 was able to write a valid an example repo and an expected output and throw in a small curveball by adjusting .gptignore. I did tell GPT the output file format in two places: 1.) in the preamble when I prompted it to make a PR for issue #16 and 2.) as a string in gpt_repository_loader.py, both of which are indirect ways to infer how to build a functional test. However, I don't think I explained to GPT in English anywhere on how .gptignore works at all!
I wonder how far GPT-4 can take this repo. Here is the process I'm following for developing:
- Open an issue describing the improvement to make
- Construct a prompt - start with using gpt_repository_loader.py on this repo to generate the repository context, then append the text of the opened issue after the --END-- line.
- Try not to edit any code GPT-4 generates. If there is something wrong, continue to prompt GPT to fix whatever it is.
- Create a feature branch on the issue and create a pull request based on GPT's response.
- Have a maintainer review, approve, and merge.
I am going to try to automate the steps above as much as possible. Really curious how tight the feedback loop will eventually get before something breaks!
The initial prompt would be, "person wants to do x, here are the file list of this repo: ...., give me a list of files that you'd want to edit, create or delete" -> take the list, try to fit the contents of them into 32k tokens and re-prompt with "user is trying to achieve x, here's the most relevant files with their contents:..., give me a git commit in the style of git patch/diff output". From playing around with it today, I think this approach would work rather well and can be like a huge step up from AI line autocompletion.