I've used this pattern on two separate codebases. One was ~500k LOC apache airflow monolith repo (I am a data engineer). The other was a greenfield flutter side project (I don't know dart, flutter, or really much of anything regarding mobile development).
All I know is that it works. On the greenfield project the code is simple enough to mostly just run `/create_plan` and skip research altogether. You still get the benefit of the agents and everything.
The key is really truly reviewing the documents that the AI spits out. Ask yourself if it covered the edge cases that you're worried about or if it truly picked the right tech for the job. For instance, did it break out of your sqlite pattern and suggest using postgres or something like that. These are very simple checks that you can spot in an instant. Usually chatting with the agent after the plan is created is enough to REPL-edit the plan directly with claude code while it's got it all in context.
At my day job I've got to use github copilot, so I had to tweak the prompts a bit, but the intentional compaction between steps still happens, just not quite as efficiently because copilot doesn't support sub-agents in the same way as claude code. However, I am still able to keep productivity up.
-------
A personal aside.
Immediately before AI assisted coding really took off, I started to feel really depressed that my job was turning into a really boring thing for me. Everything just felt like such a chore. The death by a million paper cuts is real in a large codebase with the interplay and idiosyncrasies of multiple repos, teams, personalities, etc. The main benefit of AI assisted coding for me personally seems to be smoothing over those paper cuts.
I derive pleasure from building things that work. Every little thing that held up that ultimate goal was sucking the pleasure out of the activity that I spent most of my day trying to do. I am much happier now having impressed myself with what I can build if I stick to it.
I appreciate the share. Yes as I said it was a pretty dang uncomfortable to transition to this new way of working but now that it’s settled we’re never going back
I am not the author of this post. The exploration of the scheme based sandbox permissions DSL was interesting to me. It's a classic issue of a custom parser with bad input validation.
Semi-related but I have the same experience with the gemini mobile app on android. ChatGPT and Claude are both great user experiences and the best word to describe how the gemini app feels is flaky.
This is something fine tuning should be able to improve. The main caveats are the same as always: dataset collection, labeling, and training/experimentation time.
Improve, yes, fix, no. I was actually already using a model fine-tuned for Skyrim.
I didn't go into it in detail, but it isn't even that I got the NPCs to start babbling about Taylor Swift. What is was was just that they knew that she was a musician, and as such, might be at the tavern. That's very hard to remove.
i'm trying (and failing, it's pretty hard to steer) to generate prompts for nb by extracting info from books. Idea is to generate images of settings and characters that remain consistent as the story unfolds.
doing this as a part of a read-along style app for kids/YA fiction
I tried the colab notebook that they link to and couldn't replicate the quality for whatever reason. I just swapped out the text and let it run on the introduction paragraph of Metamorphosis by Franz Kafka and it seemingly could not handle the intricacies.
This is the first post in a series I'm hoping to write on applied AI engineering fundamentals. Heavy emphasis on the fundamentals... because to many it will be an obvious review but for others I hope it will give you the confidence to build more interesting things.
Next up is probably an article on applications of tool use and/or Anthropic's Model Context Protocol. I've also planned an article on OpenAI's real time voice APIs used in those AI call screening solutions that funnel leads down into your CRM.
This is insane to read as a data engineer who actually builds software. These sound like amateurs, not experienced data engineers to be perfectly honest.
There are plenty of us out here with many repos, dozens of contributors, and thousands of lines of terraform, python, custom GitHub actions, k8s deployments running airflow and internal full stack web apps that we're building, EMR spark clusters, etc. All living in our own Snowflake/AWS accounts that we manage ourselves.
The data scientists that we service use notebooks extensively, but it's my teams job to clean it up and make it testable and efficient. You can't develop real software in a notebook, it sounds like they need to upskill into a real orchestration platform like airflow and run everything through it.
Unit test the utility functions and helpers, data quality test the data flowing in and out. Build diff reports for understanding big swings in the data to sign off changes.
My email is in my profile I'm happy to discuss further! :-)
to my `~/.claude/CLAUDE.md` file. This has in my experience been enough to get atomic commits after every minor change. I'm probably not the target audience for this though.
All I know is that it works. On the greenfield project the code is simple enough to mostly just run `/create_plan` and skip research altogether. You still get the benefit of the agents and everything.
The key is really truly reviewing the documents that the AI spits out. Ask yourself if it covered the edge cases that you're worried about or if it truly picked the right tech for the job. For instance, did it break out of your sqlite pattern and suggest using postgres or something like that. These are very simple checks that you can spot in an instant. Usually chatting with the agent after the plan is created is enough to REPL-edit the plan directly with claude code while it's got it all in context.
At my day job I've got to use github copilot, so I had to tweak the prompts a bit, but the intentional compaction between steps still happens, just not quite as efficiently because copilot doesn't support sub-agents in the same way as claude code. However, I am still able to keep productivity up.
-------
A personal aside.
Immediately before AI assisted coding really took off, I started to feel really depressed that my job was turning into a really boring thing for me. Everything just felt like such a chore. The death by a million paper cuts is real in a large codebase with the interplay and idiosyncrasies of multiple repos, teams, personalities, etc. The main benefit of AI assisted coding for me personally seems to be smoothing over those paper cuts.
I derive pleasure from building things that work. Every little thing that held up that ultimate goal was sucking the pleasure out of the activity that I spent most of my day trying to do. I am much happier now having impressed myself with what I can build if I stick to it.
reply