I don't doubt that some people are mistaken or dishonest in their self-reports as the article asserts, but my personal experience at least is a firm counterexample.
I've been heavily leaning on AI for an engagement that would otherwise have been impossible for me to deliver to the same parameters and under the same constraints. Without AI, I simply wouldn't have been able to fit the project into my schedule, and would have turned it down. Instead, not only did I accept and fit it into my schedule, I was able to deliver on all stretch goals, put in much more polish and automated testing than originally planned, and accommodate a reasonable amount of scope creep. With AI, I'm now finding myself evaluating other projects to fit into my schedule going forward that I couldn't have considered otherwise.
I'm not going to specifically claim that I'm an "AI 10x engineer", because I don't have hard metrics to back that up, but I'd guesstimate that I've experienced a ballpark 10x speedup for the first 80% of the project and maybe 3 - 5x+ thereafter depending on the specific task. That being said, there was one instance where I realized halfway through typing a short prompt that it would have been faster to make those particular changes by hand, so I also understand where some people's skepticism is coming from if their impression is shaped by experiences like that.
I believe the discrepancy we're seeing across the industry is that prompt-based engineering and traditional software engineering are overlapping but distinct skill sets. Speaking for myself, prompt-based engineering has come naturally due to strong written communication skills (e.g. experience drafting/editing/reviewing legal docs), strong code review skills (e.g. participating in security audits), and otherwise being what I'd describe as a strong "jack of all trades, master of some" in software development across the stack. On the other hand, for example, I could easily see someone who's super 1337 at programming high-performance algorithms and mid at most everything else finding that AI insufficiently enhances their core competency while also being difficult to effectively manage for anything outside of that.
As to how I actually approach this:
* Gemini Pro is essentially my senior engineer. I use Gemini to perform codebase-wide analyses, write documentation, and prepare detailed sprint plans with granular todo lists. Particularly for early stages of the project or major new features, I'll spend a several hours at a time meta-prompting and meta-meta-prompting with Gemini just to get a collection of prompts, documents, and JSON todo lists that encapsulate all of my technical requirements and feedback loops. This is actually harder than manual programming because I don't get the "break" of performing out all the trivial and boilerplate parts of coding; my prompts here are much more information-dense than code.
* Claude Sonnet is my coding agent. For Gemini-assisted sprints, I'll fire Claude off with a series of pre-programmed prompts and let it run for hours overnight. For smaller things, I'll pair program with Claude directly and multitask while it codes, or if I really need a break I'll take breaks in between prompting.
* More recently, Grok 4 through the Grok chat service is my Stack Overflow. I can't rave enough about it. Asking it questions and/or pasting in code diffs for feedback gets incredible results. Sometimes I'll just act as a middleman pasting things back and forth between Grok and Claude/Gemini while multitasking on other things, and find that they've collaboratively resolved the issue. Occasionally, I've landed on the correct solution on my own within the 2 - 3 minutes it took for Grok to respond, but even then the second opinion was useful validation. o3 is good at this too, but Grok 4 has been on another level in my experience; its information is usually up to date, and its answers are usually either correct or at least on the right track.
* I've heard from other comments here (possibly from you, Simon, though I'm not sure) that o3 is great at calling out anti-patterns in Claude output, e.g. its obnoxious tendency to default to keeping old internal APIs and marking them as "legacy" or "for backwards compatibility" instead of just removing them and fixing the resulting build errors. I'll be giving this a shot during tech debt cleanup.
As you can see, my process is very different from vibe coding. Vibe coding is fine for prototyping, on for non-engineers with no other options, but it's not how I would advise anyone to build a serious product for critical use cases.
One neat thing I was able to do, with a couple days' notice, was add a script to generate a super polished product walkthrough slide deck with a total of like 80 pages of screenshots and captions covering different user stories, with each story having its own zoomed out overview of a diagram of thumbnails linking to the actual slides. It looked way better than any other product overview deck I've put together by hand in the past, with the bonus that we've regenerated it on demand any time an up-to-date deck showing the latest iteration of the product was needed. This honestly could be a pretty useful product in itself. Without AI, we would've been stuck putting together a much worse deck by hand, and it would've gotten stale immediately. (I've been in the position of having to give disclaimers about product materials being outdated when sharing them, and it's not fun.)
Anyway, I don't know if any of this will convince anyone to take my word for it, but hopefully some of my techniques can at least be helpful to someone. The only real metric I have to share offhand is that the project has over 4000 (largely non-trivial) commits made substantially solo across 2.5 months on a part-time schedule juggled with other commitments, two vacations, and time spent on aspects of the engagement other than development. I realize that's a bit vague, but I promise that it's a fairly complex project which I feel pretty confident I wouldn't have been capable of delivering in the same form on the same schedule without AI. The founders and other stakeholders have been extremely satisfied with the end result. I'd post it here for you all to judge, but unfortunately it's currently in a soft launch status that we don't want a lot of attention on just yet.
I've been heavily leaning on AI for an engagement that would otherwise have been impossible for me to deliver to the same parameters and under the same constraints. Without AI, I simply wouldn't have been able to fit the project into my schedule, and would have turned it down. Instead, not only did I accept and fit it into my schedule, I was able to deliver on all stretch goals, put in much more polish and automated testing than originally planned, and accommodate a reasonable amount of scope creep. With AI, I'm now finding myself evaluating other projects to fit into my schedule going forward that I couldn't have considered otherwise.
I'm not going to specifically claim that I'm an "AI 10x engineer", because I don't have hard metrics to back that up, but I'd guesstimate that I've experienced a ballpark 10x speedup for the first 80% of the project and maybe 3 - 5x+ thereafter depending on the specific task. That being said, there was one instance where I realized halfway through typing a short prompt that it would have been faster to make those particular changes by hand, so I also understand where some people's skepticism is coming from if their impression is shaped by experiences like that.
I believe the discrepancy we're seeing across the industry is that prompt-based engineering and traditional software engineering are overlapping but distinct skill sets. Speaking for myself, prompt-based engineering has come naturally due to strong written communication skills (e.g. experience drafting/editing/reviewing legal docs), strong code review skills (e.g. participating in security audits), and otherwise being what I'd describe as a strong "jack of all trades, master of some" in software development across the stack. On the other hand, for example, I could easily see someone who's super 1337 at programming high-performance algorithms and mid at most everything else finding that AI insufficiently enhances their core competency while also being difficult to effectively manage for anything outside of that.
As to how I actually approach this:
* Gemini Pro is essentially my senior engineer. I use Gemini to perform codebase-wide analyses, write documentation, and prepare detailed sprint plans with granular todo lists. Particularly for early stages of the project or major new features, I'll spend a several hours at a time meta-prompting and meta-meta-prompting with Gemini just to get a collection of prompts, documents, and JSON todo lists that encapsulate all of my technical requirements and feedback loops. This is actually harder than manual programming because I don't get the "break" of performing out all the trivial and boilerplate parts of coding; my prompts here are much more information-dense than code.
* Claude Sonnet is my coding agent. For Gemini-assisted sprints, I'll fire Claude off with a series of pre-programmed prompts and let it run for hours overnight. For smaller things, I'll pair program with Claude directly and multitask while it codes, or if I really need a break I'll take breaks in between prompting.
* More recently, Grok 4 through the Grok chat service is my Stack Overflow. I can't rave enough about it. Asking it questions and/or pasting in code diffs for feedback gets incredible results. Sometimes I'll just act as a middleman pasting things back and forth between Grok and Claude/Gemini while multitasking on other things, and find that they've collaboratively resolved the issue. Occasionally, I've landed on the correct solution on my own within the 2 - 3 minutes it took for Grok to respond, but even then the second opinion was useful validation. o3 is good at this too, but Grok 4 has been on another level in my experience; its information is usually up to date, and its answers are usually either correct or at least on the right track.
* I've heard from other comments here (possibly from you, Simon, though I'm not sure) that o3 is great at calling out anti-patterns in Claude output, e.g. its obnoxious tendency to default to keeping old internal APIs and marking them as "legacy" or "for backwards compatibility" instead of just removing them and fixing the resulting build errors. I'll be giving this a shot during tech debt cleanup.
As you can see, my process is very different from vibe coding. Vibe coding is fine for prototyping, on for non-engineers with no other options, but it's not how I would advise anyone to build a serious product for critical use cases.
One neat thing I was able to do, with a couple days' notice, was add a script to generate a super polished product walkthrough slide deck with a total of like 80 pages of screenshots and captions covering different user stories, with each story having its own zoomed out overview of a diagram of thumbnails linking to the actual slides. It looked way better than any other product overview deck I've put together by hand in the past, with the bonus that we've regenerated it on demand any time an up-to-date deck showing the latest iteration of the product was needed. This honestly could be a pretty useful product in itself. Without AI, we would've been stuck putting together a much worse deck by hand, and it would've gotten stale immediately. (I've been in the position of having to give disclaimers about product materials being outdated when sharing them, and it's not fun.)
Anyway, I don't know if any of this will convince anyone to take my word for it, but hopefully some of my techniques can at least be helpful to someone. The only real metric I have to share offhand is that the project has over 4000 (largely non-trivial) commits made substantially solo across 2.5 months on a part-time schedule juggled with other commitments, two vacations, and time spent on aspects of the engagement other than development. I realize that's a bit vague, but I promise that it's a fairly complex project which I feel pretty confident I wouldn't have been capable of delivering in the same form on the same schedule without AI. The founders and other stakeholders have been extremely satisfied with the end result. I'd post it here for you all to judge, but unfortunately it's currently in a soft launch status that we don't want a lot of attention on just yet.