Some believe, though they hesitate to put it in writing where it could be openly challenged, that regulations are a dysfunctional aspect of governance and that solutions to systemic issues lie at the individual level.
I do think that's often the case. Regulations in practice are frequently (but not always) worse than the individual solution and crowd the latter out in the process. Drug laws are a fantastic example. I doubt outlawing meth or opiates has been a net benefit on the whole if you truly look at the bigger picture. But if a mega corporation is peddling meth laced sodas to children outside the local elementary school I don't think "lol parent harder" is a sensible rebuttal (or even on topic) to people demanding some minimal amount of regulation. Banning meth in sodas is a perfectly reasonable solution there - go purchase it separately at the pharmacy!
March 2025, Anthropic was claiming that 90% of code would be written by LLMs in three to six months, and "essentially all" code within twelve months. This was one week after closing a Series E round for $3.5 billion. When they began working on their Series F round for $13 billion. You shouldn't need more than that to understand what's going on here.
The Claude Code leak revealed that Anthropic runs Claude-operated bots on the internet. One should be very cautious in getting swept up in the fund-raising process if they are not seeing first-hand the fruition of all of the flattering claims being presented by strangers on the internet.
>March 2025, Anthropic was claiming that 90% of code would be written by LLMs in three to six months, and "essentially all" code within twelve months.
There's a pretty big difference between "We predict in X time frame our model will be capable of Y" and "Our model did Y."
This is like watching someone measure the size of an object and saying "I don't believe you because you guessed it was X before you pulled out your tape measure."
The irony here is that even if one is extracting legitimate value from LLMs because they are that much smarter than their peers, the process of using LLMs to perform all of their skilled labor makes them less intelligent.
"
Most people's mental model of Claude Code is that "it's just a TUI" but it should really be closer to "a small game engine".
For each frame our pipeline constructs a scene graph with React then
-> layouts elements
-> rasterizes them to a 2d screen
-> diffs that against the previous screen
-> finally uses the diff to generate ANSI sequences to draw
We have a ~16ms frame budget so we have roughly ~5ms to go from the React scene graph to ANSI written.
"
60fps is pathetic for a TUI when most terminals worth their salt are GPU accelerated and displays can be up to 240fps or even more. But let’s be real if I can play Quake at >500 fps they have no excuse.
To be fair, Claude Code is vibe-coded. It's a terrible piece of software from an engineering (and often usability) standpoint, and the problems run deeper than just the choice of JavaScript. But it is good enough for people to get what they want out of it.
But also, based on what I have heard of their headcount, they are not necessarily saving any money by vibecoding it - it seems like their productivity per programmer is still well within the historical range.
That isn’t necessarily a hit against them - they make an LLM coding tool and they should absolutely be dogfooding it as hard as they can. They need to be the ones to figure out how to achieve this sought-after productivity boost. But so far it seems to me like AI coding is more similar to past trends in industry practice (OOP, Scrum, TDD, whatever) than it is different in the only way that’s ever been particularly noteworthy to me: it massively changes where people spend their time, without necessarily living up to the hype about how much gets done in that time.
I want to get my car's transmission fixed, the mechanic shop is 50 meters away. Would it be better to walk or to drive to a location 50 meters away from my front door?
-
Opus 4.6:
<thinking>
The user is asking whether to walk or drive 50 meters to a mechanic shop, but there's a key nuance here - their transmission needs fixing, so driving might not be wise.
Done
</thinking>
Walk, absolutely. Driving a car with a compromised transmission even 50 meters risks making the problem worse and turning a repair into a more expensive one. Plus, 50 meters is barely a minute on foot.
No, what the other commenter described is narrowly scoped delegation to LLMs paired with manual review (which sounds dreadfully soul-sucking to me), not wholesale "write feature X, write the unit tests, and review the implementation for me". The latter is vibe-coding.
Reviewing a quick translation of a test to a benchmark (or another menial coding tasks) is way less soul-sucking than doing the menial coding by yourself. Boring soul-sucking tasks are an important thankless part of OSS maintenance.
I concur it is different from what you call vibecoding.
Sidenote, i do that frequently. I also do varying levels of review, ie more/less vibe[1]. It is soul sucking to me.
Despite being soul sucking, I do it because A: It lets me achieve goals despite lacking energy/time for projects that don't require the level of commitment or care that i provide professionally. B: it reduces how much RSI i experience. Typing is a serious concern for me these days.
To mitigate the soul sucking i've been side projecting better review tools. Which frankly i could use for work anyway, as reviewing PRs from humans could be better too. Also inline with review tools, i think a lot of soul sucking is having to provide specificity, so i hope to be able to integrate LLMs into the review tool and speak more naturally to it. Eg i belive some IDEs (vscode? no idea) can let Claude/etc see the cursor, so you can say "this code looks incorrect" without needing to be extremely specific. A suite of tooling that improves this code sharing to Claude/etc would also reduce the inane specificity that seems to be required to make LLMs even remotely reliable for me.
[1]: though we don't seem to have a term for varying amounts of vibe. Some people consider vibe to be 100% complete ignorance of the architecture/code being built. In which case imo nothing i do is vibe, which is absurd to me but i digress.
Even on codebases within the half-year age group, these LLMs often do perform nasty (read: ungodly verbose) implementations that become a maintainability nightmare. Even for the LLMs that wrote it all in the first place. I know this because we've had a steady trickle of clients and prospects expressing "challenges around maintainability and scalability" as they move toward "production readiness". Of course, asking if we can implement "better performing coding agents". As if improved harnessing or similar guardrails can solve what is in my view, a deeper problem.
The practical and opportunistic response is too tell them "Tough cookies" and watch the problems steadily compound into more lucrative revenue opportunities for us. I really have no remorse for these people. Because half of them were explicitly warned against this approach upfront but were psychologically incapable of adjusting expectations or delaying LLM deployment until the technology proved itself. If you've ever had your professional opinion dismissed by the same people regarding you as the SME, you understand my pain.
I suppose I'm just venting now. While we are now extracting money from the dumbassery, the client entitlement and management of their emotions that often comes with putting out these fires never makes for a good time.
This is exactly why enforcement needs to be architectural. The "challenges around maintainability and scalability" your clients hit exist because their AI workflows had zero structural constraints. The output quality problem isn't the model, it's the lack of workflow infrastructure around it.
No, the suite of linters, test suite and documentation in your codebase cannot be equated to “a better prompt” except in the sense that all feedback of any kind is part of what the model uses to make decisions about how to act.
A properly set up and maintained codebase is the core duty of a software engineer. Sounds like the great-grandparent comment’s client needed a software engineer.
What if LLMs, at the end of the day are machines, so for now generally dumber than humans and the best they can provide are at most statistically median implementantions (and if 80% of code out there is crap, the median will be low)?
Now that's a scary thought that basically goes against "1 trillion dollars can't be wrong".
Now, LLMs are probably great range extenders, but they're not wonder weapons.
Also who is to say what is actually crap? Writing great code is completely dependent on context. An AI could exclusively be trained on the most beautiful and clean code in the world, yet if it chooses the wrong paradigm in the wrong context, it doesn't matter how beautiful that code is - it's still gonna be totally broken code.