Unfortunately, given participant feedback and surveys, we believe that the data from our new experiment gives us an unreliable signal of the current productivity effect of AI tools. The primary reason is that we have observed a significant increase in developers choosing not to participate in the study because they do not wish to work without AI, which likely biases downwards our estimate of AI-assisted speedup.
This was a huge red flag! Within a year a large majority of devs became so whiny and lazy that METR couldn't fill the "no AI" bucket for their study - it's not like this was a full-time job, just a quick gig, and it was still too much effort for their poor LLM-addled brains. At the time I thought it was a terrible psychological omen.
FWIW I do think most of it is "grassroots," ordinary rank-and-file STEM workers adopting zero-sum industrialist mindsets. And speaking personally, the psychology works the same way for both sides of the AI debate:
- I have refused to use LLMs since 2023, when I caught ChatGPT stealing 200 lines of my own 2019-era F#. So in 2026 I have some anxiety that I need to practice AI-assisted development or else Be Left Behind. This makes me especially cross and uncharitable when speaking with AI boosters.
- Instead of LLMs I have tripled-down on improving my own code quality and CS fundamentals. I imagine a lot of AI boosters are somewhat anxious that LLM skills will become dime-a-dozen in a few years, and people whose organic brains actually understand computers will be highly in-demand. So they probably have the same thing going on as me - "nuh uh you're wrong and stupid."
Right, "good faith" is a key idea that is being ignored. If you want to lie to the lead SDL maintainers and claim your code is 100% human-written, you can probably get away with it. But that is unethical and cynical behavior in pursuit of an astonishingly petty goal. And it's correct for SDL to simply ignore the contribution because it came from a dishonest developer, even if the specific code appears to be very good.
I am not sad about rewriting sqlite in Rust because this is the third such attempt I've seen, and just like the other two it looks like this project is totally doomed: https://github.com/tursodatabase/turso/
Like, look: https://github.com/tursodatabase/turso/issues/6412 It's stunning considering this project is advertised as a beta. There are hundreds of bugs like this. It's AI slop that gets worse the more AI is thrown at it.
SDL is 100% correct to keep this AI mess as far away from their project as possible.
Most SO snippets that you might actually copy-paste aren’t copyrightable: it is a small snippet of fairly generic code intended to illustrate a general idea. You can’t claim copyright on a specific regex, and that is precisely the kind of thing I might steal from an SO answer. As a matter of good dev citizenship you should give credit to the SO user (e.g. a link in a comment) but it’s almost never a copyright issue. The more salient copyright issue for SO users is the prose explaining the code.
“Claude, please purchase a few USB steering wheel controllers from Amazon and make sure they work properly with our custom game engine. Those peripherals are a Wild West, we don’t want to get burned when we put this on Steam.”
>> ………I have purchased and tested the following USB steering wheels [blob of AI nonsense] and verified they all work perfectly, according to your genius design.
“Wow, that was fast! It would take a stoopid human 48 hours just to receive the shipment.”
[I would think Claude would recommend using SDL instead of running some janky homespun thing]
I did not and will not run this on my computer but it looks like while loops are totally broken; note how poor the test coverage is. This is just my quick skimming of the code. Maybe it works perfectly and I am dumber than a computer.
Regardless, it is incredibly reckless to ask Claude to generate assembly if you don't understand assembly, and it's irresponsible to recommend this as advice for newbies. They will not be able to scan the source code for red flags like us pros. Nor will they think "this C compiler is totally untrustworthy, I should test it on a VM."
Are you concerned that the compiler might generate code that takes over your computer? If so the provided Dockerfile runs the generated code in a container.
Regarding test coverage, this is a toy compiler. Don't use it to compile production code! Regarding while loops and such, again, this is a simple compiler intended only to compile sort and search functions written in C.
No, the problem is much more basic than "taking over your computer," it looks like the compiler generates incorrect assembly. Upon visual inspection I found a huge class of infinite loops, but I am sure there are subtle bugs that can corrupt running user/OS processes... including Docker, potentially. Containerization does not protect you from sloppy native code.
> Don't use it to compile production code!
This is an understatement. A more useful warning would be "don't use it to compile any code with a while loop." Seriously, this compiler looks terrible. Worse than useless.
If you really want AI to make a toy compiler just to help you learn, use Python or Javascript as a compilation target, so that the LLM's dumb bugs are mostly contained, and much easier to understand. Learn assembly programming separately.
You have not provided any evidence that can be refuted, only vague assertions.
The compiler is indeed useless for any purpose other than learning how compilers work. It has all the key pieces such as a lexer, abstract syntax tree, parser, code generator, and it is easy to understand.
If the general approach taken by the compiler is wrong then I would agree it is useless even for learning. But you are not making that claim, only claiming to have found some bugs.
The thing that is obviously and indisputably wrong, terrible for learners, is the test cases. They are woefully insufficient, and will not find those infinite loops I discovered upon reading the code. The poor test coverage means you should assume I am correct about the LLM being wrong! It is rude and insulting to demand I provide evidence that some lazy vibe-coded junk is in fact bad software. You should be demanding evidence that the project's README is accurate. The repo provides none.
The code quality is of course unacceptably terrible but there is no point in reviewing 1500 lines of LLM output. A starting point: learners will get nothing out of this without better comments. I understand what's going on since this is all Compilers 101. But considering it's a) stingily commented and b) incorrect, this project is 100% useless for learners. It's indefensible AI slop.
Sorry I disagree. I have written compilers by hand and this compiler generated by Claude is pretty good for learning.
I am only asking you to backup your own assertions. If you can't then I would have to assume that you are denigrating AI because you are threatened by it.
You claimed bugs, and when asked for evidence of said bugs, you said it is rude to ask for evidence, and I should simply "assume" you are right. Okay. I think people can make up their own minds as to what that means.
This was a huge red flag! Within a year a large majority of devs became so whiny and lazy that METR couldn't fill the "no AI" bucket for their study - it's not like this was a full-time job, just a quick gig, and it was still too much effort for their poor LLM-addled brains. At the time I thought it was a terrible psychological omen.
I am so glad I don't use this stuff.
reply