I think the issue was with incomplete context. Even before the original METR stu...

SpicyLemonZest · 2026-02-25T07:41:44 1772005304

I tried the "don't look too closely" thing for the first time last week. I got immediately humiliated when a reviewer asked why my commit was trying to replace the correct, elegant usage of an API the class was named after with a 4-line long franken-command using a different API with incorrect semantics. It's not like I'm not trying the new stuff, on a subjective level I think AI coding is really neat, but I just can't ever figure out how to map what I get to the stories I hear.

ben_w · 2026-02-25T10:35:43 1772015743

It depends what you're measuring.

Don't get me wrong, my experiments with true-vibe-coding (i.e. don't even look at the code) are as yours, that the result is somewhat mediocre*.

For some cases, and I try to push beyond the limits of what LLMs can do in order to find those limits, they suck. I'd describe the output as like that of an overenthusiastic junior who reinvents the wheel badly rather than using standard approaches even when told to.

For other cases, I know that mediocre code is actually good enough: well before LLMs happened, I've seen mediocre code that still resulted in the app itself being given meaningful public accolades.

* Though, as per previous comment of mine, I can't help notice that the mediocrity is doing more and more of my previous career: https://news.ycombinator.com/item?id=46989102

pudsbuds · 2026-02-25T08:14:10 1772007250

You just have to give up and drink the koolaid...

But for real... My company started tracking commits per hour as a metric so I just commit as many times as I can. I don't get the luxury of even looking at my work now. They say it's faster but I've never seen so much tech debt delivered so quickly in my life.

Its going to be an interesting few years...

mewpmewp2 · 2026-02-25T08:34:47 1772008487

Definitely need to stop squashing commits if that is the case! But no, seriously tracking git commit counts is absolutely ridiculous. Maybe you can have AI autonomously work on useless documentation that no one will read, with 1 commit per 100 lines of markdown?