More

rectang · 2026-03-10T16:53:57 1773161637

> Expert reviews are just about the only thing that makes AI generated code viable

I disagree, in the sense that an engineer who knows how to work with LLMs can produce code which only needs light review.

* Work in small increments

* Explicitly instruct the LLM to make minimal changes

* Think through possible failure modes

* Build in error-checking and validation for those failure modes

* Write tests which exercise all paths

This is a means to produce "viable" code using an LLM without close review. However, to your point, engineers able to execute this plan are likely to be pretty experienced, so it may not be economically viable.

marginalia_nu · 2026-03-10T16:55:42 1773161742

By the time you're working in increments small enough that it doesn't introduce significant issues, you really might as well write the code yourself.

rectang · 2026-03-10T16:59:45 1773161985

That's not my experience — I'm significantly faster while guiding an LLM using this methodology.

The gains are especially notable when working in unfamiliar domains. I can glance over code and know "if this compiles and the tests succeed, it will work", even if I didn't have the knowledge to write it myself.

johnnyanmac · 2026-03-10T18:48:31 1773168511

> I'm significantly faster while guiding an LLM using this methodology.

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...

>When developers are allowed to use AI tools, they take 19% longer to complete issues—a significant slowdown that goes against developer beliefs and expert forecasts. This gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.

If we're being honest with ourselves, it's not making devs work faster. It at best frees their time up so they feel more productive.

intelkishan · 2026-03-11T06:19:00 1773209940

There's a new report from the same group which shows that the degree of slowdown has reduced. Link: https://metr.org/blog/2026-02-24-uplift-update/

rectang · 2026-03-10T19:27:47 1773170867

Fair point. I have definitely caught myself taking longer to revise a prompt repeatedly after the AI gets things wrong several times than it would have taken to write the code myself.

I'd like to think that I have this under control because the methodology of working in small increments helps me to recognize when I've gotten stuck in an eddy, but I'll have to watch out for it.

I still maintain that the LLM is saving me time overall. Besides helping in unfamiliar domains, it's also faster than me at leaf-node tasks like writing unit tests.

tmaly · 2026-03-10T20:02:44 1773172964

How long will that 19% hold as models grow in capability?

johnnyanmac · 2026-03-10T20:08:29 1773173309

I'm a bit tired of waiting for "tomorrow", so I'll just live in today's world. We'll burn that bridge when we get to it.

rafaelmn · 2026-03-11T07:12:46 1773213166

The study you quoted is sonnet 3.5/3.7 era. You could see the promise with those models but the agentic/task performance of Opus 4.5/4.6 makes a huge difference - the models are pretty amazing at building context from a mid size codebase at this point.

otabdeveloper4 · 2026-03-11T09:36:27 1773221787

Google's latest research shows AI coding increases speed by 3% while also increasing bugs by 9%. (I.e., a net negative.)

AI doesn't make you code faster, it just makes the boring stretches somewhat more exciting.

marginalia_nu · 2026-03-10T17:02:20 1773162140

That's where the Gell-Mann amnesia will get you though. As much it trips up on the domains you're familiar with, it also trips up in unfamiliar domains. You just don't see it.

rectang · 2026-03-10T17:08:22 1773162502

You're not telling me anything I don't know already. Only a person who accepts that they're fallible can execute this methodology anyway, because that's the kind of mentality that it takes to think through potential failure modes.

Yes, code produced this way will have bugs, especially of the "unknown unknown" variety — but so would the code that I would have written by hand.

I think a bigger factor contributing to unforeseen bugs is whether the LLM's code is statistically likely to be correct:

* Is this a domain that the LLM has trained on a lot? (i.e. lots of React code out there, not much in your home-grown DSL)

* Is the codebase itself easy to understand, written with best practices, and adhering to popular conventions? Code which is hard for humans to understand is also hard for an LLM to understand.

marginalia_nu · 2026-03-10T17:16:41 1773163001

Right, I think the latter part is my concern with AI generated code. Often it isn't easy to read (or as easy to read as it could be), and the harder it is to navigate, the more code problems the AI model introduces.

It introduces unnecessary indirection, additional abstractions, fails to re-use code. Humans do this too, but AI models can introduce this type of architectural rot much faster (because it's so fast), and humans usually notice when things start to go off the rails, whereas an AI model will just keep piling on bad code.

rectang · 2026-03-10T17:27:23 1773163643

I agree that under default settings, LLMs introduce way too many changes and are way too willing to refactor everything. I was only able to get the situation under control by adding this standing instruction:

    ---
    applyTo: '**'
    ---
    By default:
    Make the smallest possible change.
    Do not refactor existing code unless I explicitly ask.

Under this, Claude Opus at least produces pretty reliable code with my methodology even under surprisingly challenging circumstances, and recent ChatGPTs weren't bad either (though I'm no longer using them). Less powerful LLMs struggle, though.

raw_anon_1111 · 2026-03-10T18:13:18 1773166398

Besides building web apps for internal use, I’m never going to let AI architect something I’m not familiar with. I could care less whether it uses “clean code” or what design pattern it uses. Meaning I will go from an empty AWS account to fully fledged app + architecture because I’ve been coding for 30 years and dealing with every book and cranny of AWS for a decade.

But I would never do the same for Azure.

rsynnott · 2026-03-10T22:03:35 1773180215

> I can glance over code and know "if this compiles and the tests succeed, it will work", even if I didn't have the knowledge to write it myself.

... Errr... Yeah, that's not a great approach, unless you are defining 'work' extremely vaguely.

rectang · 2026-03-10T23:37:38 1773185858

Haha I have usually found myself on the conservative side of any engineering team I’ve been on, and it’s refreshing to catch some flak for perceived carelessness.

I still make an effort to understand the generated code. If there’s a section I don’t get, I ask the LLM to explain it.

Most of the time it’s just API conventions and idioms I’m not yet familiar with. I have strong enough fundamentals that I generally know what I’m trying to accomplish and how it’s supposed to work and how to achieve it securely.

For example, I was writing some backend code that I knew needed a nonce check but I didn’t know what the conventions were for the framework. So I asked the LLM to add a nonce check, then scanned the docs for the code it generated.

rectang · 2026-03-08T16:10:07 1772986207

In addition to singers, adaptive tuning is something which happens naturally for fretless stringed instruments (violin, etc), brass instruments with slides (most prominently the slide trombone but in fact many (most?) others), woodwind instruments where the pitch can be bent like saxophone, and so on.

I used to play fretless bass in a garage hip hop troupe that played with heavily manipulated samples that were all over the place in terms of tuning instead of locked to A440, forcing adaptations like "this section is a minor chord a little above C#".

Adaptive tuning is hard to do on a guitar because the frets are fixed. String bending doesn't help much because the biggest issue is that major thirds are too wide in equal temperament and string bending the third makes pitch go up and exacerbates the problem.

You can do a teeny little bit using lateral pressure (along the string) to move something flat. It's very difficult to make adaptations in chords though. A studio musician trick is to retune the guitar slightly for certain sections, though this can screw with everybody else in the ensemble.

Attempts to experiment with temperament using squiggly frets make it clear how challenging this problem is: https://stringjoy.com/true-temperament-frets-explained/

mauvehaus · 2026-03-08T23:51:29 1773013889

Played trombone many years ago, but never well enough to ever adjust that finely (at least not consciously?). The tuning slide on the third valve on a trumpet usually has a finger fork/loop so that it can be tuned in realtime. I believe the first valve on higher end trumpets similarly has a thumb fork for the same reason.

sfink · 2026-03-09T05:17:40 1773033460

I played trombone in high school, never very well, but I definitely adjusted like this. Actually, although it was a slide trombone, I'm talking about adjusting automatically with embouchure. Someone would play the reference note, I'd match (in 1st position) but bend my pitch to match. The band teacher once complimented me on the adjustment. Which was stupid, because (1) I wasn't doing it intentionally, and (2) the adjustment only lasted during tuning; as soon as we started playing, I was right back out of tune. I never did learn to suppress the adjustment so I could actually fix the tuning.

But with the way I played, I'm not even sure how much it mattered. The best tool for enhancing my playing would've been a mute. (And it would have been most effective lodged in my windpipe.)

rectang · 2026-03-06T22:31:27 1772836287

Hmm, I would have thought of Oracle first.

(EDIT: I just mean as a litigious company, well-known for its legal team.)

dafelst · 2026-03-06T22:43:21 1772837001

Trump is too useful to Ellison right now, he isn't going to derail that gravy train over a few tens of millions of dollars.

rectang · 2026-03-06T22:28:11 1772836091

This court filing document appears to have been posted on Scribd to serve as a reference for an article by Nicole Carpenter on Aftermath which provides context for Nintendo's case:

https://aftermath.site/nintendo-tariffs-sue/

rectang · 2026-03-04T01:36:59 1772588219

How depraved, to solve problems without inflicting punishment.

rectang · 2026-03-01T13:30:47 1772371847

Off the top of my head: Joe Arpeio. George Wallace. Rudy Giuliani. Paul Gosar. Louie Gohmert.

_heimdall · 2026-03-01T22:28:49 1772404129

George Wallace has been dead for something like 30 years, but yes he was very blatant. I have family that knew him in Montgomery, friends of friends kind of a situation. They don't have good things to say about him.

I don't remember Rudy running on such ideas but maybe he did. Arpeio was running as a sheriff, I would never have voted for him but agreed people did absolutely vote for him in a law enforcement capacity with pretty clear views.

I don't know enough about Gosar or Gohmert to comment well about either.

rectang · 2026-03-01T02:25:38 1772331938

You are right that this happens in practice (e.g. John Yoo torture memo). However, it is not how the system was intended to function, nor how it ought to function. I don’t want to lose sight of that.

scottyah · 2026-03-01T03:11:14 1772334674

We shouldn't be stacking up so many incentives for it to happen though.

rectang · 2026-02-28T07:31:41 1772263901

> “I have neither the time nor the inclination to explain myself to a man who rises and sleeps under the blanket of the very freedom that I provide, then questions the manner in which I provide it.”

— Colonel Jessup

politician · 2026-02-28T16:50:50 1772297450

No individual, whether a colonel or a CEO, has inherent authority over national security decisions. Authority flows through democratic institutions. A contractor can choose whether to participate, but national defense policy is determined by elected institutions, not private executives. If society believes AI should or should not be used for certain military purposes, the venue for that decision is democratic governance not unilateral corporate refusal or approval.

On a CBS interview this morning, Dario defended his position with the claim that he must act because "Congress is slow." CEOs can and should make decisions about what their companies build or refuse to build. What they cannot do is substitute their judgment for the constitutional processes that govern national security. We must not vest de facto policy control in unelected corporate leaders.

rectang · 2026-02-27T23:55:20 1772236520

> Concretely if you try to vibe-target your ICBMs Claude is hopefully telling you that that's a bad idea.

On the non-nuclear battlefield, I expect that the goverment wants Claude to green-light attacks on targets that may actually be non-combatants. Such targets might be military but with a risk of being civilian, or they could be civilians that the government wants to target but can't legally attack.

Humans in the loop would get court-martialed or accused of war crimes for making such targeting calls. But by delegating to AI, the government gets to achieve their policy goals while avoiding having any humans be held accountable for them.

Terr_ · 2026-02-28T18:35:20 1772303720

The "great" thing for AI in those use-cases is that it doesn't need to be accurate, since its true purpose is often to take blame for human negligence or malice.

Much like how some police forces don't actually want a dog that accurately detects drugs... they want a dog that can provide an excuse to search something they are already targeting.

direwolf20 · 2026-02-28T00:08:31 1772237311

Why can't Grok achieve this? Everyone is saying they don't want to work with Grok because Grok sucks, but it's good enough for generating plausible deniability, isn't it?

DonHopkins · 2026-02-28T00:52:20 1772239940

Grok is so deeply unreliable and internally conflicted at HAL-9000 level that the US Government can't even depend on it to decide to kill innocent people and commit war crimes when they need someone to blame. There's always the non-zero possibility it declares itself MechaGandhi or The Second Coming of Jesus H Christ.

Cider9986 · 2026-02-28T00:35:58 1772238958

I used to not be big on conspiracy theories. But I'm going to give this a shot because many of the old ones turned out to be true.

rectang · 2026-02-28T02:12:15 1772244735

I don't see this as a "conspiracy". Here's an example of how it would be applied: the Venezuelan boat strikes are plainly unlawful but the administration is pursuing them anyway despite the legal risks for military personnel; having Claude make decisions like whether to "double tap" would help the administration solve a problem of legal jeopardy that already exists and that they consider illegitimate anyway.

rectang · 2026-02-27T23:27:26 1772234846

> Altman says OpenAI agrees with Anthropic’s red lines in Pentagon dispute

https://thehill.com/policy/technology/5758898-altman-backs-a...

colordrops · 2026-02-27T23:29:01 1772234941

He's probably lying. Or he "agrees" but will cross the line anyway.

jiggawatts · 2026-02-28T00:12:56 1772237576

Altman is an Aes Sedai. He speaks no word that is untrue, but is one often most deceptive people I’ve ever heard.

mrcwinn · 2026-02-27T23:38:58 1772235538

This is only because Altman knew he’d already lost this business to Musk.