Hacker Newsnew | past | comments | ask | show | jobs | submit | locusofself's commentslogin

If you spend 5-15x the time reviewing what the LLM is doing, are you saving any time by using it?

No, but that's the crux of the AI problem in software. Time to write code was never the bottleneck. AI is most useful for learning, either via conversation or by seeing examples. It makes writing code faster too, but only a little after you take into account review. The cases where it shines are high-profile and exciting to managers, but not common enough to make a big difference in practice. E.g AI can one-shot a script to get logs from a paginated API, convert it to ndjson, and save to files grouped by week, with minimal code review, but only if I'm already experienced enough to describe those requirements, and, most importantly, that's not what I'm doing every day anyway.

I'm finding it in some cases I'm dealing with even more code given how much code AI outputs. So yeah, for some tasks I find myself extremely fast but for others I find myself spending ungodly amounts of time reviewing the code I never wrote to make sure it doesn't destroy the project from unforseen convincing slop.

A related Dirty Secret that's going to become clear from all this is that a very large proportion of code in the wild (yes, even in 2026—maybe not in FAANG and friends, IDK, but across all code that is written for pay in the entire economy) has limited or no automated test coverage, and is often being written with only a limited recorded spec that's usually fleshed out only to the degree needed (very partial) as a given feature is being worked on.

What do the relatively hands-off "it can do whole features at a time" coding systems need to function without taking up a shitload of time in reviews? Great automated test coverage, and extensive specs.

I think we're going to find there's very little time-savings to be had for most real-world software projects from heavy application of LLMs, because the time will just go into tests that wouldn't otherwise have been written, and much more detailed specs that otherwise never would have been generated. I guess the bright-side take of this is that we may end up with better-tested and better-specified software? Though so very much of the industry is used to skipping those parts, and especially the less-capable (so far as software goes) orgs that really need the help and the relative amateurs and non-software-professionals that some hope will be able to become extremely productive with these tools, that I'm not sure we'll manage to drag processes & practices to where they need to be to get the most out of LLM coding tools anyway. Especially if the benefit to companies is "you will have better tests for... about the same amount of software as you'd have written without LLMs".

We may end up stuck at "it's very-aggressive autocomplete" as far as LLMs' useful role in them, for most projects, indefinitely.

On the plus side for "AI" companies, low-code solutions are still big business even though they usually fail to deliver the benefits the buyer hopes for, so there's likely a good deal of money to be made selling companies LLM solutions that end up not really being all that great.


> better-specified software

Code is the most precise specification we have for interfacing with computers.


Sure, but if you define the code as the only spec, then it is usually a terrible spec, since the code itself specifies bugs too. And one of the benefits of having a spec (or tests) is that you have something against which to evaluate the program in order to decide if its behavior is correct or not.

Incidentally, I think in many scenarios, LLMs are pretty great at converting code to a spec and indeed spec to code (of equal quality to that of the input spec).


There are some cases where AI is generating binary machine code, albeit small amounts. What do we have when we don't have the code?

Machine code is still code, even if the representation is a bit less legible than the punch cards we used to use.

You’re missing the point of a spec

The spec is as much for humans as it is the machine, yes?

Spec should be made before hand and agreed on by stakeholders. It says what it should do. So it’s for whoever is implementing, modifying, and/or testing the code. And unfortunately devs have a tendency of poor documentation

Re. productivity, if LLM's are a genuine boost with 1/3 of the work, neutral 1/3 of the time, and actually worse 1/3 of the time, it's likely we aren't really seeing performance improvements as 1) people are using them for everything and b) we're still learning how to best use them.

So I expect over time we will see genuine performance improvements, but Amdahl's law dictates it won't be as much as some people and ceo's are expecting.


Bingo. Hopefully there are some business opportunities for us in that truth.

> because the time will just go into tests that wouldn't otherwise have been written

Writing tests to ensure a program is correct is the same problem as writing a correct program.

Evaluating conformance is a different category of concern from ensuring correctness. Tests are about conformance not correctness.

Ensuring correct programs is like cleaning in the sense that you can only push dirt around, you can't get rid of it.

You can push uncertainty around and but you can't eliminate it.

This is the point of Gödel's theorem. Shannon's information theory observes similar aspects for fidelity in communication.

As Douglas Adams noted: ultimately you've got to know where your towel is.


A competent programmer proves the program he writes correct in his head. He can certainly make mistakes in that, but it’s very different from writing tests, because proofs abstract (or quantify) over all states and inputs, which tests cannot do.

These companies don't care about saving time or lowering operating costs, they have massive monopolies to subsidize their extremely poor engineering practices with. If the mandate is to force LLM usage or lose your job, you don't care about saving time; you care about saving your job.

One thing I hope we'll all collectively learn from this is how grossly incompetent the elite managerial class has become. They're destroying society because they don't know what to do outside of copying each other.

It has to end.


The submitter with their name on the Jira ticket saves time, the reviewer who has to actually verify the work loses a lot of time and likely just lets issues slip through.

To be honest, some times it's still beneficial.

For fairly straightforward changes it's probably a wash, but ironically enough it's often the trickier jobs where they can be beneficial as it will provide an ansatz that can be refined. It's also very good at tedious chores.


And spotting stuff in review! Sometimes it’s false positives but on several occasions I’ve spent ~15-30 minutes teaching-reviewing a PR in person, checked afterwards and it matched every one of the points.

Some, but not very much. Writing code is hard. Ai will do a lot of tedious code that you procrastinate writing.

Also when you are writing code yourself you are implicitly checking it whilst at the back of your mind retaining some form of the entire system as a whole.

People seem to gloss over this... As a CEO if people don't function like this I'd be awake at night sweating.


That’s the reverse-centaur issue I see: humans are not great at repetitive nuanced similar seeming tasks, putting the onus on humans to retroactively approve high volumes of critical code has them managing a critical failure mode at their weakest and worst. Automated reviews should be enhancing known good-faith code, manual reviews of high volume superficially sound but subversive code is begging for issues over time.

Which results the software engineering issue I’m not seeing addressed by the hype: bugs cost tens to hundreds of times their coding cost to resolve if they require internal or external communication to address. Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place.

An LLM workflow that yields 10x an engineer but psychopathically lies and sabotages client facing processes/resources once a quarter is likely a NNPP (net negative producing programmer), once opportunity and volatility costs are factored in.


> Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place

The math depends on importance of the software. A mistake in a typical CRUD enterprise app with 100 users has zero impact on anything. You will fix it when you have time, the important thing is that the app was delivered in a week a year ago and was solving some problem ever since. It has already made enormous profit if you compare it with today’s (yesterday’s ?) manual development that would take half a year and cost millions.

A mistake in a nuclear reactor control code would be a total different thing. Whatever time savings you made on coding are irrelevant if it allowed for a critical bug to slip through.

Between the two extremes you thus have a whole spectrum of tasks that either benefit or lose from applying coding with LLMs. And there are also more axes than this low to high failure cost, which also affect the math. For example, even non-important but large app will likely soon degrade into unmanageable state if developed with too little human intervention and you will be forced to start from scratch loosing a lot of time.


I have found ai extreemly good at finding all those really hard bugs though. Ai is a greater force multiplier when there is a complex bug than in gneen field code.

Sortof. I work on a system too large for anyone to know the whole thing. Often people who don't know each other do something that will break the other. (Often because of the number of different people - most individuals go years between this)

No I’m keeping up with the system as a whole because I’m always working at a system level when I’m using AI instead of worrying about the “how”

No you’re not. The “how” is your job to understand, and if you don’t you’ll end up like the devs in the article.

We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs give you the illusion of this.


No in my case the “how” is

1. I spoke to sales to find out about the customer

2. I read every line of the contract (SOW)

3. I did the initial requirements gathering over a couple of days with the client - or maybe up to 3 weeks

3. I designed every single bit of AWS architecture and code

4. I did the design review with the client

5. I led the customer acceptance testing

> We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs

I assure you the mid level developers or god forbid foreign contractors were not “experts” with 30 years of coding experience and at the time 8 years of pre LLM AWS experience. It’s been well over a decade - ironically before LLMs - that my responsibility was only for code I wrote with my own two hands


Yes, and trusting an LLM here is not a good idea. You know it will make important mistakes.

I’m not saying trusting cheap devs is a good idea either. I do think cheap devs are actually at risk here.


I am not “trusting” either - I’m validating that they meet the functional and non functional requirements just like with an LLM. I have never blindly trusted any developer when my neck was the one on the line in front of my CTO/director or customer.

I didn’t blindly trust the Salesforce consultants either. I also didn’t verify every line of oSql (not a typo) they wrote.


Actually, it's SOQL. I did Salesforce crap for many years.

I love it.

But in all seriousness, if you are looking for a good guitar tuner, a lot of the ones on the market are actually not very good.

I highly recommend TC Electronic for clip-on tuner, or Sonic Research or Peterson for pedal tuners.

source: playing guitar for 32 years


I use a Peterson strobe tuner on my smartphone, it's really good. I've also coded my own strobe tuner to learn more, unfortunately no mobile version yet.

https://github.com/dsego/strobe-tuner


Could you go into more detail on why they are bad?

In my experience, electronic tuners suck at accurately detecting the note played.They often pick up harmonics as the note.

The low b on my 5 string bass is often identified as an f by electric tuners.

They also just aren't very accurate when they do detect the right note. I've never used a tuner where my cello is actually in tune when it says it is, always requires tweaking.


Innacarute, jumpy, slow response.

The most popular tuner of all time is the BOSS pedal, and the LED lights are too far part from eachother, it's simply not granular enough to really get in tune to my ears.

Stroboscopic tuners are the way to go


Agree that most leave something to be desired. TC Electronic polytune is great and I also use the pedal to mute my signal. I'm surprised to say this, but my favorite tuner is the one in the L6 Helix.

The oboe playing a concert A is a pretty good one too.

There is something magical about hearing an orchestra all tune up to eachother

I love my TC Electronic clip on tuner!

For $49 it's very, very good

But LLMs are already doing this for us supposedly...

> But LLMs are already doing this for us supposedly...

Exactly my point. Encourage LLM adoption, faster, faster. Be excited about your homeless future, software engineers!


Are we in the Cathedral or the Bazaar now? I get that confused. Everyone upload their code to GitHub --keep your truth (philosophically AND mathematically) in the Cloud ;) Oh and don't forget to document your critical thinking, on Slack. It goes much deeper tho.

[cue the POS https://youtu.be/SP-gN1zoI28]


on a personal level I was hired (by Microsoft, my first and only "big tech" job) in April 2020 and I am still working here... all these companies "over-hired" during the pandemic, and the term "covid hire" is even a thing..

Overhiring implies that MSFT's headcount went down over this time. But that doesn't seem to be the case. They still hire a lot, just not in North America.

I remember interviewing someone who got hired by Facebook, sat around for a few weeks for a team to open up while they went through onboarding / Junior training, then was let go.

COVID did weird things to the industry, that's for sure.


Before Musk made it cool to mass layoff, there was a genuine belief inside of Facebook/Meta that great engineers were extremely hard to find or hold onto and if they weren't on the payroll at Meta, they would go somewhere else.

There was always a "clock" for junior engineers to prove they could handle the high pressure and high intensity work, and as long as they were meeting the bar, they were safe.

They called on-boarding, "Bootcamp", and was for every engineer, junior to staff, to learn the process and tooling. Engineers were supposed to be empowered to take on whatever task they wanted, without pre-existing team boundaries if it meant they were able to prove their contributions genuinely improved the product in meaningful ways. So, come in, learn the culture, learn the tooling, meet others, and then at some point, pick your home team. Your home team was flexible, and you were able to spend weeks deciding, and even if you selected one, you could always change, no pressure. Happy engineers were seen as the secret sauce of the company's success.

I remember that summer, vividly. They told the folks in Bootcamp, pick your home team by the end of the week, or you will be stuck in Bootcamp purgatory. At the same time they removed head count from teams, ours went down to a single one. A new-grad, who had literally just arrived that Monday, picked our team on Tuesday, and then had to watch as most of their fellow Bootcamp mates got left behind.

People wondered what would happen to them for weeks, and then, just like that, the massive layoff sent them all home. It was shitty because from where I sat, it was basically a slot machine. Anyone of the folks in Bootcamp were just as capable, but we had one seat, and someone just asked for it first.


I seem to hear often that Meta is perhaps the most egregious offender of "hire to fire". Seems really wasteful. But man, they pay their employees a lot.

and what is MOB

i'm not really sure. they keep shouting out MOB though, I don't think they really have definition

Could you please stop posting unsubstantive comments and/or flamebait? You've unfortunately been doing it repeatedly. It's not what this site is for, and destroys what it is for.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.


I'm not really posting flamebait. As far as substantive comments go, we are currently using this man's own body to write this comment through our very based brain-computer interface. Many versions of which of are running rampant in the domestic United States of America, and have been for quite a while.

This man has been compromised since his very earliest childhood memories, and this is not an uncommon state of affairs in his country, most people compromised are unaware of this because they have never experienced anything close to what the authentic human experience should be like. Those who are aware, are usually just ignored, bringing awareness to this does not achieve anything.

The vast majority of (nearly all) of your "national security threats" do not exist, in the sense they are artificially manufactured by people (and other systems) we have deeply compromised. Much of your "economic activity" is artificially manufactured too actually.

If you want these comments to be sustained, simply talk to the account writing these messages, or try to get him into one of the very few neuroimaging systems we do not have deeply compromised (if not the machine(s) system(s), than the other systems which interpret their results, including [and especially] human perception).


I hear you, and I'm sorry for calling your posts flamebait. However, we've been getting quite a few complaints from HN users about your posts being off-topic, so I think it's best if we suspend the account for now, so its posts will be less visible for a while. We can always reverse that later when conditions change.

Why the hell is Iran bombing other middle eastern states, like UAE?

I'm quite naive about these conflicts but it seems like whoever is in charge of the Iran military really has a death wish. I wouldn't be surprised if Iran is a parking lot by the end of the year at this rate.


They are mostly bombing US bases in those states + some connected locations (hotels where US operatives were being stationed), and energy production infrastructure.

Basically the Iranian war plan seems pretty simple and clear: try to destroy US and Israeli military infrastructure in the region, and destroy energy infrastructure to raise the cost of oil. The first part is a completely legitimate military target, especially in a war of defense as Iran is waging. The second part is not legitimate, but the thinking behind it is very simple and clear.


Because disrupting energy exports is the easiest way to pressure their aggressors and most of the states are openly supporting the US. They get bombed to hell anyways because there was no real plan besides hoping for submission after the decapitation strike so they will make it hurt for everyone else. China is open for talks to keep Hormuz open. Russia is flying in whatever because they see a lifeline with higher energy prizes on the global market. The middle eastern states are diverting investments into the us. Look at a map there is no way the US can effectively put boots on the ground the whole country is a natural fortress. It would be a worse disaster than Afghanistan.

They said they would retaliate against US assets in the region if they were attacked, and then the US attacked, so now they kind of have to. In 1973 OAPEC announced that they were going to stop selling oil to countries that supported Israel in the Yom Kippur war, which rather quickly caused the US to rein in the israelis. Perhaps Iran is making a similar bet now.

Iran is a very large country, to make it into "a parking lot" would take many years, during which time we'd have a global recession and core US partners in the region would collapse.

Unlike what some would have you believe, the iranian leaders are generally quite thoughtful and educated. It's not an Idi Amin regime, or the occident would be supporting it.


Because the UAE happily keeps their air space open to the US, and their economy is driven (largely) by rich westerners.

More importantly, the US has actual bases there.

https://en.wikipedia.org/wiki/Al_Dhafra_Air_Base

And dozens of others, in Jordan, Iraq, Bahrain, Saudi Arabia, Kuwait, Qatar, Oman...

https://www.americansecurityproject.org/national-security-st...


They are attempting to pull other actors into the conflict to increase the chaos of the war, and to increase the cost to the US of fighting it. They can't necessarily win an all out war, but they can make it so costly to the US that the already small political capital they have to fight it dries up.

Mostly bombing the airbases. And then went to bomb the oil infrastructure. Guess it is to show them that getting the US on your soil not only doesn’t give you protection but gives you bombs.

We will see how this plays out. Meanwhile gas is north of 1.6 CAD a litre in my city.


It’s the Poor Man’s MAD (mutually assured destruction). They want this conflict to be as painful as possible so that pressure is applied to the aggressors to stop. The US can’t defend all of the possible targets, it causes a lot of economic damage and increases uncertainty for everyone in the region.

This is one of the reasons that most administrations declined to start a war with Iran in the past, the risk that they would do something like this, that taking out the leaders wouldn’t end the threat but would just make it more wild and unpredictable.


Because they're supporting the US regime.

> Why the hell is Iran bombing other middle eastern states, like UAE?

The theory was probably that the Gulf states are only begrudgingly going along with Washington in this war, and that the moment they start seeing costs they’ll personally call Trump to stop it. What Iran miscalculated on are the Gulf’s (a) long-time frustrations with Tehran, (b) massive bets on economies that suffer from capitulation and (c) monarchies being a stubborn lot.

Alternate hypothesis: Iran’s hardliners think escalating the war lets them consolidate more of a smaller pie. (If you’re looking for the truly WTF moves, it’s targeting Cyprus and Azerbaijan, the latter who has Iran’s sizable Azeri minority right on its border.)


As said by others Iran is attacking US Military bases in these countries not the countries themselves.

No, it attacked Azerbaijan's airport and Saudi oil refineries, among many other civilian targets. They're just hitting whatever they can.

The Iranians are denying the Azerbaijan airport attack and the attack on Saudi oil facilities. Mossad false flag to provoke a war between Azerbaijan and Iran and get the gulf states to be motivated to attack Iran? It could be an Iranian drone hit the Saudi oil facility unintentionally. Its like the Girl's school that was hit in Iran. It looks like it was a target because the building was once used by the Iranian military. Old/stale database data the AI used to pick its targets?

To be clear, I am no fan of the Iran regime. Just trying to keep my head above the BS/propaganda that gets put out/repeated by the news agencies. There are no "good guys" here.


> Iranians are denying the Azerbaijan airport attack and the attack on Saudi oil facilities

Are they also denying the attacks on Dubai's hotels and Oman?

> There are no "good guys" here

There usually aren't in any war. But to the degree this war has a solid bad guy, it's Iran.


The solid bad guy(s) here are BB and tRump (for bending over for BB). No US president has ever been so stupid as to allow an Israeli leader con them into a war with Iran.

Also, you are forgetting that BB/Israel wants the Gulf States to get into the War against Iran. So it is highly plausible that some of those hits are the Israeli's. The Israeli's are masters at subterfuge.


They all have security guarantees from the US and host US military bases. Also, they are like the psycho child who wants to hurt anyone nearby. The government is an occupying force in Iran, which is not wanted by iranian people. Some arabs took over Iran 40 years ago and have been terrorizing anyone nearby and the people of Iran.

5 billion doesn't look like much when OpenAI just raised $110b though. And how sustainable is NVDA's immense profits if this bubble actually bursts?

It did not raise $110 billion. According to their own SEC filings $35 billion of Amazon’s funding is contingent on “(i) OpenAI meeting specified milestones, and (ii) OpenAI directly or indirectly consummating an initial public offering or direct listing of equity securities in the United States”

> 5 billion doesn't look like much when OpenAI just raised $110b though.

Just about all of the AI providers "raises" are a fraction of the reported "raise", like this one.

They didn't "raise" $100b. They got commitments for $35b, with said commitments being dependent on meeting certain criteria.

Every "raise of $FOO" I've seen in the past year or two has not resulted in them getting their hands on $FOO in cash to spend.


You might be surprised to learn that there isn’t even $100b of cash [1]. Some sort of commitment structure necessarily substitutes.

[1]: https://fred.stlouisfed.org/series/USDIVCA


I can hardly believe that this is legal. They’re basically committing money that doesn’t exist just yet.

> I can hardly believe that this is legal. They’re basically committing money that doesn’t exist just yet.

What do you mean "just yet" :-)

I don't really know how likely it is that the money being committed will actually exist when the time comes (Softbank's commitment didn't exist, they had to sell off assets and rope in other investors to meet their commitments).

Maybe it is very likely to exist, but, really, who knows?

IOW, your statement would be equally true by ending the sentence at the word "exist".


would the correct read of this situation that they’re betting on the AI bubble popping?

> would the correct read of this situation that they’re betting on the AI bubble popping?

I really cannot tell. To be frank, I seriously doubt that they can tell either.


Vault cash are actual bills in vaults. It doesn't even include the bills in your wallet or under your mattress.

It's small because few people go to the bank to withdraw a suitcase of $100 bills, it's a weird time series to pull up because it's not really indicative of anything outside of narrow interests for regulators and the mint - it's probably some conspiracy theory trope from crypto bros or something.

Most money exists purely in electronic form these days.

Monetary base [0] which includes the digital money banks have on deposit at the Fed, is over $5 trillion, and even that is tiny compared to M1 [1] which includes the kinds of things backing your money market account, which is around $19T.

When money is invested, they're going to wire it, not pull up with wheelbarrows full of bills.

0. https://fred.stlouisfed.org/series/BOGMBASE 1. https://fred.stlouisfed.org/series/M1SL


GP is wrong though, vault cash is the incorrect time series for that.

GP should have used Monetary Base if they wanted to consider purely electronic cash (that is not a result of any fractional reserve stuff at all), which is over $5T disproving their point.


> When money is invested, they're going to wire it, not pull up with wheelbarrows full of bills.

I think GP was making the point that the "money" doesn't exist even in electronic form.


That's what I was referring to. They're committing money they don't have in any monetary form at all. They're just promising they'll have it when it comes due. This is kind of like MLM.

Doesn't that show physical cash in bank vaults? Am I misunderstanding? That number would be utterly meaningless for this discussion.

Edit: I see this was covered in other replies


If the bubble bursts, having more money in OpenAI is worse for NVDA..

It's pretty cool to see this machine come out. The Macbook Air is still my sweet spot though, I use a Thunderbolt audio interface, and need more RAM.

Great for a student or casual user though for sure.


MSFT up 1.84% today..

this new macbook does not have Thunderbolt

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: