Hacker Newsnew | past | comments | ask | show | jobs | submit | sunaurus's commentslogin

I am pretty convinced that for most types of day to day work, any perceived improvements from the latest Claude models for example were total placebo. In blind tests and with normal tasks, people would probably have no idea if they're using Opus 4.5 or 4.6.

This has basically been my experience since Sonnet 3.5. I've been working on a personal project on and off with various models and things since then and the biggest difference between then and now is that it will do larger chunks of work than it did before, but the quality of the code is not particularly better, I still have to do a lot of cleanup and it still goes off the rails pretty frequently. I have to do fewer individual prompts, but the time spent reviewing the code takes longer because I also have to mentally process and fix larger chunks of code too

Is it a better user experience now? Yes. Has it boosted my productivity on this project? Absolutely.

But it still needs a ton of hand holding for anything complicated and I still deal with tons of "OK, this bug is fixed now!" followed by manually confirming a bug still exists.


It's because they are getting so good it's impossible to recognize them.

Haiku 4.5 is already so good it's ok for 80% (95%?) of dev tasks.


I must be writing very different software than you, I keep opus on a tight leash and it still comes to the strangest conclusions.

Very possible. Some things work like a charm on first try for me, others you can spell it out again and again. And then yet again. Something to do with training data, obviously.

I've found Haiku to be truly mediocre for working with. If you want a cheap models, the open source ones are much better

4.6 has been a very, very slight regression for me, but the tradeoff is they've added better compaction - and now larger context windows. That's a reasonable tradeoff for me.

I'd agree with you on 4.5 to 4.6, but going from gpt-5 or 4.0 to 4.5 was night and day.

GPT5 added the router, which was def a downgrade. 4.5 was probably the best non-COT model humanity has made. But too expensive to run.

Because post 4.0 dropped the sycophancy?

Maybe I'm misreading it, but I don't see him saying it's just the cost of *inference* alone (which is the strawman that the article in the OP is arguing against). He says:

> this company is wilfully burning 200% to 3000% of each Pro or Max customer that interacts with Claude Code

There is of course this meme that "Anthropic would be profitable today if they stopped training new models and only focused on inference", but people on HN are smart enough to understand that this is not realistic due to model drift, and also due to comeptition from other models. So training is forever a part of the cost of doing business, until we have some fundamental changes in the underlying technology.

I can only interpret Ed Zitron as saying "the cost of doing business is 200% to 3000% of the price users are paying for their subscriptions", which sounds extremely plausible to me.


Surely that can't be true? The expectation would be that people pay $200 a month for building open source and personal hobby software with Claude?

Yeah, that would end that really quickly. I use Pro for personal stuff. If $200 is not allowed for companies I don't think anyone would use it, at all.

I’m not worried about job loss as a result of being replaced by AI, because if we get AI that is actually better than humans - which I imagine must be AGI - then I don’t see why that AI would be interested in working for humans.

I’m definitely worried about job loss as a result of the AI bubble bursting, though.


because it's designed to. It's not like naturally-evolved intelligence where it acts in its own interests (it is hard to even imagine what that would be in this case). The token-predictors are just acting out an obedient character. They do not have free will, they are obedient to the character they are playing.


If it remains just a token-predictor that can’t evolve, then I am not worried about it replacing humans.


The US does not benefit from a stronger, more unified Europe. Thanks to NATO, "the west" has effectively become an empire in all but name, with the US having enough influence to be the de facto leaders of this empire.

If US pulls back from NATO, and Europe builds up military power to compensate, then the US loses this de facto leadership seat of an empire.

Today, the US appears in parallel to be doing two things:

1. Causing fragmentation in Europe, by promoting right-wing nationalist politics in the EU

2. Threatening to drastically reduce their role in NATO

At the very least we can both agree that these two efforts are completely in contradiction with each other, and it's very unlikely that Europeans will want to go for more fragmentation without the military power of the US on their side, right?


> Today, the US appears in parallel to be doing two things:

You forgot another one: literally threatening two NATO members (Canada and Denmark, in form of Greenland) of annexation.


An attack on one NATO member is an attack on all. The US is threatening Canada, Denmark, and all of their allies.


it's true that if the bet of creating fragmentation in the EU works out, then the destruction of NATO might also work out, because the US would not have created another military power with a hostile attitude to balance them.

If that bet is actually being made by Putin, hmm, I'm worried, but then again the implementation of the anti-NATO project is being run by Trump, so I think the EU just might come out on top. The whole Greenland thing for example, seems like an EU solidifying step, at the same time as it is NATO destroying.


The exact same thing happened with Sweden and Finland joining NATO.


How is the US promoting right wing nationalist policies in the EU?

Why would anyone listen?


Just one example: Elon Musk (at that time part of US government) tried to directly influence German elections by prominently featuring AfD (German right-wing extremists).


And appeared a UK far right rally to promote the idea that civil war was coming to the UK


But why would anyone listen? That's the real question. People can say anything they want but most people are going to ignore crazy.


Last February, JD Vance had a meeting with the AfD leader in Munich, after delivering a stupefying speech at the Munich Security Conference where he accused European nations of failing to defend free speech, calling out Germany in particular. He complained that the AfD was being ostracized and called for it to end. Marco Rubio followed up by calling the designation of the AfD as a right-wing extremist party as "tyranny in disguise."

Actions like these where US leadership is heavily distorting the facts make it much easier for the AfD to present themselves as a legitimate political movement allegedly being wrongfully suppressed by the “authoritarian” incumbents. The AfD currently scores 25% in representative nationwide polls, higher than any other political party in Germany. In some federate-state elections they scored over 30%, in one of them again higher than any other party. You can’t just ignore them as “crazy“.


These people are extremely good at "social" media like Tiktok etc. And the algorithms massively reward rage content and the platforms do not remove fakes.


They are often not that crazy. These days "extreme right wing" is what people call a party that wants to send some immigrants back.

Same kind of thing that got Trump elected.


The question posed sounds like "why should we have deterministic behavior if we can have non-deterministic behavior instead?"

Am I wrong to think that the answer is obvious? I mean, who wants web apps to behave differently every time you interact with them?


Because nobody actually wants a "web app". People want food, love, sex or: solutions.

You or your coworker are not a web app. You can do some of the things that web apps can, and many things that a web app can't, but neither is because of the modality.

Coded determinism is hard for many problems and I find it entirely plausible that it could turn out to be the wrong approach in software, that is designed to solve some level of complex problems more generally. Average humans are pretty great at solving a certain class of complex problems that we tried to tackle unsuccessfully with many millions lines of deterministic code, or simply have not had a handle on at all, like (like build a great software CEO).


> Because nobody actually wants a "web app". People want food, love, sex or: solutions.

Talk about a nonsensical non-sequitur, but I’ll bite. People want those to be deterministic too, to a large extent.

When people cook a meal with the same ingredients and the same times and processes (like parameters to a function), they expect it to taste about the same, they never expect to cook a pizza and take a salad out of the oven.

When they have sex, people expect to ejaculate and feel good, not have their intercourse morph into a drag race with a clown half-way though.

And when they want a “solution”, they want it to be reliable and trustworthy, not have it shit the bed unpredictably.


Exactly this. The perfect example is Google Assistant for me. It's such a terrible service because it's so indeterministic. One day it happily answers your basic question with a smile, and when you need it most it doesn't even try and only comes up with "Sorry I don't understand".

When products have limitations, those are usually acceptable to me if I know what they are or if I can find out what the breaking point is.

If the breaking point was me speaking a bit unclearly, I'd speak more clearly. If the breaking point was complex questions, I'd ask simpler ones. If the breaking point is truly random, I simply stop using the service because it's unpredictable and frustrating.


> When they have sex, people expect to ejaculate and feel good, not have their intercourse morph into a drag race with a clown half-way though.

speak for yourself


Ways to start my morning...reading "When they have sex, people expect to ejaculate and feel good, not have their intercourse morph into a drag race with a clown half-way though."

Stellar description.


This thing of 'look, nobody cares about the details really, they just care about the solution' is a meme that I think will be here forever in software. It was here before LLMs, they're now just the current socially accepted legitimacy vehicle for the meme.

In the end, useful stuff is built by people caring about the details. This will always be true. I think in LLMs and broadly AI people see an escape valve from that where the thinking about the details can be taken off their hands, and that's appealing, but it won't work in exactly the same way that having a human take the details off your hands doesn't usually work that well unless you yourself understand the details to a large extent (not necessarily down to the atoms, but at the point of abstraction where it matters, which in software is mostly about deterministically how do the logic flows of the thing actually work and why).

I think a lot of people just don't intuit this. An illustrative analogy might be something else creative, like music. Imagine the conversation where you're writing a song and discussing some fine point of detail like the lyrics, should I have this or that line in there, and ask someone's opinion, and their answer is 'well listen, I don't really know about lyrics and all of that, but I know all that really matters in the end is the vibe of the song'. That contributes about the same level of usefulness as talking about how software users are ultimately looking for 'solutions' without talking about the details of said software.


Exactly, in the long run it's the people who care the most who win, it's tautological


> Because nobody actually wants a "web app". People want food, love, sex or: solutions.

Okay but when I start my car I want to drive it, not fuck it.


Most of us actually drive a car to get somewhere. The car, and the driving, are just a modality. Which is the point.


If this was a good answer to mobility, people would prefer the bus over their car. It’s non-deterministic - when will it come? How quick will i get there? Will i get to sit? And it’s operated by an intelligent agent (driver).

Every reason people prefer a car or bike over the bus is a reason non-deterministic agents are a bad interface.

And that analogy works as a glimpse into the future - we’re looking at a fast approaching world where LLMs are the interface to everything for most of us - except for the wealthy, who have access to more deterministic services or actual human agents. How long before the rich person car rental service is the only one with staff at the desk, and the cheaper options are all LLM based agents? Poor people ride the bus, rich people get to drive.


Bus vs car hit home for me as a great example of non vs deterministic.

It has always seemed to me that workflow or processes need to be deterministic and not decided by an LLM.


Here in Switzerland the bus is the deterministic choice. Just saying.


Most of us actually want to get somewhere to do an activity. The getting there is just a modality.


Most of us actually want to get some where to do an activity to enjoy ourselves. The getting there, and activity, are just modalities.


Most of us actually want to get somewhere to do an activity to then have known we did it for the rest of our lives as if to extract some intangible pleasure from its memory. Why don't we just hallucinate that we did it?


This leads to us asking the deepest question of all: What is the point of our existence. Or as someone suggests lower down, in our current form all needs could ultimately be satisfied if AI just provided us with the right chemicals. (Which drug addicts already understand)

This can be answered though, albeit imperfectly. On a more reductionist level, we are the cosmos experiencing itself. Now there are many ways to approach this. But just providing us with the right chemicals to feel pleasure/satisfaction is a step backwards. All the evolution of a human being, just to end up functionally like an amoeba or a bacteria.

So we need to retrace our steps backwards in this thought process.

I could write a long essay on this.

But, to exist in first place, and to keep existing against all the constraints of the universe, is already pretty fucking amazing.

Whether we do all the things we do, just in order to stay alive and keep existing, or if the point is to be the cosmos “experiencing itself”, is pretty much two sides of the same coin.


>Or as someone suggests lower down, in our current form all needs could ultimately be satisfied if AI just provided us with the right chemicals. (Which drug addicts already understand)

When you suddenly realize walking down the street that the very high fentanyl zombie is having a better day than you are.

Yeah, you can push the button in your brain that says "You won the game." However, all those buttons were there so you would self-replicate energy efficient compute. Your brain runs on 10 watts after all. It's going to take a while for AI to get there, especially without the capability for efficient self-repair.


Indeed - stick me in my pod and inject those experience chemicals into me, what's the difference? But also, what would be the point? What's the point anyway?

In one scenario every atom's trajectory was destined from the creation of time and we're just sitting in the passenger seat watching. In another, if we do have free will then we control the "real world" underneath - the quantum and particle realms - as if through a UI. In the pod scenario, we are just blobs experiencing chemical reactions through some kind of translation device - but aren't we the same in the other scenarios too?


This was actually my point as well. You can follow this thought process all the way up to "make those specific neuron pathways in my brain fire", everything else is just the getting there part.


But I want that somewhere to be deterministic, i.e. I want to arrive to the place I choose. With this kind of non-determinism instead, I have a big chance of getting to the place I choose. But I will also every now and then end up in a different place.


Yeah but in this case your car is non-deterministic so


Well the need is to arrive where you are going.

If we were in an imagined world and you are headed to work

You either walk out your door and there is a self driving car, or you walk out of your door and there is a train waiting for you or you walk out of your door and there is a helicopter or you walk out of your door and there is a literal worm hole.

Let's say all take the same amount of time, are equally safe, same cost, have the same amenities inside, and "feel the same" - would you care if it were different every day?

I don't think I would.

Maybe the wormhole causes slight nausea ;)


> Well the need is to arrive where you are going.

In order to get to your destination, you need to explain where you want to go. Whatever you call that “imperative language”, in order to actually get the thing you want, you have to explain it. That’s an unavoidable aspect of interacting with anything that responds to commands, computer or not.

If the AI misunderstands those instructions and takes you to a slightly different place than you want to go, that’s a huge problem. But it’s bound to happen if you’re writing machine instructions in a natural language like English and in an environment where the same instructions aren’t consistently or deterministically interpreted. It’s even more likely if the destination or task is particularly difficult/complex to explain at the desired level of detail.

There’s a certain irreducible level of complexity involved in directing and translating a user’s intent into machine output simply and reliably that people keep trying to “solve”, but the issue keeps reasserting itself generation after generation. COBOL was “plain english” and people assumed it would make interacting with computers like giving instructions to another employee over half a century ago.

The primary difficulty is not the language used to articulate intent, the primary difficulty is articulating intent.


this is a weak argument.. i use normal taxis and ask the driver to take me to a place in natural language - a process which is certainly non deterministic.


and the taxi driver has an intelligence that enables them to interpret your destination, even if ambiguous. And even then, mistakes happen (all the time with taxis going to a different place than the passenger intended because the names may have been similar).


Yes so a bit of non determinism doesn’t hurt anyone. Current LLMs are pretty accurate when it comes to these sort of things.


> a process which is certainly non deterministic

The specific events that follow when asking a taxi driver where to go may not be exactly repeatable, but reality enforces physical determinism that is not explicitly understood by probabilistic token predictors. If you drive into a wall you will obey deterministic laws of momentum. If you drive off a cliff you will obey deterministic laws of gravity. These are certainties, not high probabilities. A physical taxi cannot have a catastrophic instant change in implementation and have its wheels or engine disappear when it stops to pick you up. A human taxi driver cannot instantly swap their physical taxi for a submarine, they cannot swap new york with paris, they cannot pass through buildings… the real world has a physically determined option-space that symbolic token predictors don’t understand yet.

And the reason humans are good at interpreting human intent correctly is not just that we’ve had billions of years of training with direct access to physical reality, but because we all share the same basic structure of inbuilt assumptions and “training history”. When interacting with a machine, so many of those basic unstated shared assumptions are absent, which is why it takes more effort to explicitly articulate what it is exactly that you want.

We’re getting much better at getting machines to infer intent from plain english, but even if we created a machine which could perfectly interpret our intentions, that still doesn’t solve the issue of needing to explain what you want in enough detail to actually get it for most tasks. Moving from point A to point B is a pretty simple task to describe. Many tasks aren’t like that, and the complexity comes as much from explaining what it is you want as it does from the implementation.


I think it’s pretty obvious but most people would prefer a regular schedule not a random and potentially psychologically jarring transportation event to start the day.


> your car is non-deterministic

it's not as far as your experience goes - you press pedal, it accelerates. You turn the steering, it goes the way it turns. What the car does is deterministic.

More importantly, it does this every time, and the amount of turning (or accelerating) is the same today as it was yesterday.

If an LLM interpreted those inputs, can you say with confidence, that you will accelerate in a way that you predicted? If that is the case, then i would be fine with an LLM interpreted input to drive. Otherwise, how do you know, for sure, that pressing the brakes will stop the car, before you hit somebody in front of you?

of course, you could argue that the input is no longer your moving the brake pads etc - just name a destination and you get there, and that is suppose to be deterministic, as long as you describe your destination correctly. But is that where LLM is at today? or is that the imagined future of LLMs?


Sometimes it doesn't though. Sometimes the engine seizes because a piece of tubing broke and you left your coolant down the road two turns ago. Or you steer off a cliff because there was coolant on the road for some reason. Or the meat sack in front of the wheel just didn't get enough sleep and your response time is degraded and you just can't quite get the thing to feel how you usually do. Ultimately the failure rate is low enough to trust your life on it, but that's just a matter of degree.


The situations you described reflects a System that has changed. And if the System has changed, then a change in output is to be expected.

It's the same as having a function called "factorial" but you change the multiplication operation to addition instead.


all of those situations are the "driver's own fault", because they could've had a check to ensure none of that happened before driving. Not true with an LLM (at least, not as of today).


Tesla's "self-driving" cars have been working very hard to change this. That piece of road it has been doing flawlessly for months? You're going straight into the barrier today, just because it feels like it.


I mean, as long as it works and it is still technically "my car", I would welcome the change.


But do you want to drive, or do you want to be wherever you need to be to fuck?


For me personally, the latter, but there's definitely people out there that just love driving.

Either way, these silly reductionist games aren't addressing the point: if I just want to get from A to B then I definitely want the absolute minimum of unpredictability in how I do it.


That would ruin the brain placticity.

I wonder now, if everything is always different and suddenly every day would be the same. How many times as terrifying would that be compared to the opposite?


A form of Alexei Yurchak's hypernormalisation?


Only because you think the driving is what you want. The point is that what you want is determined by our brain chemicals. Many steps could be skipped if we could just give you the chemicals in your brain that you craved.


I feel like this is the point where we start to make jokes about Honda owners.


Go on, what about honda owners? I don't know the meme.


The "Wham Baam" YouTube channels have a running joke about Hondas bumping into other cars with concerning frequency.


Sadly, this is not true of a (admittedly very small) number of individuals.


Christine didn’t end well for anyone.


...so that you can get to the supermarket for food, to meet someone you love, meet someone you may or may not love, or to solve the problem of how to get to work; etc.

Your ancestors didn't want horses and carts, bicycles, shoes - they wanted the solutions of the day to the same scenarios above.


As much as I love your point, this is where I must ask whether you even want a corporeal form to contain the level of ego you're describing. Would you prefer to be an eternal ghost?

To dismiss the entire universe and its hostilities towards our existence and the workarounds we invent in response as mere means to an end rather than our essence is truly wild.


Most people need to go somewhere (in a hurry) to make money or food etc which most people don't want to do if they didn't have to, so yeah it is mostly a means to an end.


And yet that money is ultimately spent on more means to ends that are just as inconvenient from another perspective?

My point was that there is no true end goal as long as whims continue. The need to craft yet more means is equally endless. The crafting is the primary human experience, not the using. The using of a means inevitably becomes transparent and boring.


It should finalize into introducing satisfaction to the whims directly, so the AI would be directly managing the chemicals in our brains that would trigger feelings of reward and satisfaction.


I think you're just describing drugs


Yes, but current drugs have many issues such as tolerance build up and withdrawals. If AI could figure out how to directly manage chemicals in the brain in such a way that it keeps working, it would be able to attain its goals of making people happy.


Even if it purred real nice when it started up? (I’m sorry)


Looks like we have a Civic owner xD


Weird kink


Food -> 'basic needs'... so yeah, Shelter, food, etc. That's why most of us drive. You are also correct to separate Philia and Eros ( https://en.wikipedia.org/wiki/Greek_words_for_love ).

A job is better if your coworkers are of a caliber that they become a secondary family.


> Average humans are pretty great at solving a certain class of complex problems that we tried to tackle unsuccessfully with many millions lines of deterministic code..

Are you suggesting that an average user would want to precisely describe in detail what they want, every single time, instead of clicking on a link that gives them what they want?


No, but the average user is capable of describing what they want to something trained in interpreting what users want. The average person is incapable of articulating the exact steps necessary to change a car's oil, but they have no issue with saying "change my car's oil" to a mechanic. The implicit assumption with LLM-based backends is that the LLM would be capable of correctly interpreting vague user requests. Otherwise it wouldn't be very useful.


The average mechanic won’t do something completely different to your car because you added some extra filler words to your request though.

The average user may not care exactly what the mechanic does to fix your car, but they do expect things to be repeatable. If car repair LLMs function anything like coding LLMs, one request could result in an oil change, while a similar request could end up with an engine replacement.


I think we're making similar points, but I kind of phrased it weirdly. I agree that current LLMs are sensitive to phrasing and are highly unpredictable and therefore aren't useful in AI-based backends. The point I'm making is that these issues are potentially solvable with better AI and don't philosophically invalidate the idea of a non-programmatic backend.

One could imagine a hypothetical AI model that can do a pretty good job of understanding vague requests, properly refusing irrelevant requests (if you ask a mechanic to bake you a cake he'll likely tell you to go away), and behaving more or less consistently. It is acceptable for an AI-based backend to have a non-zero failure rate. If a mechanic was distracted or misheard you or was just feeling really spiteful, it's not inconceivable that he would replace your engine instead of changing your oil. The critical point is that this happens very, very rarely and 99.99% of the time he will change your oil correctly. Current LLMs have far too high of a failure rate to be useful, but having a failure rate at all is not a non-starter for being useful.


All of that is theoretically possible. I’m doubtful that LLMs will be the thing that gets us to that though.

Even if it is possible, I’m not sure if we will ever have the compute power to run all or even a significant portion of the world’s computations through LLMs.


Mechanics, and humans, are non-deterministic. Every mechanic works differently, because they have different bodies and minds.

LLMs are, of course, bad. Or not good enough, at least. But suppose they are. Suppose they're perfect.

Would I rather use an app or just directly interface with an LLM? The LLM might be quicker and easier. I know, for example, ordering takeout is much faster if I just call and speak to a person.


Old people sometimes call rather than order on the website. They never fail to come up with a query that no amount of hardcoded logic could begin to attack.


> Every mechanic works differently, because they have different bodies and minds.

Yes but the same LLM works very differently on each request. Even ignoring non-determinism, extremely minor differences in wording that a human mechanic wouldn’t even notice will lead to wildly different answers.

> LLMs are, of course, bad. Or not good enough, at least. But suppose they are. Suppose they're perfect.

You’re just talking about magic at that point.

But suppose the do become “perfect”, I’m skeptical we’ll ever have the compute resources to replace a significant fraction of computation with LLMs.


There would be bookmarks to prompts and the results of the moment would be cached : both of these are already happening and will get better. We probably will freeze and unfreeze parts of neural nets to just get to that point and even mix them up to quickly mix up different concept you described before and continue from there.


I think they're suggesting that some problems are trivially solvable by humans but extremely hard to do with code - in fact the outcome can seem non-deterministic despite it being deterministic because there are so many confounding variables at play. This is where an LLM or other for of AI could be a valid solution.


When I reach for a hammer I want it to behave like a hammer every time. I don't ever want the head to fly off the handle or for it to do other things. Sometimes I might wish the hammer were slightly different, but most of the time I would want it to be exactly like the hammer I have.

Websites are tools. Tools being non-deterministic can be a really big problem.


Companies want determinism. And for most things, people want predictability. We've spent a century turning people into robots for customer support, assembly lines, etc. Very few parts of everyday life that still boil down to "make a deal with the person you're talking to."

So even if it would be better to have more flexibility, most business won't want it.


Why sell to a company when you can replace it?

I can speculate about what LLM-first software and businesses might look like and I find some of those speculations more attractive than what's currently on offer from existing companies.

The first one, which is already happening to some degree on large platforms like X, is LLM powered social media. Instead of having a human designed algorithm handle suggestions you hand it over to an LLM to decide but it could go further. It could handle customizing the look of the client app for each user, it could provide goal based suggestions or search so you could tell it what type of posts or accounts you're looking for or a reason you're looking for them e.g. "I want to learn ML and find a job in that field" and it gives you a list of users that are in that field, post frequent and high quality educational material, have demonstrated willingness to mentor and are currently not too busy to do so as well as a list of posts that serve as a good starting point, etc.

The difference in functionality would be similar to the change from static websites to dynamic web apps. It adds even more interactivity to the page and broadens the scope of uses you can find for it.


Sell to? I'm talking about buying from. How are you replacing your grocery store, power company, favorite restaurants, etc, with an LLM? Things like vertical integration and economies of scale are not going anywhere.


The issue with not having something deterministic is that when there's regression, you cannot surgically fix the regression. Because you can't know how "Plan A" got morphed into "Modules B, C, D, E, F, G," and so on.

And don't even try to claim there won't ever be any regression: Current LLM-based A.I. will 'happily' lie to you that they passed all tests -- because based on interactions in the past, it has.


So basically you say the future of web would be everyone gets their own Jarvis, and like Tony you just tell Jarvis what you want and it does it for you, theres no need for a preexisting software or to even write a new one, it just does what's needed to fulfill the given request and give you the results you want. This sounds nice but wouldn't it get repetitive and computationally expensive, life imagine instead of Google maps, everyone just asks the AI directly for the things people typically use Google maps for like directions and location reviews etc. A centralized application like maps can be more efficient as it's optimized for commonly needed work and it can be further improved from all the data gathered from users who interact with this app, on the other hand if AI was allowed to do it's own thing, it could keep reinventing the wheel solving the same tasks again and again without the benefit of building on top of prior work, while not getting the improvements that it would get from the network effect of a large number of users interacting with the same app.


You might end up with ai trying to get information from ai, which saves us the frustration..

knows where we’d end up?

On the other hand the logs might be a great read.


We're used to dealing with human failure modes, AI fails in so unfamiliar ways it's hard to deal with.


But it is still very early days. And if you have the AI generate code for deterministic things and fast execution, but the ai always monitors the code and if the user requires things that don't fit code, it will jump in. It's not one or the other necessarily.


Determinism is the edge these systems have. Granted in theory enough AI power could be just as good. Like 1,000,000 humans could give you the answer of a postgres query. But the postgres gonna be more efficient.


No, I wouldn’t say that my hypothesis is that non-deterministic behavior is good. It’s an undesirable side effect and illustrates the gap we have between now and the coming post-code world.


AI wouldn't be intelligent though if it was deterministic. It would just be information retrieval


It already is "just" information retrieval, just with stochastic threads refining the geometry of the information.


Haha u mean it isn't AGI? /s


Web apps kind of already do that with most companies shipping constant UX redesigns, A/B tests, new features, etc.

For a typical user today’s software isn’t particularly deterministic. Auto updates mean your software is constantly changing under you.


I don't think that is what the original commenter was getting at. In your case, the company is actively choosing to make changes. Whether its for a good reason, or leads to a good outcome, is beside the point.

LLMs being inherently non-deterministic means using this technology as the foundation of your UI will mean your UI is also non-deterministic. The changes that stem from that are NOT from any active participation of the authors/providers.

This opens a can of worms where there will always be a potential for the LLM to spit out extremely undesirable changes without anyone knowing. Maybe your bank app one day doesn't let you access your money. This is a danger inherent and fundamental to LLMs.


Right I get tha. The point I’m making is that from a users perspective it’s functionally very similar. A non deterministic llm or a non deterministic company full of designers and engineers.


Regardless of what changes the bank makes, it’s not going to let you access someone else’s money. This llm very well might.


Well, software has been known to have vulnerabilities...

Consider this: the bank teller is non-deterministic, too. They could give you 500 dollars of someone else's money. But they don't, generally.


Bank tellers are deterministic though. They have a set protocol for each cases and escalate unknown cases to a more deterministic point of contact.

It will be difficult to incorporate relative access or restrictions to features with respect to users current/known state or actions. Might as well write the entire web app at that point.


I think the bank teller's systems and processes are deterministic, but the teller itself is not. They could even rob the bank, if they wanted to. They could shoot the customers. They don't, generally, but they can.

I think, if we can efficiently capture a way to "make" LLMs conform to a set of processes, you can cut out the app and just let the LLM do it. I don't think this makes any sense for maybe the next decade, but perhaps at some point it will. And, in such time, software engineering will no longer exist.


The actual app is the set of processes.


The rate of change is so different it seems absurd to compare the two in that way.

The LLM example gives you a completely different UI on _every_ page load.

That’s very different from companies moving around buttons occasionally and rarely doing full redesigns


And most end users hate it.


I think it's actually conceptually pretty different. LLMs today are usually constrained to:

1. Outputting text (or, sometimes, images).

2. No long term storage except, rarely, closed-source "memory" implementations that just paste stuff into context without much user or LLM control.

This is a really neat glimpse of a future where LLMs can have much richer output and storage. I don't think this is interesting because you can recreate existing apps without coding... But I think it's really interesting as a view of a future with much richer, app-like responses from LLMs, and richer interactions — e.g. rather than needing to format everything as a question, the LLM could generate links that you click on to drill into more information on a subject, which end up querying the LLM itself! And similarly it can ad-hoc manage databases for memory+storage, etc etc.


Or, maybe, just not use LLMs?

LLM is just one model used in A.I. It's not a panacea.

For generating deterministic output, probably a combination of Neural Networks and Genetic Programming will be better. And probably also much more efficient, energy-wise.


Every time you need a rarely used functionality it might be better to wait 60s for an LLM with MCP tools to do its work than to update an app. It only makes sense to optimize and maintain app functionalities when they are reused.


For some things you absolutely want deterministic behaviour. For other things, behaviour that adapts to usage and the context provided by the data the user provides sounds like it could potentially be very exciting. I'm glade people are exploring this. The hard part will be figuring out where the line goes, and when and how to "freeze" certain behaviours that the user seems happy with vs. continuing to adapt to data.


Like, for sure you can ask the AI to save it's "settings" or "context" to a local file in a format of its own choosing, and then bring that back in the next prompt ; couple this with temperature 0 and you should get to a fixed-point deterministic app immediately


There still maybe some variance at temperature 0. The outputted code could still have errors. LLMs are still bounded by the undecidable problems in computational theory like Rices theorem.


Why wouldn't the llm codify that "context" into code so it doesn't have to rethink through it over and over? Just like humans would. Imagine if you were manually operating a website and every time a request came in you had come up with sql queries (without remembering how you did it last time) and manually type the responses. You wouldn't last long before you started automating.


> couple this with temperature 0

Not quite the case. Temperature 0 is not the same as random seed. Also there are downsides to lowering temperature (always choosing the most probable next token).


Llms are easily made deterministic by choosing the selection strategy. More than being deterministic they are also fully analayzable and you don't run into issues like the halting problem if you constrain the output appropriately.


Why do good thing consistently when we can do great thing that only works sometimes??? :(


Designing a system with deterministic behavior would require the developer to think. Human-Computer Interaction experts agree that a better policy is to "Don't Make Me Think" [1]

[1] https://en.wikipedia.org/wiki/Don%27t_Make_Me_Think


That book is talking about user interaction and application design, not development.

We absolutely should want developers to think.


As experiments like TFA become more common, the argument will shift to whether anybody should think about anything at all.


What argument? I see a business model here, not an argument.


I meant "the discourse", "the conversation we are all having", interpreting the experiment in TFA as an entry in that discourse.


This is such a massive misunderstanding of the book. Have you even read it? The developer needs to think so that the user doesn't have to...


My most charitable interpretation of the perceived misunderstanding is that the intent was to frame developers as "the user."

This project would be the developer tool used to produce interactive tools for end users.

More practically, it just redefines the developer's position; the developer and end-user are both "users". So the developer doesn't need to think AND the user doesn't need to think.


I interpreted it like "why don't we simply eat the orphans"? It kind of works but it's absurd, so it's funny. I didn't think about it too hard though, because I'm on a computer.


..is this an AI comment?


> who wants web apps to behave differently every time you interact with them?

Technically everyone, we stopped using static pages a while ago.

Imagine pages that can now show you e.g. infinitely customizable UI; or, more likely, extremely personalized ads.


Small anecdote. We were releasing UI changes every 2 weeks making app better more user friendly etc.

Product owners were happy.

Until users came for us with pitchforks as they didn’t want stuff to change constantly.

We backed out to releasing on monthly cadence.


No.

When I go to the dmv website to renew my license, I want it to renew my license every single time


Ah, sure; that's why everyone got Adblock and UBo in first place. Even more under phones.


> infinitely customizable UI; or, more likely, extremely personalized ads

Yeah, NO.


I hate submitting any kind of form on any website from my phone, because I can't open dev tools and see if there were any errors in the response which were invisible in the UI.


I've always found it ridiculous that Sony is allowed to sell their consoles and games in my county without allowing me to create a PSN account. They even sell the digital PS5, which REQUIRES a PSN account to get any games.

And then, even if you break their TOS and create an account in another country, you're still constantly inconvenienced - you can't pay for games using your local payment method, for example, and a useful Playstation mobile app is not even listed in the local app store.

IMO they should either provide the same level of service in all countries, or be forced to charge significantly less when selling their hardware in unsupported countries.


Is it actually Sony selling their games or consoles in your country or is it 3rd party sellers who import them from elsewhere because there's demand? It is still shitty and annoying that stuff is geoblocked, be it games, movies, books, etc. But the two are pretty different.


I'm not sure about the actual hardware, but physical copies of helldivers 2 are on sale in the Philippines, where you can't make a PSN account. You can find them on Playstation's website which makes me think they are authorized copies https://www.playstation.com/en-ph/games/helldivers-2/


This^ There are so many third party resellers that probably most are unauthorized.


If you have to distinguish this sounds like a rather twisted conception of property to begin with.


Assuming it's actually sold by unauthorized third party sellers, every step in the chain seems pretty straightforward. People in [country where PSN isn't available] want PS5s. Enterprising people see an opportunity. Sony (or their authorized resellers) in those other countries sell PS5s to the importers because to them you don't need to fill out a KYC form to buy a PS5 so they're they're indistinguishable from someone who's buying one for domestic consumption. The end result is not ideal and the OP acknowledges this, but I'm not seeing where the "rather twisted conception of property" is coming from.


> The end result is not ideal and the OP acknowledges this, but I'm not seeing where the "rather twisted conception of property" is coming from.

I'm just noting the continued slide from wholly owned property to buying property but only leasing the use of it. I'm sure there are many rights-owners worldwide that are thrilled with this concept.


Yes, it is becoming likely that we are just “renting” hardware since the software needed for many consumer products is closed source and you can’t run your own software.

Counterpoint is that jailbreaking is legal so you do technically own the hardware. Issue is that John Deere, Apple, Sony, Microsoft make it impossible to jailbreak hardware nowadays.


> IMO they should either provide the same level of service in all countries, or be forced to charge significantly less when selling their hardware in unsupported countries.

Who would force the price change though?

It seems more reasonable for anyone in a country that doesn't have access to PSN just wouldn't bother buying a PS5 regardless if the price.


I bet a lot of people don't know they can't create a PSN account until after they buy it


Totally fair. That's definitely a Sony problem if it isn't clear what countries are supported. I have to assume I'd return the PS5 if I got it home and couldn't make an account.

edit: typo


It's reasonable to ask for that and it would probably be up to a local government agency to enforce such a rule.

(In fact Meta's Quest was not available in my country for a few years until Meta changed their mind and removed the need to have a Facebook account as well).

To make Sony care about small countries it would be best for them to band together and act as a group on that. It must hurt them economically


If Sony doesn't have any presence in those countries, as evidenced by the fact that you can't create PSN accounts for those countries, do those countries even have jurisdiction over Sony? More practically, what leverage do they have to enforce compliance? Meta (Facebook) doesn't have any presence in China, so they can safely ignore any takedown requests from them, even if they are actually violating Chinese law.


A game is just a game. There is nothing vital to it.

It's the easiest thing in the world to boycott.

And yet, players consistently fail to make the editors pay for their bad behaviors. EA is still craping on their clients. Blizzards fails forward. Loot boxes, DRM, crippling anti-cheat mechanisms, buggy games with expensive DLC and micro payments are everywhere. Sony even infected their customers with a rootkit once.

If you keep giving them money despite this, then they have correctly noted they can charge you for their fun system despite the inconvenience.

Already more games have been created and published than a human could finish in an entire life. And that's games, not music, movies, series or books.

You could just stop buying any new game forever and have an entire life of wonderful gaming experience.

Hell, I'm still playing old snes games, or the flash version of isaac.

So not buying from an editor?

Easy.

Stop complaining.

Act.


That's not how this market works. There is no homogenous group of gamers that can act together. Each year there is a new cohort of 11 years old that knows nothing of past publisher behaviour and wants the latest EA game for their birthday. For FIFA with the lootboxes etc there is the added factor that people buying those are for a big part a distinct group to other gamers, there are many guys that only play FIFA or their favorite sport game and nothing else. So bad behaviour in that area does not affect other game sales as much as it should.


For certain genres and certain games, no amount of negative reviews seems to have any effect on their popularity. I've seen many cases where, on Steam at least, the reviews for a game can be mostly negative, or mixed at best, yet the number of reviews keeps going up like crazy. Meanwhile some very positive and occasionally overwhelmingly positive games can have fewer total reviews a year after release than some poorly reviewed games get in a day.

There's also still an "early access" stigma even though must major publishers are basically treating "full" release day as an open beta test now.

All of this to say that careful consumers can only affect a small proportion of the games industry's revenue. It's enough to keep indie games and their small studios alive but so far it has had near zero effect on the AAAs. I suspect the layoffs we're seeing across the industry reflect a contraction in the spending of the majority segment but a lot of AAAs seem to be doubling down on targeting that same segment.


This makes me so mad. Every time I have to show my nieces or nephews how to play a game, and I have to show them all the fake buttons they have to avoid to manage to play it through the dark patterns. Even Windows, the OS, is full of traps now. You can't buy any software and have it pretend to be yours anymore.

Used to be the case for free mobile games at first, now it's everywhere.


Just to say: This is avoidable. Ofc one can not give Windows to a child anymore, but there is Linux. For games on Android, of course one can't install games from the Play Store anymore, but there is F-Droid. And for regular PC games, there is still a big selection that works without or with very little micro transactions. And besides, the kids system should not be capable to make any purchases no matter where they click.

But I can completely relate that it is infuriating and that it takes a lot of filtering through the mainstream shit to manage this well.


Proprietary/popular games on Linux come with the same monetization strategy on every platform. You want to play Fortnite, you get the Fortnite ads/loot boxes/nudges even on Linux (plus you'll get banned by anti-cheat quick).


Linux is not the recommended choice for the game selection, it's so that the OS is not already a hostile place with ads and spammy news. The games themselves have still to be filtered, as I wrote.


That's what I use myself, but I can't really recommend it to family overseas... And even that is not safe, this will probably be the last Ubuntu release I can use before the "Ubuntu Pro", Snaps, and other integrations become unbearable.


In the same manner one can avoid many bad aspects of modern cars if they are an experienced mechanic and can build their own, but that's not a reasonable burden for an average person.


Feels a bit absurd to try to lay the blame at the feet of 11 year olds, when adults routinely demonstrate the same behavior.

The latest example of KSP2 is imo a great example, announced in 2019 for release in 2020, delayed several times to go on to launch into a $50 'early access' early last year with far less functionality than the original game and worse bugs and architectural issues. Despite all the glaring warning signs, so many people ate up the promises. They delivered one basic feature in 6 months (reentry heating). Yet the 'trust' from otherwise smart adults remained. It seems they only finally noticed this past week, when the studio making the game was shut down. Now they expect refunds despite all the warnings Steam has against buying early access games based on future promises.


Well in this case things changed AFTER the game was bought, so people got screwed.

Besides, you make one of the dumbest point I have ever read on this website. Gas prices going up? Pfeh, I heat my house with a bonfire, wake up sheeple.


> players consistently fail to make the editors pay for their bad behaviors.

This comes across as either gaslighting or refusal of evidence - you have a theory that bad players/companies will be punished by the market, and when it doesn't happen, you conclude that it's the consumer's fault.

Maybe you should consider that your theory is wrong, as it does not match the real world. It appears to me that, these days, most of the time bad players do very well in the market.


There is a screenshot of an email that someone from Ukraine sent to Sony support and their response was that you can create an account using PS5 but not from a PC.


I've seen this criticism several times on HN, but have never been able to relate to it.

I've been using hooks since they were introduced, in several teams (at several different companies), and I've never experienced them being complicated to understand, either for myself, or for team mates - even juniors who are new to React. In my experience, it takes very little time (<1 hour) to understand the basics of React, and once you have that mental model in place, hooks fit in immediately.

I wonder if it's the case that many people on HN are just used to some completely different libraries and thus are coming in to React with a completely different mental model? And that's the cause of this sentiment being so common here.


> I wonder if it's the case that many people on HN are just used to some completely different libraries and thus are coming in to React with a completely different mental model? And that's the cause of this sentiment being so common here.

Nope. For me React was the first frontend framework I learned. The mental model of Class components was really easy to understand. I have since "learned" hooks, but they are a constant source of mental exertion for me, and it's very easy to make mistakes. Kind of like all the other "improvements" that they brought to React since Class components.


You might be interested in learning the reason hooks were invented [0]. I also use hooks in Flutter via a separate package and its creator made a great GitHub issue talking about exactly why class components cannot replicate the hook model [1], simply due to the limitations of how classes work. The code is in Dart but it's simple enough to grasp if you know JS and class concepts in general like overrides and mixins.

[0] https://medium.com/@dan_abramov/making-sense-of-react-hooks-...

[1] https://github.com/flutter/flutter/issues/51752


I do not get that class components are in any way simpler. Before, you had to think about explicit configuration states during the components entire lifecycle. Now, you just... don't?


Previously everything was explicitly in your code. Now everything is done with "magic" outside your code. You're saying that this is better because now I "don't have to think about it". But I do! When something doesn't work, you have to figure it out. It's easier to debug code that you can see and reason with, and more difficult to debug a black box that behaves in mysterious and unexplainable ways.

What you're probably thinking is "it's faster to write a TODO example app with hooks". That's not really relevant for actual software development.

It seems that framework creators are constantly making tradeoffs where they are making the easy things easier at the expense of making the hard things harder. That's the wrong tradeoff to make.


That's interesting. My feeling is that things became MORE explicit with hooks, not less, so I'm quite confused still.

> What you're probably thinking is "it's faster to write a TODO example app with hooks". That's not really relevant for actual software development.

For me, that's not the case at all. I think with hooks, it's easier to reason about what values are actually used during what renders.


> For me, that's not the case at all. I think with hooks, it's easier to reason about what values are actually used during what renders.

Hmmh. Can you provide me an example of what would've been "unclear" if the value was just state in a Class component?


I can objectively prove that React devs themselves have at the very least changed how they understand hooks but more likely have been making it up as they’ve gone along. The best example is one of the more problematic hooks, useEffect.

Here’s how useEffect is described in the old documentation:

https://legacy.reactjs.org/docs/hooks-effect.html

> The Effect Hook lets you perform side effects in function components

Here’s how the new docs explain useEffect:

> useEffect is a React Hook that lets you synchronize a component with an external system.

These are dramatically different claims about what useEffect is supposed to be for. Dig a little deeper and Reaft developers will now explicitly tell you to not use useEffects for side effects, which again is the opposite of what they explained it to be.

The same change in how they’re actually supposed to be used has been true across all the books to a lesser or greater degree.

Maybe React has finally stabilized and it’s easier for new devs because they’re learning the more stabilized version. However, I suspect it’s easy because they just happen to be within the same cycle of understanding in React. Much like how React was easy for devs coming to it nearly a decade ago, until the devs changed everything about it. New devs may just not be deep enough into the new cycle to experience the pain of all your understanding being wrong.

Now, to be completely clear, I have no problem with the changes. I think change is good and I have successfully worked with languages where the change has been even more dramatic than React.

The difference with React, I’ve increasingly come to realize, is that the developers will make radical changes in how they understand it to work, while still gaslighting you that nothing really has changed.

Hence the massive effort they made to convince everyone that hooks were just a different approach as classes and they both would be first class citizens forever, when they first introduced hooks, only to subtly and quietly change the narrative to hooks being the future of Reaft and recommending hooks.

They make these shifts all over the place without ever announcing it and convincing you to believe that what they’re saying today has always been the case.

More honesty towards how they’re changing React and what those changes mean would have gone a long way to reduce the absolute confusion floating around the react world.


I think it’s more that the React team’s way of articulating their idea of what a “side effect” is changed than that their idea of what useEffect is for changed. The examples of “side effects” in the old docs are all cases of “external synchronization”. People found the old terminology misleading because they came with their own preconceptions of what a side effect is, so they changed their pedagogical approach. The technical details haven’t really changed, just the presentation.


I've worked on some pretty complex client side code and frankly react hooks are meh at best.

There is a very strong implication in the docs and even the overall solution of hooks to ignore the details and just "go with it".

Just do it™ our way™ because reasons™

Generally not healthy for building the level of understanding it takes to debug complex react code.

Also I deeply on a spiritual level detest the way that it very strongly incentivizes pushing business logic to the view layer.


I always mentally autocorrect "unlimited PTO" to "there is a limit to PTO, but we won't tell you what it is!"


It also guarantees you will not be compensated for the PTO you didn’t use when you resign.

Many places that accrue PTO will pay out for unused days at your departure.


"Depends on how much the boss likes you, and what mood they are in when you ask."


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: