Hacker News new | past | comments | ask | show | jobs | submit | jsnider3's comments login

No.

Very few customers pick the model based on cost, for many ChatGPT is the only one they know of.

> Very few customers pick the model based on cost.

What? 3 ou of 4 companies I consulted for that started using AI for coding marked cost as an important criteria. The 4th one has virtually infinite funding so they just don't care.


> 3 out of 4 companies I consulted for that started using AI for coding marked cost as an important criteria.

And those aren't average customers.


Interesting. I know the author thinks asking an LLM to make Atari games is cheating, but did he consider just randomly sampling from the assembly code of Atari games?

I'm not going to take security advice from someone whose website I can't open in https.

We try things, sometimes they don't work.

AI safety is a genuinely hard problem.

Indeed.

I just can't wrap my head about what the actual product/service is. Let alone something that could be sold for billions.

"Safe AI" is very ambiguous in terms of product.


If you have a Safe AI, then becoming a billionaire is being an underachiever.

Sure, but again, define "Safe AI" in terms of a product.

What exactly am I buying? How much I'm paying for it?

That's the thing I don't see.

Is it a model? `gpt-3.5-turbo-safe`?


Wouldn't all the money go to the unsafe AI, since it does more?

If someone invents an unsafe AI capable of making a billion dollars, then we will probably all die, which is why we should make safe AI instead.

The basic results are interesting, but what really surprised me is that asking them to double-check didn't work. Falling for an "optical illusion" is one thing, but being unable to see the truth once you know the illusion there is much worse.

I'm not particularly convinced asking an LLM to "double check" has much significant semantic meaning. It seems more like a way to get it to re-roll the dice. If you ask it to "double-check" something that it is in fact correct about it'll quite often talk itself into changing to something wrong. If it's going to be wrong every time, it'll be wrong every time it double-checks too.

You can test this claim by asking it to double-check itself when you think it is correct. If you always stop when it gets it right you're risking Clever-Hans-ing yourself: https://en.wikipedia.org/wiki/Clever_Hans (And be sure to do it a couple of times. In situations of sufficient confidence it isn't easy to talk it out of a claim, but it's those borderline ones you want to worry about.)


Because it isn’t thinking. Asking it to “double check” is like pressing the equals button on a calculator a second time. It just runs the same calculation again.

This just isn't true. If it took the energy of a small town, why would they sell it for $20/month?

Because if they sold it at cost, nobody would buy it.

It's the drug dealer model. Trying to get them hooked for cheap, then you turn the thumbscrews.

I like it, but giving Claude a "Deep Research" mode would be better.


Have not used it myself, but Claude has Research mode in beta.


It has Research , works well with Web Search. Saves a lot of time compared to googling and trying to synthesise knowledge yourself.


What a coincidence! I was just added to the "Deep Research" beta.

Netflix's value comes from being convenient and compatible with the copyright system in a way sharing videos P2P definitely isn't.


I'm not advocating for p2p, but rather drawing attention to the word "value" and what it means to create it. For example, would netflix as a piece of software hold any value if the company were to suddenly lose all its copyrights and IP licenses? Whereas something like an operating system or excel has standalone utility, netflix is only as valuable as its IP. The software isn't designed to create value, but instead to fully utilize the value of a piece of property. It's an important distinction to keep in mind especially when designing such software. Now consider that in the streaming world there isn't just netflix, but prime, Hulu, HBO, etc. Etc.

The parent comment was complaining about certain employees contributions to "real value" or lack thereof. My question is, how do you ascertain the value of work in this context where the software isn't what's valuable but the IP is, and further how do justify working on a product thats already a solved problem and still refer to it as "creating 'real' value"?


And their increasingly restrictive usage policies are basically testing how important the 'convenient' piece is.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: