> Very few customers pick the model based on cost.
What? 3 ou of 4 companies I consulted for that started using AI for coding marked cost as an important criteria. The 4th one has virtually infinite funding so they just don't care.
Interesting. I know the author thinks asking an LLM to make Atari games is cheating, but did he consider just randomly sampling from the assembly code of Atari games?
The basic results are interesting, but what really surprised me is that asking them to double-check didn't work. Falling for an "optical illusion" is one thing, but being unable to see the truth once you know the illusion there is much worse.
I'm not particularly convinced asking an LLM to "double check" has much significant semantic meaning. It seems more like a way to get it to re-roll the dice. If you ask it to "double-check" something that it is in fact correct about it'll quite often talk itself into changing to something wrong. If it's going to be wrong every time, it'll be wrong every time it double-checks too.
You can test this claim by asking it to double-check itself when you think it is correct. If you always stop when it gets it right you're risking Clever-Hans-ing yourself: https://en.wikipedia.org/wiki/Clever_Hans (And be sure to do it a couple of times. In situations of sufficient confidence it isn't easy to talk it out of a claim, but it's those borderline ones you want to worry about.)
Because it isn’t thinking. Asking it to “double check” is like pressing the equals button on a calculator a second time. It just runs the same calculation again.
I'm not advocating for p2p, but rather drawing attention to the word "value" and what it means to create it. For example, would netflix as a piece of software hold any value if the company were to suddenly lose all its copyrights and IP licenses? Whereas something like an operating system or excel has standalone utility, netflix is only as valuable as its IP. The software isn't designed to create value, but instead to fully utilize the value of a piece of property. It's an important distinction to keep in mind especially when designing such software. Now consider that in the streaming world there isn't just netflix, but prime, Hulu, HBO, etc. Etc.
The parent comment was complaining about certain employees contributions to "real value" or lack thereof. My question is, how do you ascertain the value of work in this context where the software isn't what's valuable but the IP is, and further how do justify working on a product thats already a solved problem and still refer to it as "creating 'real' value"?
reply