AI Hype is completely out of control – especially since ChatGPT-4o [video]

gmaster1440 · 2024-06-10T14:35:54

A lot of people appear to struggle with the likely possibility that there is both a lot of AI hype to go around and it's still a very disruptive technology that will improve a lot over relatively short timelines.

nostrademons · 2024-06-10T14:54:04

I remember in 1999 or so near the top of the dot-com bubble, there was similar hype, and I was like "Sure, over 20 years this thing is going to be huge. Over the next 2, many people are going to get screwed." Same thing for crypto in 2014 and then again every time a new crypto bubble happens.

The problem is that capital isn't that patient. People are sinking billions into LLM integrations, at discount rates of 5%+. If it takes 20 years for a tech to pay off, at a 5% discount rate, and you sunk a billion dollars into it, it needs to earn back $2.65B. Moreover, at some point during that 20-year period, somebody's going to ask "Where's my billion dollars?" and pull the plug on the project.

I think the tech behind LLMs may eventually be game-changing, but that tech is going to change ownership and get reinvented several times, and we won't actually see profitable, sustainable benefits until industries get refactored into cost structures that make sense for what LLMs can actually do. The web needed its Netscape and Yahoo and Geocities and Apache, but it also needed Google and Rails and Django and GitHub and Stripe and Facebook and Webkit and nginx and MySQL to really become what it did.

mdp2021 · 2024-06-10T16:45:12

> what LLMs can actually do

You should see them as a passage - something in study and development today for something more complete tomorrow.

PheonixPharts · 2024-06-10T14:58:43

I would agree except the gap between what's realistically cool about LLMs (make tons of traditional NLP tasks easier/better) and what people are hyping them up to be (potentially world ending sentient beings) is so tremendously large it essentially guarantees an AI winter despite the fact LLMs represent a major advancement in practical SotA.

It also does concern me how basically nobody is building a real product around the current state of "AI", but are rather hinging their success on what they believe it will be in the near future.

mdp2021 · 2024-06-10T15:22:27

> what's realistically cool about LLMs

This morning I met an newspaper article: "What will be the results of the European elections? Let us ask ChatGPT".

ben_w · 2024-06-10T15:48:33

Newspapers have a long-standing tradition of exactly the same disconnection from reality that LLMs get flak for.

https://benwheatley.github.io/blog/2024/03/19-14.53.05.html

mdp2021 · 2024-06-10T16:43:38

Very nice article, Ben.

But see, being it a given from experience that humans can be so bad in judgement, reasoning, professionalism, output... Why did we strive for superhuman judgement, reasoning, professionalism, output? (The same way we strived for superhuman strength.)

ben_w · 2024-06-12T11:38:12

Thanks.

> Why did we strive for superhuman judgement, reasoning, professionalism, output?

Lots of possible reasons; there's many ways for it to be valuable.

(And it's not like the newspapers are deliberately writing fiction, with notable exceptions like The Onion).

If it's a rhetorical question, I'd be interested to know what you had in mind :)

Closi · 2024-06-11T11:52:08

Although it seems looking at the article it said there were significant gains for right-wing parties, so it did end up being right on that one!

threatofrain · 2024-06-10T16:01:08

If ML froze for the next ten years, we'd still be integrating everything we'd have today. The current progress of today has already reached a minimum threshold of quality.

Closi · 2024-06-11T11:56:27

100% - the current SOTA with the right implementation can already replace a lot of contact center work, give us a real-life J.A.R.V.I.S. and make video games where your decisions actually influence the story in a wildly meaningful way.

threatofrain · 2024-06-12T21:37:00

I've noticed that some korean restaurants are already doing really excellent customer ordering agents over the phone.

inciampati · 2024-06-10T14:42:48

I tend to agree with the caveat that we are experiencing exponentially diminishing marginal returns on investment (energy, FLOPS, currency). Yes it's going to get a lot better and smoother to work with. Agents will become secret and someday very soon no one will want to hype that they are using the tech. But we will need another revolution to get the kind of phase change we experienced with the introduction of LLMs. Imo the biggest thing that will lead to improvement now is open models that will get reused and remixed in creative ways.

edit: to be clear, I'm saying that we are running into scaling "walls" that make hard extrapolation based on increased investment senseless.

kipchak · 2024-06-10T14:53:06

Gartner puts Generative AI at just about it's peak in terms of Hype back in August of 2023[1], with a 2-5 year time frame to go through the "trough of disillusionment" and then back to the "plateau of productivity", which seems pretty fair to me.

[1]https://www.gartner.com/en/articles/what-s-new-in-the-2023-g...

red-iron-pine · 2024-06-10T18:37:31

The disruptive technology standard trend of: "the market would be huge" to "there is no market" to "there is a market, and it's growing"

fabian2k · 2024-06-10T14:50:23

I suspect that the success of AI will also vary a lot between different applications. But most people will evaluate AI on the topics they're most familiar with.

It will likely be disruptive in some areas short term, but will take much longer in other areas.

dimask · 2024-06-10T14:43:37

The problem is that there is little, incremental progress last 1 year or so after the big chatGPT boom to justify the hype, technically wise. Most of the "progress" going on is basically marketing, and making the models respond in ways humans like, or being more useful in certain practical applications. The basic, fundamental issues/limitations remain unanswered and unaddressed. As products, they have improved a lot and most probably are gonna improve more. But if we are talking for going towards AGI or more complex applications, I do not see evidence on that except as toys.

kenjackson · 2024-06-10T14:53:34

I see it being used a lot now. AGI is a mirage and I think LLMs have made this more clear. We will never get there because the goal posts will always change. We will eventually get to the point that the prenatal experience is viewed as critical to AGI.

What complex applications are you speaking of?

infecto · 2024-06-10T14:53:53

It us always interesting to see these takes.

The past year has brought us both model improvements along with drastic cost reductions. Its been a pretty magical year imo.

Is it AGI? Not even close but we have been utilizing the tooling improvements to build products internally.

faeriechangling · 2024-06-10T14:56:17

Hardware improvements alone provide a straightforward path to a substantial improvement over what we have now, and we should see software improvements and more data to use over time as well. I'm not quite at "singularity" level of hype but this tech will just get used more and more and more.

It's one of those short term is overhyped, long term is underestimated things.

jqpabc123 · 2024-06-10T15:05:33

The main reason LLMs may appear disruptive at this point is because the real cost has yet to be fully factored in.

Not too many applications can justify spending huge amounts of energy generating answers that may just be nice sounding BS.

ben_w · 2024-06-10T15:51:29

The main cost is training rather than inference, and compute is still getting cheaper, so I don't expect this to limit the use of LLMs.

We may finally see people realising that just because they can use a hammer on a screw, doesn't make it the best choice.

red-iron-pine · 2024-06-10T20:45:02

cheaper, but still crazy damn expensive

it makes sense because big money is betting it's worth the investment -- but it may not be

Ukv · 2024-06-10T15:54:12

I'm not so certain. You can already run a GPT-4-quality model locally on a decent desktop, and GPT-3-quality models on low-powered chips - plus data centers will benefit from scale. A lot of third-party services are using paid APIs that (based on cases like Mistral where some models are publicly available) appear to more than cover the inference costs.

There are also plenty of uses for LLMs beyond generating hopefully-accurate answers, such as for fictional content or use as foundation models for tasks like translation. Though we are definitely in the "throw things at the wall and see what sticks" stage currently.

ben_w · 2024-06-10T15:42:26

Yup. Humans are ironically (alazṓn-ically?) very binary when it comes to classification of things.

rafaelero · 2024-06-10T14:58:29

Even more people seem to struggle with the concept that it may be underhyped.

pydry · 2024-06-10T14:49:24

Ive been working with them for a while now and the areas I see it disrupting are:

* User interfaces. It does provide a genuinely new way to interface seamlessly with existing software.

* Translation.

* Generating bullshit. Yes, there's demand for this even if perhaps there shouldnt be.

* Not much else. That includes using it as a specialized autocomplete. I think it falls down pretty badly at that.

Turing_Machine · 2024-06-10T14:57:40

It is (or was, see my other comment) pretty good at generating standardized boilerplate code, which was quite useful.

Not so good at anything that's (for want of a better term) "creative".

pydry · 2024-06-10T15:11:15

Ive noticed that too. I think it will lead to an excess of boilerplate and less refactoring to remove boilerplate. I'd consider these things to be at best neutral and maybe net negative.

tk90 · 2024-06-10T16:56:28

pydry · 2024-06-10T18:02:20

I filed that under "new kind of UI".

eurekin · 2024-06-10T14:41:05

I'm using chatgpt daily, but cannot imagine it doing any real software development work for multiple reasons.

It's great as a non offensive stack overflow replacement, but just look at aider benchmarks (amazing work by the way): most capable models really struggle to make basic changes to a real code base.

Does anybody actually using it in practice believes the hype? I thought the hype is just another theatre for investors

EDIT: just some points taken from: https://aider.chat/docs/leaderboards/#code-editing-leaderboa...

- The metric "Percent completed correctly" maxes out at 72.9% with gpt4o, while at the same time, giving out correctly formatted output only 96.2% of the time.

- Benchmark suite is based on https://github.com/exercism/python, which very likely is a part of the training material already! In real code bases, no LLM would have the advantage of seeing new or proprietary code

wudangmonk · 2024-06-10T15:45:22

Its useful for small trivia answers, stuff you might ask google about and read a short paragraph about it as long as it the answer seems reasonable enough.

For coding I have long given up on it. I only see it useful for people that want to do small scripts in languages they are not familiar with or to write toy examples of web/mobile apps.

For art I can sort of see a workflow in adding details to textures and using existing images and 3d assets to guide the generation of images. This is all just my speculation so I could be completely wrong just like the person that knowns very little to no programming thinking they will be able to code whatever they want with it.

mvkel · 2024-06-10T14:44:16

Have built multiple MVP-scale node web apps, python scripts and two native iOS apps with it, not having known the languages beforehand.

The only skill I brought to the table was app architecture.

It's not just hype.

eurekin · 2024-06-10T14:51:33

It's a great and very useful tool, but the hype around just feels significantly blown out of proportion.

I also built multiple things with it and I always came to a point, where it just couldn't handle anything slightly larger than a mvp, or a non guided change requiring editing multiple files at once

mvkel · 2024-06-10T15:00:29

In my experience this is just too big of a prompt to give it.

The best way to use it is to make everything as atomic as possible. Ask it for one function at a time, rather than "make my app handle user auth"

eurekin · 2024-06-10T15:07:35

Of course one can dumb down the requirements so that it will handle, but what about "once basic auth is added, check which endpoints should require it and by what clients" - any real work is out of the scope currently

mvkel · 2024-06-10T19:10:22

This is why Prompt Engineering is a legitimate profession.

In the future, you could prompt GPT10 with "give me a marketing plan" and its output would be just as terrible as GPT3's.

Leveling up one's prompting skill from zero shot to few shot to agenetic is how you get usable results.

righthand · 2024-06-10T14:59:51

I would love to see those code bases and the commit histories. We had code scaffolding and code generators well before LLMs. Just as we had autocomplete before LLMs.

warkdarrior · 2024-06-10T14:56:40

This is probably one of the most significant impacts LLMs will have on SW development. Programming languages, frameworks, APIs, and runtimes will become less relevant to humans, and will probably be optimized for LLM use. DX is moving up the stack.

eurekin · 2024-06-10T15:05:58

Ok, so you're a library developer and create a greenfield API.

What do you do, in order for chatgpt to be able to pick up your library and it's patterns? What obstacles I see in this scenario:

* Base models takes months and millions to train

* RLHF supposedly can add knowledge, but it's disputed to mostly "change style"

* What incentive will OpenAI have to include your particular library's documentation?

I imagine, if that library starts being really popular, a lot of other code will include examples how to you use it. What about before that?

Including new knowledge always lags (are there two gpt updates per year? maybe a up to 4, but not really significantly more) few months, so what about a fast moving agile greenfield project? It could cause frustration in LLM users (I know I have been bitten a lot by some python library changes already).

It seems that it's just another tool in the box for humans to use. In far far future maybe, when we somehow get around those millions of dollars for fine tuning (doubtful) and/or libraries simply stop changing.

But still, put any really not small code base into 120k token context and see how easy both gpt and cluade opus trip up on themselves. It's amazing, when it works, but currently it's a roll of a dice still

jslakro · 2024-06-10T14:55:24

Did you use chatgpt, copilot or any open source model?

mvkel · 2024-06-10T15:00:47

Vanilla ChatGPT, not even Cursor

Closi · 2024-06-10T14:29:51

> "What I know is that there is no evidence that we are going to get human level artificial intelligence any time soon"

Disagree with this statement - it's too broad. There are plenty of tasks where AI's can fully outperform humans, and the number of tasks where AI's can fully outperform humans seems to increase every week.

Then it depends what we mean by "human level" - the median human is different to the smartest person in the room.

jb1991 · 2024-06-10T14:36:14

A calculator can also outperform a human, and there are many other devices that have outperformed humans for a very long time. Does that mean it is human-level intelligence?

Closi · 2024-06-10T14:45:29

True, but that's also kind of the point - we have grown computers from adding number (calculator 1960s capability) to being able to beat humans at chess (1997) to beating humans at jeopardy (2013) to breaking the turing test (2020+), LLM's that can write code (and test it themselves!), and even creating art and music.

So IMO if we are saying that "there's no evidence that computers will have human level intelligence" we need to take a step back and define what that means - because if you were sitting in the 1960's and you defined it, a modern LLM may very well have already met that definition.

jb1991 · 2024-06-11T08:06:55

You can't define intelligence by how people in any one point in time would have responded to it. A calculator is AI to people of the 19th century, but not to us. It's only in the passing of time, if something that is thought to be intelligent still seems intelligent long after it is introduced, that we might have a contender.

Closi · 2024-06-11T08:48:34

A moving definition of intelligence isn't particularly useful though if we are trying to establish if something classifies as intelligent or not.

But I think we have at least moved from a state of computers that are less 'intelligent' than a fish, to more 'intelligent' than a dog (assuming by intelligent we mean 'ability to solve problems, and to apply knowledge to novel situations' - i.e. while GPT-4 will make illegal chess moves, it would make less illegal moves than a well-trained dog).

We have moved that quickly from fish to dog, so we need to be careful to not let hubris make us think we are that much more special! Dog-intelligence to human-intelligence is a big leap, but maybe not as much of a leap as fish to dog? Or at least dog to ape and ape to human?

wizzwizz4 · 2024-06-10T14:50:59

> to breaking the turing test (2020+)

You mean 1967, right? https://en.wikipedia.org/wiki/ELIZA

> However, many early users were convinced of ELIZA's intelligence and understanding, despite Weizenbaum's insistence to the contrary.

ben_w · 2024-06-10T15:10:36

"Does X look like Y" is always a continum. In test environments, players are judged human with these rates:

* Real humans 66%

* GPT-4: 49.7%

* ELIZA: 22%

* GPT-3.5: 20%

https://arxiv.org/pdf/2310.20216

(I'm rather surprised by ELIZA beating 3.5, as were the researchers).

Turing's introduction of the test, was a 70% chance of spotting the AI after 5 minutes.

Workaccount2 · 2024-06-10T14:41:27

The thing with AI though is that it is being trained to use tools, not to be a tool.

dimask · 2024-06-10T14:46:09

When they can outperform human infants in learning, eg data required to learn and versatility, we can talk business.

Not all world is "big data".

Workaccount2 · 2024-06-10T15:17:06

I suggest playing around with gemini's 1 or 2 million context window. You'd be surprised how much you can get from an LLM when you can drop a few textbooks on the topic into the context window before asking it anything.

Mind you, this is also all early gen stuff.

Closi · 2024-06-11T09:00:34

On the flipside, AI's have a super-human ability to consume knowledge that humans can't compete with.

This just seems like putting a human-constraint on AI systems so that we can still classify ourselves as special.

asoneth · 2024-06-10T14:46:07

The number of tasks where AI (or any machine) can outperform humans is not necessarily proof of human-level intelligence. I could easily imagine machines replacing the majority of current human occupations well before reaching human-level intelligence.

“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.” ― Edsger W. Dijkstra

righthand · 2024-06-10T14:36:52

The machine outperformed John Henry, does that make the machine human level intelligence or in the case of AI are there enough human tasks that are pretty easily accomplishable through protocol? A lot of people say things like “it’s so easy a baby could do it”. Nothing in AI means humans aren’t intelligent or that machines are now human intelligent.

m3kw9 · 2024-06-10T14:49:54

The proof is on the pudding, and you cannot let Ai do anything more complex than a relative short prompt, the back and forth is bad and full of pot holes you need to check. It’s like working with a person with multiple levels of knowledge/intelligence sometimes 9/10 sometimes almost 0 and everywhere inbetween

aagha · 2024-06-10T17:54:37

100%.

I was using it with my daughter to help her with math review for high school finals, and was presented by ChatGPT 4o with an impossible triangle as an example problem for her to solve.

We "told" ChatGPT the problem couldn't be solved with this kind of triangle and asked it to come up with other examples NOT using impossible triangles and it couldn't NOT come up with workable examples.

cgearhart · 2024-06-10T15:38:04

This is always the dividing line in the debate and the place where folks talk past each other. Under normal circumstances people expect “human level” intelligence to be coupled with human level autonomy; but task focused AI often produces super human performance on that specific task with very little autonomy.

Moreover, ML has historically been very brittle compared to the generalization humans provide. A machine that reads addresses might be 10x better than a person, but it may not be worth it if you can’t do things like say “we don’t even sort the blue envelopes—there’s a specific rule for those” like you could with a human. (The specific case here is irrelevant—humans can handle arbitrary variation on the rules infinitely better than a bare CNN.)

It’s this ability to cope with human-level ambiguity that has so much potential in LLMs. But even they don’t work like people—giving it a new rule could make it worse at all other tasks with no explanation as to why.

ben_w · 2024-06-10T14:57:01

For me, it depends what is meant by "intelligence".

Even ignoring the people who insist it comes with baggage of consciousness or qualia.

AI can train on millions to billions of examples in a matter of months as transistors outpace synapses by the degree to which marathon runners outpace continental drift… but AI also need to do that well just to get up to the current level.

If AI could learn from as few examples as we require, Tesla would have achieved level 5 driving years ago — but as is, they've got millions of vehicles with years of real world experience and it's still merely "ok", not even "expert".

If we never learn how to make machines learn faster, then we'll have an economy of people whose jobs are essentially "give the machines examples to learn from"… and for some of those jobs, they could spend years providing those examples and still no machine will do as well.

madaxe_again · 2024-06-10T14:32:18

The point I’ve seen raised that seems to escape many is that AI doesn’t need to be better than the best human to be profoundly disruptive - it only needs to be as good as a mediocre human carrying out a rote task to absolutely change the world.

gonzo41 · 2024-06-10T14:40:53

ATM's killed bank tellers for the most part, people changed how they work in banks to move up the value chain. The same thing will happen.

Also, expect a backlash to AI taking human service jobs. Like having to scan a QR code at a restaurant. Its a bad experience that shows you're enraptured by a technology and don't care about the customer.

I don't disagree that AI will be disruptive. I just think it's going to find a place in technical fields and more people will want to remain talking and interacting with people like we've done for the last 100K years.

Turing_Machine · 2024-06-10T14:53:07

> ATM's killed bank tellers for the most part,

Has it really? I haven't noticed any drastic decline over the years. Maybe some, but I wouldn't describe it as "killed".

What universal ATMs and cards have really killed off is personal checks, IMO.

gonzo41 · 2024-06-10T15:11:13

There was something like an order of magnitude delcine in the number of teller positions when banks rolled out ATM's. They've been around for so long that most people don't remember when they were uncommon.

But ask yourself this, when did you last go into a bank. I've done a home loan application on a smart phone.

Anyway, stuff is changing, There's always going to be jobs about. People are flexible and can learn things, AI ain't all that.

tivert · 2024-06-10T14:54:07

> it only needs to be as good as a mediocre human carrying out a rote task to absolutely change the world.

And "absolutely chang[ing] the world" is often assumed to be good, but that's not necessarily so. There are a lot of mediocre humans in the world, only capable of rote work. If we push automation to the maximum, eventually we'll reach a point were those people have nothing economically viable to do.

And then you reach a decision point: support them on welfare, forever? Immiserate them and hope they die off (what our current social framework will invariably choose)?

madaxe_again · 2024-06-10T15:04:04

Capitalism requires consumers. Ergo, to keep the wheels turning, either everyone is going to have to become an artisan, or UBI is going to have to be a thing.

tivert · 2024-06-10T16:38:42

> Capitalism requires consumers.

I disagree with that premise. For certain businesses it's fundamentally true, and right now it's broadly true because of the limitations of our technology, but I don't think it's fundamentally true for capitalism.

If the AGI hype pans out, I think we'd see an transition into an economy with a lot fewer consumers. The capitalists would eventually amass all the valuable resources for themselves, and the positive externality that currently drives the consumer economy (the need for labor to make use of capital) would run down and eventually die. Then you'd have a small cadre of elites who own the automation and the resources, and use them chiefly for vanity projects and maybe some B2B activity between themselves (e.g. electricity sales), a small somewhat larger cadre of "middle class" providing luxury/vanity goods to the core elite (e.g. high end prostitutes, Mars colonists), and a great mass of economically useless people living at the margins.

> Ergo, to keep the wheels turning, either everyone is going to have to become an artisan, or UBI is going to have to be a thing.

So the former won't work because artists can't feed themselves by trading art among themselves, and the latter's not necessary because an Elon Musk focused on personal vanity projects doesn't need consumers, he just needs resources an an AGI bureaucracy with AGI robots to command around.

tmountain · 2024-06-10T14:42:07

Yes, but the likelihood of it having similar capabilities to a human just seems really low. If AI develops “cognitive” capabilities, it won’t do so in any sort of human like way, not will it reach equilibrium at “mediocre human” levels of performance. It will rapidly evolve into something we can barely recognize because it’s not constrained by evolutionary boundaries in the same way we are.

ripjaygn · 2024-06-10T14:54:21

> It will rapidly evolve into something we can barely recognize because it’s not constrained by evolutionary boundaries in the same way we are

It is constrained by physical infrastructure like compute and memory, just like the brain which had to become physically bigger to get better.

wumbo · 2024-06-10T14:43:40

“Human-level” means we can drop the bullshit qualifiers.

Human-level intelligence should adapt to many uses.

It may require teachers, but not prompt engineers.

The article author is very valid in the stance that this is far from intelligence.

Yes yes, you wouldn’t criticize a fish’s intelligence for not being able to climb a chessboard.

But these things are stupid as hell. They are not even close.

ugjka · 2024-06-10T14:52:38

I'm fine with everything that can run locally, but honestly any product that adds AI on server side will want some sort of subscription from you eventually because it is not cheap by any means yet... So we will have every app on phone or desktop begging for money pretty soon when the hype starts to shift on the wrong side

rafaelero · 2024-06-10T14:49:06

Does this guy really believe that GPT-4's answers are of the same quality as GPT-3.5? That's ridiculous.

ado__dev · 2024-06-10T14:59:26

I agree that GPT4 is better, but the amount of times it still hallucinates is still way too high to be useful for any real work. LLMs at present are super impressive, but have very very low trust.(This is true for all models, Claude, Opus, etc.)

rafaelero · 2024-06-10T15:03:33

More than 60% of programmers are already using it and finding it useful. Boston Consulting Group conducted a trial and concluded that it increases quality of their consultants work while decreasing the time it took to finish it. So no, you are not correct that hallucination makes it impossible to use.

twic · 2024-06-10T16:45:04

> More than 60% of programmers are already using it

Firstly, there is absolutely no way anyone could know this. Secondly, no they aren't.

rafaelero · 2024-06-10T18:06:51

It's what surveys are showing. Why would they lie?

Terretta · 2024-06-10T18:04:28

Hallucination, in my opinion, is mostly down to bad prompting.

Just as one can say, in response to a perfectly correct answer, "Explain what's wrong with your answer" and it will find something, one can either create a vacuum it feels compelled to fill with hallucinations or ask in a way it stays grounded.

Zambyte · 2024-06-10T14:53:02

It's pretty dependant on the questions you ask. Obviously when you push them to their limits, GPT 4 goes a lot further. But what about for your average queries though? I don't have the data to make a proper claim about that.

limfop · 2024-06-10T16:20:00

> It's pretty dependant on the questions you ask.

That’s the description of a house of cards rather than some revolutionary new technology.

rafaelero · 2024-06-10T14:56:54

That's why we have Lmsys for. And spoiler alert, it performs better than 3.5 no matter the complexity of the question.

Zambyte · 2024-06-10T15:19:08

And are you sure people ask the chatbot arena the same caliber of questions they ask a single model? Wouldn't you tend to ask more difficult questions to find where one fails but the other doesn't? I'm not convinced that the average query will get a significantly better result from GPT 4. I only know that is true for more difficult queries.

rafaelero · 2024-06-10T16:07:44

Do you have access to ChatGPT? You can try it yourself (you can query GPT-4o and GPT-3.5 afterwards). The difference is large enough to be noticeable at least 80% of the time in my experience.

babuskov · 2024-06-10T15:02:11

Recently I got free 30 minute trial access to GPT4 (and was prompted to upgrade if I liked it).

I used the opportunity to send some queries which GPT-3.5 was bad at, and while GPT4 was better and closer to the correct answers, it still failed to produce them. When I pointed out the problems, it got even closer, but still ended up insisting that its (wrong) answer was correct in the end.

So, yeah, GPT4 is much better than GPT3.5, but it's got a long way to go before it becomes real AI.

Currently, I still use it for 3 cases:

1. When exploring something completely new. It saves time reading and learning because you can ask questions with "layman words" and get to learn the specific terms used in the domain quickly.

2. When google search is swamped by SEO spam or google insists on ignoring some of the keywords

3. When I need some boilerplate code that has been written a thousand times. It saves time which would otherwise get spent on concatenating different code segments from 5 different Stackoverflow threads and reading the docs afterwards to ensure it's correct. Now I just copy/paste from ChatGPT and read the docs afterwards to fix any places where it messed up.

Workaccount2 · 2024-06-10T15:11:51

You're doing it wrong if you are not feeding Chat the documentation first.

It's not as great a gpt4O, but with gemini and 1M tokens (2M now also avaiable) you can dump in a small library of "writing javascript for X" books before asking it to produce code for X. It dramatically improves output.

ChatGPT also can be front loaded with documents, but the context is much smaller (32K in the web app, I believe?)

rafaelero · 2024-06-10T15:05:20

> So, yeah, GPT4 is much better than GPT3.5, but it's got a long way to go before it becomes real AI.

I don't disagree. Just wanted to highlight the poor intuition of the guy doing the video.

Workaccount2 · 2024-06-10T15:03:10

Ironically, and I'm sure this will draw ire if Google takes it out of beta, but I "watched" this video by asking the AI overview to summarize the video then asked it specific questions about viewpoints and opinions expressed in the video (I pay for premium and opted into the test beta of this last week).

So rather than watching the video for 22 minutes, I felt I was able to get the gist of it in 1 minute...and denied the creator a full view. Was it accurate? I skimmed the video too in order to see, and it seemed correct, but I am aware of the inaccuracy these models tend to have.

I don't think the hype is necessarily accurate in it's specific predictions, but I do think it is accurate in the "things are going to change dramatically" way.

dwallin · 2024-06-10T15:04:45

Generative AI is progressing at an extremely rapid rate. What many are experiencing though, is that every company wants to be perceived as being on the cutting edge. Being on the cutting edge gets you mindshare, free publicity, investment, stock price bumps, etc. This incentivizes all this bluster and false promises. If you just take a shallow look at everything, it would seem like an unending wave of this, and it's easy to get jaded. However right behind that is a not so slow march of people doing actual work, building non-trivial products, or using the technology to create tons of small, sometimes invisible, but cumulatively transformational, improvements to their products.

zoogeny · 2024-06-10T15:23:04

This weekend I caught up with some old friends. One of them works as a Registered Nurse (RN) and the other has been working in marketing for a small local clothing company for over a decade.

The RN was describing a new AI service that her practice is using. It is a plugin to their tele-medicine software that listens to an ongoing call and uses speech-to-text plus whatever AI magic sauce the product has to fill out a "call sheet". I don't know precisely what a call sheet is, but the RN said it was the worst part of her job and she was very happy that it would automate the paper-work that takes up a large part of her day.

The marketing manager laughed and said she used GPT multiple times a day. As in, she doesn't even send emails now without running the text through GPT. She then relayed an anecdote about how she had taken a picture of her fridge and asked GPT to identify the food available in there and to make a weekly meal plan for her family.

These anecdotes were just brought up spontaneously. That is, the RN was just excited and musing about a new huge time-saver that was making her work-life better. And the marketing manager, who has already completely integrated GPT into her work, was showing how she was now integrating GPT into her family life.

Even though I understand the "facts based reasoning" that is advocated in this video - my personal experience is seeing excitement not at hype, but excitement by non-tech people in the current uses of the technology. The trend I see in the people I am talking to is that AI, even the weak LLM version we are seeing now, is being integrated into work-life and personal-life at an astonishing pace.

limfop · 2024-06-10T16:17:17

The model suggested adding glue to her meal plan then everyone clapped.

https://www.theonion.com/guy-who-sucks-at-being-a-person-see...

Turing_Machine · 2024-06-10T14:55:55

Am I the only one who thinks that ChatGPT-4o is worse in pretty much every respect than previous versions, and that the previous versions themselves seem to have been dramatically dumbed-down?

I'm on the verge of canceling my subscription altogether, as trying to get the damned thing to do what I ask is starting to take more time than just doing it by hand.

If anyone from OpenAI is reading this: I (and I suspect many others) would rather pay more money than use a shitty model.

jamil7 · 2024-06-10T15:01:36

Yes, I cancelled my subscription a while ago, I can't tell if it's getting objectively worse or I just ran up against the limitations in using it for software development. I spent so much time correcting code it had written or getting it to rewrite code or getting concrete answers from it that I didn't feel like I was saving as much time as I initially thought after the novelty wore off. I will try Claude now it's available in my region. Overall it's still a very useful tool and I still use it but have stopped paying.

fnordpiglet · 2024-06-10T22:21:02

I found the part about the voice and image stuff not being amazing weird - I thought they still haven’t released the new voice multimodal model to the apps. It seems to lose the point of the change going from a pipeline task delegated to a VTT model to the LLM to a TTV model into a new voice / text multi modal model that can encode nuance in the audio into semantic space and impact generation of tokens which themselves being in the voice domain directly can encode generative nuances in voice. This can include picking up on and generating inflection in the voice etc. Maybe it doesn’t address software engineering by AI, but IMO, that’s the absolutely least interesting use case for AI on earth despite being manically obsessed over by software engineers.

digitalsushi · 2024-06-10T14:48:29

how could the hype be in control?

we have a lot of people with a lot of investments that benefit greatly from hyping the products, an investment gold rush

we simultaneously have an industry of people leveraging better jobs, job functions, job titles, by showing off their skills with this new technology making real things very quickly, and causing companies to have fomo and react without planning by overhiring and overpromoting

and then finally we have a third class of people just talking about this new technology for the benefits of proximity. if there's a rumor george clooney is is crashing a party, half of people will make sure they're the loudest voice for the perceived benefits of that proximity. and even if it's not accurate, it's effective

all of it's happening at the same time

osigurdson · 2024-06-10T15:04:22

I don't know. I struggle to recall anything as transformational as the recent advancements in AI. Talking to a computer and getting mostly correct answers is literally Star Trek level. I suspect we are getting at least single digit increases in global GDP now due to LLMs.

masswerk · 2024-06-10T15:16:53

I think, the key word here is "mostly". If it doesn't do better, the transformation is really that we have managed to engineer reliability and repeatability out of machines, which have been their defining characteristic in the past.

So, will they do better? I personally think that knowledge is about implications. Notably, things that are similar do not necessarily have identical implications. And I can't see LLM tackling this issue anytime soon. (I even doubt that LLMs provide a suitable approach to this, at all.) So we'll probably end up with "we managed to get rid of reliability", which may be just the opposite of the role, the computer plays in Star Trek scripts.

*) To provide a bit of context to reliability and repeatability: These are really externalization of the major generalizations of Husserl's life-world (Lebenswelt), as in "and so on" and "I can do this again". We are generally fine with machines, because they embody these principles in confined parameters. However, there's also a certain danger in this, as we tend to generalize and idealize along these lines, as this is essential to how we construct and navigate the world, we live in. We'll also apply this to things that are just "mostly", and are apt to be fooled by this, as may be the case with LLMs. (And in the past, machines were a great way out of this trap, as long – and as soon – as we were able to verify the parameters and boundaries.)

netman21 · 2024-06-10T14:55:54

Remember when Cliff Stoll wrote a whole book about how the internet was over-hyped, Silicon Snake Oil? In 1995. We are in the same stage with AI.

righthand · 2024-06-10T14:50:15

What is human intelligence? Are all humans, human intelligent? How do I know a machine has gotten there? Seems like a pretty poorly defined goal.

Zambyte · 2024-06-10T15:13:47

Yeah, this is the question I always find myself asking after anything like this. Without establishing a definition of "intelligence", anything that follows is meaningless.

And "intelligence" can be defined. People just think of different things when you use the word unless you define it. Here is how I like to define intelligence: the intersection between knowledge and reason. An example of a system that is highly knowledgeable but lacks intelligence due to its inability to reason is Wikipedia. A system that is highly reasonable but lacks intelligence due to its lack of knowledge is a calculator.

It's obvious that language models encode more knowledge than humans do. Even a few GB model that I can download and run on my computer can spit out facts about random things. Way more than any human could. Sometimes they are wrong about facts, but so is any knowledge system, including humans.

An LLMs reasoning capability is a bit less clear cut. When you have any understanding of how LLMs work, it seems like they should have no ability to do novel reasoning. Yet when you try to push them, they are often surprisingly good. I think that people who say the best LLMs have worse than human reasoning capabilities probably overestimate the average adults reasoning capability. Plus it is not hard at all to hook up a language model as a component in a greater intelligence system, by adding other components that specialize in reason. For example an environment to run code in.

righthand · 2024-06-10T15:25:10

I will say that LLM reasoning is good because it’s using information that has already been reasoned. Whether that reasoned information is fact or fiction is another case all together, so since LLMs are good at regurgitating information then they are good at regurgitating reasoned information as well. I would expect reasoned regurgitation to inform me at best; I wouldn’t expect something regurgitating to be able to necessarily be able to reason information it can’t regurgitate already.

Zambyte · 2024-06-10T19:02:32

Wikipedia is using information that has already been reasoned also. That is just knowledge. I specifically called out "novel reasoning" because you can use examples from your personal life after the model was trained, or completely fabricate scenarios, and it will still be able to produce interesting and useful reasoning about it.

Again, it's not the best tool available for training and logic applications, but the fact that it really can demonstrate novel reasoning along with its obvious knowledge means thar it clearly has some non-zero level of intelligence.

righthand · 2024-06-10T20:41:03

I know some dogs that don’t quite understand capitalism but can definitely reason in a novel way.

I think intelligence might not just be reasoning but also the ability to seek knowledge and reason. Some form of will to attain higher intelligence. Machines only have human interests of further knowledge in that regard.

Zambyte · 2024-06-11T14:34:12

> I know some dogs that don’t quite understand capitalism but can definitely reason in a novel way

I would just describe this as intelligent but less than average human intelligence.

> I think intelligence might not just be reasoning but also the ability to seek knowledge and reason. Some form of will to attain higher intelligence.

This is a circular definition of intelligence that doesn't really make sense. How can intelligence be the drive for more intelligence? Those have to be two different things.

Consider a college educated person working in some field that requires lots of critical thinking. They have been doing their job for 30+ years, and they are very good at it. They are stubborn and set in their ways; not interested in changing anything. Now consider a 3 year old who is curious about everything. Who is more intelligent? The answer seems to obviously be the adult, even though they refuse to learn anything more.

I really don't think the willingness or ability to learn is a component of intelligence. A system can be statically intelligent or it can be dynamically intelligent. Dynamically intelligent systems are more exciting and novel, but statically intelligent systems can still be extremely useful.

Ability to learn is more interesting in the context of general intelligence (also note general intelligence implies the existence of non-general or domain specific intelligence). In order to be able to act intelligently in any situation (such as: what if my computer suddenly grows legs) it must be able to learn through experimentation.

righthand · 2024-06-18T16:53:08

So what I gather from what you’re saying. A will to learn has nothing to do with intelligence. But then you go on to state that it revolves around general or domain specific intelligence. Then what is general intelligence. You’re moving the goal post back into undefined terms.

I am not stating that intelligence is the drive for more intelligence. I am saying that intelligence is that act of collecting more knowledge. Setting bars like:

> I would just describe this as intelligent but less than average human intelligence.

I know some people who appear dumber than dogs because they believe after schooling is complete they have everything they need to operate in life. Are these people below human intelligence?

Intelligence is restrictive to your exposure to knowledge. All knowledge can be aggregated to a domain. You could state “general intelligence” is a primary school education, but not all primary schools deliver the same education. So what is it?

limfop · 2024-06-10T16:26:38

Come on, we all know “human intelligence” is defined as “qualifying for a boat loan”.

https://morbotron.com/caption/S02E14/267783

madaxe_again · 2024-06-10T14:30:25

It’s ridiculous. Everyone is investing in Newcomen’s Engine for the Raising of Water by Fire - but it’s just a flash in the pan, and nothing can be accomplished by steam that can’t be accomplished by a good horse.

mcpar-land · 2024-06-10T14:42:06

instead, invest in our AH (Artificial Horse), which requires ingesting the behaviors of 100,000,000,000 real horses to approximate how they function! Ignore the mutant hooves and walking backwards, we can fix this if we just 10x the horse ingestion every year. It approximates a horse better every time!

GolfPopper · 2024-06-10T14:36:45

Nonsense! Just look at how successful tulip bulbs and cryptocurrency were!

1vuio0pswjnm7 · 2024-06-10T23:33:02

tl;dw

1. He reminds viewers of deliberate attempts to take advantage of known human gullibility toward so-called "AI" ("Eliza Effect"), using dark patterns.^FN1

2. He cites various instances where so-called "tech" companies have lied about "AI" products and/or faked "AI" demos.

3. The creator of the video said he has included some sixty links to sources. He asserts that comments about "AI" that do not cite sources are unpersuasive. (That will not stop HN commenters from sharing their unsupported opinions.)

FN1. He tells viewers that "cute" is a dark pattern, e.g., a giggling "AI" voice. Long have I wondered about all the silly non-descriptive or misleading names chosen for contemporary software, for so-called "tech" companies, as well as the silly graphics and mascots. What is their purpose. For example, is the reason for the silly company names chosen by so-called "tech" companies more than just avoiding trademark disputes. (Yes.) If software developers and so-called "tech" companies intentionally use these tactics to trick people into thinking or doing something that is against those peoples' interests, then arguably these could be dark patterns. Obvious example of a misleading name: "OpenAI" is not open, and that may have serious consequences, but it's unlikely the company will be changing its name any time soon. The video creator mentions that the SEC has stated it is cracking down on the use of "AI washing" where companies add "AI" to names or otherwise use terminology to trick people into believing "AI" is used when in fact it is not.

In sum, the video is about the Silicon Valley "culture of lying". Perhaps ironic that the "AI" being pitched by SillyCon Valley today has no concept of a "lie". Despite the endless anthropomorphism, a computer running "AI" has no concepts whatsoever. Concepts come from people, not computers.

ai4ever · 2024-06-10T14:32:00

Everyone with cash to spare (CSPs and consumer-internet monopolies) and fomo will have their GPUs soon.

Who is left ?

zombiwoof · 2024-06-10T14:36:09

It’s because we don’t need Artificial we need Apple Intelligence

ai4ever · 2024-06-10T15:12:27

* red alert *

narrative shift triggered. needs to be combated with more hype. VC money is at stake. co-pilots are threatened.

sora-hype is exhausted. 4o-hype is tired.

call-to-action: "thought-leaders" like a16z to put out giant "architecture of ai" articles to keep the fomo alive.