Hacker News new | past | comments | ask | show | jobs | submit login

> LLMs currently statistically regurgitate existing data.

NO! They do not.

Deep learning models are "universal approximators". Any two-layer neural network with enough parameters, data and training is a universal approximation. That means they can learn ANY relationship with an arbitrary accuracy.

Going beyond two layers, with several layers, problem domain structured architectures, and recurrent connections, they become far more efficient and effective.

So yes, they learn associations, correlations, stochastic models, statistics.

But they also learn to model functional relationships. Which is why they are able to generalize relationship to new situations, and combine previously unrelated relationships in reasonable and surprising ways.

A large part of creativity is putting together previously unrelated concepts and then letting the obvious logic of those relationships combine to result in something new an unexpected.

Note that both combining normally unrelated things, and combining the concepts in some way more or less consistent with what those concepts normally mean, is well within the grasp of current models.

They haven't outclassed out best thinkers. Or any of our best thinking as individuals yet. They are still very limited on problems that require many steps to think through.

But they are definitely, within their limits, being creative.

And they are far, far, FAR from just being statistical parrots.




> and combine previously unrelated relationships in reasonable and surprising ways.

We've yet to see those surprisng ways despite all the claims.

Note: what they do already is amazing and surprising in itself (such as "write me a song about quantum physics suitable for a 5 year old"). It's still very much shy of "hey there's this new thing previously unthought of".


> We've yet to see those surprisng ways despite all the claims.

This is the one reason everyone is finding them fascinating.

Perhaps you find them boring. Rote. Or something. But the reason non-technical people, as well as technical people, are enjoying and learning by interacting with chat and other models is how often the results are interesting.

I asked ChatGPT-4 to create a Dr. Seuss story about Cat in the Hat and my green conure parrot Teansy, that involved sewing and Italy. It produced a wonderful story of how they met in Italy, became friends, encountered a homeless child with a threadbare blanket and help the child. Then began helping others and ended up creating a fashion design studio.

All written in Dr. Seuss prose that made for a perfect children's book.

Pretty creative.

I then asked GPT to continue the story, but as a James Bond novel where one of Teansy's mysterious clients was actually a criminal using the fashion industry to hide his nefarious practices, and that Teansy should help James Bond solve the case.

For that I got another great story, completely consistent to James Bond tropes. It came up with a story line where the fashion industry was used to launder blood diamonds, which I thought was brilliant. A perfectly good rational for a James Bond villain. The story was great.

Throughout, Chat threw in funny suitable mentions about Teansy's fashion focuses, including feather lined wear, etc.

And all this creativity in a first draft written as fast as I could read it.

A year ago, nothing on the planet but a whimsical human with too much time (more time than it took Chat), on their hands could do this.

--

Obviously, we are discovering Chat can perform far more complex behaviors.

Act as any agent we describe including computer systems, or the internet. Respond quickly to feedback. Form plans. Learn and summarize the grammar of small artificial languages fairly well just from examples, ...

Without interacting with these models we would never have declared these were expected behaviors.

So I don't know what basis the emergence of these behaviors isn't surprising. Hoped for, envisioned, sure. But hardly an expression of obviously predetermined designed-in capabilities.


This is all interpolation between existing concepts. It is not a counterexample.


> This is all interpolation between existing concepts.

Interpolating sounds like a simple task.

But whether it is depends entirely on the data. Simple data will result in a simple interpolating model.

But complex data requires complex relationships to be learned.

Calling a complex model just an interpolator is like saying human beings are just another bag of atoms doing what atoms do. Technically correct, but missing the significance of humans.


It also isn’t really clear to me that humans aren’t also interpolating between complex existing concepts when we come up with novel thoughts or ideas. Our minds are complex, our pre-existing knowledge base is complex. It’s impossible to know if our unique thoughts aren’t really some complex amalgamation of other thoughts we already have in there somewhere, perhaps a mashup of seemingly unrelated thoughts that just happen to lie closely in the multidimensional space of ideas to the thing we are thinking about. Sounds potentially similar to a complex LLM then, really.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: