Growing among who? The more I learn about and use LLMs the more convinced I am w...

TeMPOraL · on May 16, 2023

> The more I learn about and use LLMs the more convinced I am we're in a local maxima

Not sure why would you believe that.

Inside view: qualitative improvements LLMs made at scale took everyone by surprise; I don't think anyone understands them enough to make a convincing argument that LLMs have exhausted their potential.

Outside view: what local maximum? Wake me up when someone else makes a LLM comparable in performance to GPT-4. Right now, there is no local maximum. There's one model far ahead of the rest, and that model is actually below it's peak performance - side effect of OpenAI lobotomizing it with aggressive RLHF. The only thing remotely suggesting we shouldn't expect further improvements is... OpenAI saying they kinda want to try some other things, and (pinky swear!) aren't training GPT-4's successor.

> and the only way they're going to improve is by getting smaller and cheaper to run.

Meaning they'll be easier to chain. The next big leap could in fact be a bunch of compressed, power-efficient LLMs talking to each other. Possibly even managing their own deployment.

> They're still terrible at logical reasoning.

So is your unconscious / system 1 / gut feel. LLMs are less like one's whole mind, and much more like one's "inner voice". Logical skills aren't automatic, they're algorithmic. Who knows what is the limit of a design in which LLM as "system 1" operates a much larger, symbolic, algorithmic suite of "system 2" software? We're barely scratching the surface here.

ux-app · on May 16, 2023

>They're still terrible at logical reasoning.

2 years ago a machine that understands natural language and is capable of any arbitrary, free-form logic or problem solving was pure science fiction. I'm baffled by this kind of dismissal tbh.

>but LLMs are never going to go into a recursive loop of self-improvement

never is a long time.

leonidasv · on May 17, 2023

Two years ago we already had GPT-2, that was capable of some problem solving and logic following. It was archaic, sure, it produced a lot of gibberish, yes, but if you followed OpenAI releases closely, you wouldn't think that something like GPT3.5 was "pure science fiction", it would just look as the inevitable evolution of GPT-2 in a couple of years given the right conditions.

ux-app · on May 17, 2023

that's pedantic. switch 2 years to 5 years and the point still stands.

edgyquant · on May 17, 2023

No it isn’t. Even before transformers people were doing cool things with LSTMs and RNNs before that. People following this space haven’t really been surprised by any of these advancements. It’s a straight forward path imo

canjobear · on May 17, 2023

In hindsight it’s an obvious evolution, but in practice vanishingly few people saw it coming.

leonidasv · on May 17, 2023

Few people saw it coming in just two years, sure. But most people following this space were already expecting a big evolution like the one we saw in 5-ish years.

For example, take this thread: https://news.ycombinator.com/item?id=21717022

It's a text RPG game built on top of GPT-2 that could follow arbitrary instructions. It was a full project with custom training for something that you can get with a single prompt on ChatGPT nowadays, but it clearly showcased what LLMs were capable of and things we take for granted now. It was clear, back then, that at some point ChatGPT would happen.

berniedurfee · on May 16, 2023

I’m agreeing with this viewpoint the more I use LLMs.

They’re text generators that can generate compelling content because they’re so good at generating text.

I don’t think AGI will arise from a text generator.

behnamoh · on May 16, 2023

My thoughts exactly. It's hard to see signal among all the noise surrounding LLMs, Even if they say they're gonna hurt you, they have no idea about what it means to hurt, what is "you", and how they're going to achieve that goal. They just spit out things that resemble people have said online. There's no harm from a language model that's literally a "language" model.

forgetfreeman · on May 16, 2023

You appear to be ignoring a few thousand years of recorded history around what happens when a demagogue gets a megaphone. Human-powered astroturf campaigns were all it took to get randoms convinced lizard people are an existential threat and then -act- on that belief.

nullsense · on May 16, 2023

I think I'm just going to build and open source some really next gen astroturf software that learns continuously as it debates people online in order to get better at changing people's minds. I'll make sure to include documentation in Russian, Chinese and Corporate American English.

What would a good name be? TurfChain?

I'm serious. People don't believe this risk is real. They keep hiding it behind some nameless, faceless 'bad actor', so let's just make it real.

I don't need to use it. I'll just release it as a research project.

edgyquant · on May 17, 2023

I just don’t see how it’s going to be significantly worse than existing troll farms etc. This prediction appears significantly overblown to me.

forgetfreeman · on May 17, 2023

Does it really? You thinking LLM-powered propaganda distribution services can't out-scale existing troll farms? Or do a better job of evading spam filters?

edgyquant · on May 17, 2023

No I’m thinking that scaling trolls up has diminishing returns and we’re already peak troll.

nullsense · on May 17, 2023

Any evidence or sources for that? I just don't know how that would be knowable to any of us.

Yuval Noah Harari gave a great talk the other day on the potential threat to democracy from the current state of the technology - https://youtu.be/LWiM-LuRe6w

nullsense · on May 17, 2023

Only time will tell.

forgetfreeman · on May 17, 2023

It's not like there isn't a market waiting impatiently for the product...

nullsense · on May 17, 2023

It's definitely not something I would attempt to productize and profit off of. I'm virtually certain someone will, and I'm sure that capability is being worked on as we speak, since we already know this type of thing occurs at scale.

My motivation would be simply shine a light on it. Make it real for people, so we have things to talk about other than just the hypotheticals. It's the kind of tooling that if you're seriously motivated to employ it, you'd probably prefer it remain secret or undetected at least until after it had done it's work for you. I worry that the 2024 US election will be the real litmus test for these things. All things considered it'd be a shame if we go through another Cambridge Analytica moment that in hindsight we really ought to have seen coming.

Some people have their doubts, and I understand that. These issues are so complex that no one individual can hope to have an accurate mental model of the world that is going to serve them reliabily again and again. We're all going to continue to be surprised as events unfold, and the degree to which we are surprised indicates the degree to which our mental models were lacking and got updated. That to me is why I'm erring on the side or pessimism and caution.

goatlover · on May 17, 2023

So the LLM demagogue is going to get people to create gray goo or make a lot of paper clips?

visarga · on May 16, 2023

A language model can do many things based on language instructions, some harmless, some harmful. They are both instructable and teachable. Depending on the prompt, they are not just harmless LLMs.

ben_w · on May 16, 2023

> They're still terrible at logical reasoning.

Are they even trying to be good at that? Serious question; using LLMs as a logical processor are as wasteful and as well-suited as using the Great Pyramid of Giza as an AirBnB.

I've not tried this, but I suspect the best way is more like asking the LLM to write a COQ script for the scenario, instead of trying to get it to solve the logic directly.

fsckboy · on May 16, 2023

> using the Great Pyramid of Giza as an AirBnB

, were you allowed to do it, would be an extremely profitable venture. Taj Mahal too, and yes, I know it's a mausoleum.

ben_w · on May 17, 2023

I can see the reviews in my head already:

1 star: No WiFi, no windows, no hot water

1 star: dusty

1 star: aliens didn't abduct me :(

5 stars: lots of storage room for my luggage

4 stars: service good, but had weird dream about a furry weighing my soul against a feather

1 star: aliens did abduct me :(

2 stars: nice views, but smells of camel

staunton · on May 16, 2023

Indeed, AI reinforcement-learning to deal with formal verification is what I'm looking forward to the most. Unfortunately it seems a very niche endeavour at the moment.

stuckkeys · on May 16, 2023

I was looking at the A100 80gb cards. 14k a pop. We gonna see another GPU shortage when these models become less resource dependent. CRYPTO era