OpenAI, Google DeepMind's current and former employees warn about AI risks

therobots927 · 2024-06-05T00:16:01 1717546561

Personally, I think the biggest risk of AI is that it doesn’t live up to expectations (in the short term).

bsaul · 2024-06-05T05:45:49 1717566349

I feel that it's really hard to get an intuition on where all this is going. on the one hand you've got sci-fi like openai demos, and for sure their tool is definitely a productivity booster which i use often, on the other hand it still fails mysteriously at some elementary tasks.

And this has been a reccuring observations of neural networks for the past 40 years, which makes me think there's some inherently problematic blockers with the fundamentals of the tech we haven't solved yet.

therobots927 · 2024-06-05T12:50:58 1717591858

I think there’s a chance that AI capabilities have already plateaued. Why else would OpenAI make their latest chatbot free? They need more data to train it! Same goes for the big tech companies talking about making agreements with newspapers and book publishers. None of this is a good sign in my opinion. I recently found this old article which speaks to what you’re saying, why can’t a neural network learn the rules of game of life? The input is clean (pixels) so why isn’t the output clean? https://bdtechtalks.com/2020/09/16/deep-learning-game-of-lif...

anon373839 · 2024-06-05T11:27:08 1717586828

It’s actually not that mysterious. Deep learning is curve-fitting. The whole premise of it is to approximate data and provide a function to interpolate between the sampled points. This is a very static end product, nothing like the dynamism of actual intelligence.

If your input is sufficiently similar to enough training data, then your output is going to be good. If it isn’t, then it’s a crap shoot.

ErikBjare · 2024-06-05T12:34:53 1717590893

There are a lot more failure modes specific to LLMs deriving from their auto-regressive nature. "Enough training data" isn't enough, the training data also needs to include lots of directions for when and how to hedge outputs so that it doesn't dig itself into a hole.

Example query: "list 5 songs where the lyrics start with "hey" but the title doesn't"

It will confidently hallucinate answers where the lyrics do start with hey, but so do the song title. But if you tell it to first output the lyric and then the song title, it will correctly check that both conditions are true before claiming a match. "sufficiently similar training data" wouldn't help in this case, or at least not without making the training data so exhaustive as to be impractical.

This is essentially another kind of CoT prompting which helps these failure modes. It seems difficult to train the models themselves to determine they need a suitable strategy to work around issues like these (as opposed to prompting it to).

DoItToMe81 · 2024-06-05T06:29:08 1717568948

"Regulatory capturists drum up support for regulatory capture"