An understanding of AI’s limitations is starting to sink in

mattlondon · on June 14, 2020

The trouble is that people have been sold this idea that ML/AI can do amazing things, without properly being told that really the things it can do are quite narrowly-scoped. They've been sold the Star Trek computer idea.

For example, years ago I was working on a prototype/proof-of-concept thing for instrumenting industrial machinery with stick-on small computers. Simple stuff - attach accelerators, temperature, humidity etc sensors to existing machines and collect the data and send it back.

The management thought we'd be able to apply machine learning on the data to get "business insights" from the all-powerful machine. They didn't know what these insights might be, just that ML/AI would generate them and therefore make the business a fuck-ton of money because AI generated novel new "business insights" that no one had thought of before and so transform the business. They thought it was just a magic box that would generate unbounded magic answers for their needs by passing in just some temperature and humidity readings or whatever, and then it would tell them they need to make more brown bread and less bagels in the North East region etc.

In reality, as I understand it, currently ML/AI requires us to know what the possible answers can be before we even begin training the network. So the classic example is it needs to know that the possible MNIST digits are 0-9, or that you are looking for one of 100 image classes etc.

You cant train a network with the MNIST digits, and then have that network tell you what shares to buy or sell.

Sure you can lop off the final layer and repurpose some of the middle layers, but you still need to train it to classify the inputs into categories you define up front. It won't give you a novel answer that you have not trained it for.

... at least that is how I understand it. Things may have changed over the past 5 years or so.

That said, I do agree there have been some cool things lately like machine vision etc. I don't think it will be that huge an industry though - it feels like a lot of it is largely just commoditised now (which is good) and it will be just like any other library you pick - like picking a UI framework for a web app. Just pick up a pre-trained network from modelzoo and get on with your real business requirements for 99% of people using ML, while the other 1% (at FAANGs et al) and academia churn out new models.

TrackerFF · on June 15, 2020

That's actually something ML is incredibly useful at, when it comes to machines with sensors - failure prediction / anomaly detection, etc.

In the industry, (preventive) maintenance takes up a pretty huge chunk of resources. It's something techs need to do often, and it's often a laborious task, but it's obviously done to reduce downtime.

So the business insight, as they like to call it, is to reduce costs tied up to repairs and maintenance.

All critical applications have multiple levels of redundancy, so that a complete breakdown is very unlikely, but it's still a very expensive process if you're dealing with contractors. If you can get techs to swap out parts before the whole unit goes to sh!t, then that's often going to be a much cheaper alternative.

But, in the end, it comes down to the quality of data, and the models being built. A lot of industrial businesses hire ML / AI engineers for this task alone, but expect some magic black-box that will warn x days / hours / minutes ahead that a machine/part is about to break down, and it's time to get it fixed. And they unfortunately expect a near-perfect accuracy, because someone in sales assured them that this is the future, and the future is now.

nmyk · on June 15, 2020

Yep, you and the user you're replying to are both right in different ways. One thing's for sure - machines don't generate "insights" on their own.

Let's define an "insight" as "new meaningful knowledge", just for fun. We could talk about what comprises "new" and "meaningful" but it would be beside the point I'm making.

In a supervised learning problem, the range of possible outputs is already known, meaning the model output will never be categorically different from what was in the training data. The knowledge obtained is meaningful as long as the training labels are meaningful, but it can never be new.

Unsupervised learning doesn't have a notion of "training data" but that means an unsupervised model's output requires additional interpretation in order to be meaningful. It is possible to uncover new structures and identify anomalies in new ways, but this knowledge isn't meaningful until someone comes in and interprets it.

Applied to the specific example where sensor data is used to try to generate insights about machine functionality: Either you can only predict the types of failures you've already seen, or you can identify states you've never seen but you wouldn't know whether they mean the system is likely to fail soon or not.

It's the Roth/401(k) tradeoff. For model output to be useful, someone must pay an interpretation tax. The only choice is whether it is paid upon insight deposit or withdrawal.

TrackerFF · on June 15, 2020

> Either you can only predict the types of failures you've already seen, or you can identify states you've never seen but you wouldn't know whether they mean the system is likely to fail soon or not.

Yup, this is something I've seen from both sides. First you mention is basically the standard, while the last is part of the deep learning voodoo black magic that executives and sales love.

I've had people approach me with proposals like "What if we just churn [ALL OF] our data through this or that model, and let's see if it comes up with some patterns we've never seen or thought about"

And that's not just for industrial applications. It's everywhere.

What is concerning to me is that this mentality will surely induce more unrealistic expectations. Before you know it, business execs are starting to ask why we need business analysts at all, because surely those fancy deep neural networks can extract all kinds of features - "only need data scientists to figure out those things".

So yeah, that's my fear. That businesses will blindly start to discard domain knowledge, and just feed black-box models their data, and let the data scientists wrestle with the results.

antipaul · on June 15, 2020

It’s not only outputs/labels that provide “insights”.

Knowing how the outputs relate to the inputs is where most new insights could come from.

For example, what feature (input) is driving the “failure” of the machine (output/prediction)?

This is where ML explainability comes in.

Veedrac · on June 15, 2020

This is demonstrably false; AlphaGo made significant new discoveries, for example.

nmyk · on June 15, 2020

Yeah this is where it would have helped if I had discussed what I meant by "new".

AlphaGo is a supervised learner that outputs optimal Go moves given opposing play. It yields new discoveries in the same sense that a model designed to predict mechanical failures from labeled sensor data would: I didn't know what the model was going to predict until it predicted it, and now I know.

But what the factory owners want is a machine that can take raw, unlabeled sensor data and predict mechanical failures from that. They want insights. "Why not just feed all our data into the model and just see what comes out?" they ask. "I don't see why we need to hire at all if we have this neural net."

The reason you need a human somewhere in the system if you want insights is because someone needed to program AlphaGo specifically to try to win at Go. At the factory, someone needs to tell the machine what a mechanical failure is, in terms of the data, before it can successfully predict them.

Then, neither "winning at Go" nor "mechanical failure" are states that the system hasn't already been programmed to recognize. That's what I mean when I say a supervised learner cannot generate "new" output.

shadowmint · on June 15, 2020

If the business wanted to track the rate of failures and create predictive models about when things fail, or detect anomalous behaviour, that's what they would have set out with as the goal, and, perhaps, some ML model might have helped, but probably, it would've been too unreliable and any number of standard predictive models with well known characteristics would have been used instead.

That's not what they wanted.

What people are being sold is AI/ML as a magic bullet that will do something useful regardless of the situation, and it lets business people avoid making decisions about what they actually want, because AI/ML can be anything, so they just signup for it and expect to get 20 things they didn't know they wanted handed to them on a plate.

Turn out, it's not enough to just collect a bunch of data and wave your magic wand at it. It wasn't with web analytics 10 years ago, it's still not.

What you actually need is someone who has a bunch of tricks up their sleeve, and has done this before, and can suggest a bunch of Business Insights the business might need before they start building anything, people that actually decide what to do, and actions taken to investigate, and solve those problems.

I mean, to some degree you're right; perhaps ML models could be useful for tracking hardware failures, but that's not what the parent post is talking about. The previous post was talking about just collecting the data and expecting the predictive failure models to just jump out magically.

That doesn't happen; it needs a person to have the insight that the data could be used for such a thing, and that needs to happen before you go and randomly collect all the wrong frigging metrics.

...but hiring experts is expensive, and making decisions is hard. So ML/AI is sold like snake-oil to managers who want to avoid both of those things. :)

Retric · on June 15, 2020

Projects rarely end up doing what was planned when they started. As long as ML is solving real problems in practice, upper management will keep treating it as magic fairy dust to sprinkle around aimlessly.

It's all about how you package things. ML connected to an audio sensor could predict failure modes that are dificult to detect otherwise. Now that might not be was was asked for, but a win is a win.

bumby · on June 15, 2020

I think both your post and the one you are responding to are correct.

I’ve experienced what the OP was alluding to...namely, it helps tremendously to start out with an understanding of the problem you’re trying to solve, more so in supervised learning. It’s incredibly frustrating to ask managers what business problem they are trying to solve only to be met with “We don’t know, that’s what we want the software to tell us.”

On the other hand, if they say “we want to predict machine failures” or “reduce maintenance downtime” now we have a lens in which to view the data.

If AI could do the magic as those managers hoped, they would be out of a job

sevensor · on June 15, 2020

I work in this space, and you've called out the thing that drives me absolutely bonkers. No matter how accurately I can diagnose the state of the equipment, what everybody wants is to know how long until it explodes, so that they can take it down for maintenance ten minutes before that.

YeGoblynQueenne · on June 15, 2020

This is a good explanation of how things work and things haven't changed much in the past 5 years. What changes the most is the amount of data and computing power ("compute") that are used in training neural nets.

Architectures are also tweaked slightly although they continue to be largely based on the two architectures that started the current deep neural net boom, Long-Short Term Memory Networks (LSTMs) and Convolutional Neural Networks (CNNs), which are largely used for natural language processing and machine vision respectively.

Every once in a while there is a bit of excitement about results that come out of adding a new parameter which is then given a fancy name like "attention" and generates much breathless copy about how neural nets will soon become as intelligent as a dog/ your three year old/ your personal assitant etc. Usually though it's not clear whether these new parameters or architectural tweaks are really responsible for the success of the proposed techniques and when one looks at the field as a whole it becomes more and more apparent that the most successful work is backed by the most data and the most computing power.

To slightly er tweak your prediction, I think what'll happen is that, indeed, like you say, there will be libraries and APIs and so on (well, there already are) but all these will be controlled by large companies that can afford the ground work. So, I don't expect small outfits, research teams at universities or tiny startups, to make a big difference. Academia is fast losing the ability to produce work that beats the state-of-the-art anyway (that's from personal comms with other researches). So in terms of academia perhaps we should expect a shift away from neural nets, to something that's easier to do research on.

rpastuszak · on June 15, 2020

> but all these will be controlled by large companies that can afford the ground work. So, I don't expect small outfits, research teams at universities or tiny startups, to make a big difference. Academia is fast losing the ability to produce work that beats the state-of-the-art anyway (that's from personal comms with other researches).

That makes me particularly sad. In a utilitarian sense, the ideal situation would involve new approaches being democratised. Instead, we're ending up with walled gardens.

The current situation isn't dissimilar to what has happened to the Web Platform during past 5-10 years. And, similar predictions are made wrt VR and AR (both seem to attract different types of investors).

dr_dshiv · on June 15, 2020

To look at a beautiful reframing of AI as a tool and not as a quasi-other agent, please enjoy this page: https://nooscope.ai/

HarHarVeryFunny · on June 15, 2020

The best way to think of today's AI capability is: automation.

You train it with a lots of examples of "given this input, this is the output i want", and hopefully it learns to get the "correct" (similar input => similar output) output for new inputs that you feed it. i.e. you've now automated the process of figuring the correct output for a given input.

There is also the "reinforcement learning" AI paradigm where the trained AI is choosing actions (from a given repertoire) in order to maximize an action outcome score based on some scoring function you provided. This is appropriate in a situation where you want the "AI" to do something more than just select the correct output, but again no magic - you're having to anticipate and score the potential outcomes.

dumb1224 · on June 15, 2020

In my opinion this type of tasks is the typical AI tasks that get most exposed to the general public. The AI/ML model is set to attempt a human task. The aspiration is to get as good as a human (or faster/better more precision etc). Classic statistical inference/ models usually don't perform very well in these scenarios(or do they?).

The typical AI methods train a model that utilises features that don't make much sense to humans (pixels/texture/word tokens etc). In contrast with traditional statistical modelling where each feature were given 'meaning' and their importance explored via investigating the observations (data) using properties of well studied mathematical models, the AI methods that often made the press don't put a lot of emphasis on these properties. The advancement is a push to utilise all available information/data to outstanding 'performance'. The explanation of the inner features within that math space is usually secondary (though I don't mean that authors of novel methods don't care about mathematical modelling).

I might be too naive here but that's how I feel after trying out many methodologies in my field.

HarHarVeryFunny · on June 15, 2020

The "AI" capabilities that are making headlines nowadays are mostly based on deep (multi-layer) neural networks with millions, or billions, or parameters. These nets do self-organize into a hierarchy of self-defined feature detectors followed by classifiers, but as you suggest for the most part they are best regarded as black boxes. You can do sensitivity analysis to determine what specific interval values are reacting to, and maybe glean some understanding that way, but by-nature the inner workings are not intended/expected to be meaningful - they are just a byproduct of the brute force output-error minimization process by which these nets are trained.

Neural nets are mostly dominant in perceptual domains such as image or speech recognition, where the raw inputs represent a uniform sampling of data values (pixels, audio samples) over space and/or time. For classical business problems where the inputs are much richer and more varied, and already have individual meaning, then decision tree techniques such as random forests may be more appropriate and do provide explainability.

dumb1224 · on June 16, 2020

>Neural nets are mostly dominant in perceptual domains such as image or speech recognition, where the raw inputs represent a uniform sampling of data values (pixels, audio samples) over space and/or time.

I mostly agree with that, however there have been advancement in scientific areas such as chemistry and biology (DNA/RNA) where the data are definitely meaningful and a lot times categorical. So the methodologies can be applied in wider areas, they just need a lot of domain knowledge and experience.

a1369209993 · on June 15, 2020

> the possible MNIST digits are 0-9

Except - and this rather ties into your point - those are not the only possible digits; your network also has to deal with (ie reject) other possible digits such as "P", "E", "3̸̶", or "[Forlorn Sigil of Amon-Gül redacted]"[0], which look like, but are not, decimal digits.

0: https://www.youtube.com/watch?v=ajGX7odA87k

HarHarVeryFunny · on June 15, 2020

That depends on how you train it.

If you only train it on examples of 0-9 then those are the only outputs it's going to give. If you fed a "P" into such a net then the outputs would be the degree of similarity of that "P" to each of the (0-9) digits it was trained on. You could of course threshold the output and ignore any prediction with confidence less than, e.g., 90%.

If you wanted the net to do a better job of rejecting non-digits, or at least some specific ones, then you could include a bunch of non-digit examples in your training data (so now your net has 11 outputs: 0-9 and "non-digit"), then hopefully - but not necessarily - it's highest confidence prediction will be "non-digit" when presented with a non-digit input.

a1369209993 · on June 16, 2020

> If you only train it on examples of 0-9 then those are the only outputs it's going to give.

Exactly.

> 11 outputs: 0-9 and "non-digit"

IIRC, this doesn't work (or not well) because the net tries to find similarities between the various members of the set "everything except these ten specific things", but you could just require low confidence for all digits on non-digit inputs as part of the gradient descent function.

The problem is more that - if people with decision-making authority trust the AI to not be insane and evil by default - failure modes like this have to occur to you before the AI starts misbehaving in production.

01100011 · on June 15, 2020

> currently ML/AI requires us to know what the possible answers can be before we even begin training the network

Isn't that what unsupervised learning is for?

HarHarVeryFunny · on June 15, 2020

There's no such thing as truly unsupervised learning - any machine learning algorithm is defined by some guidelines you've provided (or built-in) for what output it has to generate.

For example, you might think of data clustering as an unsupervised problem, but in reality you're still providing supervision in terms of what similarity measure to use and some control over how many clusters it should generate; and, at the end of the day, the output is always going to be a set of clusters, not, say, a stock tip or wry comment on the nature of your data!

bcrosby95 · on June 15, 2020

I took it more to mean that, e.g., when solving for 1+1 it won't be allowed to answer "tomatoes".

Balgair · on June 15, 2020

So, uh, how do you talk to the C_Os about this and tell them that it's GIGO?

Asking for a friend.

HarHarVeryFunny · on June 16, 2020

Tell them ML is so smart it's almost as smart as a C_O.

So if they can solve the problem the ML probably can too ...

joejerryronnie · on June 16, 2020

You get a co-worker to do it.

p1esk · on June 14, 2020

Things may have changed over the past 5 years or so.

Things may have changed over the past 5 weeks or so with GPT-3.

Judgmentality · on June 14, 2020

Until GPT-3 can write something meaningful, it's really just a showcase of the technology and a gimmick of a product.

Sure it's cool, but what problem is it solving? As far as I can tell the only useful function it has is polluting the internet with pseudo-intellectual comments to promote some agenda (likely political). So now that I think about it, it actually would be incredibly valuable for things like subverting democracy.

I'm still waiting for AI to figure out self-driving cars, natural language processing, and other "holy grail" problems that researchers have been chipping away at for decades but we are still a long way from having "solved" them.

Also in the case of self-driving cars, there is actually a very serious problem of it being difficult to guarantee safety. You can't tie every input to every output to demonstrate how it will behave (it is way, way, wayyyyyyyy too big for that), so you have to show statistically by driving so many miles that it's better. Which is still sound from an engineering perspective! But it leads to things like Teslas driving into trucks parked sideways across the road because the engineers simply never predicted it would ever happen. And now it's happened multiple times!

wharfjumper · on June 15, 2020

I don't understand why the goal is to "guarantee safety". It seems to be generally accepted that human error causes approximately 90% of motor vehicle accidents [0]. Surely then the goal of any autonomous or semi-autonomous transport system should be to merely reduce that percentage? If all motor vehicles were "self-driving" and the total number of annual deaths was reduced by one, wouldn't that be a good thing? I would rather reduce my chance of dying early than eliminate the possibility of being killed by a bot.

0. http://cyberlaw.stanford.edu/blog/2013/12/human-error-cause-...

joejerryronnie · on June 16, 2020

Because humans are illogical creatures and will ignore objective facts if they believe control (or perceived control) is being taken away from them. The challenges of rolling out true autonomous vehicles are psychological as much as they are technical.

p1esk · on June 14, 2020

Until GPT-3 can write something meaningful

What is "meaningful"? Honest question. Isn't meaning assigned by a reader? If I'm reading poetry generated by GPT-3 and I like it just as much as poetry written by a human poet, does it make it meaningful? What if I finetune GPT-3 (or the bigger next gen version) on every scientific paper ever written, and as a result it generates a novel idea that turns out to be valid and useful, should I care if it happened by accident, or without "understanding"? Can the knowledge encoded in the model parameters be interpreted as some form of understanding? If no, why not? What's missing exactly? What is different from how human scientists operate?

wpietri · on June 15, 2020

In communication, meaning is a collaboration between writer and reader. The writer does their best to convey something; the reader does their best to understand.

There's also the kind of meaning that scientists and researchers talk about when they extract knowledge from data. That's pretty different from communication; it's more a process of internal generation of notions and explanations that could later be conveyed in communication.

And then there's the meaning that is even more internal. E.g., reading tarot cards or tea leaves, people generate meaning out of nothing. And then there's Pareidolia: https://en.wikipedia.org/wiki/Pareidolia

If you're saying machine-generated text is meaning in the sense of that last category then sure, it's something we loosely call meaning. But it's fundamentally the same as other sorts of cleromancy [1], just more elaborate.

[1] https://en.wikipedia.org/wiki/Cleromancy

p1esk · on June 15, 2020

a process of internal generation of notions

How is this "knowledge extraction from data" process different?

bregma · on June 15, 2020

Immanuel Kant would call the former "rendering a synthetic judgement" and David Hume termed the latter revealing the a priori knowledge. Plato would call both simply giving substance to the forms.

It seems none of this is something new. It's just that you no longer need a good education to learn about old ideas; you can come up with them on your own.

wpietri · on June 15, 2020

Different from what? Regular communication? Because the first involves trying to sync up two minds to have the same idea. The latter involves one mind, trying to generate an idea that ends up being useful. Useful in the George Box sense: "All models are wrong, some models are useful."

p1esk · on June 15, 2020

I meant how is knowledge extraction done by a language model different from knowledge extraction done by a human scientist?

wpietri · on June 16, 2020

Sorry, I don't understand the question.

p1esk · on June 16, 2020

GPT-3 does some form knowledge extraction, right?

You said above: "There's also the kind of meaning that scientists and researchers talk about when they extract knowledge from data."

So if we agree that in both cases some form of knowledge extraction is happening, I wonder how these two forms compare.

wpietri · on June 17, 2020

A search for "GPT-3 knowledge extraction" doesn't yield much, so I don't know what you're talking about. But as far as I'm concerned, knowledge exists in the heads of people, so I don't think we agree on that.

p1esk · on June 19, 2020

knowledge exists in the heads of people

How is this knowledge encoded in the heads of people? Perhaps through the choice of synapse strength and connectivity between neurons?

What is being encoded in GPT-3 weights as it reads through millions of pages of text? How is it different from tuning biological synapses?

Still don't know what I'm talking about? It's OK, neither do I :)

joe_the_user · on June 15, 2020

The thing is, if we could write a specific, closed-end, prescriptive definition of "meaningful" or "understanding" or whatever, then we'd be able to program it. And we can't, so we have to settle for something else, usually how a thing fails to be what we (indeed subjectively) consider meaningful. Still, it's not arbitrary.

The way that something like GPT-3 tends to fail basically is that you 2-3 paragraphs where paragraph 3 will containment statements that subtly contradict the semantics of paragraph 1. Things like seeing things inside closed drawers and other cues that the thing has no fixed world-model. That's what gives an impression of "meaningless" or "no understanding". (admittedly, I've only played with GPT-2 but this is a description of how texts written by these models at first seem plausible and then implausible as you read more).

It's not some philosophical objection akin to "nothing but a human brain can think".

p1esk · on June 15, 2020

the thing has no fixed world-model

What is "world-model"? What makes you think GPT-3 does not have some kind of a world model? It's clearly not a very good one, but at the same time it does not mean it can't get better. A 3 year old also does not have a very good world model, what's the difference between his world model and one of GPT-3? Again, clearly there's a big difference, I'm just not sure we know enough about what's going on inside 175B model to make any dismissive statements about it. I'm also not sure what will happen if you train a much bigger model on much bigger data. Things might start to emerge at some scale.

HarHarVeryFunny · on June 16, 2020

Well, GPT-3 isn't any kind of general intelligence - it's explicitly architected as a language model - something that learns to pay attention to prior context to predict what word comes next. The only kind of world model it has is a statistical model of what word is most likely to come next based on the corpus it has been trained on.

You could argue that general intelligence is also based on prediction, and that a human's world model therefore isn't so different in nature, but there are some very significant differences ...

1) GPT-3's model is based only on a corpus of text (facts, lies, errors, etc) it was fed... there is no grounding in reality.

2) GPT-3 is only a passive model - it's not an agent that can act or in any way attempt to validate or augment it's world model.

3) GPT-3 is architecturally a language model .. it can get better with better or more data, but it's never going to be more than a language model.

The difference between a 3-year old's brain and GPT-3 is that the 3-year old's brain is not a one-trick pony ... it's a complex cognitive architecture, honed by millions of years evolution, capable of performing a whole range of tasks, not just language modelling.

The 3-year old's brain also has the massive advantage of being embedded in a 3-year old autonomous agent able to explore and interact with the world it's world model is representing... It you tell GPT-3 pigs can fly then as far as it is concerned pigs can fly, whereas the 3-year old can go seek out pigs and see that, in fact, they can't.

wokwokwok · on June 15, 2020

> What if I finetune GPT-3 ...

> if you train a much bigger model on much bigger data...

Yep, no one is denying that the potential exists for something to be created that can do something useful.

However, the point being made is that GPT-3 doesn't do anything particularly useful.

The generated text from is not consistent, and apparently fine tuning can result in poor performance (ie. generates random crap). It scored better at a bunch of metrics, which is great, but of questionable practical value.

"world model" or not, it's currently interesting, amazing... probably not useful unless you're writing a spam bot.

What is unclear, is if bigger models will actually solve that or not.

inimino · on June 15, 2020

We know much more about the world-model of GPT-3 (which is none, but it has a language model) than we do about the three year old (which we can't build or reproduce in silico) but we know plenty well enough to say that no, things are not going to magically "emerge" at some scale. You need something fundamentally different from endless cloze deletion exercises to learn what a drawer is.

Judgmentality · on June 15, 2020

Well you can randomly string words together and occasionally get lucky and make a meaningful argument, but that doesn't mean you've created a good method for constructing new ideas.

It seems to me the simplest way to decide whether or not something is meaningful (in this context) is whether or not the author (which is the algorithm GPT-3) can respond to criticisms against its own argument in a coherent way. In which case it has to get lucky twice, so it's that much less likely to happen randomly. If the author cannot respond to comments in a comprehensible manner, it's hard to defend the author.

p1esk · on June 15, 2020

can respond to criticisms against its own argument in a coherent way

I'm not sure about GPT-3, but let's imagine GPT-4 next year will be able to do this. It just does not strike me as a particularly high bar to clear. Let's go further, and assume GPT-5 in 2022 will pass the Turing Test (you personally will not be able to tell). What would you say then?

inimino · on June 15, 2020

Please don't wildly speculate about technologies you clearly don't understand. "Does not strike me as a particularly high bar" means you're unfamiliar with how these systems work at a deep level (by which I mean, you personally cannot sit down at a terminal and build one). So rather than -ahem- making things up and asking "what then?" -- please ask for textbook recommendations on these topics if you'd like to know more.

p1esk · on June 19, 2020

I guess we'll just have to wait and see what happens next year :)

p.s. I built my first language model (LSTM based) back in 2014. Then I built a VAE based one. Then a GAN based. None of them were especially good, so I switched to music generation (this actually works pretty well). My most recent project is using sparse transformers for raw audio generation. Building novel DL models is literally in my job description.

Please don't wildly speculate about strangers you meet on HN.

inimino · on June 19, 2020

Why just wait when you can bet?

p1esk · on June 19, 2020

In order to bet, we would have to agree on evaluation criteria. A task like "respond to criticisms against its own argument in a coherent way" is difficult to evaluate. Turing test is also pretty vague, and some people already declared it passed many years ago: https://www.bbc.com/news/technology-27762088

inimino · on June 20, 2020

Yes, it would be difficult but not impossible to agree on clear criteria. We would also need an impartial third party to make a determination.

The Turing test hasn't been passed if I'm the judge. Supposing I were unable to tell the difference between any AI system and a human interlocutor defending the same argument, at any time before 2023, I'd admit my prediction was wholly incorrect.

I doubt very much I'll find a taker for this bet, however. The AI field has always been bigger on optimism than results, as we both know.

p1esk · on June 20, 2020

Yes, I agree TT has not been passed. But probably starting later this year we will be seeing more and more claims it has. At first it will be clear it's not. Then not so clear, and then the goalposts will be moved again, so that when GPT-5 is announced and it is clearly capable of keeping a conversation, the reaction on HN will be the same as the current reaction to GPT-3: "meh".

inimino · on June 21, 2020

That seems like a pretty testable prediction. If I can't tell the difference between GPT-5 or whatever it is and an adult human native English speaker by the end of 2022, you'll win the bet.

p1esk · on June 21, 2020

How many questions will you need to determine if it's a bot? How many sessions, and what percentage of correct guesses will determine the outcome of the experiment?

inimino · on June 21, 2020

If it wasn't a bot but a human, we'd typically have an unbounded conversation, except by the bounds of politeness, until I was satisfied one way or the other, so that seems a reasonable protocol here, perhaps with some reasonable upper bound on the time spent (an hour?) to avoid putting a potentially unbounded commitment on human participants. (Actually, "We've been chatting since four, I have somewhere else to be" is a pretty good signal of humanity... and should help you see why you're not going to win this bet.) I wouldn't expect it to take more than a few minutes, and I would expect to be wrong in (much) less than 1% of sessions.

p1esk · on June 21, 2020

The more time you spent chatting the better your chances are to guess correctly. I admit that in 2 years the models might not be good enough to fool you for hours. If you put enough thought into it, especially knowing what the bot was trained on, you could devise a set of tricky questions which would expose it.

However, I believe the models will be good enough to fool you during, say, a 20 question/response dialog. They will definitely be able to fool vast majority of unsuspecting humans. And they will definitely be able to keep track of conversation (remember what you said previously, and use it to construct responses to follow up questions).

inimino · on June 21, 2020

How will the bot answer "when and where were you born, and how do you know?" How will it answer "what color, besides red, best communicates the flavor of a strawberry, and why?". How will it answer "What historical figure does my communication style make you think of most, and why?", or "Which of your family members comes to your mind first?" or "What do you think the context was in which the following poem was written?". I don't need to know what it was trained on to win this bet, and 20 open-ended questions is more than enough.

Between the vast majority of unsuspecting humans and me there is a considerable gap. Mind the gap!

p1esk · on June 21, 2020

You are kidding, right? All these questions you provided are extremely simple to answer, compared to many other things clever human interrogators might say during TT. I'm starting to doubt your NLP expertise.

The TT ready model I'm envisioning will be trained on many billions of chat sessions. It will contain dozens of preconstructed graphs and will dynamically construct dozens more (personality graph, common sense knowledge graph, domain specific knowledge graphs, causality graph, dialog state graph, emotional state graph, etc), it will have a bunch of emotion detectors, humor detectors, inconsistency detectors, lie detectors, praise detectors, etc. It will have the ability to query external sources (e.g. google search --> web page parsing --> updating relevant graph). All these modules will filter, cooperate, and vote, providing input to higher level decision making blocks. These blocks will use those inputs to condition and constrain response generation process. This is finally where a language model comes in, and this until recently has been the hardest part - generating a coherent, grammatically correct, interesting text, directly addressing a specific prompt. This part has been solved. Until GPT-2 last year we simply could not generate high quality text. Now we can, and GPT-3 is even better at that. Sure, there are plenty of non-trivial problems left to solve, but I don't view them on the same level of difficulty - some of them have already been solved in the process of IBM Watson development, so I'm optimistic. The hardest remaining challenge is probably constructing common sense graphs. [1] looks promising.

p.s. your questions are so naive I'm not sure if you're trolling me. A human might answer them like this (and a bot built 50 years ago could easily imitate that):

"when and where were you born, and how do you know?"

- [personality - redneck] I was born on a farm in Oklahoma. How do I know what?

"what color, besides red, best communicates the flavor of a strawberry, and why?"

- Red is the right color for strawberries.

"What historical figure does my communication style make you think of most, and why?"

- You talk like one of them big city hipsters.

"Which of your family members comes to your mind first?"

- My little bro Jimmy, we just went fishing together on Tuesday.

"What do you think the context was in which the following poem was written?"

- [depends on the poem] I don't get this poem. What is it about?

[1] https://arxiv.org/abs/1906.05317

inimino · on June 22, 2020

You're betting that in the next roughly two-and-a-half years, the common sense problem will be solved well enough to fool me, despite not having been solved in the entire history of AI up until now. I'll take that bet. How confident are you?

p1esk · on June 22, 2020

To fool you for 20 questions, yes. I'm ~70% confident, so I'll bet you $100 :)

To clarify, the common sense problem is a hard one. It is similar to level 5 autonomy driving. That will take a while to solve. But what we are talking about here is kinda like Waymo cars which can drive themselves in ideal weather at slow speeds in Arizona. So in 2.5 years I think the best chatbots will be as far from having common sense as the current Waymo self driving cars are from level 5 autonomy. Which is to say they will be pretty good.

inimino · on June 22, 2020

You're on for $100. I'm > 99% confident that I won't be fooled by any AI before 2023, and would have bet any amount that I could afford to set aside.

Ideal weather at slow speeds, with a professional human driver throwing road-condition curveballs at you and challenging your responses? I like my chances.

imtringued · on June 15, 2020

>But it leads to things like Teslas driving into trucks parked sideways across the road because the engineers simply never predicted it would ever happen. And now it's happened multiple times!

This is a completely wrong reading of the situation. Tesla engineers clearly understood it is impossible to determine whether a stationary object is actually on the street or just an overhead sign with the equipment that is available on current Tesla cars and therefore they simply ignore all stationary objects. Because Autopilot is not a self driving technology this is not considered a problem. If a Tesla with enabled Autopilot ever runs into a stationary object it is clearly the driver's fault. It's extremely predictable and since it's easy to blame the driver there is no need to fix the problem by adding the required sensors to Tesla cars.

_pd19 · on June 15, 2020

There's a ton of problems that a quality language model like GPT-3 can solve, beyond just spamming text generation - stuff like translation quality, automatic post-editing of text, classification, etc.

perl4ever · on June 15, 2020

"Sure it's cool, but what problem is it solving?"

Isn't it good enough, or very nearly, to generate fake news? And if you can generate something that fools a large percentage of humans, even for a second or two, you can sell ads. It would be solving a problem for anyone who can profit from it. I thought the people who developed it stated that it was too dangerous to release widely? Dangerous = useful to bad people, no?

thu2111 · on June 15, 2020

That was just PR fluff designed to play to the ideological biases of their Valley employee base. They released GPT-2 anyway some months later because other people were going to replicate it anyway, and guess what, the river of fake news we're flooded with daily is still not being generated by AI. It's being generated by journalists with an agenda, same as ever.

Don't get me wrong. You can absolutely generate news articles with these text generation models. But if you look at examples of the "fake news" generated by the monster GPT-3 model, it's not really any different to junk churned out by supposedly respectable news organisations i.e. grammatically correct but filled with logical contradictions, false statements that can be checked in 30 seconds with a search engine and so on. Most people already learned which news sources are trustworthy and which are pushing an agenda. If those outlets replace their journalists with GPT-3 it won't make any difference. The readers who don't trust them will still not trust them, and the readers who just want partisan cheerleading will continue to be satisfied.

perl4ever · on June 15, 2020

The proportion of people who will spend 30 seconds with a search engine to check on something is practically infinitesimal. Even people who do it frequently don't do it most of the time. And it doesn't matter anyway, because it's too late - you only have to fool people for a second, or a fraction of a second.

thu2111 · on June 16, 2020

Great, so focus on solving that problem generally instead of worrying about AI.

What I see is actually quite different. If the people who could detect fake news was practically infinitesimal then you'd see the vast majority of people having super high trust in the media. In fact most people don't trust the media, lots of polls showing that. Sure they may only fact check something occasionally (often by reading about a topic they happen to understand), but people aren't stupid. After they notice or hear about a few mistakes and observe they're always in the same ideological direction, they get the picture.

jtjbdhsjjdnd · on June 15, 2020

In ODS.ai Russian slack-chat a Replika developer has showed the increase in metrics that was provided by GPT-3 in comparison to GPT-2. Here you can see it on Telegram (also in Russian, but the pic is in English) without the access to ODS-slack. https://t.me/dlinnlp/931

dreamcompiler · on June 15, 2020

Here we are in 1989 again.

The cycle keeps repeating. A new advancement in computing power, networking, or algorithms means there's a new batch of low-hanging fruit for AI to pick, so we pick it. Investors say "What about the high-hanging fruit?" and we say "No problem. We just need a slightly longer ladder."

Two years later everybody finally realizes the high-hanging fruit is on the moon.

tgv · on June 15, 2020

My first AI teacher (even before '89) compared solving AI with neural nets to teaching pigs to fly by throwing them from a tower. Improvements come from building higher towers.

There's a recent NLP model that was trained on a trillion words. It would take us 10,000 years to read or listen (no breaks, no sleep) to that many words. Problems like attention, and the relation between memory and sequential thinking haven't been cleared up at all. Even semantics, i.e. basic understanding of a utterance or a scene, is in its infancy.

Large neural nets can help with interesting problems, but it's not going to mimic our style of thinking any time soon.

mjburgess · on June 15, 2020

It isn't ever going to model our style of thinking. A "neural network" is just high-dimension linear regression; the idea it has anything to do with the brain is metaphorical nonesense.

No algorithm running on digital hardware can emulate the biological process of animal intelligence.

machiaweliczny · on June 15, 2020

Why not?

mjburgess · on June 15, 2020

What algorithm running on a digital computer would make the computer transparent?

snarf21 · on June 15, 2020

It seems clear that were limited by access to data and computer power. That has largely changed. However, the hard part has always been around deciding what question to ask and how to score the result. This requires domain knowledge and a scientific process (conjecture, measure, analyze, repeat). There is no magic bullet just better tools. Someone still has to define the problem and solution. Just look at IBM Watson and what a failure that is. They did okay on Jeopardy because they spent years optimizing for just that. Jeopardy could have changed the way they wrote clues by including irony and other language constructs and the humans would have done fine and Watson would have bombed.

gkolli · on June 15, 2020

thank you for this amazing analogy!

unreal6 · on June 14, 2020

"The result is an artificial idiot savant that can excel at well-bounded tasks, but can get things very wrong if faced with unexpected input."

I think this gets to the core of what is still a limitation of current technologies. Venturing into the unknown is still a deeply relevant task that seems unlikely to be replaced by computers anytime soon.

SomewhatLikely · on June 14, 2020

This is changing. In natural language processing just in the past couple weeks OpenAI wrote about their GPT3 model which can learn some tasks remarkably quickly, after only 1 or 2 examples. That model has extreme compute requirements, but it shows strong progress on performing never before seen tasks. There is still some steam left in the current deep learning boom. https://arxiv.org/abs/2005.14165

ausbah · on June 15, 2020

I heard someone theorize that GPT3 can perform so well simply because it can overfit to every domain

zby · on June 14, 2020

This is the same thing people have been writing about computers since 1950s. At that level of abstraction the progress is invisible - and yet: https://slatestarcodex.com/2020/06/10/the-obligatory-gpt-3-p... https://slatestarcodex.com/2020/01/06/a-very-unlikely-chess-...

londons_explore · on June 14, 2020

> can get things very wrong if faced with unexpected input

Alice: Bob, Can you translate "Eat my shorts" into latin for me?

Bob: No. I don't speak latin.

Alice: Go on - try anyway.

Bob: "Eatus mine shortus"

Alice: Wrong! The answer is "Vescere bracis meis". You're totally wrong Bob! I was expecting more of you!

Radim · on June 14, 2020

You missed the preface:

Bob's manager: Here's Bob. You paid a gazillion dollars for him, and he can translate whatever you want. Any language, whenever – the true revolution everyone's been talking about. Your life and our economy will never be the same.

Alice: Bob, Can you translate "Eat my shorts" into latin for me?

…

amayne · on June 15, 2020

For fun I put this into OpenAI's new API:

Alice: Bob, Can you translate "Eat my shorts" into latin for me?

This is what it replied without any training:

Bob: As it happens, yes, I can.

Alice: Great.

Bob: Ego sum vestrum braccas comedisse shortus.

Alice: Exactly what I was looking for.

Google translate says "Ego sum vestrum braccas comedisse shortus" means " I have eaten your pants shorts."

Olreich · on June 15, 2020

If models actually were able to tell you what they know and don’t know, then sure. But instead they just give you an output for any input you give them, whether they have a clue or not.

0xBABAD00C · on June 15, 2020

This is such a simplistic view of a vast and evolving field of scientific research, it's too cartoonish of an argument to even warrant a real response. Which is why the conversation around ML gets negatively selected against actual researchers and practitioners, who get headaches from opinions like the one above.

memexy · on June 15, 2020

What's wrong with their view. They're right that such models will always give an answer whether it makes sense or not. Has anyone tried asking Bob to solve sudoku? What happens? Does Bob fill in the blank spots according to sudoku rules and is the final solution an actual solution for the given board or does it end up being a random list of numbers?

imtringued · on June 15, 2020

Honestly last time I tried to get into machine learning the huge limitations made me lose interest. The idea that we can just pretend that these limitations don't exist is absurd. ML lets us do a lot of very interesting things but it's just a stepping stone. It's something we are stuck with rather than something that we want to keep.

blackbear_ · on June 14, 2020

This is a good article. I don't understand why people find it questionable and get so defensive about it. It does not deny the fast progress nor the potential of current/future techniques. It is simply an explanation for the layman of things that should be obvious to any practitioner in the field.

Barrin92 · on June 14, 2020

> I don't understand why people find it questionable and get so defensive about it

because people in the technology sector have a distorted view of their own importance. Despite the extremely meagre impact of the "computer revolution" on both economic growth and anything outside the world of bits, people in 'tech' fashion themselves to be sort of world-changing figures. AI on top of it is a sort of ersatz religion, the rapture of the nerds as Charlie Stross put it.

code4tee · on June 15, 2020

In probably 99% of AI/ML use cases the AI/ML is basically just a commodity item and the real “expertise” comes from getting and preparing good datasets for analysis, and having a clear problem to solve. The strategy behind something like AWS SageMaker is based entirely around this idea.

The problem is that too many companies believed it was the opposite so they built and hired all these AI/ML “experts” that just wanted to “built models” but didn’t want to focus in the messy hard stuff like finding and cleaning data. Nearly all of these AI/ML “experts” inside companies were also broadly just applying off the shelf tools and algorithms, perhaps with a bit of ensembling, rather than actually building new AI/ML approaches.

As a result, the big investments inside most companies produced a flash and puff of smoke that got people briefly excited followed by a lot of money spent with little business value returned.

I’m a big believer in ML approaches, but in most cases companies need to be focusing in their data first against clear business problem and just use off the shelf tools for the rest. That’s good enough for nearly all needs.

There’s a big bubble at the moment with all these “AI/ML” teams that’s going to crash hard as businesses realize the above and reset to focus on stuff that works and generates tangible value for the business.

cjhanks · on June 14, 2020

We have also been watching these machine learning models for 6 months:

- increase the volatility in virtually every financial market they touched

- be exploited by adversarial learning networks to amplify funded propaganda as news

- use poorly contrived sentiment analysis to generate incomprehensibly meaningless news headlines

These non-linear "function approximators" have absolutely unpredictable and insane non-linear behavior where learned information was non-existent or sparse.

God help us all if one of these artificial intelligence devices is driving the road and sees a red stop sign that is a square, rather than a hexagon.

ipiz0618 · on June 15, 2020

"AI" is a very vague term. What you described aren't entirely "machine learning", but a combination of existing linguistic techniques and machine (deep) learning.

People confuse what AI can do, and what is AI all the time. It also doesn't help when there are so many inexperienced data scientist making promises that they can't achieve.

In your example, I'd argue that a human is not necessarily a better driver than a machine. An attentive and careful driver is certainly better than a machine right now, but there are many who drive carelessly. While a person is unlikely to mistake a square stop sign as something else, there are so many drivers that would simply ignore the sign, and traffic lights in general. They'd also drive dangerously because of road rage, and inattentiveness. And the majority of traffic accidents are caused by these drivers. A machine is unlikely to do these.

That said, until we figure out how to run all these deep learning models without a crazily expensive and power-consuming GPU, it is unlikely AI would be used as general purpose programs.

dreamcompiler · on June 15, 2020

Whether humans or AI are "better" drivers is completely beside the point. The point is that we can characterize human drivers. We know where they succeed and where they fail, both in a statistical sense and in an individual sense based on their age, attention, vision, chemical impairment, etc. But we cannot characterize ML networks. We take it on faith that they work and then we find (because somebody dies) that they run right into an overturned truck or a pedestrian or under a truck crossing the road.

Until we can characterize the behavior of these systems, they must not be put in control of life-critical processes like driving.

the_omegist · on June 15, 2020

Just playing the devil's advocate but : when you take a taxi, what do you know about the driver? You can vaguely see if he's sober and that's all. You EXPECT him to have a driver's license, to have a good eyesight, etc but you KNOW nothing about it. If he has an heart attack while he's driving on the highway, could it have been predicted (by you or the company)? No.

So i don't see why this distinction between AI and humans is made : both are black boxes. Perhaps humans have less "edge cases" but as long as the error level of AI is the same or lower than the one of humans, I don't care if the car crashed because the human driver looked at a sexy woman on an ad on a billboard or because a variable was poorly set in the car's code.

ipiz0618 · on June 15, 2020

I also agree on this. I think in terms of liability humans who one can sue when they make a mistake is more valuable than a machine.

That's why in life critical applications companies who are capable of taking the risk are scarce, because when accidents happen, the company has to take responsibility. It cannot be resolved by just firing employees.

henearkr · on June 15, 2020

You can fix a software, but you can only punish a human driver, hoping it will fix itself. Also, both can be forced to train, but you can reproducibly test only the software, no guarantee that your retrained human driver will not succomb to the same road rage in the near future.

redis_mlc · on June 16, 2020

Nonsense.

You can't fix a model to handle unknowns, and you can't test that.

We've seen with Tesla's autopilot software that things like obsolete road markers and overturned trucks are meaningless to software.

henearkr · on June 16, 2020

Of course you can!!

Even in something not very well defined as a neural network, you can try to retrain it, or also to modify its architecture, or its postprocessing, and verify reproductibly on test cases that it behaves better.

Also, to address your critics, you can add test cases (just like in any sotware. But actually they also do exactly that for hardware too).

Kaibeezy · on June 14, 2020

Where are stop signs hexagons?

Am I being a pedantic numpty or am I illustrating a point about the many ways errors creep in, regardless of the natural- or artificial-ness of the intelligence?

joe_the_user · on June 15, 2020

You're being a pedantic. Human beings are tremendously better at driving than machines despite sometimes saying hexagonal rather than octagonal. Humans and current AIs both make mistakes but humans manage a kind of robustness, ability to deal gracefully with unexpected situations, that current AIs don't seem to be progressing towards.

lopmotr · on June 15, 2020

What happened to sensor fusion? There's no reason self-driving AI has to be as unreliable as toy or research AI. People made these same FUD arguments about computer in cars decades ago. Home computers were unreliable so cars will surely crash if their brakes or throttles are controlled by computers too.

bufferoverflow · on June 15, 2020

> Human beings are tremendously better at driving than machines

Human drivers: 1 death per 88 million miles traveled (in the US) [1]

Tesla Autopilot: 5 deaths per 3 billion miles [2]

[1] https://www.iihs.org/topics/fatality-statistics/detail/state...

[2] https://electrek.co/2020/04/22/tesla-autopilot-data-3-billio... and https://en.wikipedia.org/wiki/List_of_self-driving_car_fatal...

imtringued · on June 15, 2020

Tesla autopilot is just a very fancy form of cruise control. Without corrections by human drivers it will happily run into stationary obstacles.

shaklee3 · on June 15, 2020

Autopilot isn't used in the same conditions.

legolas2412 · on June 15, 2020

Yeah. Totally makes sense to compare a human driver driving an average 7 year old 30k$ car with not so good safety ratings driven by average person in snowy, rainy pothole ridden roads to a newish 70k$ luxury car with good airbags/crumple zones driven in mostly sunny California roads by mostly young drivers with a driver assistance system.

And then for you to argue that the driver assistance system is actually better than a human driver if given the car alone!

Wow

perl4ever · on June 15, 2020

What humans are good at that machines can't do yet is bullshit justifications for screwing up.

dragosmocrii · on June 15, 2020

North America. Where are stop signs not hexagonal?

dragosmocrii · on June 15, 2020

Oh, I checked, they're octagonal xD

somberi · on June 15, 2020

https://www.youtube.com/watch?v=C99SZRbw50E

parineum · on June 15, 2020

When obscured by overgrowth.

EE84M3i · on June 14, 2020

Serious (and likely ignorant) question - what does linearity have to do with anything here? linear over what and why does non-linearity make something 'unpredictable'?

YeGoblynQueenne · on June 15, 2020

Linear models have more bias, so they represent current data less well and are more predictive of future, unseen data (think of a straight line through a point cloud).

Non-linear models have more variance so they represent current data better and are less predictive of future, unseen data (think of a line snaking around a point cloud).

An added complication is that deep neural net models are, in practice, vectors (or, well, tensors) of numbers so they are difficult to interpret. This and their extreme variance makes it hard to know how they will behave in the future.

phreeza · on June 15, 2020

The bias/variance trade-off is not really related to extrapolation. Think of a point cloud following a quadratic shape. A linear model will extrapolate terribly.

YeGoblynQueenne · on June 15, 2020

Well, "more predictive" doesn't mean it's a perfect fit. Every model has error. A line through a point cloud curving upwards will still represent some of the points in the cloud. So it will have high error, but it's still a representation of the data.

And yes, the bias-variance tradeoff is about generalisation (i.e. the ability to extrapolate to unseen data). But this is more related to the fact that in the real world, problem spaces don't have nice, friendly, regular shapes nor do their shapes stay put after we've trained a model.

phreeza · on June 15, 2020

My understanding is that generally, the error when extrapolating to areas not covered by the training data distribution would be considered to be part of the "bias" part of the bias-variance tradeoff.

The way I see it, the variance is the part of the error that you can reduce by collecting more data from your distribution and increasing model complexity if needed.

The bias part is what will not get better no matter how much you sample your distribution, and extrapolation problems fall into that category.

YeGoblynQueenne · on June 17, 2020

>> The way I see it, the variance is the part of the error that you can reduce by collecting more data from your distribution and increasing model complexity if needed.

Ah, apologies, I see what you mean. That is true, but this "error" is in-sample error, so increasing your model's variance will increase its ability to interpolate but not extrapolate to out-of-sample data, as I explain in my longer comment.

"In-sample" means all the data you've collected to train and test with. It includes training/validation/test splits. At the end of k-fold cross-validation, your model has "seen" all the data in your sample and the model that performs best is the model that best represents that data.

But, because the data was sampled from a distribution that is most likely not the true distribution of the data (since that distribution is unknown), the sampling error (i.e. the differences between the true and sample distributions) will be reflected in the model. A high-variance model will suffer more from this than a high-bias one.

Sorry I didn't understand immediately what you meant. The longer comment above is correct but probably doesn't help answer your question directly.

phreeza · on June 18, 2020

Thanks for taking the time to write the detailed reponses. Definitely led me to think more closely about these vaguely held intuitions about bias and variance! I think you are exactly right that the crucial aspect is the variance when looking at out-of-sample predictions, not just across several samplings from the original training distribution (a la k-fold crossvalidation).

YeGoblynQueenne · on June 16, 2020

Bias and variance are characteristics of the model, not components of its error as I think you're saying. In the most simple sense, bias and variance refer to the shape of the function represented by the model (let's say "the shape of the model" for simplicity). A model with a more "rigid" shape (approaching a straight line) has more bias and one with a more "relaxed" shape (further from a straight line) has more variance.

The extent to which a model can extrapolate to out-of-sample data depends on how well the shape of the model follows the true distribution of the data. This is true regardless of the bias and variance of the model. It just happens that most of the time, in interesting, real-world problems, the true distribution of the data is more or less different than the sampling distribution of the training data- i.e. there's always some amount of "sampling error".

Sampling error can't be reduced by collecting more training data- you just have more data with the same sampling error. Increasing model complexity increases variance, so if you start with high sampling error, you wil get a high error on out-of-sample data because your model matches the "off" distribution of the training data too closely. What training with more data and with a more complex model can do is increase the ability of the trained model to interpolate, i.e. to accurately represent (new) data points that are in the same region of "instance space" as the training data points.

A high-bias model can extrapolate well if the sampling error is not too high and the shape of the true distribution is not too irregular. However, a high-bias model will also not interpolate as well as a high-variance model. Its rigid structure will "miss" many data points. Like you say, this will not change if you train with more data. Anyway, that's the tradeoff.

Now, the reason why deep neural nets, which are extremely high-variance models, are trained with large amounts of data, is that they can interpolate very well but can't extrapolate very well. If a model doesn't extrapolate very well but its training sample is a large enough chunk of instance space, it can still be very useful, because it's still representing a large number of instances.

How to put it? Mabye your high-variance model has seen examples of white dogs and black dogs in training, but no green dogs. Your model will not be able to generalise to green dogs, but if green dogs are rare, it will still be able to represent most dogs, so it's still useful.

Of course, looking at the output of a trained model (its behaviour) doesn't tell you anything about what it was trained on. So a model that has very high accuracy on a large number of tasks will look impressive, even if it can't generalise at all.

perl4ever · on June 15, 2020

I'm not good at math, but I'm confused by the association of AI with non-linear stuff, setting aside the association of non-linear with "bad". I thought ML involved linear algebra or something (says xkcd!) which would presumably be...linear?

dreamcompiler · on June 15, 2020

The inner activation function (AF) of neurons is inherently nonlinear; it has to be in order to solve any problem that is not linearly decomposable (which is basically all of the interesting problems). Often the AF nonlinearity shows up as a thresholding operation following a linear weighted sum, but that's not the only mechanism.

And yet neurons are not "pure" binary thresholders the way logic gates are because you can't take the derivative of a binary function, and you can only do backpropagation on differentiable functions. The compromise neurons make is a "smoothed threshold" or sigmoidal curve which is differentiable but still very nonlinear.

YeGoblynQueenne · on June 15, 2020

I'm not sure where the "linear" in "linear algebra" comes from. You hear about linear algebra in relation with machine learning a lot because training a neural net (with the backpropagation algorithm and friends) requires some matrix arithmetic. Inputs to neural nets are vectors or matrices, their weights are (arrayed in) vectors or matrices, their outputs are - well, usually scalars but can also be vectors or matrices.

Also, the use of linear/ nonlinear in machine learning is a bit misleading. A "line" is not necessarily a "straight line", but usually when we say "linear" we mean "straight" and so when we want to say "not straight" we use "nonlinear".

In any case, when we say "line" in machine learning we mean a function, the function of a line. So a "nonlinear" function is a function that curves and turns, e.g. a sigmoid, whereas a "linear" function is straight as a rod.

Why a line? Classifiers er classify by drawing a line through space. "Space" means a Cartesian space where our training examples are represented as points (hence, "data points"). Data points are located in Cartesian space according to coordinates that represent their attributes, or features (these coordinates are the "feature vectors" that are input to neural nets). We classify data points by drawing a line between those that belong to one class and those that belong to other classes. More to the point, when we train a classifier, we find the parameters of a function of a line that separates the points of separate classes and when we want to classify a new point, we look at where it falls with relation to that line.

So that's where all that stuff about lines and "linear" and "nonlinear" models comes from. A "linear model" or "linear classifier" can only draw straight lines. A "nonlinear model" can go twirling around madly.

Finally, "non-linear" doesn't mean "bad". There are tradeoffs- in particular, the "bias variance tradeoff" that I hint at in my earlier comment. A linear model is more limited in what it can represent, but a nonlinear model is less likely to represent data that it hasn't seen in training.

fizixer · on June 15, 2020

- "linear" in "linear algebra" comes from "system of linear equations"

- NN can absolutely represent non-linear functions, and they are based on solving system of linear equations.

- The non-linear function here has nothing to do with the linearity of the system of linear equations used to construct it.

- The two main sources of non-linearity are, (a) the inputs (e.g., an image, or a series of images varying a non-linear fashion), and (b) the activation functions.

klipt · on June 15, 2020

The underlying derivatives are linear (like all derivatives) but neural networks' ability to approximate arbitrary non linear functions is one of their biggest strengths.

perl4ever · on June 15, 2020

Yes, so I'm left wondering, when making the association of the math to the badness, how do you decide if the linearity or the non-linearity is the salient part?

coldtea · on June 15, 2020

Mathematically, you can think of "linear" AI problems as "easy to solve", and non-linear as "difficult". That's part of what the parent means.

Some function being linear means it's easier to guess. If a real world phenomenon is tied to a linear function, then it's easy for AI to guess/approximate.

cjhanks · on June 15, 2020

If you have ever opened up Excel or a similar program. One of the more useful options is to generate a regression line-fit on your data points.

One option is to specify a polynomial function, you can specify how many coefficients you want. One of the measurements is the mean-squared-error between the line-fit and the points.

You can add as many polynomial coefficients as you want, and you will be able to decrease the mean squared error. But the more polynomial's you choose, two things will be true:

1. The line-fit will be far more likely to go through the points.

2. At points in the line where there was no data, the line will less approximate the underlying physical reality.

That same mathematical property is what is relevant here. There is nothing inherently evil about non-linearity, when the non-linear math model properly maps to the physical reality. But when you over fit a line, many of the functional solutions may be completely wrong.

EE84M3i · on June 15, 2020

I'm confused. I agree that overfitting can lead to very bad models.

But, what I don't understand is that I thought that "linear" in ML contexts was normally used in the sense of 'linear transformations', which is a sense of linear that 'line-fit' from excel isn't -- it's affine.

Is a linear model with thousands/millions of weights/parameters (like deep learning models) really substantially simpler to understand? Can it do anything useful?

[1]: https://en.wikipedia.org/wiki/Linear_map

cjhanks · on June 15, 2020

I suppose from the perspective of someone implementing these models, yeah - it is linear, but it is not bijective. In a system with only one layer, that manifests as an alias (assuming the output dimensions are smaller). In a system with multiple layers of either `N->M` or `M->N`, those aliases tend to manifest as apparent "non-linearities".

So, I guess looking from the bottom up the system may look non-continuous and linear. But if you look from the top down, it would look continuous and non-linear.

Really, I am not sure which one is "true".

tylerhou · on June 15, 2020

I assume they are using non-linear to mean non-continuous, which implies that there can be large, hard-to-understand changes in behavior when the input is changed only a small amount.

kccqzy · on June 15, 2020

Polynomials with large degrees are continuous. It's just that they can still change by a large amount (i.e. having a large derivative) when the input is changed by a small amount.

I invite you to construct the Lagrange polynomial (i.e. interpolating polynomial) for points on a nice, simple curve with some noise. They will, by definition, pass through every point given, and yet it will likely behave very badly outside the range of the given points.

tylerhou · on June 15, 2020

There is nothing wrong with using a non-linear model, though; x^2 or x^3 regressions make sense on many datasets.

Non-continuous is also not the perfect terminology, but I argue that it is more precise than non-linear: the chief idea being that the model "changes unpredictably."

kccqzy · on June 15, 2020

Sure you can argue things however you want, if you also decide to ignore hundreds of years of mathematical terminology.

yellowstuff · on June 15, 2020

How have machine learning algorithms negatively affected financial markets in the last 6 months? Markets have been volatile because information about the real world has been volatile. I don't think markets in an earlier era would have handled a global pandemic any more robustly than they did in 2020.

faitswulff · on June 14, 2020

But it has also generated prodigious amounts of erotic fiction so that balances out some of those points, right?

neal_jones · on June 14, 2020

I honestly don’t know if this is a joke or if there is a bunch of erotic fiction I’ve been missing

pas · on June 15, 2020

Deepfakes (faceswap but for porn). Decensored hentai. And of course the question came up again: how ethical are generated pictures depicting illegal content?

faitswulff · on June 15, 2020

I was referring to AI Dungeon, actually, and yes it was a joke. But 100% true.

abetusk · on June 14, 2020

Intelligence is the amalgamation of many smaller problems working together and building on top of each other.

* Facial recognition/detection

* Facial synthesis (deepfakes)

* Speech synthesis, including mimickry

* Speech recognition

* Natural language processing

* Gait/walking algorithms

* Motion planning

* etc.

Complexity arises from simple units working together in parallel. We're working on the smaller, specialized problems that will, in the next generation, be put together to build more complex and complete systems.

I'm no fan of the 'black box' nature of neural networks but it's clear they're getting results. As they become more accessible to the lay person, we'll see a profusion of use cases that are both anticipated and surprising.

I'm always flabbergasted by the doom prediction. The path we're on seems apparent.

cjhanks · on June 14, 2020

I agree with the notion that artificial intelligence is a graph of smaller problems, as is human perception.

The problem is a question of informational density. Biological systems are computationally very dense. Far more dense than the 4nm transistor fabrication available today, and with a far larger volume of size.

Consequentially, the computational capability of most AI systems is far lower than its biological equivalent. And as you find in most information finite discretization problems - the lower density information system will alias against the higher information system.

So, that means you will have a hierarchy/pipeline of computational stages - each aliasing reality. Eventually, you will find that your parameterization of each perceptual stage has a strange property. The size of each subsequent layer is important... but the relative computational space of each subsequent stage is even more important. Because mismatched stages results in nothing but numerical interference and noise.

And I think that is where we are today. The IQ of a krill shrimp.

natmaka · on June 15, 2020

Isn't "the amalgamation of many smaller problems working together and building on top of each other" a fair description of the theorical Unix system?

Aren't your criteria for "intelligence" human-centric, implying that there is no other form of "intelligence"?

Aren't your criteria of the "black box" type, given that AFAIK no human can really completely explain how he recognizes faces/does NLP/walks/...?

abetusk · on June 15, 2020

> Isn't "the amalgamation of many smaller problems working together and building on top of each other" a fair description of the theorical Unix system?

Yes. Note the success of Unix and the ability to scale, do work and provide an environment to be productive in.

> Aren't your criteria for "intelligence" human-centric, implying that there is no other form of "intelligence"?

Your use of 'human-centric' is odd. I would have thought the traditional 'human-centric' theory of the mind is something monolithic and indivisible. Suggesting that it's many small processes communicating with each other is basically taken straight out of nature, from ants, schools of fish, birds flocking, etc.

Whether there are other forms of intelligence or no, it's clear that incremental progress in individual processes that can then be composed together is a productive way to traverse the energy landscape. This is why (imo) we see so many symbiotic relationships from cells on up to higher level animals.

> Aren't your criteria of the "black box" type, given that AFAIK no human can really completely explain how he recognizes faces/does NLP/walks/...?

I'm not quite sure what your point is here. If you're critiquing me about neural networks being black boxes and not giving us real insight into the underlying system, that's fair and the reason why I said I didn't like the black box aspect of neural networks. I will say that if there is a black box model that can be easily manipulated, this will probably lead to deeper models much quicker.

If you're saying that human cognition is not describable by any human and, I guess, implying that it's indescribable, I would point out that one doesn't follow from the other. Not having a good model right now doesn't imply we won't understand it at some future date and, in my opinion, this is precisely what's happening. Having no human be able to describe the underlying computation (of face recognition, nlp, walking etc.) doesn't mean it's indescribable, it means it's not describable by anyone right now.

At one point we didn't know how birds flew. We still might not know, to your satisfaction, but we have a basic understanding of how to make things fly, both in practical and theoretical terms. Planes fly and we understand how even though they don't flap their wings. I have no doubt we'll figure out how to do complex human-level computation even if we don't have a deep model of the specifics of human thought.

natmaka · on June 18, 2020

>> Aren't your criteria for "intelligence" human-centric, implying that there is no other form of "intelligence"?

> Your use of 'human-centric' is odd. I would have thought the traditional 'human-centric' theory of the mind is something monolithic and indivisible.

Sorry, my answer wasn't clear. It was not tied to the "small processes communicating..." approach but to your very list of "problems" (facial recognition/detection, facial synthesis (deepfakes), speech synthesis, speech recognition..) which seems to me expressed in a way tied to human activities, while at least part if the underlying "intelligence" underlying many of them may also exists in other forms of life (other mammals, birds, fishes...).

> Suggesting that it's many small processes communicating with each other is basically taken straight out of nature, from ants, schools of fish, birds flocking, etc.

Exactly. My point is that analyzing the ways "the smaller, specialized problems" are tackled by non-human living beings seems pertinent as self-analysis (as humans analyzing human intelligence) is difficult, and as various species may apply various solutions, some more easy to grok. Focusing on "problems" too specific to the human being may be a sort of "framing" detrimental to the quest. Moreover the famous Dijkstra quote ("The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.") may be pertinent.

> incremental progress in individual processes that can then be composed together is a productive way to traverse the energy landscape. This is why (imo) we see so many symbiotic relationships from cells on up to higher level animals.

I agree. My point is about _how_ we consider the system(s) (our "point of view"): framing it to human characteristics, globally or locally (dualism)... It seems to me that the very organization of a system may be neglected when we consider it a stack of "small processes communicating...". Pirsig's "Metaphysics of Quality" may be pertinent.

>> Aren't your criteria of the "black box" type, given that AFAIK no human can really completely explain how he recognizes faces/does NLP/walks/...?

> I'm not quite sure what your point is here. If you're critiquing me about neural networks being black boxes and not giving us real insight into the underlying system, that's fair and the reason why I said I didn't like the black box aspect of neural networks.

This was my point and I agree with you.

> if there is a black box model that can be easily manipulated, this will probably lead to deeper models much quicker.

I'm less optimistic, as it is only 'probable', and AFAIK won't give us more real insight into the underlying system.

> Having no human be able to describe the underlying computation (of face recognition, nlp, walking etc.) doesn't mean it's indescribable, it means it's not describable by anyone right now. > I have no doubt we'll figure out how to do complex human-level computation even if we don't have a deep model of the specifics of human thought.

I agree, we will enhance ways to "approximate" (tricks leading us to a solution to each "local" problem) up to the point of being able to solve real-world problems. However it may reach some hard limit (as far as I understand this is the point of the article), and using a powerful tool/method insufficiently understood may be dangerous.

giardini · on June 15, 2020

You're right: a human driver will stop even at a hexagonal stop sign, even though most are octagonal. Much safer behavior!8-))

Veedrac · on June 14, 2020

https://youtu.be/hx7BXih7zx8?t=513

kordlessagain · on June 15, 2020

> complicated when you get to the long tail of it

Well, there's the problem right there.

lokedhs · on June 15, 2020

Watching this presentation did not make me any more confident about going in a self-driving car.

Based on the way it was presented, I got the feeling that they are just essentially manually identifying cases and addressing them as they see them. Is that solution really helping to make the system more robust when encountering an unexpected situation?

phreeza · on June 15, 2020

I'd argue that the main driver of volatility over the last few months was the Coronavirus, and not AI...

pgwhalen · on June 15, 2020

I’m curious if you have a source that ML has increased financial markets volatility.

irrational · on June 14, 2020

So Skynet will be insane?

cjhanks · on June 14, 2020

Most definitely insane and probably well dressed.

eanzenberg · on June 14, 2020

This is so strange. If you use facebook, google, netflix, apple, microsoft, amazon or a whole host of other services you are interfacing with AI all the time. To think there’s no value there is asinine. Comes up a lot on HN. Seems like people set in their ways who don’t want to progress forward.

fock · on June 14, 2020

oh yeah, magnificient AI at Google search. Picked up my ebook-reader again, wanted to know about the state of linux there. so do a search: "<model of ebookreader> linux ssh" (since a good shell is the point, where you can start developing). Turns out, the first 3 pages want to sell me the same thing I already own, with one outlier selling nutritional supplements. Oh well done AI!

bronco21016 · on June 14, 2020

It’s doing exactly as it’s trained. Nudge the useds to buy more trinkets.

Now just imagine how good it could be if it was being trained to actually give good search results instead of selling.

YeGoblynQueenne · on June 15, 2020

Unfortunately, Google Search (and Amazon and MS search and all three companies' assistants) doesn't work that well when it comes to selling you what you want to buy either:

https://github.com/elsamuko/Shirt-without-Stripes

But it sure can identify stripes.

juped · on June 15, 2020

idk, the thing Google has been doing lately where they suddenly render a block of ads under your mouse as you're clicking on a result seems like the kind of thing an AI would do to increase ad clicks

perl4ever · on June 15, 2020

There's a nearly unimaginable amount of money and computing power going to minimize the net value of the service + ads. If you get significant value, it's going to be optimized out in a few minutes. The ads on Youtube recently ramped up to where even something short is unwatchable for me, and it doesn't work at all with an adblocker.

enchiridion · on June 14, 2020

Isn't copy and pasting within a thread generally discouraged?

aljgz · on June 14, 2020

Economist editor to writer: This year, X is on the down slope of hype cycle and no one is talking about it, gimme something ASAP.

Economist editor: opens their "pessimist template"

"X has over-promised and under-delivered, A, and B have not commercialized yet, and may never be. X cannot do C yet. The challenges of X are D, E and F"

Replaces A...F with the most prominent examples they can find, Boom, we have an article.

I read through the article, hoping for a bit of important info. I wish we had a vote in HN: "Isn't worth your time, don't read it"

LetThereBeNick · on June 15, 2020

It sounds like you are ready to program an NLP to generate these articles

gwbas1c · on June 15, 2020

My general guidelines about AI (Machine learning, programming):

1: Computers can't read minds! Your algorithm might know that I like the Beatles because I listen to them a lot, but it can't predict that I woke up today craving to listen to some music from my childhood.

2: You don't know what you don't know! Your algorithm might make 24 frames per second film look smoother at 60 fps, but if something like wheel spokes move backwards at 24fps, it'll have a tough time getting the wheel to move the right way at 60fps.

3: Just because you have the information, doesn't mean you know how to write a program that can extract the knowledge you're looking for.

Which really means we need a super advanced AI with a worldview and context in order to automate certain kinds of information processing. I don't expect this anytime soon.

g_airborne · on June 14, 2020

Like others are saying the progress towards AGI isn’t great but each individual subdomain is seeing great advances all the time. Object detection, facial recognition and NLP with GPT are much better than they were a few years ago. Each of these can provide business value to a certain degree, but I would agree that something resembling AGI holds the most business value. For this to happen all of the pieces have to be put together somehow - right now research focuses on specific subdomains and improves the SOTA on them. Once someone figures out how to make everything work together, it could mean a second, much larger wave of AI. So the question is, when will that happen?

asutekku · on June 14, 2020

Apart from a true GAI, the research has progressed really fast. We’ve seen a huge leaps in e.g. image generation and speech synthesis during the last few years that I’m really interested on what the future will bring.

randcraw · on June 15, 2020

For a rather complete view of what deep learning still does not do, I recommend the work of Gary Marcus and Ernest Davis. While seeming solely critical, I think they make very good points about the limits inherent in deep learning as we know it now, and how it needs to grow to overcome those deficits.

"Rebooting AI" https://www.penguinrandomhouse.com/books/603982/rebooting-ai...

And a few articles, for audiences both popular and technical:

Deep Learning: A Critical Appraisal https://arxiv.org/abs/1801.00631

The Next Decade in AI https://arxiv.org/pdf/2002.06177.pdf

How to Build AI We Can Trust https://www.nytimes.com/2019/09/06/opinion/ai-explainability...

And a HBR podcast ("Beyond Deep Learning") https://hbr.org/podcast/2019/10/beyond-deep-learning-with-ga...

mD5pPxMcS6fVWKE · on June 14, 2020

Maybe it's because everyone talks/chats every day with "virtual assistants" at banks and every other organization, and never ever finds them useful. Their main purpose is to frustrate you enough so you give up trying to connect to a real person.

ksaj · on June 15, 2020

Judging by the number of times and different places I've heard the sentence "Please listen closely, as all our options have changed" even when they haven't changed in years suggests that this is new whine in old bottles.

somewhereoutth · on June 15, 2020

Previously, when "The computer says no" it was possible that an engineer somewhere might know why the computer said no.

With AI, nobody knows why the computer said no.

neonate · on June 15, 2020

https://archive.vn/USQ6M

ur-whale · on June 14, 2020

http://archive.is/RZD7w

cageface · on June 15, 2020

I think there's an opportunity right now for new, human curated indexes in the style of the old Yahoo index. AI content generation is getting too good at fooling AI curation and I'm getting less and less value in broad searches on Google. I would pay a monthly fee for hand-vetted lists of the best content on topics I'm interested in.

renewiltord · on June 14, 2020

It's limited but so effective! The other day a friend asked for photos of her sister (also a friend) that I had because it's her birthday and she wanted to make a collage. I just searched on Google Photos by her name and it found a bunch because of the face classification. That's some good shit.

reportgunner · on June 15, 2020

How do you know you haven't missed any good pictures of your sister ?

renewiltord · on June 15, 2020

You don’t but that’s okay. I have thousands of photos. I don’t need exhaustiveness. I need filtering.

andbberger · on June 14, 2020

works great until I add epsilon of adversarial noise and turn your sister into an ostrich

webmaven · on June 15, 2020

The hard part of that would probably be getting unauthorized access to the photos in order to modify them in place, rather than the adversarial image perturbation per-se.

XCSme · on June 14, 2020

I think that the average user doesn't like ostriches that much.

renewiltord · on June 15, 2020

Fortunately, my camera is not my adversary so I'm safe.

msla · on June 14, 2020

These days, you can translate text by pointing your phone at it and taking a picture. Thirty years ago, this would have been unambiguously AI, because it would have been not only impossible, but stupid impossible like something out of a soft SF novel where little self-flying robots deliver stuff to your house, or you can ask a computer a question in a natural voice and reasonably expect a civil, natural-language answer sourced from global databases.

Ah, but all that works. If AI is perpetually defined to be "that which does not work" then it's perpetually potential, perpetually postulated, perpetually possible perfection. It never has to be compared to a clunky translation, or a drone that gets shot down. Unwritten novels never have story problems in the third act, unwritten programs never give ludicrous output.

lambdatronics · on June 14, 2020

Yes, but you're missing the point. In 1950's sci-fi, those marvels were possible because there was imagined to be something like a general artificial intelligence behind the technology. We've achieved narrow AI, but the perception is that in order to get it, we would already have general AI, which is why people are disappointed.

robbrown451 · on June 14, 2020

Whether 50s sci-fi imagined or implied other things is irrelevant to the question of whether or not the current capability described (point camera at sign, get translation) qualifies as AI.

The point is we have current things that are quite amazing, and would at one time have been considered to be the sort of thing that only an AI would be able to do, and yet we keep moving the goalposts. As if AI is defined as "that which humans can do but machines can't, done by machines".

jpttsn · on June 15, 2020

But was it the “capability” that qualified as AI in the 50s? Or was the capability just one example of what the AI could do?

Suppose we said we’ve invented Jesus because we’ve invented ways to walk on water and turn it into wine.

robbrown451 · on June 15, 2020

I don't get your Jesus analogy, I mean Jesus is a proper noun for an individual.

A technology is different. Its capabilities pretty much define it. Unless you are going to try to get all philosophical about it and say it doesn't count if it doesn't experience qualia or something (which is nonsense) Or, unless you are defining it in ways that specifically call out the implementation details. A helium balloon isn't a hot air balloon, not because it doesn't have capabilities, but because you've specifically said in the name that it must use hot air.

jpttsn · on June 17, 2020

If a client/muggle asks me to build an “AI” I’ll be wary that their “spec” is sometimes just examples of what the “AI” should be able to do: play chess, write a poem.

In their mental model, the AI is far from defined by these capabilities. They won’t be happy unless there’s an actual AGI whose capabilities happen to overlap with the spec.

So my point is historically we have cheated our way out of defining “intelligence” and instead given necessary but insufficient examples. I think this is the mechanism behind the goalpost-shifting in “AI”.

That’s the metaphor: it would clearly be absurd for engineers to define Jesus by some examples of capabilities. People who are waiting for the second coming won’t be satisfied.

In the lab we maybe should define AI by some set of capabilities. But clients, journalists etc. picture the Hollywood version, and narrowly fulfilling their spec won’t actually satisfy them.

0x262d · on June 14, 2020

this is also one-sided though - the translation is cool but only kind of works and the quality varies a lot based on language, becoming unusably bad for many languages. so there's some amount of dismissal by AI boosters of its failures that people have grown accustomed to.

robocat · on June 15, 2020

> becoming unusably bad for many languages

I presume that is not due to a failure of the ML, but instead is often due to the lack of a good corpus of text, or the lack of appropriate manual training/nudging/correcting of the model for unprofitable languages.

austinjp · on June 14, 2020

a.k.a. "the AI effect", Larry Tesler (mis)quoted by Douglas Hofstadter.

zby · on June 14, 2020

This sounds like what was writtein in 2001 about Internet. I don't know - videoconferencing is still a problem, but there is progress.

seibelj · on June 14, 2020

I’m a longtime AI skeptic who has been arguing passionately against the doom-sayers, many in my own family and in casual conversations with laymen, for several years. I stand by this article I wrote which summarizes my views https://medium.com/@seibelj/the-artificial-intelligence-scam...

The hype on AI was absolutely astonishing. I’m glad it’s finally coming back to reality.

dgb23 · on June 14, 2020

The more I understand AI technology the less magic and powerful it seems.

More traditional AI is often logic or algorithm based. This is powerful and expressive, but the "intelligence" part is as flexible as the specific implementation. This is intelligent in the sense of a design being intelligent, which can be attributed to the author of said design.

Modern AI is often statistical. We have more data, so we can use that to generate a de-facto decision making database with statistically sound heuristics.

These techniques are all useful and can be very impactful. But we have to understand them as tools.

Your article agrees with this sentiment in a very entertaining way.

fxtentacle · on June 14, 2020

That article is well argued and I am inclined to agree. But then I noticed that we seem to strongly disagree on other topics.

I find it fascinating that you can both see AI as what it truly is (a rebranding to solicit investment) yet still cheer for cryptocurrencies, which I personally believe to be not much more than a pyramid scheme to dupe those investors that buy in late. In my opinion, both AI and crypto are quite similar in the ways that they mislead investors by promising a golden future.

seibelj · on June 14, 2020

I am a longtime crypto guy, and totally 100% understand how you see the links. I have a very nuanced opinion on this, but it's difficult to explain briefly here. There are a lot of scams in crypto, but if you are a believer in Austrian economics and are not a fan of Keynesian theories, fiat currency, and the Federal Reserve, then Bitcoin / crypto is very attractive.

But that's a separate discussion, unrelated to my opinion of AI and is primarily a philosophical issue. Cryptocurrency / blockchain is fascinating but we are not promising robots cleaning your house.

fxtentacle · on June 15, 2020

I agree that AI has been way worse with the overpromising.

But when I hear IPFS = interplanetary file system and then wee how poorly it performs in practice and that it's mostly used for illegal content, I cannot help but think that the crypto side also likes to oversell their practical utility.

I believe I have yet to see an application where the Blockchain is truly a critical component. In most cases, it seems that people end up caching its data in sql to speed things up, meaning that they're working on their own private copy now.

Vinceo · on June 15, 2020

Why do you believe that cryptocurrencies are not much more than a pyramid scheme?

fxtentacle · on June 15, 2020

With the current transfers fees, it is difficult to do much apart from rarely moving large payments. For old-school currencies, that would be called speculation.

If you now consider that both the mining difficulty grows over time and the mining reward drops, then you clearly have a system with a built-in advantage for early adopters.

So people buy in, wait a bit, then exit at a higher price. But that only works as long as you have a large enough stream of newcomers.

I'd be willing to consider it an investment if there was some sort of inherent value on the other side, so if buying cryptocurrency gave you a claim of ownership on something real. But the critical fundament of cryptocurrency is that it's unrelated to the real world and only controlled by its members. In other words, the value of a cryptocurrency is determined exclusively by what people believe it should be.

mathgenius · on June 14, 2020

I agree with your skepticism, while also acknowledging that A.I. in all its incarnations is a powerful tool. But the argument against AI, apart from "this is obviously limited", is much more subtle than what you are claiming in your article. (IMO, it comes down to quantum physics... Hopefully one day this will become obvious to everyone when we all have quantum computers.)

ilaksh · on June 14, 2020

People are going to complain that AI is lame right up until the point that it gets general enough to make them all irrelevant in terms of work productivity. Then rather than modifying society to distribute the gains, they will leave the outdated structures in place and try (too late) to suppress it.

At no point (until it's too late) will there be be effective legislation discouraging the creation of fully general and autonomous digital persons that compete with humans.

LetThereBeNick · on June 15, 2020

With that attitude I would like to pledge my undying allegiance to you

logicslave · on June 14, 2020

Amazon retail is currently in the process of baking advanced deep learning models into every area of their retail process. Its all becoming automated. Just because its not talked about openly doesnt mean its not happening.

These people at the economist are completely out of the loop.

rb808 · on June 15, 2020

I'm still not sure what successful AI implementations there have been. Stuff like Amazon/Spotify recommendations seem sensible. Is there anything else out there that is impressive?

seibelj · on June 15, 2020

Autocomplete on emails which allows me to tab over for common statements. But I agree - the places where AI has improved business can be counted on with one hand, and each finger is extremely thin.