
Artificial Intelligence Software Is Booming, But Why Now? - retupmoc01
http://www.nytimes.com/2016/09/19/technology/artificial-intelligence-software-is-booming-but-why-now.html?ref=technology
======
Houshalter

        Date		Approximate cost per GFLOPS inflation adjusted to 2013 US dollars
        ---------------------------------------------------------------------------------
        1961		$8.3 trillion
        1984		$42,780,000
        1997		$42,000	
        April 2000		$1,300
        May 2000		$836
        August 2003		$100
        August 2007		$52
        March 2011		$1.80
        August 2012		$0.73
        June 2013		$0.22
        November 2013	$0.16
        December 2013	$0.12
        January 2015	$0.06

~~~
NhanH
I believe the NN/ Deep learning renaissance started around 2011 -- iirc Alex
net was released in 2012 and the famous Google's paper on detecting cat on
youtube also around that time.

It seems like the drop of costs from 2007 to 2011 is higher than the previous
period. What happened in 2007? Or was it just a slow down during the start of
2000s ?

~~~
Houshalter
The drop 2011 was due to GPUs taking over. (See my source:
[https://en.wikipedia.org/wiki/FLOPS#Cost_of_computing](https://en.wikipedia.org/wiki/FLOPS#Cost_of_computing)
) I believe the gaming industry has (indirectly) heavily subsidized AI's
progress.

Nevertheless prices still fell by a factor of 16 between just 2000 and 2007.
So the 2,000's weren't stagnant.

------
thr0waway1239
TLDR from article which supplement's eli_gottlieb's comment:

"Much of today’s A.I. boom goes back to 2006, when Amazon started selling
cheap computing over the internet. Those measures built the public clouds of
Amazon, Google, IBM and Microsoft, among others. That same year, Google and
Yahoo released statistical methods for dealing with the unruly data of human
behavior. In 2007, Apple released the first iPhone, a device that began a boom
in unruly-data collection everywhere."

The combination of smartphone + cloud created a virtuous (for AI, that is)
cycle of unending data collection, which fed the improvements in theoretical
research models which needed data at a larger scale to be validated.

~~~
vonnik
AI is moving fast. Faster than it has in a long time. In research circles,
"supervised learning", or data classification problems, are considered solved.
Consider the magnitude of that statement: If we have sufficient labeled data,
we can predict those labels accurately on data we haven't been exposed to:
fraud, faces, diseases, you name it.

AI is moving fast for three reasons, which Andrew Ng summarizes neatly: 1) We
have more data than ever before (some of which has been organized in fabulous
datasets like ImageNet), much of which is being generated online or by
sensors. 2) We have more powerful hardware than ever before -- this is the
combination of distributed computing with GPUs. You could say that AI advances
at the speed of hardware, or at least is limited by hardware advances. Brute
force AI like deep learning is the main beneficiary. 3) We have seen a steady
stream of algorithmic advances for years. Specialists despair of keeping up
with the literature, research in AI is so feverish.

Others have pointed out additional factors: open-source projects that lower
the barrier to entry for the algorithms; cloud computing services that open up
access to the hardware. The choke right now, and what's slowing wider
adoption, is skills. Many companies don't have the teams to implement AI well.

Some of those companies are also the noisiest about selling AI. I have deep
doubts about some vendors to deliver on the hype they're encouraging. And I
believe that will lead, not to an AI winter, but to a poisoning of the well
that the AI sector will have to address for years to come.

But overselling and hype are inevitable symptoms of real advances in tech, and
in some ways, the tech is advancing faster than the hype can keep up with,
precisely because the people hyping it, whether salesmen or journalists, don't
understand the true extent of what's going on.

AI is just math and code. In a sense, you could say we've entered the age of
Big Math, which is the next stage after Big Data. The math is necessary to
process the data and determine its meaning. The math takes the form of massive
matrix operations that can be processed on the parallel calculators known as
GPUs. To call it statistics, as the reporter of the piece does, is only
partially true. The math involved in AI comes from probability, calculus,
linear algebra and signal processing. It's more than fancy linear regression.
And it's definitely more than just making predictions about customer behavior,
however attractive that is for industry.

Asking why now about AI software is like asking why now about cars after the
Model A came out. Because it's there, it's faster than horses, and it makes
you look cool. Like cars or any other powerful technology, AI is part of a
race, and that race is taking place between nations and companies. You can
decide not to adopt AI, the same way newspapers decided to ignore the
Internet, or the way the French decided to fight German panzers with mounted
cavalry in WWI. There really is not choice whether or not to adopt AI-driven
software. It's a question of when, not if. And for many companies, the when is
now, because later will be too late.

To give one example of how fast it's moving: Deep learning has been widely
thought to be uninterpretable, or without explanatory power, but that is
changing with cool projects like LIME:
[https://homes.cs.washington.edu/~marcotcr/blog/lime/](https://homes.cs.washington.edu/~marcotcr/blog/lime/)
Which is to say, for some problem sets, we'll be able to combine the
impressive accuracy of DNNs with the reasons why they reached a given decision
about the data.

On the hardware front, NVIDIA and Intel are racing to build faster and faster
chips, even as startups like Wave Computing or Cerberas come out with their
own, possibly faster chips, and Google works on TPUs for inference.

~~~
apsec112
"the way the French decided to fight German panzers with mounted cavalry in
WWI"

Quick note: I think this is historically inaccurate. In World War One, the
Germans had essentially no tanks, unlike the British and French. In World War
II, there's a popular rumor that Polish (not French) cavalry forces charged
German tanks at Krojanty, but this is itself a myth (the cavalry attack was
against infantry, tanks arrived afterwards).

~~~
YeGoblynQueenne
>> In World War II, there's a popular rumor that Polish (not French) cavalry
forces charged German tanks at Krojanty, but this is itself a myth (the
cavalry attack was against infantry, tanks arrived afterwards).

Not just any cavalry- Polish winged hussars:

[https://en.wikipedia.org/wiki/Polish_hussars](https://en.wikipedia.org/wiki/Polish_hussars)

Obviously, they could totally take a tank single-handedy, but they have now
ascended into heaven to party with the Valkyries.

(It is known.)

------
eli_gottlieb
Because it's been just about 10 years since they figured out how to:

1) Use convolutional layers, ReLUs, and a few other tricks to ameliorate the
vanishing gradient problem,

2) Perform continuous, high-dimensional stochastic gradient descent on
graphics cards, and

3) Apply these things via stochastic grad-student descent to sufficiently
massive datasets that even the most brute-force models and training methods
(backpropagation of errors on a loss functional) can actually work.

In the meantime, the hardware for doing it has become commoditized and the
software has consolidated and become standardized. So now it's A Thing in
industry.

There are lots of "smarter" algorithms that almost definitely come closer to
human cognition, for instance probabilistic program induction. But _those_
aren't _fast_ , and don't always neatly separate training from prediction:
you're just not gonna be able to train those models ahead-of-time on a 10,000
image corpus inside a single week with today's hardware and software.

We need to find ways to make machine learning fast even when it's not just a
bunch of matrix multiplies. Otherwise, every time we make our algorithms more
interesting, we cripple ourselves computationally.

~~~
rspeer
I'm concerned about something similar, where a lot of AI techniques seem to be
rushing toward a local maximum:

* AI researchers do things that get good results faster out of Nvidia cards

* Nvidia makes their cards faster at the things AI researchers are doing

It's getting good results. We sure are going up this gradient quickly. But I
don't think it's going to get us to a global maximum.

~~~
dave_sullivan
I agree w/ you about moving up a gradient quickly w/ the "GPU manufacturing
<-> deep learning research" feedback loop. I think it could last a while
though. One really important area of research is figuring out how to take
better advantage of greater capacity. Also, how to do more with fewer training
samples (0 shot, 1 shot, etc learning). Then there's reducing the precision of
the units you're using to increase capacity through software. Applying these
algos to video, audio, media generation, and others will eat up all the
resources you can throw at it; the algos today could take advantage of larger
capacity when applied to time-series. There's so much going on that I don't
see it slowing down for at least 3-5 years.

Also, I'd like to point out that we've seen some big breakthroughs in the past
10 years. But for the past 10 years, the whole field of deep learning has been
looked at with skepticism and has been very niche. Over the past couple of
years, a lot of money and resources have been put in place to pursue this area
of research. More money doesn't necessarily mean more results, but there are
many many more people working on these problems than ever before. A lot of
them are legitimately brilliant researchers in the prime of their careers. I
think there's still more to see.

I am concerned about an Nvidia monopoly around deep learning hardware. They
give away tons of free cards to deep learning research groups, but at some
point they'll want people to start buying. I assume they expect that will be
the enterprise set, but if suddenly they manage to move all their capable deep
learning cards to Teslas only (which have a huge markup), it will put the
hobbyist deep learning developer at a big disadvantage. The only check and
balance on that is the fact that Nvidia makes cards for gamers too, and gaming
card competition is still somewhat robust, so any technology that gives
enterprise a big boost will probably make its way to their flagship gaming
cards quickly. Nvidia's only real competition, AMD, is so far behind that they
might as well not be in the business. As someone who usually roots for the
underdog, it pains me to see AMD fumble so badly in this whole area.

Quantum computing could be a new hill, but I think that's a ways out and I
don't know enough about the topic to speak with any real confidence.

~~~
dragandj
As someone who does (non-NN) ML on AMD, I might ask why you think that AMD is
so far behind. In my experience, their GPU hardware is excellent, maybe even
better than Nvidia, especially if you factor the price in. Where they ARE
lagging heavily is:

1\. GPU as a service: While all major providers (AWS, Azure) offer Teslas on
their servers, there is no AMD on the cloud (that I know of)

2\. Key libraries: Nvidia comes with matrix libraries and cuDNN out of the
box, while for AMD, there are only open source offerings that are a bit
difficult to manage.

But, If you write your own software or rely on open source, AMD is quite
performant and affordable. The problem is that it is really obscure. So, yes,
they are really bad at marketing and if you are looking at them as a user
instead of as a developer, they are invisible.

~~~
dharma1
The key libraries part has kept me wondering. How much would it cost for AMD
to assign a handful heavy duty engineers to this task (writing AMD optimised
kernels for convolution etc)?

Their management has been fast asleep for at least 2 years

~~~
dragandj
That is a part of the problem: they assigned people for the task, and produced
_open-source_ libraries for matrices, FFT, maybe even something for DNNs. But,
those are not polished much, and you have to hunt them down and install them
yourself. And, they do a really bad job at marketing.

On the other hand, finding and installing those libraries is nothing compared
to actually developing GPU computing software, so, as I said, if you want to
program GPUs, even such scattered state of AMD platform is not that worse than
Nvidia. Because, in Nvidia, you install CUDA and you get everything set up.
And then - what do you do? You still have to learn a not-so-easy black art of
optimizing kernels for the actual hardware.

~~~
dharma1
I haven't seen CuDNN equivalents (in terms of perf) for common machine
learning frameworks from AMD. I don't think they exist, if they did, people
would shift to using AMD.

For NVidia I have seen some faster kernels than the ones supplied by NVidia -
[https://github.com/NervanaSystems/neon](https://github.com/NervanaSystems/neon)
\- though CuDNN introduced Winograd kernels too in their last update

~~~
dragandj
I do not know about the quality of this since I do not use NNs, but there is
[https://github.com/hughperkins/DeepCL](https://github.com/hughperkins/DeepCL)
and I think I have seen others.

------
intrasight
I'm still not comfortable calling the thing that is booming "artificial
intelligence". It is mostly pattern recognition and classification.
Intelligence is something else.

~~~
eternalban
Yes, "signs of intelligence" include _creativity_ , _pedagogy_ , ..., and of
course, getting bitten by the bug of existential angst.

Accept no subsititute.

~~~
mbrock
It seems pretty likely that we will start to see "General AI" discovering
problematic things like the unsolvable nature of ethics questions, the
ungroundedness of truth claims, the immense silliness of religions and
ideologies, etc.

Which might be a good thing. Terrifying visions of AI are all about certainty
and the authoritarianism it creates.

If existential depression is a mental friction from the lack of certainty
about what to do, then some measure of it is probably necessary...

~~~
eli_gottlieb
>Which might be a good thing. Terrifying visions of AI are all about certainty
and the authoritarianism it creates.

Indeed, may the gods protect us all from some things actually being true and
other things actually being false. That would be terrible!

~~~
eternalban
(I sense/assume a missing /s in your post, Eli.)

The objection here would be that entertaining that we can assert T/F of _all_
propositions, given results of _halting problem_ [computation],
_incompleteness_ [formalism], and _uncertainty_ [physics], is unreasonable.

~~~
eli_gottlieb
>The objection here would be that entertaining that we can assert T/F of all
propositions, given results of halting problem [computation], incompleteness
[formalism], and uncertainty [physics], is unreasonable.

This displays a radical misunderstanding of the phenomena mentioned. The
Halting Problem and logical incompleteness are the same thing underneath, and
while they do hold in all formal systems, this never actually matters for non-
meta-level mathematics. Basically any theorem about a structure we actually
care about will be sub-Turing-complete, and with modern type theories, we can
tear down inductive types and rebuild them with stronger axioms when we need
to. As a result, some self-referencing theorems are true-but-unprovable, but
we never actually need those theorems.

Uncertainty in physics is either just probabilistic (in the case of typical
experiments), or, in the special case of Heisenberg uncertainty... no
actually, that's just probabilistic imprecision too. That's what Heisenberg's
inequalities _actually say_ : "the product of the standard deviations of these
measurements must always be at least this much."

Nowhere are we encountering the kind of _radical, existential_ "we can't know
anything and had better give up" uncertainty without which /u/mbrock seems to
think we will all fall into political authoritarianism. He's been reading too
many liberal philosophers of World War II, or maybe just watched that BBC
thing "Dangerous Knowledge" and took it seriously.

~~~
eternalban
It seems _irrelevant_ that, as of now, "any problem we care about" is sub-
Turing-complete. We're still taking baby steps as a species and the relevant
timelines in context of this thread are futuristic. Agreed?

The point is that to maximally insist that "we can't know anything", and/or,
"we can know everything" are both equally unreasonable positions.

> Nowhere are we encountering the kind of radical, existential "we can't know
> anything and had better give up" uncertainty without which /u/mbrock seems
> to think we will all fall into political authoritarianism. He's been reading
> too many liberal philosophers of World War II, or maybe just watched that
> BBC thing "Dangerous Knowledge" and took it seriously.

S/he can address your concerns -- that was not my intent, but hopefully you
agree that unreasonable insistence on maximalist positions inherently carries
the danger of "political" authoritarianism.

Your _practical_ position reminds of a purported exchange between Wittgenstein
and Turing, per Hewitt [ref]. I am sympathetic to it. But reminder again that
my initial comments here were in context of intelligent machines. In fact as I
am writing this little note, I am entertaining the thought that a veritable
intelligent machine may in fact review the 3 (methodological) constraints
noted, and shrug it off as practically unimportant. Or it may become alarmed.
:)

[edit/p.s. ref:[http://lambda-the-ultimate.org/node/4302](http://lambda-the-
ultimate.org/node/4302)]

~~~
eli_gottlieb
>It seems irrelevant that, as of now, "any problem we care about" is sub-
Turing-complete.

It's not irrelevant, it's a question of how you view the Church-Turing-Deutsch
thesis: can hypercomputation occur in the physical world? If it can, then why
and how are we somehow blocked from utilizing its physical manifestation to
reason about the relevant questions of real-world events? If it can't, then
why aren't systems with finite, if large or even growing, Kolmogorov
complexity thus sufficient for all reasoning about the real world?

>S/he can address your concerns -- that was not my intent, but hopefully you
agree that unreasonable insistence on maximalist positions inherently carries
the danger of "political" authoritarianism.

My objection has been precisely that to insist on radical ignorance has led to
an obnoxiously enforced liberalism.

>Your practical position reminds of a purported exchange between Wittgenstein
and Turing, per Hewitt [ref].

I think Turing is just plain wrong here. Real mathematics did not originate by
receiving ZFC with first-order logic as a revelation at Sinai and
extrapolating theorems and structures from there! It began by formalizing ways
to reason about the real world around us. When those original informal methods
became insufficient in the foundational crisis of the 19th century, _then_
mathematicians started inventing foundations to unify everything without
paradox.

Notably, someone in the comments thread then mentions Curry's Paradox, which I
looked up and found _surprisingly_ underwhelming. Curry's Paradox is a
_perfect_ example of what Wittgenstein called a meaningless language game!
Classical implicature X -> Y isn't always equivalent to the existence of a
causal path "X causes Y", but natural-language implicature _mostly_ talks
about causal paths, so conflating the two in symbolic logic derives a
"paradox".

>In fact as I am writing this little note, I am entertaining the thought that
a veritable intelligent machine may in fact review the 3 (methodological)
constraints noted, and shrug it off as practically unimportant. Or it may
become alarmed. :)

I don't think that a machine limited to reasoning within a single mathematical
structure or foundation can really qualify as "intelligent" in the human
sense. Logical foundationalism is the wrong kind of thing to constitute a
capacity to think.

------
Animats
We don't marvel over the ATM reading a handwritten paper check correctly.
That's a considerable achievement.

~~~
SapphireSun
From the first time it started doing that, I've been amazed that that works so
well in a production system. Every. Time.

~~~
Animats
The first time I saw it, I thought they had people in some call center doing
the reading. But they don't, at least not often.

The US Postal Service used to have 55 centers where humans tried to read
envelopes that the machines couldn't. They're now down to one. "We get the
worst of the worst. It used to be that we’d get letters that were somewhat
legible but the machines weren’t good enough to read them. Now we get letters
and packages with the most awful handwriting you can imagine."

[1] [http://www.nytimes.com/2013/05/04/us/where-mail-with-
illegib...](http://www.nytimes.com/2013/05/04/us/where-mail-with-illegible-
addresses-goes-to-be-read.html?_r=0)

~~~
singularity2001
very interesting

>> equipment that can read nearly 98 percent of all hand-addressed

that number must be pretty outdated

------
nkozyra
1\. Availability and _accessibility_ of large amounts of training data.
Without this training and validation is expensive if not impossible. Now if
you don't have the data you can acquire it yourself. Leading to ...

2\. Computational speed & storage upgrades. This applies largely to physical,
time-critical things like automated driving. The self-driving car could have
had all the data it needed in 1980 to do its thing, but required fast
computers and lots of data storage to do it safely in real time in a feasible
commercial product.

3\. Advancement of algorithms. Fervor and excitement around AI/ML has been on
a slow but perhaps exponential burn. This has led to the refinement of
algorithms that largely sat dormant from the late 80s (and earlier) until
fairly recently. This also means lots of open source libraries for people who
wish to implement without caring about the underlying mechanisms behind the
algorithms.

These things are leading more people to dabble recreationally and
commercially.

------
benhamner
Ready access to high-quality usecases and training data, along with shared
knowledge of the methods that work well on these helps:
[https://www.kaggle.com/competitions?sortBy=numberOfTeams&gro...](https://www.kaggle.com/competitions?sortBy=numberOfTeams&group=all&page=1&site=main)

(disclaimer: I work at Kaggle)

------
erikb
I thought we are already over that cliff of becoming forgotten. Yes, we didn't
call it AI 2 years ago. But has there been such a big technological change
since then with the googles, facebooks and twitters of this world? I think the
problem is that it is actually so transparent that you really can't see it
when it is applied to you.

------
bozoUser
"Democratizing AI" is the word being thrown around a lot by CEOs, PR, et al.
But only time will tell whether it's true democracy.

~~~
marvin
As far as I know, the concept of "democratizing AI" comes specifically from
the OpenAI initiative, which intends to make new developments in AI technology
broadly available and not tied to any single company or entity. This in order
to ensure that no single entity has control over how this technology can be
used.

Given that Google, Apple, Facebook and others have vastly more data than any
independent project, and therefore have stronger AI (currently limited to e.g.
speech recognition, image recognition and other low-impact applications), the
state of democratization of AI by this measure is poor today.

------
azinman2
Some how machine learning got rebranded and now it's important... and major
products are putting prediction into the ui more.

------
rsrsrs86
Such vacuity

------
matk
So, the influx is due to a plethora of things, which are not all mutually
exclusive:

1) The Internet has indeed produced large data sets which allowed statistical
AI approaches to flourish opposed to logical AI approaches.

2) Somewhat better algorithms. However, I'd say that the algorithmic
development hasn't had as much progress as the comments elude. We've only had
a few notable innovations like NMT in the past ~20 years.

3) Computational Power to run the algorithms, so we can perform more
experiments on large data sets, more computationally intensive algos, and
induce better hypotheses.

4) Libraries like scikit-learn and keras have "democratized AI". Grad students
used to implement algorithms themselves in the 2000s; now middle-schoolers are
doing ML with the tooling already available.

Those are basically it. I think (2) could even be taken off, because again:
learning algorithms have barely changed IMO.

