
This Startup Has Developed a New Artificial Intelligence That Can Beat Google - aaron_p
http://www.forbes.com/sites/aarontilley/2017/02/14/gamalon-artificial-intelligence-bayesian/#17edb182b78c
======
rahimnathwani
" If you want to train the system to recognize a cat, for example, show it
10,000 pictures over and over again of every possible variation of what a cat
looks like. But don't show it a dog, because then the AI will get really
confused."

Umm, that's not how machine learning (or human learning) works. You can't
learn to recognise something after seeing only positive examples.

~~~
candiodari
Humans seem to be pretty good at it, so why not ?

If I show a kid 100 elephants, and one dolphin, it will in fact be aware that
a dolphin is not an elephant. In fact, I bet it'd still be aware of that even
if I were to include one dolphin in the sequence of elephants and claim it's
an elephant. In fact I was curious and I have a kid. I was pretty sure they
didn't know the difference between a bee and a praying mantis, so I used those
(google images makes that really quick). "That's a different kind of bee
daddy" is what came out. Now of course, you can say kids have seen lots of
examples of animals already and therefore they have a lot of negative training
data "stored in their memory". That's not how the mind works though. It
doesn't go away and train itself at all.

And it's easy to devise an algorithm that does that too. Instead of using the
positive examples to draw a decision boundary directly you use it to train
both positive and negative features.

You can have independent features, negative features and positive features.
Deep learning as it's commonly used gets you positive features, but really
backprop just moves features away from independence of the input data.
Learning the opposite question is easy as well. So combine this, running half-
half positive and negative features. That gets you positive and negative
features. So yes you can train, even with deep learning, on just a few
examples. Backprop will convert independent features to either positive or
negative features.

So now let's say I do that with 1000 9x9 convolutions and I train half of them
positive and half of them negative. Would it be able to tell me a 64x64
dolphin is not an elephant ? Interesting question. I need to try, but I have
little doubt that this will succeed, and get a decent success rate on
identifying dolphins as non-elephants.

It is even possible to do deep online learning, especially if you don't care
about different images having unequal weights. And there's plenty of
algorithms that can do online learning by themselves as well. In fact, the
quintessential reinforcement learning algorithm is pretty much the answer to
the question "how can I update my probability distribution exactly given this
extra piece of data".

~~~
majewsky
> If I show a kid 100 elephants, and one dolphin, it will in fact be aware
> that a dolphin is not an elephant.

Yes, because it has learned to distinguish the class "elephant" from the
thousands of other things that it has already seen. If all the kid had _ever_
seen were 100 elephants, I'm pretty sure it would have a darn hard time
telling dolphins and elephants apart. You dismiss this line of thinking, but
consider the following:

Imagine yourself as having an entirely untrained sense. For example, let's say
that I convert the image data of the elephant images into a sound pattern, and
play them to you. After hearing all 100 elephants, it would be impossible for
you to recognize a dolphin image's sound because there is no negative sample
to tell elephant sounds apart from. All you could tell is that it's definitely
an image as sound, as opposed to all the other sensory inputs that you've seen
examples of.

~~~
candiodari
Why wouldn't I be able to train negative classifiers ? Note that I do always
have "negative" training data available : background noise for one, and I'm
sure I can generate 10 other kinds but I don't think that's even necessary.

Backprop moves things away from equilibrium. It can do so in both directions.
It can make a network more accurate, or more "inversely" accurate.

I guess what you mean is that seeing only positive examples will lead to a
network that always answers "yes". And that is indeed a concern. So perhaps I
should adjust my answer to hedge for that:

If the network trains a good positive feature for elephants, it will be able
to train a good negative feature as well.

~~~
rahimnathwani
"Note that I do always have "negative" training data available"

Yes, _you_ do, but the model being trained with _only_ cat pictures doesn't.

"If the network trains a good positive feature for elephants, it will be able
to train a good negative feature as well."

I'm not sure what you mean here. Let's say a positive feature for an elephant
is a patch of grey. What's a good negative feature? And how would the model
learn that the presence of that feature should make it less likely what it's
seeing is an elephant.

~~~
candiodari
Think of a decision boundary that a neural net makes. And let's assume it's a
3 dimensional surface. That surface is going to have peaks going up and peaks
going down.

Backprop determines the differential with respect to the error. You can use
that to lower the error in classification, but there's nothing preventing you
from going the other way. You would normally descend into the valleys, but you
can just as well climb the mountain, to find features that are extremely
unlikely to be found on an elephant, and with more training you can find stuff
that's commonly found on non-elephant things, and train those features on
those.

What I mean is that if it only sees positive examples, it might conclude that
anything means it's an elephant, regardless of what exactly it is. So that's
why I have to qualify by saying a "good" feature.

Furthermore there are ways to generate good negative examples from the only
positive input data. For instance, you can use random static as an input.
Given that static (correctly generated) does not have any information content,
there's nothing in there that's an elephant, and you can train it away from
that. Other obvious ways to generate more data is to do reverse training :
determine what input data would make the network say that that is absolutely
not an elephant, after seeing a few elephants. Assume it gets that right, that
it isn't an elephant. Insert some static to generate more data. Take that
input, flip it, rotate it, zoom it, add static, change color distribution and
train those as non-elephants.

Likewise, take the input pictures. There should be other stuff in there than
elephants. So figure out which values trigger the network (not all pixels
matter for the end result, in fact, most won't). Find the ones that don't
matter for all pictures, and collage them together. Train as non-elephants. Do
the same in reverse (find elephant pixels, and use those to paste more
elephants in other pictures).

You need to do this to keep your data balanced : there should be equal number
of positive and negative training generated items.

I'm not sure how you could get the adversarial network approach to work for
this.

At this point, you have still not given it any negative example, but there's
lots to train on.

~~~
rahimnathwani
"Find the ones that don't matter for all pictures, and collage them together.
Train as non-elephants."

You're just creating negative examples manually, and then feeding them to the
model.

------
staticelf
How do you measure intelligence? What do they mean by "100 times more
efficient" than tensorflow? Isn't Tensorflow just a framework that you create
AIs with?

From my understanding we don't even have a good definition on human
intelligence, so how can we measure it or/and artificial intelligence?

If someone has answers from me who have zero experience in the AI-field I
would be happy.

~~~
miltondts
Here is an interesting definition and way to measure intelligence
[http://www.vetta.org/documents/AIQ.pdf](http://www.vetta.org/documents/AIQ.pdf)

an informal definition from the paper: “intelligence measures an agent’s
ability to achieve goals in a wide range of environments”

There is also a formal definition and a method of measuring it.

EDIT: a blog with a summary of the paper and source code
[http://www.vetta.org/2011/11/aiq/](http://www.vetta.org/2011/11/aiq/)

------
majewsky
When I visit that site, it blurs out everything and shows me a Quote of the
Day that I don't care about. How can I actually read the article?

~~~
matart
Continue to article button in the top right corner

------
mchahn
> Maybe down the line it will consider offering its Bayesian machine learning
> technique to outsiders, but it would be a hard sell as the tech industry has
> adopted deep learning systems fanatically.

I call BS. Build a better mousetrap and ...

------
amai
Seems to be related to
[https://news.ycombinator.com/item?id=13644441](https://news.ycombinator.com/item?id=13644441)

