
Meet the algorithm that can learn “everything about anything” - ossama
http://gigaom.com/2014/05/23/meet-the-algorithm-that-can-learn-everything-about-anything/
======
mikkom
Here is the actual paper for those interested

[http://levan.cs.washington.edu/ngrams/objectNgrams_cvpr14.pd...](http://levan.cs.washington.edu/ngrams/objectNgrams_cvpr14.pdf)

~~~
_flag
Summary of the paper for those who don't want to read it:

So basically there are two categories of "learning" involved in this sort of
research, supervised and unsupervised. In supervised learning, someone gives
the computer a long list of concepts and their attributes ("frog", "green
frog", "jumping frog") and a set of pictures to go with each item, and feeds
them into a visual-recognition algorithm. In unsupervised learning, the
computer is given a concept like "frog" but then has to discover all the
variations itself and get its own visual data to match.

The claim in this paper is that they have made the unsupervised learning as
strong as the supervised learning. That is, they give the computer a concept
("frog"), it goes and searches through Google Books for common variations
("green frog", "jumping frog") and then uses Google image search to fetch
images for each of those queries. They can then remove the obvious false
positives (they test to see which images seem to screw up their learning
algorithm and leave those out), and the result they get is on par with the
supervised learning methods.

\----------------------

In my opinion, this is only mildly interesting because Google Image Search
functions based on human input anyway -- Google knows the difference between a
"frog" and a "jumping frog" or even a "camel" simply because people on the
internet caption such images and Google can make associations between images
and their captions. Essentially, what the researchers have managed to do is
outsource the work of some grad student to millions of people around the world
through Google.

Of course, it could be argued that there is some sort of parallel with what
humans actually do (we know what things are called because we hear other
people call them that), but even if I didn't know the name of an animal I
could still tell you when the same animal is in different pictures, and I can
also tell you when it's jumping and what colour it is. I don't need to have
someone caption the image for me to understand the broad range of situations
to which the caption "jump" applies.

~~~
xerophtye
>I don't need to have someone caption the image for me to understand the broad
range of situations to which the caption "jump" applies.

I wonder if this has anything to do with the fact that we can jump too. That
we can translate the frog's position into something we do as well.

of course one can argue that we can do the same for non-anthro-moprhic things
as well. What i think is that we dont directly relate pictures, as the
software is taught. What we do is translate that 2D picture into something
we'd see in the 3d world. And that 3d "vision" isn't just another image. It
represents an object in our world. something that has shape, existence etc.
something which we can observe from other senses as well. For us a picture
doesn't always represent an abstract thing, an arbitary pattern of colours. It
usually represents something concrete. Something about which we have tons of
other pieces of knowledge as well.

So we relate pictures by checking if they map to the same real-world object.
And here that "object" is a sort of nexus of many pieces of information we
have on it which is a product of many direct and indirect human experiences.

So i don't really think that we are in a position to teach a computer to do
anything like that.

------
cedias
(Disclaimer: I am not an native english speaker) The title of the article and
of the algorithm (learn everything about anything) is a bit misleading. You
might believe that it learns everything, period. Actually it's more about
finding every variation of a "concept", I quote their website: "a fully
automated method that given any concept, e.g., horse, discovers an exhaustive
vocabulary for it that explains all variations (i.e., actions, interactions,
attributes, etc) that modify its appearance."

~~~
Shizka
I wonder. If they have an exhaustive vocabulary would it be possible to
generate a picture of what the system believes an object to look like? I know
that there is something called generative models in machine learning and my
guess is that it could be applied here.

~~~
Houshalter
It's possible, but generally generative models have to be trained in a
specific way. If not, you could do something like for every layer of the
neural net, you train another NN which can "predict" the layer below it, it's
input. Then you can work your way down each layer to try to find an input
which would produce that output.

Another way is to use some kind of optimization to find an input which
produces that pattern (e.g. backprop to the pixels themselves.) This will give
you the image that _most strongly_ triggers that output. Not necessarily a
typical example.

------
cyborgx7
The problem with all approaches to machine learning I see today is that they
only focus on grouping and separating concepts based on certain
characteristics. They seem to be all fundamentally statistical.

None of them seem to work towards a fundamental understanding of what the
concepts mean. I'm not sure how that could be accomplished though.

Is there even a meaningful distinction to be made between beeing able to
identify a concept and understanding what a concept actually means?

~~~
baddox
I don't think there is a meaningful distinction. Do chess computers have a
"fundamental understanding" of chess, which humans traditionally considered a
benchmark of human intelligence/strategy?

Analogous to the philosophical zombie thought experiment, I think that "real
intelligence/understanding" is indistinguishable from simply being able to
perform actions to accomplish the same tasks that humans traditionally
consider to require intelligence.

Of course, that's one of the PR problems that AI has always had: once a
computer can outperform a human at some task, that task is no longer
considered to be something that requires "true intelligence." Most people
would consider someone who can multiply numbers together to be intelligent,
but when computers do that (incomprehensibly faster and more reliably), few
people consider even for a moment that it's AI. Same with more advanced
mathematics, like computer algebra and automated theorem proving. Same with
facial and voice recognition. And I'm sure it will be the same with self-
driving cars.

~~~
thret
If fictional media is anything to go by, the single defining aspect of human
intelligence is love. This is the last bastion of human understanding that is
incomprehensible to evil, machines and aliens.

~~~
aaronem
Depends on your choice of fictional media; try reading Peter Watts, some time
when you're already not in a good mood.

~~~
thret
Thanks, I'll place Blindsight by Peter Watts in my queue.

~~~
aaronem
That's the place to start; the Behemoth trilogy isn't bad, per se, but it is a
ramshackle thing by comparison, and I think only partly because it has a
larger story to tell.

------
sparky_z
This strongly reminds me of the first chapter of Greg Egan's Diaspora [1]
(which I can't recommend enough) in which newly-formed AIs bootstrap their way
to consciousness in part by connecting randomly to an online library and using
the various data streams to build up an associative model of the world.

[1]
[http://gregegan.customer.netspace.net.au/DIASPORA/01/Orphano...](http://gregegan.customer.netspace.net.au/DIASPORA/01/Orphanogenesis.html)

~~~
resdirector
Seconded. Diaspora is one of the best sci if books I've read. Highly recommend
it (and all of Egan's work) to the HN community.

~~~
wfn
Thirded. Before Diaspora, I first read "Wang's Carpets"[1] which is a short
story of his. Then found out this story had later been incorporated as a
chapter into the book. I remember basically immediately ordering said book
that night.

fwiw, that "Webly-Supervised Visual Concept Learning" reminds me of the stuff
that Hinton et al. do re: unsupervised (concept, etc.) learning (using
restricted Boltzmann machines, and so on.) Good talk on the subject (of deep
learning, etc.):
[https://www.youtube.com/watch?v=AyzOUbkUf3M](https://www.youtube.com/watch?v=AyzOUbkUf3M)

[1]: read online here:
[http://bookre.org/reader?file=222997](http://bookre.org/reader?file=222997)

~~~
MachineElf
Umm.. fourthed? I just couldn't help but jump in and also recommend Greg
Egan's "Permutation City". That book is just wonderful... think simulation,
cellular automata as a model for computation, artificial life and all that
other good stuff :).

Also, about the LEVAN thing... given the amount of data available online, both
in various structured formats and unstructured formats, don't be surprised if
deep learning will yield better and better results moving forward. To me
though, they mostly seem evolutionary rather than revolutionary. I mean if you
look back at the AI field, during the days before the "AI winter" came, huge
amounts of data is one thing researchers back then didn't have available. This
is not to say that there haven't been advances in learning algorithms at all
recently. ..

~~~
arethuza
As well as adding my own strong recommedatios for Egan's "Permutation City"
and "Diaspora" I would also recommend "Quarantine" \- which has a rather
splendid idea for mobile apps - "neural mods" that actually augment the brains
own congnitive capabilities (including augmenting sensory data for the
ultimate in VR).

And there is what one group chooses to do with a very special neural mod...

------
cscurmudgeon
One of the hallmarks of bad science is overly grand claims paired with
aggressive marketing. Bad times are coming for AI again.

~~~
zwieback
Judging by the amount of "deep learning" submissions to HN bad times for AI
are already here although maybe they never left.

In defense of the LEVAN thing, though, I didn't see any claims that this is
science at all, more like an exploratory application illustrating an
algorithm.

~~~
agibsonccc
Disclaimer: I have vested interest in Deep Learning having built a distributed
deep learning framework[1] and building a business around it.

Deep Learning is actually worth the hype though. It has 2 main merits that are
interesting.

1\. Auto Trend Discovery

2\. Plays very well with parallelism

The main problem, which I'm hoping to fix, is feasibility and ease of use.
Neural nets to the untrained eye can be a black box that takes a really long
time to train with little to no reward.

The hype isn't all for naught either. I'll elaborate if asked, but won't bore
you guys otherwise.

The results coming out from different tasks are currently blowing away many of
the old school algorithms in tasks like sentiment analysis, speech to text,
object recognition, among others.

[1] [http://deeplearning4j.org/](http://deeplearning4j.org/)

~~~
throwawaydl
I think it's worth clarifying a few points here - this is a fairly naive view
of deep learning being put forward.

In particular -

1) what does 'automatic trend discovery' mean? We've been able to do change-
point detection, linear regression, etc. for hundreds of years. If you're
talking about automatically learning a feature representation, then there are
other algorithms that can do this, in a much simpler way. If you're arguing
that it produces better representations, then make that argument.

2) This is almost _completely_ false, and indicates a substantial lack of
experience of ML beyond deep learning. Other machine learning algorithms
(SVMs, LR, even some decision tree algorithms) are _much_ easier to train in
parallel - this is (partially) because your objective function has certain
nice properties that allow you to combine partial solutions together that are
produced in parallel (convexity, separability). When you're using gradient-
based methods on an incredibly ugly non-convex function from a multi-layer
neural network, you're in a completely different world.

Granted, there have been techniques coming out for training in multiple
address spaces, but these are _hacks_ to get around the ugly structure of the
problem, not the principled approaches that exist for other algorithms.

I don't have any perspective on your deeplearning4j library, but I'm skeptical
of its utility given the existence of existing well-tested deep learning
libraries written/contributed to by renowned experts in the field (e.g. cuda-
convnet, caffe, torch). This stuff is a) very tricky to get right, b) very
tricky to debug, and c) very performance sensitive. Just a quick pass through
shows zero references to CUDA/GPGPU programming, so I'd suspect performance is
going to be significantly worse than the aforementioned libraries.

~~~
agibsonccc
1\. In this case, we are talking about the pretrain part of neural nets.
Automatic trend discovery comes down to doing feature extraction for the user.
That being said, if I was inexperienced I wouldn't be teaching this stuff[1].
Am I the best machine learning practitioner out there? No. A lot of us aren't.
I am all about making other people's jobs practical though.

Yes, I am talking about learning better representations. See hinton's deep
autoencoder work as a prime example of this comparing PCA to RBM based methods
for topic detection[2].

2\. Google and people way smarter than I am seem to be doing just fine with
this[3]. That being said, I didn't say that random forest (with whole
companies built on this parallelism[3]) or any of the algorithms WEREN'T
friendly. I would say one of the main appeals for deep learning is the scale
of data with which it can benefit from.

Feel free to be skeptical all you want, if the researchers want to take the
time to write a full stack distributed framework, I welcome others in to the
game. The problem with the packages out there right now, (being matlab,
python) are training times, and integrating in to an actual ecosystem. I'm
addressing this this year at 2 different talks[5][6].

Replying to your last point, I use blas underneath for all of the matrix
calculations, I will be adding GPUs later this year, and yes you're right,this
stuff is hard to make. I also wouldn't be publicizing it if I wasn't already
using it in production applications. Frankly right now though, I use cpu
matrices right now, because I can fire this up on AWS (without the limit of
GPU RAM), and it's practical for hadoop deployments. Honestly whether we like
it or not, GPUs take a lot to get right. NVIDIA[7] and AMD[8] are going to
make my job pretty easy though.

To end, if I was afraid of every little obstacle, why do anything in the first
place? While you're hiding behind a throw away account, I'm actually trying to
put this in the hands of people who don't have the time to learn every little
thing about neural networks. At the end of the day, I follow the papers very
closely and enjoy what I do. I also work on all sorts of different techniques
for different problems combining different machine learning algorithms for
different tasks (just like anyone else would). This framework is my way of
getting this out to everyone else. If you have a deep learning framework, I'd
love to see it, maybe I could learn a thing or 2.

[1]: [http://zipfianacademy.com/](http://zipfianacademy.com/)

[2]: [http://www.cs.toronto.edu/~fritz/absps/esann-deep-
final.pdf](http://www.cs.toronto.edu/~fritz/absps/esann-deep-final.pdf)

[3]: [https://bigml.com/](https://bigml.com/)

[4]:
[http://static.googleusercontent.com/media/research.google.co...](http://static.googleusercontent.com/media/research.google.com/en/us/archive/large_deep_networks_nips2012.pdf)

[5]: [http://hadoopsummit.org/san-jose/schedule/](http://hadoopsummit.org/san-
jose/schedule/)

[6]:
[http://www.oscon.com/oscon2014/public/schedule/detail/33709](http://www.oscon.com/oscon2014/public/schedule/detail/33709)

[7]:
[http://www.jcuda.org/jcuda/jcublas/JCublas.html](http://www.jcuda.org/jcuda/jcublas/JCublas.html)
[8]: [http://developer.amd.com/tools-and-sdks/heterogeneous-
comput...](http://developer.amd.com/tools-and-sdks/heterogeneous-
computing/aparapi/)

------
graycat
If we define 'learning' in an appropriate way, then any good research library
already knows "everything about anything".

Okay, let the thing 'learn' about the Kuhn-Tucker conditions by searching on
Google and reading, say, Wikipedia or some books at Google or Amazon. Then
have the thing show that for problems in functional form the Zangwill and
Kuhn-Tucker constraint qualifications are independent. Do that and I will
start to believe that the terminology 'deep learning' is appropriate. I'm not
holding my breath.

Yes, it may be that in some rough sense the kind of 'learning' it is doing is
roughly like some of the learning of a child of, say, 2 as it is starting to
learn about language and things. Yes, it may be that such 'learning' is a
significant part of the intelligence of, say, a child of 3-5. Maybe. Big, huge
maybe.

When I was working in AI, I noticed the terminology had been cooked up to
imply much more than was being accomplished. Now, as I understand it, there is
a specific definition for the current AI term 'deep learning' and has to do
with the 'depth' of where adjust parameters in a neural network, not how
'deep' the 'learning' is about the subject in question. Cute terminology.

------
nawitus
AIXI-mc can also learn everything about anything. And you can download the
source code online[1].

1\.
[http://jveness.info/software/default.html](http://jveness.info/software/default.html)

------
ZeroFries
This doesn't really help the symbol grounding problem: it uses pre-human-
sorted data (who use their own ability to match symbols and meaning) to form
its associative network. So, it's using human consciousness as part of the
input to form its own consciousness. You could argue that humans use other
humans' consciousness to develop its own, but now you have the infinite
regress which seem to be the fate of all symbol grounding contemplations.
Surely there has to be a starting point, some "axioms" you initially accept
about the world to start the process. Maybe these are embedded in our DNA and
have evolved to be a practical start (eg: sharp sensory input from nerves on
your skin is automatically linked with pain, which we automatically avoid).

------
motyar
Where is the code? is that opensource? and which programming language it uses?

~~~
cedias
You have everything here:
[http://levan.cs.washington.edu/?state=show_about](http://levan.cs.washington.edu/?state=show_about)

I quote the readme:

> _This is an implementation of the "Learning Everything about Anything"
> system. The system is implemented in MATLAB, with various helper functions
> written in Shell, Python, MEX C++ for efficiency reasons. For details about
> the method, please see [1].

This readme contains instructions on using the code, as well as
accessing/using already trained models for various concepts.

For questions concerning the code please contact Santosh Divvala
([http://homes.cs.washington.edu/~santosh](http://homes.cs.washington.edu/~santosh))
at santosh@cs.washington.edu.

The software has been tested on Linux using MATLAB versions R2011a. There may
be compatibility issues with older versions of MATLAB. At least 4GB of memory
(plus an additional 0.75GB for each parallel matlab worker) is assumed._

------
bra-ket
I'm surprised NELL learner is not cited as it's closely related:
[http://rtw.ml.cmu.edu/rtw/](http://rtw.ml.cmu.edu/rtw/)

------
clubhi
// learn everything wget google.com?search=random_string() >> everything.txt

------
dang
This is such a dumb title. Can any of you suggest a better one?

