I was going to make the same request.
The singularity should be discussed where relevant, not added to everything.
This paper is producing high level features from noisy data in an unsupervised fashion -- a human still needs to indicate the task it should be targeted for and a human still needs to provide labelled training data for these high level features to be of use.
This work is interesting enough to warrant detailed discussion on the topic at hand, large scale machine learning, rather than just rehashing discussions of the singularity.
Added: As I can't reply to the comment below I'll do it here =] The network provides learned representations that are discriminative.
The aim of the network is to learn high level features representative of the content.
One of the many features it produced was one which accurately indicated the presence of a face in the image.
Note that they said train a face detector and not classify.
For example, from the same network there was a feature which accurate detected cats yet they didn't explicitly train a cat detector either (see the section "Cat and human body detectors").
As the network represents the content as generic features it is clear that, if it reaches a high enough level, those features are essentially classifications themselves.
tldr; High-level features generated by this unsupervised network are so high-level that one of them aligns with "has a face in the image", others with "has cat in image", etc, but these features cannot be used without labelled training.
Actually, what's significant about this work is that labeled training data was not required:
"Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not."
I replied by adding to my comment above as it wouldn't allow me to reply earlier. Reference that.
tldr; High-level features generated by this unsupervised network are so high-level that one of them aligns with "has a face in the image", another to "has cat in image" (see the section "Cat and human body detectors") and so on.
Note however that they select the "best neuron" for face classification -- the only way they can do that is via using labelled data and testing all the neurons (where each neuron's activation is a feature).
Thus, these features cannot be used without labelled training.
But the difference is that you can show it 1 billion unclassified images, then show it 1000 images you know to be faces, analyzing how its neurons respond to the known inputs to use it to classify the rest of the images.
Strictly speaking, you do need to have some labeled data at the end in order to determine how the neural net views faces, but I think that obscures what's notable about this system.
The amount of human participation involved in training is potentially six or more orders of magnitude less. That's a breakthrough, and a change in kind, not just degree.
In a more general response: I don't think what I stated obscures what's notable about the system, I feel I stated exactly what was notable and specifically avoided overstating it.
Overhyping when it comes to machine learning and AI seems to be the norm and has already hurt AI/ML severely in the past[1].
More specifically: I didn't disagree with anything you've stated, simply pointed out that labeled training data is necessary in response to the statement that it wasn't.The high-level feature extraction the paper discusses is unsupervised but the classifiers it produces are semi-supervised. It's an important distinction.
Having a bachelor's emphasis in AI, I think you described it perfectly. I was wondering too from their abstract how they were recognizing "faces" entirely without labels, this makes it clearer. As you said, unsupervised they can find extremely high-level categories. That is pretty impressive.
How does this work? I thought neural nets only learned when they got some kind of feedback that let them know whether what their classification was right (back propagation).
The neural network in this paper, an autoencoder, doesn't require labelled data.
Autoencoders take high dimensional input, map it to a lower dimensional space and then try to recreate the original high dimensional input as closely as possible.
The idea is to learn a compressed representation for the data and hope that this compressed representation works as a high level featureset.
As the model is just trying to represent the original input, no labelled data is required for the initial part. Labelled data is later introduced when the high level features are used for classification.
What's most interesting about this paper is that one of the features learned by the model maps quite well to "image contains a face" without any prompting by the researchers.
Did we? A lot of early works in AI were ... /overstated/.[1] While a lot of concepts were created way back when, a lot of results weren't really generated. It's extremely valuable for someone to actually go and do a thing, now that we can, even if someone had the idea for the thing eons ago.
The early works in AI with regards to unsupervised learning were in the 1940s and 1950s. Claude Shannon had demonstrated a chess learning system which taught itself by playing him to defeat him in under two weeks as early as 1949.
No, they weren't overstated. They were hyped by a clueless press. There's a pretty critical difference. It's a bit like how the early web pioneers didn't say that the web was going to revolutionize the delivery of dog food; it was a journalist who said that.
"It's extremely valuable for someone to actually go and do a thing, now that we can"
Self organized unsupervised learning was in use for optical classification of potatoes in the feeding of Frito Lay automated processing plants in the late 1970s.
Please distinguish that you haven't actually looked for earlier examples from that you imagine none exist. Thanks.
I find both of your comments extremely condescending, both toward saalweachter and the authors of this article.
1. The fact that Claude Shannon succeeded in training a chess system has virtually no impact on sallweachter's claim that many AI results were overstated.
2. Certainly the press overstated them, which supports saalweachter's premise rather than weakening it. Even if the _implied_ claim was that _researchers_ overstated results, your argument does nothing to weaken this claim.
3. Frito Lay solved a problem several orders of magnitude easier that of face recognition in natural images, which is still very much an open problem in computer vision.
4. Similar to 1., the Frito Lay example contributes nothing to your goal of weakening saalweachter's claim that this is valuable research--a claim which is exceedingly innocuous.
I understand that you've probably got a bone to pick against the many AI naysayers and saalweachter's comments conjured a few common misrepresentations (i.e. (a) that the "AI revolution" burnt-out because it's researchers were somehow naive and (b) that neural networks are something new invented by computer vision researchers). You'd be justified in arguing against these claims, and I'm sure your father (respected AI researcher of the same name) would make them too, if saalweachter had tried to make them (which he didn't). But even if you were justified in making the argument, I would expect a less condescending one that made better use of evidence than the argument you've made here.
"I find both of your comments extremely condescending"
When a comment opens with a tone like this, I usually don't bother to respond, but I'll give you a chance, because you seem to have done a lot of honest mis-reading.
To wit, it may be of value for you to inspect your own tone, if you find public condescention inappropriate.
.
"1. The fact that Claude Shannon succeeded in training a chess system has virtually no impact on sallweachter's claim that many AI results were overstated."
It wasn't meant to. Sallweatcher's claim was silly. Who cares if many things were overstated? That has zero bearing on that valid work was, in fact, being done.
The purpose of that statement was to remind us that as early as the 1940s, machine learning was able to defeat its own creator at what remains today regarded as a highly intellectual pursuit. My goal was to ignore the FUD of "some people got it wrong" as an attempt to suggest that there was nothing right.
Some people always get some of everything wrong. His claim is tautological and disinteresting. I was politely declining to shame him for it, but since you've presented me as having false goals, I now have no choice but to clarify.
It is generally inappropriate, for reasons like these, to chastize strangers over imagined motivations. Frequently, you don't know strangers' motivations as well as you might imagine from a simple read of a few paragraphs.
.
"2. Certainly the press overstated them, which supports saalweachter's premise"
You are now repeating something I said to me back to me. From that, you are deriving the false conclusion that because a journalist somewhere said something wrong, an important thing has been discovered.
What I'd like to point out is that the net result of observing that journalists made mistakes is still "so what?"
"Even if the _implied_ claim was that _researchers_ overstated results"
It isn't.
"your argument does nothing to weaken this claim."
You have not correctly identified what I was speaking to. This is akin to telling someone discussing environmental damage that some farmer is talking about crop yield and the speaker hasn't weakened their claim.
Again: so what? I never argued that there are journalists who got things wrong. I'm the one who brought it up.
What does that have to do with my original discussion?
.
"Frito Lay solved a problem several orders of magnitude easier that of face recognition in natural images"
Discovering defects in potatoes moving at 45 miles an hour inside a water sluice from a single blurry image from a single angle in hard realtime using 1970s hardware is not several orders of magnitude easier than locating things on a face in slow time on modern hardware.
It's actually quite a bit more difficult even in fair conditions. Potato defects are under the surface, and have to be located by subtle color variation. It is not hard to find the characteristic shape and shadow of the nose.
With respect, sir, it's quite clear that this is not something you've done. You're claiming that easy things are more difficult than hard things, and you're forgetting the 40 year technology gap inbetween in your rush to show that a 2012 project is more impressive than a 1973 project.
To be clear, Babbage's mechanical calculator is also more impressive than an algebra solving system made in prolog. Why? Because it's more work and it's more difficult.
Your claim of several orders of magnitude simpler suggests that you are inventing data for the sake of feeling correct in an argument, and that you do not actually have the experience to show correct guesses in this field. That, combined with a tone suggesting that you feel it appropriate to rebuke strangers in public, suggests that I don't really want to much talk to you anymore.
.
"Frito Lay example contributes nothing to your goal of weakening saalweachter's claim that this is valuable research"
Again, you've misidentified my goal, and the way by which you've done that is to drop a critical piece of his actual claim.
I don't know why you feel that it's okay to guess at people's goals, then tell people how morally wrong your guesses are. I really don't.
My actual goal was to point out the jarring unfamiliarity with the field that both he and you evidence:
"It's extremely valuable for someone to actually go and do a thing, now that we can, even if someone had the idea for the thing eons ago"
The thing I was focussing on was to show him that this thing that he's applauding someone for doing in 2012 for the first time now that it's practical, even though it isn't being used in industry, was actually outclassed by a much more difficult problem on much more limited hardware in realtime 40 years ago by a company that nobody would think of as a technology giant.
The goal was to display just how far out of touch saalweatcher was with the state of the industry.
Please don't speak to my goals anymore. For someone who'd like to speak about condescention (when I think you actually mean arrogance,) for you to tell me what I meant and what I was getting at - incorrectly - then lambast me for it in a tone far more severe than that which you're criticizing is, I admit, difficult to swallow politely.
.
"I'm sure your father (respected AI researcher of the same name) would make them too"
Do not speak for, or involve, my recently deceased father in your attempt to be correct, sir. Especially not while you're telling someone else they're being rude.
"I would a less condescending one that made better use of evidence than the argument you've made here."
Unfortunately, though you suggest this, taking a brief look through your comment shows that this is not in fact correct. You have been radically uglier than that which you are criticizing, involving personal attacks, false claims of other people's intent, false claims of other people's goals, and the repeat involvement of a recently deceased relative.
I would prefer not to hear from you again. Thanks.
Also, this paper is about 20,000 object categories, not just 1 (faces). And the neural network is not the standard type but of the deep learning variety which has only existed since 2005 (invented by geoff hinton, who was also big in neural net circles in the 80s so he's not some newcomer who hasn't done his literature search). One of the couthors of the paper is andrew ng, head of the stanford ai lab, so he's pretty legit.
I have no interest in your presenting your unwillingness to do basic research as if it was a valid form of skepticism.
Whether or not you believe me, everyone else just went ahead and took a quick look, and learned something.
Frankly, I would be happier, given your seeming inability to be a part of this conversation in a polite way, yet also your seeming unwillingness to depart this conversation even after it was requested, that you actually believe I'm wrong, and go around "calling people on this," so that everyone has early warning just how much you actually know about this field, instead of having to wait to listen to you speak.
"Shannon was a much better chess player than any program available in 1949."
On a technicality, this is correct: he started his work on December 29, and it wasn't until five days later, January 2 of 1950, that it was able to beat him.
All the same, you have no idea what you're talking about, and are asserting your beliefs as fact.
The correct way to handle "that doesn't sound right" is a search engine, not putting your hands on your hips and telling someone they're wrong in public.
This work is interesting enough to warrant detailed discussion on the topic at hand, large scale machine learning, rather than just rehashing discussions of the singularity.
Added: As I can't reply to the comment below I'll do it here =] The network provides learned representations that are discriminative. The aim of the network is to learn high level features representative of the content. One of the many features it produced was one which accurately indicated the presence of a face in the image. Note that they said train a face detector and not classify. For example, from the same network there was a feature which accurate detected cats yet they didn't explicitly train a cat detector either (see the section "Cat and human body detectors"). As the network represents the content as generic features it is clear that, if it reaches a high enough level, those features are essentially classifications themselves.
tldr; High-level features generated by this unsupervised network are so high-level that one of them aligns with "has a face in the image", others with "has cat in image", etc, but these features cannot be used without labelled training.