"You are facing north in the center of the sidewalk, toward the intersection of main and 1st streets. Standing 40 feet in front of you is a brown dog and 72 feet in front of you there is a woman standing on the other side of the road. Twenty seven feet ahead is the front door of mcdonalds. Would you like to hear the latest google locations reviews of that restaurant? Your apartment building entrance is 157 feet ahead. 34 feet ahead and to your left a graffiti artist has tagged a brick wall with the QR code pointing to the goatse website and there is a billboard with an advertisement for the latest star trek movie to your right and 50 feet upward. Also it is dark and you are likely to be eaten by a grue."
Google 'acquired' Professor Hinton along with the two co-authors of this paper earlier this year.
This is the paper that did unsupervised training of a deep net on frames from YouTube videos, and found it had autonomously developed detectors for, among other things, human faces and cats. Jeff Dean is a coauthor.
- Google+ queues images for recognition. Results improved steadily over 72 hours.
- Google+ does not use OCR of text in the images. That surprised me. But perhaps it's a privacy issue.
- Google+ does use information gleaned from elsewhere on the web. Words that were associated with the same images on Flickr would turn up those very pictures on Google+.
- Oddly, Google+ does not use information associated with those images on Twitter.
- Google probably uses EXIF data married to a database of location names.
- The much-vaunted feature recognition is impressive, better than any other system, but for me did not achieve creepy levels of intuition.
Historically, error rates of around 20-25% won competitions and set records. A year or two ago, though some researchers and professors from the University of Toronto absolutely smashed those records, getting around a 16% error rate. They went and made a startup out of their tech, and got acquired by Google a few months ago.
I think that this is going to be the first of a long line of Google products integrating this sort of deep neural network technology. I wouldn't be shocked if Google in 10 years was known for something besides search, at this rate.
If I'm flipping through my album of dog photos, or looking especially closely at dogs via google glass, maybe Google will show me an ad for dog food?
But seriously, how paranoid can people be? If anyone really wants to get your data, do you really think it's safe on your server or on your local machine?
No-one is suggesting that Google are going to hack into your machine to get your data, nor that what they want to do is out and out unpleasant, but what it is in in their interest either instead of or as well as yours.
Until we work out that instead of / as well as, I think a healthy questioning of what might be happening is reasonable.
The Google computer has been reading about these concepts for years, now we know it can see them in pictures (and maybe even in live videos). That excites me to a degree that it becomes a little bit scary. When will that computer learn the concept of "self"?
Update: actually Google seems to understand the concept of "golden retriever", I search my photos with the word and yes, at least Google knows how golden retrievers look like.
The data is available for querying as well as licensed such that you can take it and build your own commercial database with it (requiring only attribution).
"I'm sorry Eric, I'm afraid I can't do that."
The technology discussed in that article is about deducing the existence of a common feature, in this instance a cat, from a large collection of unlabelled images.
Mostly unlabelled then, which means you can learn to generalise over a huge number of images but learn labels on a smaller set.
"We applied the feature learning method to the
task of recognizing objects in the ImageNet
dataset (Deng et al., 2009). After unsupervised
training on YouTube and ImageNet images, we added
one-versus-all logistic classiﬁers on top of the highest
layer. We ﬁrst trained the logistic classiﬁers and
then ﬁne-tuned the network. Regularization was not
employed in the logistic classiﬁers. The entire training
was carried out on 2,000 machines for one week."
Basically you learn features in unlabeled data, then identify the features your trained net is recognizing with labeled data. When you run over g+ images, you then only tag with features you're sure of past some threshold of certainty.
"Abstract: We consider the problem of building highlevel, class-speciﬁc feature detectors from only unlabeled data."
There are tons of photos of these places online, many of them tagged ("breaking news from the dome of the rock", or "here's me and Sam at the western wall"). Collect enough of these and you can attach knowledge to images. Then, you just have to know two images look similar, and you have your classification.
Neither of the above is easy - nay, it's very hard. But once you have those two building blocks, this technology is viable. And it's very exciting!
It's very impressive.
They wrote an algorithm that takes that data and recognises new images with it. As long as there is a way for us to tag inaccurate matches then it should be able to continue to learn. I imagine any flagged matches are being reviewed carefully.
How did they do that?
make search sound like scotty explaining a warp core on star trek
(In other words, I consider Google technology company, Apple a marketing one.)