A lot of people took this to mean the API was going to keep running.
This isn't strictly a lie, because the person who says this genuinely believes it. But the problem comes from, he doesn't stick around to follow through. Maybe he gets promoted. Maybe he's the "acquisitions guy" and he's gone onto the next juicy startup to tell them the same thing.
The guy you actually end up working for doesn't care about your startup. To him, you're just employees of the firm who got hired a slightly different way than the Graduate Trainee Fast Track Programme he came in on, and it's not that he resents you as outsiders or anything, he just doesn't see why you shouldn't be treated and act the same as everyone else. And since you work for him now, you vote either way with your feet.
I'll be checking out the alternatives and resources you guys are posting though, HN is great!
Fuck you facebook...
Often the secret sauce is not the algorithms themselves, but the availability of (massive/broad) training data.
All we need is someone to auth the app allowing the "view photos" and "view friends photos" permissions and then crawl all of them for photos with faces in them. The return gives us a pixel location for the center of the face and then you compare that against what the detector returns. Keep iterating on that for the millions of photos Facebook grants you and you should get Face.com-level accuracy fairly quickly!
Does anyone have any recommendations about which algorithms I should use? Would I be sampling Haar-like features? (I've only worked with feature training a few times.)
It would be cool to train for other objects and features as well. Cars, body parts, animals...
You could stratify and charge for categories, pre built features etc.
The dataset for useful training of video, images and audio is going to easily range into tens of gigabytes or more.
This service would probably be best modeled on the mechanical-turk approach.
Worth noting is that Youtube is kind of a source already.
The labelme dataset: http://labelme.csail.mit.edu/instructions.html This is close to what you were thinking (and I liked) -- having annotation done by public.
A list: http://www.computervisiononline.com/datasets
PASCAL VOC: http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2011/
CALTECH 256, and several others. Many papers presented in CVPR 2012 (http://www.cvpr2012.org/) were indeed on very large datasets.
Do note that some of these datasets are fairly large (may or may not be as broad though).
The algorithms you use will depend on the complexity you can handle. OpenCV, IPP, and CCV (which was on hacker news few days back) could provide good options for algorithms with a training dataset you choose/create.
They have images labeled according to the WordNet ontology, and the dataset is still growing. They also have more detailed annotations on subsets of the data for various things (object, attributes, etc.)
If you're interested, results on the Pascal VOC  and Imagenet Large Scale Visual Recognition Challenge  are considered to be the state-of-the-art in computer vision (and therefore most promising). If you look at actual performance of the best contenders, you'll find that reported accuracies are often in the teens or twenties. These numbers are likely over-optimistic. So for most practical purposes, general object recognition doesn't work. At all.
However, there are some domain-specific approaches that work to some degree. Faces is one example. Another is pedestrian detection, which is getting somewhat mature at this point. A third is plant recognition, as in my previous project Leafsnap .
My personal feeling is that there isn't going to be "one algorithm to rule them all", but rather a collection of dozens of algorithms to deal with the most common classes of objects (faces, people, cars, buildings, animals, text, etc.), and then a set of other algorithms for recognizing all other classes, depending on each class' characteristics (shape, material, configurations, variability, etc.)
Finally, a word of caution. Many young PhD students in vision start having dreams of building such a website/service, and they have all had their hopes crushed. Different classes often need to be dealt with quite differently, and it's not even clear what kind of API would cover all the different cases. So if you're actually serious about this, I would recommend limiting your focus to a few specific domains, and also by considering some actual use-cases so that you have something concrete to aim for (i.e., "build an API so that application X is possible").
And only a 30 day notice? For shame. Are there any worthy alternatives? Maybe even some I can pay so maybe they stick around longer?
My idea is start with a similar API for face detection using Viola-Jones OpenCV algorithm fork (available in GitHub) and reduce the false positive/negative rate.
NO business-model, only donate for AWS service cost.
"Nature makes its way"
Our society/economy is currently not advanced enough to support free services nor are there many options to hide/divert costs sufficiently as not to matter. Eventually any free service must come to an end.
Thankfully, we're making an -- eventually open-source -- alternative to Face.com's REST API:
For my purposes though the REST API is what's more useful since my problem is not connectivity or bandwidth but processing power. Right now my software is running on a Raspberry Pi which can just barely run motion for capturing the pictures, I would need a real computer to run a full OpenCV install. That is why the face.com API suited my needs so well.
Are you planning on offering facial recognition along with the face detection or just focusing on face detection for now?
A startup testing some sort of hypothesis about a product or a market should get it out in front of people as soon as possible, and that should include using things like this. If they shut down, they shut down.
By that point, if you find some sort of product/market fit, you can always build your own face recognition, or switch to an alternate provider. If your product is a bust, then who cares if the API is gone?
Either way, there's no sense in a startup spending time building anything until it becomes the very highest-value thing they can do. There should be too many novel, risky hypotheses to test to worry about testing the one like "our team can build an adequate facial recognition service."
They provided a good service for free, and made it way easier then other way of doing CV.
If you were actually interested in this project space I'mnot sure why you would use anything other than face.com's API
If you're interested in face recognition, I suggest checking us out. We offer an SDK rather than an API usually with a two month free trial. More hassle, but less worry that an API will suddenly shut down with no warning.
Meanwhile, I'm already working in the first REST API iteration for face detection using OpenCV.
This work was just presented at the Conference on Computer Vision and Pattern Recognition (CVPR), the leading conference in vision. This paper was an oral presentation (< 5% submissions are), and having worked in this space, this is the real deal -- their results are incredible, and I've already heard that they are indeed reproducible.
Normally, face algorithms operate independently, in a sort of pipeline: face detection ("where in the image are the faces") -> pose detection ("which direction are the faces looking?") -> fiducial detection ("where are the eyes, nose, mouth, etc. on each face?") -> alignment ("warp the faces to make them have a more similar pose") -> recognition ("what is the identity of each face?").
This method does the first three in an integrated way (face, pose, and fiducial detection), and with many fewer training examples than most commercial systems (including face.com and Google's Picasa) achieves really impressive performance. This is a huge deal, because normally more training data = better performance, and also because by doing all three steps together, the numbers reported are much less optimistic than they normally are in such papers (which always assume that things "upstream" happen perfectly).
The need for less data matters especially for open systems, as many are suggesting to build in the comments here, because sharing images of faces runs into copyright and privacy issues. As a company, you can collect a large dataset of images and if you never share it with anyone, then it's okay to use for training your algorithms. But if you're building an open consortium/system, then almost by definition that requires the training images to be shared, which is a big problem because now you're limited to a very small set of available data that is cleared for such use.
As far as code, there is Matlab code available on the linked page, but it's not clear what their license is. By default, I would assume it's "for research purposes only", but the paper goes into some amount of detail on the method which would allow people to reproduce it from scratch if they are worried. The approach itself is quite similar to the traditional "flexible part model" that is the basis for most top-performing object recognition methods (co-invented by one of the authors of this paper, btw), and the modifications to deal with faces are not very complicated.
Face recognition is still very much an unsolved problem, and while the face.com guys had some interesting approaches, it is not clear that they were necessarily the right way to go. And a large part of getting recognition right is getting all the previous steps right, so building on this work would be a good place to start.
Also, one way of building a recognition system ("whose face is this?") is using verification ("are these two faces of the same person?") as the key inner loop. If you take this approach, then you should pay close attention to the results on the Labeled Faces in the Wild (LFW) benchmark, which is the current de-facto standard that vision researchers evaluate on: http://vis-www.cs.umass.edu/lfw/results.html
Since this is my area of expertise, I'm happy to answer any other questions that people might have.
i hope it'll be back on graph, too good to let go...