At Facebook, we are using a similar technique to crop images. The part that is different is that instead of using the center of all the faces, we find the viewport that maximizes the number of faces to be displayed. http://blog.vjeux.com/2012/image/best-cropping-position.html
That's going to be the next step for these as requests come in for images of low aspect ratio which generally just become resized smaller instead of cropped around a face.
Thumbor (https://github.com/globocom/thumbor) is a great Python library for helping cropping which uses openCV for facial detection (as well as other algorithms for finding "interesting" parts of photos)
Thanks for posting that. I haven't seen that one in a while but built a few. I'd recommend if anyone uses it to not expose it directly to the internet, once someone malicious sees all the sizes in the URL they may try hitting 1..n X 1..M which is a big hard on the servers.
We used specifically named sizes to avoid that problem although whitelisting in the edge servers would also work.
I implemented something like this with OpenCV late last year using a node.js wrapper. We were looking to do something similar for image cropping. What I found was that OCV was kind of unreliable. A major percent of the time, it would mark knees as faces or even wrinkles in pictures of women's handbags. It was a side project and we didn't pursue it, so I was unclear if there was a way to train it over time. If so, did you do this to improve your accuracy?
I also used PIL a few years back to do image generation from text. I kept running into memory fragmentation bugs, weird artifacts creeping into images, etc. I haven't been able to recommend PIL for any serious work since then. Has this gotten better?
We haven't trained our model at all and just use the include XML file. We have adjusted the default neighbors threshold which helps mitigate the knee factor. Faces tend to be in the 40+ neighbors range from my trials and the default is 3.
We use ImageMagick for cropping and resizing, PIL was just easier for the example. I haven't noticed any issue with PIL but I also haven't used it for any serious image processing.
I built something like this to automate cropping headshots for players in major sports. I figured it would work flawlessly after I ran a few test images through, but it turns out that it failed detecting a face or gave the wrong coordinates about 20% of the time.
I'm curious to know what the success rate of SeatGeek's process is.
Were you using your own test data or a provided XML file?
Sports shots in our case don't get a great hit rate. Adjusting the `minNeighbors` parameter can help out with that depending on how many false positives you can accept. Musical artist misses are in the single digit percentages although shadows and strange backgrounds can give some additional faces that we don't really care for.
When collecting images we are now searching for those with more direct faces visible to make the detection easier. After that though we just try to get the face in the direct center and fall back to hoping the face is in that spot if we can't detect any.
At some point I want to try checking for partial face matches as well which should help in major sports since we tend not to use headshots.
I haven't touched the app in about a month, updated the library to the most recent version, and I'm now getting a 100% hit rate. I guess the library was a bit buggy.
Also, these headshots are the ones the players take before the season starts, not action shots. In theory there should be a high hit rate.
Beards and sunglasses also tend to cause problems. It's also nice to allow manual setting of where faces in some web editor to override what OpenCV might find.