I did an internship where one of my tasks was to automatically remove the background in X-ray images, I spent basically half a year studying image processing, segmentation, reading papers, skimming image processing books etc... and never came across the word "matting", which was exactly what I was doing.
Somebody should've told me that earlier, would've probably saved me a month.
2. The network may rely on differences in lighting, etc, and fail to generalize.
The state of the art in natural image matting already is confronting fine details as well as image segmentation and clustering. Copying papers from 10-12 years ago would give much better results than he shows here.
Insert whatever background you like (as long as the result looks natural). That way you can automatically get a good pixel map of the subject. And using a video camera at 30fps, you could get thousands of training images in just a few minutes.
Of course, you might run into an issue of overfitting (it might learn what you as an individual looks like and not generalize to other people). However as long as you green-screen a somewhat large number of people this shouldn’t be an issue.
Edit: darn. Looks like I wasn’t the only person to think of this idea!
It would be interesting to take the output of this and use the alpha mask as the starting point for the grabcut mask.
The end result is that we preferred to used the blank white wall that was lying behind.
Thank you for the comments :)
This is currently merely a side project alpha, we are trying to secure funding to take it further.
Glad people having fun though :)