As my Master's thesis , I built a crawler that did similar fingerprinting (although less generic). It wasn't something breathtakingly novel, but all in all a somewhat successful project.
It detected > 100 CMS, additional features like ad networks, social embeds, CDN, industry detection, company size etc. In the end, you could run a search and get the result as an excel sheet (because apparently that's what people like.)
The whole thing took about 6 months and ended up with > 100 million domains on a single (mediocre) machine humming away at around 100 domains/s. The sales/marketing folks loved it.
Since I was just finishing university, my skills were still pretty raw, so I'd assume that an experienced engineer would be able to do this a lot faster.
From what I can tell, there was a lot of demand out there and sites like builtwith sold their somewhat limited reports (at least at the time) for a good amount of money.
Previous discussion: https://news.ycombinator.com/item?id=2022192