Requiring even a modest at-cost fee for a web service does wonders to discourage all sorts of misuse, from wanton large-scale data mining, to blatant repacking and resale, to worse. (Heck, simply requiring a valid credit card alone helps.)
And sadly no, simply having low quotas for free access doesn't entirely suffice. If there's material value to be extracted from a free service, you'd be amazed at the lengths people will go through to create large numbers of low-volume scrapers. Most of these are obvious and easy to detect and defeat, but continually doing so adds up in cost, and it takes engineers away from providing better services to legitimate customers.
In short, most people on the outside don't appreciate just how difficult the handful of bad guys make it for companies to do something good for the other 99%. So I'm sympathetic to Microsoft here, I really am.
Imagine what the world would be like if there was a cost-per-unit to sending email - our inboxes would be a much saner, friendlier place. Fewer marketing emails, virtually no spam (no longer economical).
For all the money the Bing unit keeps pouring, I feel buying DDG and leaving them the heck alone can be a reasonable long-term bet with relatively little risk.
The challenge would be to keep the founders/team motivated. They could spin it off as a completely different company on IPO track and give significant equity. But then what if Google wants to buy them out? And of course, the Bing team may have a problem with MSFT creating internal competition though that may just push em to do better.
More likely if DuckDuckGo gets in an acquisition bidding war, I put my money on Gabe passing acquisition for raising a huge funding round that lets him take some money off the table.
With all the flaws(enough that I don't use it), it remains a rare search engine start-up that has its heart in the right place: to actually serve consumers versus build some technology or team and get acquired(looking at you, Powerset).
You mention a 'bidding war'. Who would buy them? Any 'features' they provide can be copied by msft/google if they feel threatened. They don't have their own backend search technology so partners who might want to do search with them, ask.com, search.com, etc. have no reason to work with them as opposed to the BOSS api themselves or even Bing. I'm just as excited as you are about innovation in search and competitors to google, but DDG needs to be re-architected on the back-end before these sorts of pronouncements make sense.
I must admit they do a good job in the PR area of making themselves seem big and sexy. But they're really not innovating search, or providing any thick value in improving search. At least Blekko was trying to create a better algorithm.
Why the past-tense? Blekko is still alive and kicking. (And I hope they find success, because they're doing amazing technical work. From a purely technical perspective, they deserve ten times the press that DDG is getting.)
But I think, the mistake Blekko did perhaps was being too close to a Google kind of (i.e. traditional) search engine. On this path, it may take them atleast another 3 years, before people start taking them seriously. It will be a hard and grueling road.
But the thing, I like about them, is that they did not make the mistake of cuil, and are being conservative in making promises.
Overall, I suspect, they may be feeling a bit out of sorts, as when they started out (2007) the social network thing was still in infancy. And now people are talking about facebook coming into search and so on, which if it happens, may be a totally different approach to search, than perhaps what blekko did, which was trying to emulate and out-do Google.
As an aside, I still use blekko everyday, and they obliterate google and bing in certain intent queries. If you search for some intent query in these categories: travel, jobs, real estate, cars, finance, legal, medical, services, and merchandise sales AND one of their blekko/user curated slashtags fires/autofires, blekko destroys. ( eg. cure for headaches )
I can't tell how serious you are with that statement.
If you want access to a big crawl to grep through it for interesting data, then Common Crawl is awesome and inexpensive and I don't think you can get anything like it for the price, unless your query is simple enough to run as a blekko webgrep (https://blekko.com/webgrep).
If you want to build a search engine, Common Crawl isn't so useful. Search engines want _directed_ crawling of the pages that they think are good. Crawling is only a small fraction of the total work done in a search engine. Search engines generally aren't on AWS, because the right configuration of machine isn't rented by Amazon -- serving queries needs SSDs or more ram and less cpu than what Amazon offers. So, what Common Crawl offers a search engine is higher costs and mostly bad data.
love the idea but search results are balls (ATM)
I'm sure they still make heavy use of the Bing API, but have since expanded their range of sources to soften the blow somewhat. They're now getting results from Blekko (who run their own index), and are I'm sure they've been building out their own index. There's a full list of sources here - http://help.duckduckgo.com/customer/portal/articles/216399-s....
We also don't have any pricing details for higher volume usage of Bing, and DDG are in a much better position to negotiate a better deal these days.
By charging for their search API, I would just say that Microsoft is beginning to take their API seriously. It seems pretty clear that minimal resources, if any, were dedicated to the free version.
I'm bummed, because I've found relative success using their news search api (particularly for the article aggregation component of http://www.congressionalprimaries.org/), and now we'll have to look into alternatives, but if this means actually providing a decent product, I think this is a good move for Microsoft.