I'd love to see someone to pony up and purchase a first-generation search engine (ie: Altavista, Lycos, Hotbot) that still has some nostalgic branding left. AOL, Yahoo, MSN have all been around since that generation but they've all gone through failed re-branding issues in a struggle to keep up with Google.
Pair it with a solid technology backend (DuckDuckGo) and you might actually have a chance to pick up the biggest US internet user demographic (Gen-Y) that happened to grow up with those sites.
How does a search engine make money then?
 Search: https://www.google.com/search?btnG=1&pws=0&q=HD+moni...
 Hat tip: http://01100111011001010110010101101011.co.uk/2012/04/matt-c...
Bing isn't too much better for this search phrase, but kudos to DDG for the Super User post atop the search results.
Back in the days of something like the yellow pages, it was also an advertising supported model (or pay-for-placement, something that might damage the credibility of a search engine). What I also found very interesting is the term 'yellow pages' is in the top 5 highest revenue generating search terms. 
There is some broken cultural obsession with making this stuff free, and that Next Big Search must have automatic "web scale" appeal to the entire planet and cater to all needs and use cases simultaneously.
I'd happily pay for a curated, spam free index if that index satisfied 90% of my needs, retaining privacy raping, up-selling free search for the remaining 10%. About 50% of my search traffic is already in the form of keyword searches pointed at IMDB/Wikipedia/eBay/Amazon site-specific search, so why not just wrap this up for me.
Hugely complex automation, web scale, index freshness, instant search, deep web blah blah, and all the other costly noise I care a good deal less about. Sometimes thinking small doesn't hurt that much.
And if you look at what I'm saying from the right angle, you might even see a thousand untapped niches for industry/lifestyle specific companies collecting money from consumers in their segment and passing it to retailers who charge for access to their (otherwise hidden behind proprietary apps) catalogues. I've no idea why this isn't more common already, it's as if the entire industry has been frightened into believing that using computers to find things we already know exist to be an insanely difficult task.
"About 50% of my search traffic is already in the form of keyword searches pointed at IMDB/Wikipedia/eBay/Amazon site-specific search, so why not just wrap this up for me."
Do you like how DuckDuckGo handles this?
As for DDG, kinda but not really. There is a fundamental disconnect between my everyday needs on a computer and the kind of needs that Big Search apparently inexorably must accomplish (and fashionably heralded by the industry in forums like HN), that DDG attempts to emulate.
Most of the time I want to access my trusted vendors (bank, shop, eBay, ...) and a search engine is supposed to be my entry page to achieve that. But say, as I just searched on DDG now for "The Art Of Computer Programming", I get (from my perspective) trusted results from Wikipedia and Amazon, mixed in with crappy spam from freebookzone.com and so on.
My point is that I don't care about that freebookzone.com link, it's a (difficult) problem my ideal search engine would not be attempting to solve.
I am DMW, I pay you $50/year because you don't try to spelling correct programming language identifiers, and for the privilege of you knowing that 90% of the time when I search for a book I want you to send me to Amazon UK, even if I'm connected from Estonia, and apply these rules because I told you them somehow (say, during signup). If your results are incomplete, then provide me with access to some separate deep web mining/ranking service, who I'd expect are paid a cut of my subscription to gain access to.
1) paying for fine-grained index controls, eg you publish something new, head over to search engine and tell it to spider your site, or you tell it to spider it between 2 am and 3am, whatever. You could also use this to test updates you're making ... imagine being able to do a dry run on your new version and see this is going to cripple your SE traffic. Or get your new article analyzed before you publish it.
2) dress listings up ala ebay ... not sure if they still do it but they used to do cheesy crap that let you make your listing stand out more than the other guys, if there's a tasteful way to do that on a CPM basis it would print money
3) charging for an api like bing etc are doing
4) charge for reports on phrases, websites, industries
5) charge for telling you why your competitors are outranking you
6) charge sites a subscription ... AOL probably gets most of their traffic straight from Google, and they probably deserve almost none of it, so make them pay for that traffic. The large, eyeball-driven sites could easily be discriminated against.
7) charge low quality sites to un-penalize them. This is not a pardon, it's just a reset and it'll eat into their margins but whatever, they need your traffic.
This all revolves around two things: tax garbage sites, and provide tools for legitimate sites. These feel like low hanging fruits to me, there'd have to be much more interesting ways to monetize it than these.
The only hard part really is getting people to give a shit that you made / have / are a search engine.
1. You can get a lot of that data and those tools for free already. Webmaster Tools and Google Analytics.
2. Many of the things you'd charge for are things people don't understand anyway. In our little hacker bubble we know how valuable this stuff is and see a fair amount of companies use it but the vast majority of websites are operated by mom and pop shops and mom and pop can barely figure out how to turn on their computer. Expecting them to have any interest in getting or interpreting those reports is like trying to get them to learn quantum theory. You'll end up with a very limited customer base.
These paid options create an unfair advantage. It's the exact reason why Google was so successful. Google is trusted and popular because it isnt a pay-to-play system. People will quickly figure out that the rankings are biased and quit using the engine. This is a step backwards in search.
Saying that charging to unpenalize a site isn't a pardon but a "reset" is disingenuous. Call it whatever you'd like but in the end it really is a pardon. The idea is to discourage sites from gaming the system and your whole idea is to encourage them to. What we'll end up with in the end is that what you call "garbage sites" are just sites without a lot of money and "legit sites" are those with money.
I'm sorry but your plan just takes us back to the pre-google dark ages of search.
2) There's a whole SEO industry that operates on a hazy interpretation of what Google is supposed to be doing these days ... lots of companies know what SEO is, they know what it does, they know why they need it, and they pay out the arse for it. This brings clarity to that industry and those companies instead of letting them reverse engineer the changes you make and speculate on what matters. If they're willing to pay $100s/hr for SEO they'll surely pay $1000s for a roadmap straight from the source. That's like a printing press for money because that data expires when you act on it.
Money creates an unfair advantage right now. Pay people to spam backlinks to your website and you'll rate higher. Pay people to write summaries of blog posts and eventually you'll rate higher than those blogs you're sourcing your content from just because you can afford to generate more content faster. Pay people to submit and vote on digg, reddit, bla bla bla. Pay people to write about your product and create content. Pay people to market your site by writing content tailored for social media communities and get 1000s of backlinks. Pay people to do viral marketing stuff. Pay people to link to you. Pay Google to feature you above the search results.
The 'pardoning' is a little scammy and would be difficult to implement but the goal isn't to encourage them to take advantage of the system, the goal is to get your share because they're going to take advantage of it regardless. Google does this already via AdSense.
Furthermore I don't think there is a way to get data that is any less vague than it is now. Each site is so unique that this solution can't scale and you'd have to settle for analyzing the data yourself. Also, Analytics does have to do with what you described when it comes to seeing what's working as far as SEO goes and yes, AdWords would be more appropriate as an example when talking about competitor and keyword research. For some reason I thought those tools were in GA.
Generally though this really seems like a return to the bad old days except instead of keyword stuffing your meta tags you pay to play. Your whole plan would lead to the end of truly organic results. Yeah, the system a
Ready gets gamed now but at least everyone has am equal shot of gaming it. All you need is the knowledge. The current paid techniques of gaming the system would simply shift from third parties to the search engines themselves. I also feel like what you describe is closer to a paid directory with search functionality than a search engine. I mean, even if it worked like search does today plus those paid features it wouldn't be long before the true search functionality became irrelevant and we'd be left with a directory where whoever paid the most came out on top.
To your credit, I agree that it would be nice to get some more data, better data, and data presented in a more human-friendly/layperson-friendly way but you lose me as soon as you get into a lot of these paid features that help you rank higher.
There is nobody out there who knows exactly what is going on or whether your redesign is going to help or harm or whether your content is the best it can be. But they will charge you lots of money to apply what they've observed to work before or to automate processes and monitoring and performance.
All of this happens today without any specific clarity into how Google works, I don't think it would worsen the situation if the guesswork was taken out of the equation - sites with no SEO still won't matter, sites with SEO still will, and bad people/sites will still be an on-going game of whack-a-mole.
I experimented with this idea about a month ago. You can grab the source here (https://github.com/SeditiousTech/Avina) and visit the index here (avina.apphb.com). It's not a real search engine per-se, but it is/was a pretty cool experiment. One of the problems is that people will forget to turn the extension off when accessing personal information (banking, porn etc).
Why is it that when we talk about making search better many (seriously, like gobs of people) talk like the only way to improve it is to overthrow Google. Improvements in search can come from anywhere. If Google can deliver on what the author talks about then that's great. If someone else can then that's great too. The point is to improve search not overthrow Google's dominance, right?
I'm all for the ideas in this article but I was totally turned off by the subtext that implied a need to take down Google. We don't really need a next Google. Google can be the next Google. It doesn't matter so long as the hidden web is indexed.
Why is it that as soon as a company is no longer the underdog we immediately throw them under the bus. Microsoft made PCs the norm in US households and now we love to tear them down (rightly so in many cases, admittedly). Facebook used to be the coolest thing ever and now we love to hate them too. Same with Google and Apple. Why do we hate incumbents so badly?
In any case, yes, let's index that hidden web. But let's focus on the indexing itself rather than who does it. If Google succeeds at doing this will it not count and will we still call for someone else to "disrupt" the new hidden web indexing industry?
For instance, a whole city rioting to break new looms because it was putting "honest weavers" out of business.
Or the publishing world rioting against anything that smells of sharing ... since forever.
Google surprisingly doesn't act like an incumbent at all. And that's good. We shouldn't hate on them because they are incumbents since they're doing a damn good job at it.
edit: Also the whole idea that "When a market is dominated by a single player. That market is ripe for disruption."
And your examples of losing email or Adsense accounts isn't so solid. Those are really edge cases and its a problem endemic to creating applications that need to catch abuse especially when the user base is so enormous. We know computers aren't people and they can't exactly think so considering the amount of data Google has to filter through and knowing you'll never write code that's one-size-fits all I think they're doing a good job. I'd presume any competitor would have similar problems once they grow to a certain size.
I expect someone to call me out for saying you can never write one-size-fits-all code -- to them I'd say it's true; as long as humans continue to be fallible then so will the systems we create. There will always be an edge case and it'll take every last one to pop up before we can even conceive of trying to catch them all. But that's off track so I'll end it here.
As for the 'who' ... doesn't really matter whether it's Google or someone else that fixes whatever problems but historically it's not in their DNA to care about individual users so it feels quite natural to assume they'd be replaced rather than repaired.