I find both its birth and death interesting. The birth because the fact that you...

Shank · on March 23, 2013

I think it'd be illogical for Google not to in some ways compete with alternatives, like Blekko in this case. Google's core business is search, and in the early days, it was this very naive behavior that allowed Google to sneak right under places like Inktomi. Just because a search engine starts small doesn't mean it can't gain traction or market, and if it has compelling enough features, it'll happen in a heartbeat.

I suggest reading "I'm Feeling Lucky" by Douglass Edwards - the early chapters describe a lot of Google's anxiety of being killed by other search engines for one reason or another. It helps to explain why they'd be quick to adopt new strategies to keep results fresh.

incongruity · on March 23, 2013

I'm not sure google is even clear about what its core business is, these days. It really seems quite clear that they've abandoned much of the approach and focus on being the information finder to now being the social-network wannabe.

It's as if something flipped 180 degrees - they went from being the company that wanted to help you find out about everything to being the company that wanted to find everything about you.

jacquesm · on March 24, 2013

Google's core business is advertising.

jacques_chester · on March 24, 2013

Anyone who disagrees with this other Jacques is invited to take a squiz at Google's annual reports.

Any of them since the introduction of adwords and adsense. It doesn't matter which. They all basically read the same.

incongruity · on March 26, 2013

Indeed – perhaps I was wrong to say that they don't currently know what business they're in – rather, they haven't known and recently, they've re-aligned around a singular strategy focused on that.

The problem is that for so long, they built their public facing brand on a completely different premise and image. Therefore, as they've re-aligned, they're facing a bit of cognitive dissonance in the minds of their products (err, the public).

just2n · on March 24, 2013

Might the leadership change have anything to do with that?

jacquesm · on March 24, 2013

If google is still researching AI then it is likely because the war with the spammers can't be won in any other way. Sooner or later you end up in a situation where the amount of 'spam' versus the amount of 'ham' is such that no matter how good your algorithms and how good your computing infrastructure you'll end up spitting out a lot of spam.

Human curation is a stop-gap solution, computers can generate spam faster than humans can filter it. You could argue that the humans will only have to look at the good stuff but unfortunately they'll have to look at all of it to make a decision in any practical setup.

Google has stepped up the arms race against spam and for a while their algorithms gave them an edge, now we have reached the level where the spammers have the edge again and it will take another quantum leap before the good guys can regain the edge.

It would be funny if we end up with AI mostly because of the spammers :)

ChuckMcM · on March 24, 2013

It would indeed be an interesting situation if the spammers forced the creation of AI (either they doing it to make better spam or someone like Google to deny it).

One of the interesting things I have experienced in my time at Blekko has been that "growth" on the Web isn't really growing all that much. Sure there are trillions of pages being created but there are only so many things the few billion people in and around the Internet care about. There are 'hard information' places, which are things like libraries where reference searches are common, there are 'entity' places, be thay shops or service providers or SOMA startups, and there are "transient" places where information is current and then stale, to be stored and later reconstructed like coral into a 'dead' (in terms of change) but 'useful' base. Seeing the web from the point of view of a web scale crawler and indexer it starts to be clear that the mantra "Organize all the world's information" is getting tantalizingly close to a dynamically stable froth.

I have to believe that Google has figured this out, some of the smartest engineers I've worked with are at Blekko but Google has its share as well. When you trawl through the fishery and all you get are trash fish you start to wonder, "hmm did we actually catch all the fish there are?" So to it goes with "the Web".

I started doing some speculation [1] on how you could value information that was discoverable on the Internet. And one of the schemes I came up with is how many people would find that information "useful", where useful really means they would have some reason of seeking it out. And then scaling that value by the value to them of having it. So for example if my genome was online, there is maybe a dozen people who would find it "useful", and of that dozen probably on the insurance actuaries and perhaps the occasional researcher who would find it "valuable".

Now one takes that unit of applicability/value and scales it again by the "cost" to acquire it (find it, index it, etc). And from that you can compute the total size of the Web. Well estimate it at least. So far my upper bound (based primarily on the fact that world population is stabilizing) is about 72 trillion documents at any given time.

When you look at it that way you can see that ultimately the spammers lose. They lose because over time the actual information that rises to the useful vs cost threshold is identified and classified, or the legitimate channels that provide dynamic information, or the legitimate archival sources that provide distilled information are all, for the most part known and 99.99% of all your users can find everything they want. And as a spammer you are no longer given the free reign of "appear and be indexed" you have to ask for admittance though some form or another. And the level of new credible sources that are created is inherently a function of the number of people that exist, and the number of people that exist is stabilizing.

When Yahoo started with its human curation it was vastly better than anything anyone else could do, and then it was overwhelmed by a combination of growth and algorithms that could much more rapidly infer curation from the social signals of bookmark pages and article reference. Curation has come back into favor, and it combined with machine learning algorithms will create what is essentially a stable corpus of documents[2] known as "the web".

Wikipedia is a great analog for what is happening world wide. All the reference articles they want to put in are nearly done, the number of editors required has gone down not up, and the future of web search is, in my opinion of course, similar. The only new ground on the web is social networking and Google understands that it seems. Its fortunate for them that only deep pocketed entity that is possibly a near time threat there is Microsoft, and since to date Microsoft is trying to do exactly what they did, they benefit from having already been through that part and know exactly what Microsoft will have to do next.

[1] My side hobby is attempting to discern the economics of information.

[2] Documents are just that, pieces of information, I hardly count every page rendered by Angry Birds in a browser as a separate document, in fact applications are them selves a single "document" in the since that some number of people will seek them out, and connect/consume them over what we think of as the "web" interface.