Hacker News new | past | comments | ask | show | jobs | submit login
Google: we're having scaling issues, please stop distributing your FF extension (splitbrain.org)
99 points by gaika on June 26, 2008 | hide | past | favorite | 19 comments



All things considered, 1. I'm not surprised a plugin like that put heavy load on Google -- Bookmark every page? But the bookmarks being linked to searching is a small surprise. 2. It's quite impressive how smoothly they handled things. Don't be evil, indeed. 3. Does anybody look before commenting? The shirt is long gone...


Yeah. The great part of this story is that Google approached this developer and asked him nicely.


>3. Does anybody look before commenting? The shirt is long gone...

It's the kind of thing where you don't look because in the extra 10 seconds required to read the comments someone who didn't look could get the first post in.


Well, yes, but the fact that it even has comments should be a giveaway. People are asking for the shirt two days later.


It surprises me that Google gave up so easily! This is a pretty similar service to what WebMynd provides - we're handling ~2.5m pages per day and although scaling is difficult, it's doable.

We out-engineered Google!!


Don't flatter yourself.

The problem Google is having isn't with storing the links, its with the effect of the links on the personalization of your search results. Which is a problem WebMynd isn't facing.

They can't precompute the eigenvectors because it changes every time you visit a web page and calculating it live for each search request when the user has thousands of personal pages can be expensive.

I would have just disabled the effect of the personalization on search of bookmarks not added manually.


Obviously, it was a slightly tongue in cheek comment..

However, I would point out that WebMynd IS doing personalised search (totally personalised, in fact). As you point out, updating search indices every time a user visits a page is difficult. The fact that Google have chosen not to tackle this challenge, and we have, is a symptom of their existing methods and stated goal of organising the world's information: not your's.

As for your last point, I actually agree with the OP and others that Google's soft touch on this matter was commendable; disabling the effect or blocking the extension outright would have been heavy handed and sent a pretty bad message.


I'm not entirely sure disabling the personalization effect would have been the wrong thing to do from a user experience point-of-view. Supposed I let a friend browse on my computer, do I really want my search results to be personalized with stuff that interests him/her? What about pages I mis-clicked on or pages that whose link was misleading?

I think that requiring a deliberate, conscious vote of confidence from the user for a web-page before actually letting it affect my personalized search results is a very good thing.

But from a public relations POV you're probably right, that was the best approach


The downmodders should really explain why they are modding down a perfectly sensible comment here.


What WebMynd is doing is incredibly daunting - they're not just storing a bookmark for every page, they're doing full text indexing on the content as well - in real time. Seriously hard scaling issues, and they're coping with them well.

I hear they're hiring also, if you want to help them out-scale Google!


It's surprising how easy it is for me and for other people to forget that Google is not a monolithic entity. It's a collection of people, like any other big company. There's going to be good people and bad people in there. The press tends to refer to it as a single entity, however.


"Whenever you use the web search, it checks it against your Google bookmarks. You can easily imagine what problems can come up when you have a several 10 or even 100 thousands of bookmarks…"

Am I mistaken to believe that unless they are using a poor algorithm, 100k bookmarks to add to their index shouldn't be a problem. What am I missing here?


Probably the bookmarks aren't stored the same way their search index is.


I don't know, but whether they do or not shouldn't matter. They should design it so it can scale, 100k bookmarks is not really that big of a dataset for anyone, let alone G.

I don't see why they can't just fix this instead of asking him to revoke his plugin, especially if people are using it. It's needlessly removing a use case which may gain traction.

Of course they may know something I don't and which isn't mentioned in the article which makes doing any of this infeasible or undesirable.


Maybe they should just include the bookmarking in their search indices.


The bookmarks are already indexed. The pages that are bookmarked have a edge against pagerank for that specific user.


If I was a spammer or SEO, I would be figuring out every possible way - whether shady or not - to get someone to bookmark my page in Google Bookmarks.


does anyone care about the poor captcha system on http://www.splitbrain.org/blog/ ?


Finally a sensible response from a software company.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: