Hacker News new | past | comments | ask | show | jobs | submit login

Webspam is a really big problem, yes. It's very unlikely that you'd be able to catch up or keep up in that regard without Google's resources.

Building the index itself is relatively easy. There are some subtleties that most people don't think about (eg. dupe detection and redirects are surprisingly complicated, and CJK segmentation is a pre-req for tokenizing), but things like tokenizing, building posting lists, and finding backlinks are trivial - a competent programmer could get basic English-only implementations of all three running in a day.




I am not even that good of a programmer and I also agree with you that index relatively trivial. Other major issues, besides fighting spam:

- Hardware infrastructure and data center presence for extremely fast search from anywhere in the world. - Near real-time search suggestion. - personalized search results based on past search + geolocation. - Search to get instant results without having to go to a website.

Just to name a few. Google Search is the gold standard of a search engine, not because its Google or because they have been around for a long time and the brand name sticks (I am sure it helps too), but for the simple fact is no search engine is even remotely close to being as good as google and I have tried them all more of the less and given them shot. They are just not good at all.

I also don't understand the hate towards google being in charge of so many products so many people use, ie, Mail, Maps, Chrome, Android, Docs (to name a few). It's simply because they are damn good at it. If its a crime to make a product so good that people continue to use it, then I don't know what else people are supposed to do. As if we are asking google to make shit products, I just don't understand the reasoning.


It has nothing to do with the number of products, it’s what they do with their influence over the market. See AMP and incompatibilities between Gmail & IMAP, for example.


You concentrating on the literal interpretation of the phrase “give access to the index”. This is non-technical article which didn’t go into details, just read it as “give access to index & ranking”.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: