Hacker News new | past | comments | ask | show | jobs | submit login

>2) 20 years of experience fighting SEO spam.

Tangential - but does anyone else feel that google results are useless a lot of the time? If you search for something, you will get 100% SEO optimized shitty ad-ridden blog/commercial pages giving surface level info about what you searched about. I find for programming/IT topics its pretty good, but for other topics it is horrible. Unless you are very specific with your searches, "good" resources don't really percolate to the top. There isn't nearly enough filtering of "trash".

Yes, I feel like Google search results have very gradually become more irrelevant and spammy over the past decade or so.

There are 2 issues, I think.

Firstly, the SE-optimised spam, which has become very good as masquerading as genuine content.

Secondly, Google has dumbed search syntax down a bit, and often seems to outright ignore double quoted phrases, presumably thinking it knows better than I what I want.

As a dev, I do accept I may be an outlier though - with the incredible wealth of search history and location data that Google holds, it seems likely things have actually improved for typical users.

is there a way to turn this " ignore thing off? drives me nuts

Seeing as google has my search history for the past 14 years, they should be able to KNOW that I'm a slightly more technical user and can take advantage of power user features instead of treating me like an idiot

Google signed an armistice in the Great Spamsite War some time around '08 or '09, to the effect that spam can have all the search results aside from those pointing at a few top, trusted sites, so long as they provide any content at all. Bad content is fine. Farmed content is fine. Content that was probably machine-generated is fine. Just content. Play the game, make sure your markov chain article generator or mechanical turks post every day, throw some Google ads on your page, and G will happily put your spamsite garbage at result #3.

There’s a reason for this; click through rate on ads is higher on pages that don’t achieve the user goal.

I suspect that the AI models powering the search results develop a sort of symbiotic relationship with the spam - if the user actually finds what they are looking for by clicking through an ad on an otherwise spammy page, everyone “wins”; the user found what they were looking for with minimum effort, google got their ad revenue, and the spammy page got a little cut for generating content that best approximating the local minimum that links the users keywords to actual intent...

“Farmed content is fine”. I thought that was one of the major (intentional) victims of the Panda update. https://moz.com/learn/seo/google-panda

There are a few widespread scaled publishing operations like IAC which seems to be doing well with the split up of About.com & relaunching it as vertically focused branded sites, but the content farm business model died with the Panda update.

Some of the sites that were hit like Suite101.com went offline. eHow is still off well over 90%. ArticlesBase sold on Flippa for like $10k or some such. One of the few wins hiding in all the rubble was HubPages, but even they had to rebrand and split out sites & merged into a company with a market cap of about $26 million ... and the CEO of Hubpages is brilliant.

Even with IAC on some sites they are suggesting ad revenues won't be enough http://www.tearsheet.co/culture-and-talent/investopedia-laun... "As Investopedia charts its course as a media brand, it’s coming up against the roadblock all publishers eventually hit — the reality that display revenue alone won’t be enough. ... Siegel said he expects course revenue to exceed what’s generated from the site’s free content. While he wouldn’t say what the company’s annual revenue was, Siegel said it grew an average of around 30 percent for each of the last three years."

There is also other factors which parallel the panda update that further diminish the quick-n-thin rehash publishing business model - Google's featured snippets & knowledge graph pulling content into the SERPs so there is no outbound click on many searches - programmatic advertising redirecting advertiser ad spend away from content targeting to retargeting & other forms of behavioral targeting (an advertiser can use a URL as a custom audience for AdWords ad targeting even if that site does not carry any Google ads on it) - mobile search results have a smaller screen space where if there is any commercial intent whatsoever the ads push the organic results below the fold

I agree with this. Most searches give me almost a whole page of ads and stuff up top before the things I’m interested in start showing up way down at the bottom of the page, and even then the results are often spam.

I’ve been using DuckDuckGo and have found I have this problem less. I don’t always find what I mean on DDG, as of now I’d say Google is still better if you’re not sure exactly what you’re looking for is called, but if you know the keywords you need DDG is often better.

Someone linked to an interesting site talking about how to make homemade hot sauce here on HNs. I partly read it and thought it was a great clean site and something I wanted to try. Later going back to find it again I literally spent hours searching, even though I'm pretty sure I remembered some of the exact phrases. For some reason recipe related search results are really really terrible on both Google and Bing.

Could you not find it again via the HN site search? https://hn.algolia.com/?query=%22hot%20sauce%22%20recipe&sor...

This is awesome and helped me find it again! Thank you!

Sometimes sites get dropped from the results because they are malware hosts. It’s more likely to happen to small independent sites. They are also more likely to just pack it up and shut down their sites.

Yeah, this is why I still use and like myactivity.google.com, as creepy as it is. It's helped me re-find so many interesting half-remembered sites and videos and songs I'd previously come across.

Why would you rely on google spying instead of your own browser history?

cross platform support, maybe?

100% agree. For technical queries, as long as a StackExchange comes up, Google is still okay.

But for increasingly more basic searches about a product I'm interested in or a medication or anything else non-complicated that would have gotten me a clean list of decent, non-paid results even 5 years, I'm now getting half a page of sponsored BS and then another half a page of 'created content' written by a bot or shyster explicitly for gaming Google's SEO.

Not only has Google lost almost all their good will (i.e. Don't be evil), but their products aren't even that good anymore, at least not so much better than alternatives where the negatives of using Google outweigh the difference in quality.

Yes, at least half the time I search about a particular topic, it seems the first few pages are written by some contractor in the Philippines probably getting paid $2 / hr who just spent the prior 30 minutes researching the topic.

I am not sure that this take is accurate.

I would agree that programming search results tend to be quite good, but I think this is likely in large part because the average person attracted to programming both has a high IQ and has experience building some part of the web stack. Thus the sites that are quite manipulative in nature would have a hard time trying to fake it until they make it in such a vertical where people are hard to monetize and are very good at distinguishing real from fake. And even if a fake site started to rank for a bit it would quickly fall off as discerning users gave it negative engagement signals.

This is also perhaps part of the reason sites like Stack Overflow monetize indirectly with employment related ads targeted to high value candidates versus say a set of contextually targeted ads on a typical forum page or teeth whitening gizmo ads on the Facebook ad feed.

The lack of filtering of "trash" probably comes from a bunch of different areas

- I think there was a quote that people are most alike in their base instincts and most refined in areas where they are unique. some of the most common queries are related to celebrity gossip & such. There are also flaws in human nature where inferior experiences win based on those flaws. For example, try to buy flowers online and see how many layers of junk fees are pushed on top of the advertised upfront low price. shipping, handling, care, weekend delivery, holiday delivery, etc etc etc

- some efforts to filter trash based on folding in end user data may promote low quality stuff that people believe in. a neutral & objective political report is less appealing than one which confirms a person's political biases. and in many areas people are less likely to share or consider paying for something neutral versus something slanted toward their worldview.

- as the barrier to entry on the web has increased some of the companies that grew confident they had a dominant position in a market may have decided to buy out other smaller players in the vertical & then degrade the user experience as real competition faded. there was a Facebook exec email mentioning they were buying Instagram to eliminate a competitor. Facebook's ad load is now much higher than it was when they were smaller. But the same sort of behavior is true in other verticals too. Expedia & Booking own most the top travel portals.

There has also been a ton of collateral damage in filtering all the trash. So many quirky niche blogs & tiny ecommerce businesses were essentially scrubbed from the web between Panda, Penguin & other related algo updates.

does anyone else feel that google results are useless a lot of the time?

Google doesn't make money from you finding what you're looking for. Google makes money from you searching for what you're looking for.

It has gotten better over the years in some ways even if it feels like it also got worse. I recall pages of "ads and useful lookimh search result keywords" being more common in the past.

w3schools still outranks mdn a lot.

You're not alone. From my perspective, the value of google search results has been dropping for years. And the quality of their search results seems to be dropping in a way I suspect is profitable for google. Most of the results I get back from google these days are trying to sell me something I have no interest in buying.

For example, suppose I do a google image search for "pear", because I want images of pears obviously. The first result is indeed a pear, good job google! Except the first search result just happens to come from Amazon, and also happens to be a pretty shitty thumbnail quality photograph (355x336). It's a pear alright, but why is this particular image of a pear first? Google didn't try to give me the best image of a pear, they tried to give me the pear image they thought most likely to induce a financial transaction. Or alternatively, google let itself get cheaply manipulated by Amazon's SEO. Neither is a good look.

A much better pear image, 3758x3336 from wikipedia, is further down the search results. So it's not like google was unable to find good pictures of pears. And a non-image search for "pear" returns the wikipedia page first, so it's not like google failed to noticed the relevancy of the wikipedia article about pears. Yet the shitty amazon thumbnail of a pear shows up higher in the image search results than a high resolution photograph of a pear from wikipedia.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact