Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
It's not SEO: something is fundamentally broken in Google Search
141 points by bsanr2 on Aug 9, 2021 | hide | past | favorite | 53 comments
There is a general sentiment that Google's search capability has degraded over the years. Many disagree, pointing to metrics that suggest an increase in utility and successful searches for the average user over time. For the instances where Google users have difficulty finding relevant results, the on-going battle with SEO practitioners is cited, with the difficulty of distinguishing legitimate pages from ad farms and the like suggested as an answer to the question of why we see such failures and an increasing reliance on big-name sites as a trustworthiness heuristic.

I want to sidestep that debate somewhat by describing my experience with a more controlled environment: voice searching for YouTube videos to play on my Google Home Mini device. Many will remember how remarkably accurate searches were at initial release c. 2017; songs could be found by reciting lyrics, humming melodies, or vaguely describing the thematic or narrative thrust of the song. The picture is very different today. It's almost impossible to get the system to return even slightly obscure tracks, even if one opens YouTube and reads the title verbatim. Recently, I was trying to listen to a song from Neil Cicierega's Mouth Dreams album, Aerolong. Just that one song. No combination of terms would bring it up. Several times, my Mini tried to play the entire album as a playlist. But just as often, it would return something that was just off ("I Don't Want to Miss a Thing" by Aerosmith) or completely unrelated (Bohemian Rhapsody). We should be clear: this is an Alphabet-produced device and interface tapping into an Alphabet-built index with an Alphabet-developed search function and being absolutely incapable of returning the correct result. One may wonder if this might have something to do with the obscurity of the track in question, but Neil Cicierega is an artist of note, if not mainstream; and, besides, it found the correct album and, presumably, would have eventually gotten to the correct track while making its way through the playlist. But that's not what I was asking for as a user; if the system found the correct track but insisted on not going directly to it, it's making a decision for me that I did not ask it to. That would be a horrifying finding, if more evidence could be compiled (beyond this being a commonly-encountered scenario) to show that this, specifically, is what was happening. At this point, however, the only thing that can be definitively concluded is that Google searches are not returning results that they ought to; results that, by all accounts, they would have been able to a few years ago; and results that must not have bren influenced by efforts to eliminate the effects of abusive SEO practices. Something else is going on.



Amit Singhal, who was Head of Search at Google until 2016, has always emphasized that Google will not use artificial intelligence for ranking search results. The reason he gave was that AI algorithms work like a black box and that it is infeasible to improve them incrementally.

Then in 2016, John Giannandrea, an AI expert, took over. Since then, Google has increasingly relied on AI algorithms, which seem to work well enough for main-stream search queries. For highly specific search queries made by power users, however, these algorithms often fail to deliver useful results. My guess is that it is technically very difficult to adapt these new AI algorithms so that they also work well for that type of search queries.

While the old guard in Google's leadership had a genuine interest in developing a technically superior product, the current leaders are primarily concerned with making money. A well-functioning ranking algorithm is only one small part of the whole. As long as the search engine works well enough for the (money-making) main-stream searches, no one in Google's leadership perceives a problem.

Naturally, this would be a good time for a competitor to capture market share. Problem is, the infrastructure behind a search engine like Google is gigantic. A competitor would first have to cover all of the basic features that Google users are used to before they would be able to compete on better ranking algorithms.


This heavily aligns with my experiences over the past few years.

Where Google once used to return exact or close results for very specific or niche searches, I feel like it struggles to even land in the same ballpark of results. I've been asking myself whether the results have always been this bad more and more over the last year and a half thinking I've been taking crazy pills. Unfortunately, the current competitors still perform even worse for the same queries so it's not like there's enough of a reason to default elsewhere (yet, I hope).


In the past, Google felt more like a sophisticated professional tool. If you made the effort to click through the results beyond the first page, refine the search query increasingly and, on occasion, enclose individual terms in quotes, you sooner or later landed a hit.

Nowadays, the query you enter into the search field is answered by an AI algorithm that works more along the lines of "I know best what you're looking for". For users who are not very knowledgeable about search engines and tend to search rather superficially, this apparently works quite well. For professionals who want to dig out the really interesting hits in the deeper part of the web, an AI-powered search engine of this sort is quite frustrating.


Sometimes https://entireweb.com (no affiliation) gets better results than Google.


Google has a verbatim search, click on Tools/All Results/Verbatim, but it is no longer really verbatim. It often offers results that have only one of your keywords. DuckDuckGo seems to completely ignore any quotes I put around keywords.


I don' understand why duckduckgo ignores quotes.


> Naturally, this would be a good time for a competitor to capture market share. Problem is, the infrastructure behind a search engine like Google is gigantic. A competitor would first have to cover all of the basic features that Google users are used to before they would be able to compete on better ranking algorithms.

That is why the basic rule of dethroning an incumbent is to not go into competition heads-on. If someone is successful in shaking Google, then it won't be a search engine company. That company will build some different product first not competing with Google at all, become a leader in that and then, if business model supports, build a better search engine. It is a long process, but that is how it usually works. Google is a classic example of this way of getting at Microsoft. After search, it tried to eat Microsoft's lunch by developing products to compete with MS-Office, and also ventured into mobile and laptop OS.


To this point - it is already happening to certain extent.

Amazon's share of product search is more than that of Google's [1]. And earlier Facebook started getting large chunk of ad money, which would have otherwise gone to Google.

[1] https://www.wsj.com/articles/amazon-surpasses-10-of-u-s-digi...


I’ve been surprised to find that search.brave.com is par with google for almost all my queries


I wonder if those newer algorithms are the reason I sometimes get search results that seem to have nothing at all to do with what I asked for.

Of course the older search algorithms are also AI algorithms; just not the black-box machine learning algorithms that are so popular in recent years.

My theory is that Google's algorithms care too much about what the average person wants, and not enough about what I want. I'm not the average person. And I thought Google would know enough about what I want by now.

What I really want is more personalised search where I control how the algorithm is tweaked. What would also be useful, and I really can't believe Google hasn't done this yet, is voting on search results. Those content farms would quickly be voted into oblivion.


Well, we know Google "salts" ML processing to get the results they want for censorship of all types. That seems likely to screw up objectively valid or honest search results pretty easily.


The real question should be is altavista still working?

I'm too scared to look.

Independent webmaster days used to be the best times... Now hackers and corporations run the show. Many are complaining that SEO is getting trumped by outright payola... cough

Things like Net Neutrality and paid promotion have overrun (free and independent thought) search results now and clogged them up with sales funnels to the point that we're just going to get more and more irrelevant results until people are going to have to turn to libraries again to find meaningful and succinct results to questions. AI is in it's infancy and usually geared towards profit for google more so than for human good from what I can observe, because especially during pandemic times, they have pretty big bills to cover.

The underlying problem is the monetization of information. As long as we keep driving this trend truth and accuracy of information will suffer deeply. It's best to keep monetization to less critical resources like entertainment and physical products, they can afford to be sensationalized and monetized more than things presented as science, facts, and credible news.

This trend ruins the value the Internet once had and also steers intellectual value/power back to universities interestingly though, where high tuition often provides learners to more carefully planned presentations and usually more emphasis on accuracy. The future will be expensive for all of us.


> The real question should be is altavista still working?

It redirects to: https://search.yahoo.com/?fr=altavista


This really explains a lot


Your theory is that Giannandrea is intentionally using a technically inferior technology because it makes more money? How?


Say I search for a chicken piccata recipe. Which pays Google more:

- a static blog from 2003 without ads but an excellent recipe

- a YouTube video or promoted article with ads enabled who paid Google to feature them on this query and who will allow other advertisers to market to me on their site

You might say “well they already paid Google so Google is making money either way,” but people will only continue putting money into Google ads if there is a good return for them, ie more views or more money. So Google has material interest in returning results to advertisers, more so than they do on pointing me to a better chicken piccata recipe.


Because it is optimized for revenue driving features rather than for niche informational purposes.


I think it's more complicated than that. Google's engineers are under pressure to constantly "improve" things, or in other words, to push out new ranking algorithms. Once you go down the AI path, it's probably hard to go back.


Holy cow! Just to expand on my previous comment below, but a different use case. Exact search, you have to use quotes seems to be completely busted as well. The company I work for builds websites, we have thousands and we have a common phrase in the footer. I would use Google to help find example sites. Well, as of at least this morning, assuming this is as old as Google's July 1st update exact match no longer works.

I used to get 100's of great results. Now, I get less than 30 and over 70% are spam. You are correct that something is completely wrong. It is 100% broken in this case.

I just can not believe how horrible this is. Just to emphasis how bad it is, I even tried a combination of the phrase in the footer plus the name of the company. It would not return any results for that company/website. Just spam results.

What is going on here?


Using double quotes has been busted (at least for some searches) for literally years. This isn't, in general, a new phenomenon. You were lucky it worked for you as long as it did.

Double quotes don't work. Putting '&tbs=li:1' at the end of the search string to get "verbatim" results no longer works. Most of the "hacks" from this 2003 O'Reilly book don't work: https://www.oreilly.com/library/view/google-hacks/0596004478...

It's all turned to shit.

What is going on here?

Read the other posts for some other ideas of what's happening.

My opinion is that Google doesn't need to care. The company's market cap, as of today, is $1831 Billion dollars. So Wall Street doesn't care either. Larry Page is hiding out in New Zealand, and he probably doesn't give a shit either. :-)


I noticed things started going sideways around the time Google+ was launched. Someone decided that + in the search engine had to be reused for the doomed platform. Sad.


I can’t speak to your voice experience however I used to regularly end up 20-90 pages deep on google searches ensuring I collected all the information I need.

In the last year or so I have not progressed past page 3 without being presented with mostly fake results. This even happens when I use specific terms which I know used to return perfect results and when I know the content exists.

For example the other day I wanted to find a presentation made at FOSDEM so I searched for the FOSDEM keyword with filetype: pdf. I saw a handful of genuine results despite knowing there are hundreds of matching documents.


That is a very interesting observation. I often travel back to those pages too as part of my job and now that you mention it, yes. That is exactly what I see too. It’s absolutely garbage. I will sometimes pull a SERP with 100 results looking to see if something is there. I will find garbage way before I find a result I think should be somewhere in that mix. Especially before SPAM related results.


I had noticed this too but assumed it's part of cost cutting projects inside Google - like how they increased the number of unskippable YouTube ads from usually 0-1 to minimum 2 often 4 in a row, at every 5-10 minutes, or how they removed free photo storage in a time when disk pricing is at it's lowest point, or how the subjective quality of experience with their services has degraded to unimaginable lows (i.e. Google Drive, which is so buggy it's not even funny, often failing to fetch files with hundreds of failed API requests and taking a few minutes to "process" downloading a file - often failing with 40x errors).

It may just be a part of the strategy to lower costs after achieving domination in most markets.


> like how they increased the number of unskippable YouTube ads from usually 0-1 to minimum 2 often 4 in a row, at every 5-10 minutes

This is the main reason I eventually coughed up the cash for YouTube Premium. I looked at how many videos I watched in a given day/week/month and found out that based on my hourly rate a a dev, I was spending ~$50/month watching YouTube ads.


One specific and odd behavior I’ve seen repeatedly: if I search for something to do with “PureScript”, but I have a transposition typo, e.g. “pursecript concatenate strings” Google will often return links to TypeScript.

It seems to be putting more weight on the popularity of the subject matter rather than the actual content of my search query.

I’ve only used PureScript for a couple of years, and I’m curious how earlier iterations of Search would have performed.


A better search would be:

"purscript" "concatenate strings". In this case, 37 results came from Google search.


There is also the possibility that they properly adapted to their users. Power users as a proportion of Internet's total user count probably followed an inverted zipf distribution over time. At the begining 100%, then 99, 90%, 9% and now less than one percent.

Assuming power users formulate search in ways that are irreconcilable from those of the average user, and assuming Google adapted their models, metrics to the average user and retrained them at each step,

then, we are simply no longer a target market of Google.


Seems like an opportunity for a whippersnapper YC startup to fill the void.


I already mentioned that, and explained why it's not applicable to this case, suggesting that to some degree it's not a sufficient explanation.


Your OP is about a very specific music selection feature in a single-result voice search, no doubt influenced by music licensing law and contracts (are you paying for Google/YouTube Music or relying on free stuff?), not "Google Search".


>no doubt influenced by music licensing law and contracts

Yes. Well, I assume this must be part of it.


Universal standardization to the “average” or most common is severely detrimental to everyone but is the default.


It could be Google, it could also be websites doing everything they can to break Google’s search.

Take Reddit for example. Date ranges are absolutely useless because they’ve started hijacking the posted date and updating it every single time someone adds a new comment.


That only adds a float of six months at least. Reddit threads are locked after that point.


6 months is a LOT. Leads to garbage results.


It is a new page each update. Your cache is invalid.


Once you're the dominant search engine, crappy organic results plus highly targeted ads is the optimal revenue machine.

http://infolab.stanford.edu/~backrub/google.html <ctrl-f> motives


Makes sense to me. If the results are great, user spends t-minus 3 seconds on Google. If the results are crap, user spends t-minus however long it takes to weed through crap. Longer is better, therefore crap is better. That only works up to a point, the point at which everyone starts to notice, like they are noticing now.


Now it's very difficult to search for an exact match. Seems like the inverted index is not used at all, and instead just encyclopedia like results are returned. Which is fine sometimes, but not when I'm looking for a phone or an function name. I would think that a neural ranking should return also the quality of the results and change to the old algorithm if it is worse than expected. But no.

At least academic Google still seems to use the old ranking function by now.


I just tested an few exact matches myself and it seems like its completely broken, see my comment above. I am just floored at how bad it is now. Something is truly broken here.


It's the same story with Siri and other voice assistants. I think companies realized how costly the cloud compute for these services is, and have been downplaying then and reducing datasets ever since.


I have a slightly different manifestation of the same problem. I have a google alert setup for the name of my website (to see when people write stuff about it). Over half of the notices that I get are from spam/malicious sites who copy a sentence or two and make a spammy page. I would have thought that google would be filtering this spam out.....


Definitely agree. I remember Google being unobtrusive and genuinely helpful several years ago. Then, gradually it seems to have gotten not only worse but more “consumer-oriented”. Pushing me towards businesses and products and sales pages rather than just the best “search engine” text match. It also definitely seems to favor bigger and bigger sites or more cookie-cutter sites that have likely “perfected” SEO, things like WordPress blogs, Medium, etc. Not to mention the walled gardens! Try searching for anything without getting the top hits to be Pinterest links, ads, YouTube videos, etc and not because these things are the best matches for your query but rather because these sites have paid the most to be featured or paid an engineer to SEO-ify everything for them. I’ve wanted to switch to DDG but I have a paranoid worry I’ll miss out on some good things then too.


Are there any "open" search index initiatives going on?

Like a curated wikipedia/openstreemap of internet search indexes (with ignore-lists like uBlock has for ads)? I fully realize the impossibility of a handmade search index for the entire web, the internet is just too impossibly large and ever changing. I'm more thinking of subsets of various topics of interest. Different communities could pop up that made specialized crawling algorithms for very specific topics and you could subscribe to those indexes for a very personalized search experience. A mix of automatic classification, indexing combined with hand-tweaked tagging and weeding out spam.

Over time you could have a pretty "broad" search engine - kinda like how OpenStreetMap was initially terrible (coverage wise) but is today extremely competitive with the big guys. With it I could subscribe to or possibly even aggregate/proxy other indexes into a personalized search system. That way I could make a super specialized search that only searched indexes made for information search on technical topics and have another index explicitly filter out unwanted spammy websites.

Some "pre-index" on the search front-end could intelligently decide which indexes to query based on your search string. If you are dissatisfied with the indexes it chose to query you could manually tweak it and redo the search.

At the very least it would be awesome if search engines had a way to submit an "ignore-list" - an index over known crap/spammy websites. It's not feasible for me to keep adding specific ignored urls to a search term. When I search for images online I would prefer to ignore Pinterest by default.

It really irks me that when searching for information that Google defaults to thinking I'm looking for a product to buy. Sometimes I would just prefer to exclude ALL websites that generally only acts as a retail/webshop for goods. That fine grained search method is just not possible today.


I just searched for VCF West content that weekend (text search) and the result was a quite a mess. Less results than expected, clearly biased by "channel", interspersed by non-related video suggestions and false positives.


Remember when Google Maps had real zoom? You could count someone's eyelashes from space. The fidelity was so good that it was subsequently reduced. I think it was military/gov objection. To those who were accustomed to the power of old zoom, new zoom was super suck and still is.

I think what were seeing with search is a corollary: the results are blurry, not clear like they once were, like an intentional backing away from clarity. Why dirty the results? Except for teaching copyright holders a lesson*, I don't understand.

*$593 million fine in France v. Google


Anybody who uses voice search on Google TV or Android TV (whatever they are calling it this week) should try the voice search on Amazon fire TV. It will blow your mind.


I just typed in "Neil Cicierega Aerolong" into the YouTube web search bar and what you were looking for is the first thing that comes up, just the isolated video with that track.

Interestingly, on music.youtube.com, only the fan extended version and an unofficial video come up, not the plain track that shows up on normal full YouTube.

Which of those does the Google Home Mini search?


Home Mini used to search YouTube first. At some point, they changed the preferred context to Play Music, and you would have to say, "Play x on YouTube" to get YouTube audio, even if a more correct result was on YouTube than was on Play Music. Obviously, Play Music is no longer available; it seems like it it now completely ignores YouTube searches and will only use YouTube Music.


I started to notice I was getting less hits on searches related to .NET development for versions prior to .NET Core. I pessimistically assumed it was due to some concerted effort to reduce access to MS-related articles. I don't use Google for much more than that nowadays so I wasn't aware it is a general lack of quality results across the board.


jumping from Google Search to a specific thing on Youtube/Music is just that, a jump. These are complicated/addon systems that are evolving and there's not a direct connection between the two necessarily. There are all kinds of contract and alterior motives involved when music industry and YouTube Music product marketing/goals etc are in play. The choices they've made as far as user experience (which is what your complaint is really about) are theirs. It's not horrifying. It is 'something else is going on' but not freaking deceptive must-get-to-the-bottom-of-the-conspiracy. This could be a blog post you could link to?


yes search is broken. When you report SEO spam sites to companies like Cloudflare or Godaddy they just ignore them. Meanwhile Google happily incorporates negative SEO in their rankings. These days I'm using duck.com to have some decent results




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: