Hacker News new | past | comments | ask | show | jobs | submit login
AI spam is winning the battle against search engine quality (theregister.com)
26 points by cebert 52 days ago | hide | past | favorite | 21 comments



Let's not let Google off the hook by blaming AI sludge though. Whenever I use Google in a context where I don't have an ad blocker I'm always surprised when every single result above the fold is a paid placement that's clearly not what I'm looking for. And Google has become awfully comfortable just ignoring terms in my queries.

Not that LLM sludge isn't also an issue, of course.


Same way Google has turned the other way when it comes to fraudulent ads on YouTube.

It's obvious that Elon Musk doesn't want to sell me crypto, how an automated system made by one of the largest technology companies in the world hasn't detected its fraud is alarming, and Google has the resources to also pay for manual review.

They just don't care that their system is being used to ruin people's lives

I saw an ad on YouTube once selling washing up tablets to kids to.... eat.


If you search for something on google it says it is returning "x of 82,000,000 results." at the top. But if you actually click through the results pages you'll find that it will only ever return 400 results max for any unchanging search string. If you have 10 results per page that's 39 pages. If you have 100 per page that's 4 pages. Of those 400 results at least half (near the start) are ads and SEO trash. So it is only possible to actually look at ~200 results per search with google. And that is why search is so useless these days. It's 900 instead of 400 for bing, but it's the same problem.

There are no free search engines anymore. AI spam doesn't really change the dynamic.


> There are no free search engines anymore

Search is now worse than it was in 1996, Trust me I was there kids :) What you have now is a pale shadow of a real, functioning internet.

Perhaps the way forwards radically different.

If a consortium of independent crawlers could release a massively compressed model periodically, say 12 times per year, then search could move to the client side. Anyone got sensible estimates on what the size of a model could be to give say about 80% of the capability of Google/Bing?

I think it would weigh in smaller than 1TB, and with clever differential code you'd only have to download the changes.

Isn't it time to move search of the web off the web?


Altavista, Yahoo, Lycos and the likes weren't great in 1996, that's why everyone switched to Google in the first place.


I remember in elementary school around the year 2000, we had a special class where we all learned boolean operators and advanced search functions.

Absolutely none of that works on any modern search engine-- except kagi.


Remember the "I'm feeling lucky" button on google. Their search was so good that you could, for a while at least, count on that button getting to where you wanted to go in a single click.


Yep. I remember when I used to be able to search for exact strings and engines would find them! Booleans even worked. It was a magical time for web search from ~1999 to 2015.


Here's something that should be shocking:

At universities all around the world you'll sit a course called "Research Methods". It's a bit of statistics, philosophy of science, hypothesis formulation, significance testing, understanding epistemology, quality, quantity, bias.....

I've a fairly good overview of it, because I've taught it at least 10 semesters.

One of the things baked into every research methods course is search. Sometimes you learn the interfaces to specific tools for searching papers. But most of it is what you describe; boolean operators, regular expressions, sorting and filtering...

The students are still told to use Google and that this works with Google. But this information hasn't been useful for almost 5 years now.

For 5 years we've been training bachelors, masters and PhD students to go out there and use tools and techniques that are almost completely irrelevant. The primary official tool at the foundation of all academic research is broken.

Almost no research methods professors I know at any university have though to train students to deal with advertising, spam, AI clutter, disinformation - to deal with the reality of Google as it actually is.

For 5 years we've been misleading students. Because we got so hung-up on a monopoly, we've allowed a single corporation to fuck the whole of Western academic research - because year after year I definitely see poorer results.

One day I hope the world is able to look back at the colossal cost of BigTech to culture.


Come on that's a ridiculous criticism. Nobody* actually clicks through all those pages. People rarely even click to the second page. Instead they adjust their search terms.

The reason Google is starting to suck is that the first page results are often trash.

*standard HN disclaimer


Exactly.

One may argue that the only first result of search is important, but that is not true.

For any general topic I should be swamped with results.

I have my own link database for news. For search term 'Elon Musk' I have 4 thousands of links alone.

400 results for search term is laughable.

It is hard to control flow of thousands of links. It is easier to control information if you narrow people scope of visibility close to nothing.


AI spam has only been around a year or two--but the decline in search engine quality has been going on for much longer.


All the more reason it was so predictably unwise to have given the bad guys a new force multiplier.


It's not like this is new. Before LLM-generated spam there was content farm spam, and it has noticeably been on the increase for years now.


Does AI really make this much worse? Search engine quality was already low because of SEO spam, or other copycat scrapped sites. I mostly google using site:blabla.com to filter out that crap.


Yes, it does. Because AI can churn out content 24x7 while traditional SEO spammers at least had to sleep and take time off on weekends.


I hope AI spam will make adtech implode into itself.


More likely that it will bury everyone else who isn’t spamming


I am sure google will create special meta tags for sites to state their content is Ai generated and then perhaps utilising to improve ranking, ai generated content < "manual" content


I bet AI content sites will be absolutely thrilled to use such tags and worsen their ranking.


There is also conflict of interests. Google profits from directing you to content farms, as there will be more ads there.

Oh no we earn more money by producing worse product!

It would only change if openai disrupted the search monopoly. Even then we would end up in another monopoly and enshittification cycle continues




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: