It's crazy that so much content is now unaccessible due to subs going private. I don't know what to blame more: centralization of forums into one giant company like Reddit, or Google's algorithm that still shows those private Reddit pages.
Maybe this is a good time to mention the Web Archives browser extension [1] that offers links to various cache / archive providers for any page you visit from a toolbar button. There are many such extensions, this is the one I've been using occasionally. Simple but very useful.
I haven't tried since the beginning of the Reddit strike though, I don't use Google and I only very occasionally run into Reddit pages. I know ArchiveTeam has asked help to archive Reddit [2].
They definitely did. I don't know if it's still there, maybe in some (hamburger) menu attached to the results, if any? In any case I think their cache is still readable, but their googlebot is fast and may be quick to get rid of the cache if a page disappears… Recently, I've had bad luck with search engine caches when trying to read a page that does not exist anymore.
You have to click on the kebab menu next to the result title - you should see a modal popup titled "About this result". Then you either have to expand the "More Options" collapsed at the top to see the "Cached" pill, or scroll to the bottom of the modal to see it - it seems like they have 2 versions they're A/B testing.
I'm not sure why but it seems like they're really trying to make it hard to find.
I've said it before, the web's demise is Google's fault.
SEO optimization has pushed content 'producers' to generate page after page. There are websites that have a page for every Microsoft KB<number> error out there pointing to their own product.
It even is starting to overtake YouTube.
All just to get higher in Google and push (malicious) ads to visitors.
I think this is EXACTLY the problem with algorithms and AIs to build an index. You don't have a human in the loop to go "ok, enough of this bullshit, if you keep doing this we're going to ban you from our service."
The whole "ladder climb to first place" ruined search results. There was nothing wrong with being on page 2 -- if page 2 contained USEFUL RESULTS TO BEGIN WITH. Usually you get half a page of local stuff, followed by half a page of useful stuff, followed by maybe 2-3 pages of gibberish and regurgitated stuff.
Like a reddit topic will be posted on a bunch of 'forums' verbatim with all the comments as replies, because there are that many 'fake forum' sites out there seeded with that content lol. Same with stack overflow, really. (maybe stack overflow clones are the worst).
My favorite are the info sites that try to get you to download malware to solve your problem. XD
What you're seeing on Google is the content that's most well adapted to Google's algorithm. There's a lot of stuff that is not as well adapted. For the most part it hasn't gone away, it's just not able to compete with the bullshit.
At the same time I don't think Google is exclusively at fault. I think their dominance in shaping web traffic is a far bigger factor.
I would expand on that to say that Google's algorithm no longer has anything to do with showing you the best results. It has been altered or adjusted to show you or get you to click on the most ads. It's the same thing Facebook did with its feed, except that Google isn't, generally, monetizing platform engagement (though YouTube certainly is).
So yes, it's some part SEO, but it's also Google biasing things that aren't actually the 'best' results for what you're searching for in order to show more ads.
I think there is a third cause: the low visibility of high quality non-Reddit forums on Google against the tide of low-quality results.
There's some kind of SEO exploit that Google is unable to counter. Apparently, by creating bot accounts on a
semi-reputable social media platform like LiveJournal, Baidu or Reddit and having them post endless links to each other from website X, website X is inflated on Google even if no humans interact with any of the bot posts/accounts.
I don't work in SEO and have never bothered to investigate this, it's purely a topic that keeps coming up in the twitter posts of one of the owners of Livejournal-spinoff Dreamwidth (their account is @Rahaeli).
Google loves to group results by domain too. Say you search for something related to your hobby. Somewhere on the page is going to be a hit from the forum for that hobby, but just below it in a smaller font will be like a half dozen sub hits from that same domain, then that's all there is for that domain in the search results. Never mind that all the relevant knowledge online on this something might be contained in that forum, effectively below the fold until you expose it with the the site:forum.com flag, but this requires a priori knowledge of forum.com being a good source to use for your something.
Reddit has tons of spam that I run into whenever I try to search for any kind of prior product information or advice through the API. As of June 30th, maybe that will no longer be my problem, but I don’t think the spam links will cease.
It's really disturbing just how bad Google search has become. For so many cases Google actually is only useful as a search interface for Reddit, StackOverflow, and other sites that accumulate (and actually garbage collect) knowledge. Without such a qualifier Google will just give you absolute SEO trash results for many types of searches.
I'd just like to note the irony of saying how bad Google is while at the same time saying that Google's general purpose search engine is still better than any of the specific search engines of these individual sites.
Perhaps the issue isn't that Google is bad, and the issue is that search is incredibly hard.
No I disagree, because this simply wasn't the case a few years ago. Back then I could just search something, and Google would give me results, and that was it. It wasn't too long ago when googling was a proper skill to be learned and it felt like that could get you anywhere on the web. Google was exceptional once.
Now Google won't even acknowledge "" anymore, and having to hold its hand and guide it towards a single website which I already need to be aware of, is pretty pathetic compared to what Google was once able to do. Also the fact that it gives back so much spam and even puts it at the top of results.
> Google's general purpose search engine is still better than any of the specific search engines of these individual sites
This is only partially true. Google's search engine is definitely better than Reddit, but that is really not hard (I need to emphasize this, as reddit's search is really bad, unless it is old.reddit, then it is at least somewhat OK), but for many other sites the reason to pick Google is just convenience.
> Perhaps the issue isn't that Google is bad, and the issue is that search is incredibly hard.
I think the issue is more Google deliberately allowing and pushing all that spam, because users that find what they seek will spend less time on the site. Otherwise I find it hard to explain this drastic drop in quality. Would also explain why they are taking away all the useful search and query tools.
Or the people responsible for working on it just don't have the skill anymore, who knows.
You hit it exactly. It never was this bad, and now it's all junk results. And even worse, every page is the same junk results. It just repeats. It's maddening.
The answer is somewhere in between pushing people to click ads, the zealous bias towards automation, the pivot towards AI (no, the one that happened in 2016), and (tinfoil hat time) the influence of entities that saw Google's previous adroitness at quickly connecting the average person with specific and accurate information as a threat.
It's not bad because they can't build a decent search engine. Building a decent search engine is a solved problem, which they solved.
It's bad because their incentives aren't aligned with their users. The shit results they are giving aren't because they can't give good ones, they are because they don't want to.
They are bad now, but they were exceptional few years ago. Same thing with Gmail, now I get obvious spam in my inbox and real email in the spam folder. Looks like they gave up.
Before 2020, it was possible to google random phone numbers and figure out who was calling. Nowadays, googling random callers is useless, but Bing somehow still works.
Ive wondered, with 200,000 employees, how close you could get to something useful, by tasking them all with part time wiki editing/building. Hand build an internal ranking of web quality, keeping in mind that different posts being high quality, don't automatically make all posts at that host domain quality. The surfacing of information could still be very automated.
Or have it crawl anything but be able to provide allow lists the way uBlock Origin allows picking block lists.
But I'm not sure I would want an allow list approach. How to you run into new interesting websites? How would people find your new website? A block list would be good though.
There's a really good example that someone found in the wild, where Google would omit exact string matches, but could then be coaxed into producing them indirectly.
It is disturbing because I thought this was a problem we had solved 20 years ago. If I could remember a few details about something that was indexed by Google, I used to be able to just find it.
I'll take this opportunity to mention Kagi. I've been paying for Kagi search for a bit over a month now. Not only are the raw search results far better than Google ever was, but the search engine is privacy focused, allows you to easily create your own block lists of sites you don't want to see, and has some fancy AI bells and whistles that allow you to summarize both individual pages and all of the results of your search.
I would love a competing SERP that simply listed the results on Wikipedia, Reddit, Stack Overflow, etc. Have a major section of the page that ALWAYS lists the Wikipedia result. Don't make me add "reddit" to every search term.
It would certainly be an interesting next turn of events if Google made an offer to buy out Reddit - just for the user-generated content and to stop the site from self-destructing, i.e. the golden goose offing itself.
It's unlikely given current capital interest rates, but at this point this drama has pretty much crossed the borders of plausible fiction anyway.
I don't know about ability but it is obvious they are not willing to even try. Why? If search result quality was anywhere in their radar, there would be a button next to all search results letting me block the domain from my personal search results. One day they might even figure how to use the information about blocked domains to help ranking results while not being gamed.
Some years ago there was the option to remove a domain from search results (I don’t remember if it was a default or an option).
Google removed it without giving a reason. I assume that it was not working as expected.
But the web has changed in these years. It would be interesting to have it again.
- Google already put ads on search, thus incentivized to show users the best results.
- But this must be balanced with other incentives, like upranking links that also display ads good for the bottom line. (And up/downranking stuff to maximize its "value" (eg data collection), at users' expense.)
- re "best software engineers in the world"; Firstly, Google (200k pax corp with national interests) has optimized itself to lowrisk recruiting, thus having a heavy "careerists" bias builtin by design ("best software engineer" is definitely nowhere near the top priority of its hiring rulebook). Which corresponds to a very low % of mentioned group. Also https://sockpuppet.org/blog/2015/03/06/the-hiring-post/
( You'd think ppl such as Satoshi will pass the Google interviews? What about Linus and etc? )
( Not to mention plenty of weirdos (like Ciro on SO) would be banned from interviews regardless of skill level; and plenty of best engineers are weirdos )
> Google is good for search results which generate money, e.g. when you intend to buy things.
I think how true that is depends on where you want to buy from. Google is good at referring you to certain other websites to buy from. If you're looking for broader search results than that, it's not really any better than for other sorts of searches (at least, that's how it was a couple of years ago).
I don't think so. One of the things that made me start avoiding google was that it had developed a really huge blind spot for great sites that aren't "name brand". They're still there, but google won't tell you about them.
Google should probably just buy Reddit and run it at a loss.
This may not be good for us the users, and might end up killing Reddit in the long run, but it's better for both Google and Reddit than the current situation.
What's the difference for users between Google buying Reddit and Reddit just shutting down now? One promo cycle? Less?
There's no "long run" any more for Google's touch of death, like there was for Google Groups; they've gotten quite good at speed-running value destruction.
It feels like it's more core to Google's search business than the projects normally shut down by Google. Maybe not quite on par with YouTube, but it could be close.
Deja News/early Google Groups provided great search results. Google, in usual form, has been working hard to destroy any property outside of Web Search/YouTube for a long, long time, Groups included. The internet is bigger now, so Reddit has an advantage there, but Groups could have been what Reddit became if it received some love instead.
I had the same thought but with OpenAI being the buyer. I think Gruber was right that a lot of the motivation for the recent Reddit API changes are OpenAI (and other AI startups) using Reddit data and becoming ultra-valuable: https://daringfireball.net/linked/2023/06/09/reddit-ipo
That data was helpful, but if Reddit is blacked-out, or people move to more walled-off forums it makes future iterations of ChatGPT and the like more difficult to train on recent developments. Why not have OpenAI run Reddit as a way to get lots of moderated data into its models?
Reddit's valuation is insane right now, they got in before the "correction" so it's at some absurd multiple.
Anyone trying to buy Reddit would need to basically lowball the hell out of them and make the case that an IPO will only be worse (which may or may not be true).
Maybe I am missing something about IPOs here, but if Reddit’s valuation is way off because it was before the correction, wouldn’t it be easy to argue your low ball offer is the real value?
Even if they IPO, wouldn’t the stock immediately crash if it’s overvalued?
It doesn't sound right. It last raised money at $10B in 2021, but Fidelity, who led that round have since cut that valuation on their own books back to $6B.
I would not be surprised if that is a conservative number.
The market has moved and reddit hasn't been going in the right direction.
While that's still a lot of money, it is unprofitable, so a trade sale would make some sense. The real issue is that it comes with a lot of reputational baggage that a lot of public companies would not want.
I still think $6B falls into "insane", but that's my unqualified opinion, so it may not be correct.
That said, I also don't think they're willing to accept the idea that even $6B might be high, so I don't think anyone would be successful with a meaningfully lower offer.
Duckduckgo has somewhat recently added a feature where with some searches they'll spot you're trying to search for Reddit posts, and ask if you want more of them displayed. It's a useful addition for the habit I've picked up of adding "reddit" to searches to limit the blog spam.
It's a shame it came out now, when reddit is increasingly full of spam and now is blowing itself up.
It's funny, I've seen spam sites in my web crawls that append have 'reddit' appended to the page title. I've never seen such search results appear in any search results, but points for effort.
Both DDG and Google have some default rules to limit the number of results from one domain. So those clear spam attempts may not make it, but I've seen many sites that include something like "this item is popular on Reddit for x reason, but they don't pay us so it's actually not that great."
I'm trying to write a book and researching some niche old stuff and trying to extract info from 90s websites (some of the websites are dead and I don't know how to search Internet Archives), and Google Search is becoming hopeless, so I've been switching to yandex.com to find stuff.
Even Google Image Search is terrible now and I use TinEye.com and yandex.com for that. I should probably try Bing more as well.
I haven't noticed this, but have seen the complaints. I don't tend to see many reddit links in my search results, and haven't noticed a change since that blackout.
I don't get a ton either which is odd because I OFTEN add "reddit" onto my search and get way better results for what I'm looking for.
Kinda odd how Google search hasn't learned yet that Reddit results are higher quality, at least for me? Almost like it's not optimizing for quality of results...
It's almost like Google's optimizing for revenue on results has created the Reddit situation. Can't find better results without filtering for Reddit explicitly, so people end up on Reddit and end up posting on Reddit and around and around it goes.
Yep! Here is a classic false consensus effect that's very prevalent on the internet: What temperature should you cook a chicken breast to?
If you search "what temperature should I cook a chicken breast to" I'm actually surprised that one of the top 10 results is from The Spruce Eats and gives good advice, but the rest are flooded with sites parroting USDA recommendations. A problem in general I personally face searching for anything food or health related is that top results are all sites regurgitating the same information with nothing new added.
Now search "what temperature should I cook a chicken breast to reddit". First hit has a post with some really good info in the top two comments. Second hit is "Stop Cooking Chicken Breasts to 160°F (71°C)". Excellent advice right in the title.
I'll give examples of other topics I get better info faster on via Reddit:
* Fixing just about anything; troubleshooting appliances, home repair, video card issues, linux laptop compat issues, etc
* Hobby related stuff; RC cars, electronics, etc
* Home improvement/repair related stuff; flooring, HVAC issues, and this could go on the first point but it's really its own world
I usually don’t see many Reddit results in my searches either. According to previous threads on HN a lot of people append site:Reddit to their Google searches so perhaps that is what they are referring to.
Same. It would be nice if someone would share some example queries. I use Google quite a bit but rarely end up on Reddit, nor do I feel Google search quality has been going down over time.
As a concrete example, I run a used retro games business.
Proper technical manuals don't exist for the repair work that I need to do; finding the solution to a problem I can't work out myself is a mix of YouTube, Reddit and a looooot of filtering out bullshit.
I never worked in search, curious what are the technical difficulties of implementing a YouTube-style "Don't Recommend This Channel" are? From the outset, the YouTube recommendation can be an offline process, and much easier to scale.
Google pretty much has "F-you" level developer man-power to throw at this problem, so I'm somewhat surprised they haven't implemented it yet (the "bad result is good for Google" reasoning never quite made sense to me). I'm curious if the functionality is not worth it or if the technical challenge is insurmountable at Google's scale.
After so many years I no longer use Google at all, for anything. In particular with search, all those ads showing as organic results at the top drove me away. I switched to DDG and have been in general happy with it.
Today while I was searching it was a double whammy because Stackoverflow was down. So Reddit content was gone and StackOverflow too. That's basically the two resources for coding knowledge gone.
RTFM and #GetGud. That's what we did in my day. Sure only nerds could stand using the software we produced, and installing any software was a dice roll on corrupting the windows registry, but hot-damn if you couldn't be wrong with the confidence that nobody will know better!
Ironically, LLMs could be a game-changer in dealing with SEO noise, not at the search stage, but during the indexing of web pages. By understanding context and semantics, they could help create a more refined and relevant index, effectively cutting through the clutter of SEO-optimized content.
Between ArchiveTeam and Pushshift, I guess search engines should deliver those instead of the Reddit domain. A lot of the discussion topics have enduring usefulness and so searching a more stable source makes sense.
Moderations tools (Reddit has stated many times that they would retain free access to the API), blind users not being able to use the official clients (they are accessible beside very few actions that are only used by mods, easy to fix and commitment has been made on those), and porn being removed from the API (porn in the 3rd party apps is the hill people wanna die on?).
This protest has been fabricated by the multi-millionaire 3rd party apps developers that want to keep the free money flowing.
I haven't tried since the beginning of the Reddit strike though, I don't use Google and I only very occasionally run into Reddit pages. I know ArchiveTeam has asked help to archive Reddit [2].
(not affiliated to anything mentioned)
[1] https://addons.mozilla.org/en-US/firefox/addon/view-page-arc...
[2] https://news.ycombinator.com/item?id=36254172