The problems I’ve noticed with Stack Overflow are a few and hard for me to narro...

sznio · on July 25, 2023

Your first point seems to be most important.

> - google used to return really relevant results for SO, and it stopped doing so at some point a while ago

SO might be horrible now, but it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now? Google's current recommendation of going to w3schools or - even worse - geeks4geeks or any other content farm is and always will be worse than stackoverflow. I don't have a clue what their algorithm is doing but it's surely trying to kill Google search as fast as possible.

Another joke is the fact that searching for "[language] [symbol]" also brings me to these content farms instead of the documentation. You seriously can't find useful anything these days using Google.

technion · on July 25, 2023

This whole situation just shows as a lie everything we hear about SEO. Stack Overflow has the exact text, it loads incredibly fast (should be commended more for this), doesn't require ten meg of Javascript to render as far as I know generally meets HTML standards.

These scam sites load megabytes of junk, load slowly, have text interpersed with ads and modals that render right on top of them, if you open devtools you'll often see pages of warnings about deprecations and/or invalid html, and despite having the same scraped text always score higher on Google.

mountainb · on July 25, 2023

It has long been a mantra in SEO land that user generated content sites in general and forums in particular are to be aggressively down ranked. The reason for this is that industrial strength spam farms otherwise spin up tens of thousands of forum domains to pass link juice to what they are targeting. This naturally penalizes real forums, which often contain the best content for a query.

This is why Google has basically surrendered and why so many search result categories are now dominated by whatever sites Google has arbitrarily declared the winner through editorial decision making. In many search categories, we are effectively back where we started with Yahoo directories and hand-picked search rankings. What you see on the first page SERP is the best that they can do under the circumstances based on the fundamentals of how search works.

The web was a fun idea while it lasted, but if you are using it as a primary information resource, you are wasting your time.

wiz21c · on July 25, 2023

funny, I just rented an introductory book about signal processing and was (re) amazed to see how much the information is well explained, with tons of example, and a real plan to guide you in the ton of knowledge you have to master.

I for one welcome back our new library overlords.

daveguy · on July 25, 2023

Come on now... Don't hold out... Citation?

wiz21c · on July 25, 2023

Understanding Digital Signal Processing, 3rd edition

    Richard G. Lyons

ajmurmann · on July 25, 2023

I think there are a number of things that could be done to improve this, but I'm sure Google won't do:

1. Heavily penalize presence of ads. The content farms makes their money with lots of shit ads. This will wreck their business model.

2. Just manually block out heavily penalize content farms and boost known good sides like SO and MDN.

Fanmade · on July 25, 2023

Well, those sites are often running ads probided by Google, so it's understandable why Google doesn't really have a good incentive to follow your first suggestion.

mountainb · on July 25, 2023

They do penalize heavy ads. There have been shenanigans on this front also (penalizing non-Google ad networks and favoring Google ad networks). They do penalize some content farms and favor others. The issue they are concerned about with SO and MDN among other user generated content sites is covert seeding at scale for the purpose of manipulating search results.

There's just a lot of fraud on the internet related to search and advertising manipulation, but it's under-policed in part because of the internationalized nature of it and because it is hard to bring fraud cases in the United States because of the particularized pleading standard. That should not stop the feds from bringing criminal cases, but generally the feds care about large dollar value frauds (as they probably should be) rather than on policing very large numbers of small dollar value frauds that have a major aggregate impact on the online economy. They like going after the guys who steal $100 million from deaf children with Lupus rather than doing 200 $500k fraud prosecutions.

Xeamek · on July 25, 2023

Except those adds are often provided by Google itself. So penalizing them would be self harm for google revenues

DeusExMachina · on July 25, 2023

Anecdotal, but my website lost a lot of search traffic after Google's core update in March, which seems to have affected SO as well looking at the first chart.

If I look at Google's guidelines, my articles follow all of them: in-depth, well-researched, demonstrating personal experience, better than other articles appearing in the search results. And yet, they were "penalized" by this update for who knows what reason.

I looked into it and some other websites benefited from the update, so who knows what changes they made and why.

johnny_reilly · on July 25, 2023

Again anecdotal, but I lost the majority of my SEO traffic late last year around the time of a core update. I've spent the best part of a year attempting to repair it, on the assumption I'd committed some heinous SEO crime. The more time that passes, I'm starting to think that the issue isn't mine so much as Google's. It's baffling. I wrote about it here:

https://johnnyreilly.com/how-i-ruined-my-seo

slashdev · on July 25, 2023

I don’t see what you did wrong, it must have been the algorithm change. My parents had a business that was killed by a Facebook algorithm change. My brother took a significant hit from an Amazon algorithm change. Building a business around any of the big tech companies seems very risky.

I think Google search has just declined a lot. I guess they’re losing the constant cat and mouse game with SEO. It seems worse than it has ever been, I’m relying more on ChatGPT and copilot now.

I can only imagine that LLMs will be the end of any content based search ranking. I don’t know how they’ll adapt to that.

DeusExMachina · on July 25, 2023

Looking at your post and the HN discussion, I am also of the same impression.

verteu · on July 25, 2023

Maybe Google (semi/permanently?) penalized your domain when spammers started using your GA4 tag?

johnny_reilly · on July 26, 2023

Gosh that would suck. Sounds plausible though

pjc50 · on July 25, 2023

> These scam sites load megabytes of junk, load slowly, have text interpersed with ads and modals that render right on top of them

Only if you're not googlebot. The crawler sees a much nicer site.

gingerlime · on July 25, 2023

which should — in theory — get them penalized for cloaking. But obviously it doesn’t. Reinforcing GP’s point.

NavinF · on July 25, 2023

Google has gotten pretty lenient about that: https://developers.google.com/search/docs/essentials/spam-po...

"If you operate a paywall or a content-gating mechanism, we don't consider this to be cloaking if Google can see the full content of what's behind the paywall just like any person who has access to the gated material and if you follow our Flexible Sampling general guidance."

I wonder if they just gave up

post-it · on July 25, 2023

Hypothesis: Search, being Google's oldest product, is no longer prestigious to work in. It's in maintenance mode.

zerkten · on July 25, 2023

Does Google run other indexers for the purposes of catching cloaking? Are there other strategies that can be used? One of the problems of SO is that most of the valid content is out there and easily available without having to scrape the site which may mean penalizing for bad content is harder.

raverbashing · on July 25, 2023

And the fact that google is not detecting those is damning (to google)

XCSme · on July 25, 2023

Does it even make sense to serve different content to a bot than what a human would see? Isn't the search engine trying to rank content made for humans?

pjc50 · on July 25, 2023

It's an adversarial process. The search engine is, in theory, trying to rank by usefulness to the user, and the site owner is trying to maximize revenue by lying to the search engine. And the user.

AlexandrB · on July 25, 2023

I'm generally puzzled by Google's reluctance to do manual intervention in these cases. It's not like this is a secret. Just penalize the whole domain for 60 days every time a prominent site lies to the crawler.

sulam · on July 25, 2023

There are very many sites where the content you see as a non-logged-in user is different from what you see if you have in your possession an all-important user cookie.

gtirloni · on July 25, 2023

If Google's support is any indication, Google doesn't like to involve humans in their processes. There probably isn't enough humans to do this manual intervention you propose.

XCSme · on July 25, 2023

Then, maybe the "crawler" should be an actual PC navigating to the browser, taking a screenshot (or live feed) of the page and processing it with AI.

pjc50 · on July 25, 2023

Eh, Google choose to be identifiable as googlebot and to obey robots.txt for other reasons of "good citizenship", because not everybody wants to be crawled.

Scarblac · on July 25, 2023

It makes sense if you know your content isn't nice for humans (e.g. full of ads and tracking stuff) but you want it to rank high anyway.

soco · on July 25, 2023

I wonder what will I see if I change my browser's user agent?

Shrezzing · on July 25, 2023

Google is really failing hard in this regard, and I'm fairly sure it's intentional on their part. Searching "Typescript array" has obvious intent from the user, and an obvious "correct" first result. Google returns the documentation page in the 3rd result, but it's a link to a deprecated version of the page. The rest of the above-the-fold links are websites that contain Google ads.

Duck-Duck-Go returns the up-to-date documentation link 2nd, and the MDN result in 3, with W3Schools in 1. Bing returns actual content on the results page, describing exactly what you need to understand a TS Array.

Google have the incentive to push the poor sites, because they earn revenues from doing so. Bing and DDG don't have that incentive, and return much more relevant and useful links. That doesn't feel like a coincidence.

RyanHamilton · on July 25, 2023

I spent years learning a programming language well then further years delivering a training course, iterating and then providing sections of the course on the website free online. Both as advertising and to get new people started. Your "typescript array" returns one of the sites in the top 5 that basically copy-pasted via thesaurus many of my articles. I checked and it turns out they offer $50 for people to submit content for any language / technology. So you have someone in a cheap country paid to go copy content and reword it on that site. Then they rank higher than you, as they do this over many languages thus seeming more authoritive. Even more worryingly with chatgpt, they won't even have to pay the $50 any more. So the whole internet may become like this. Leaving me little incentive to publish material except that which solely entertains myself, mmm facebook/twitter = not a good outcome.

withinboredom · on July 25, 2023

I have a friend that does something similar, but only does video with the text gated behind a paid-only site. He makes pretty good money, but for the exact reasons you listed is why the site is paid-only. They have a much harder time stealing (as in posting as their own content) the video.

freediver · on July 25, 2023

Results on Kagi for comparison:

https://kagi.com/search?q=typescript+array&r=us&sh=Qa5cXHvwj...

We simply downrank sites which display a lots of ads on them and also use community blacklists for dev site clones.

sotix · on July 25, 2023

I also notice and appreciate that Kagi returns older results while Google continues to push newer webpages. I have found so many useful results from perfectly fine content on older webpages. At this point, I’d be extra happy if Kagi had a Web 1.0 filter that focuses on basic html websites.

eitland · on July 25, 2023

For those who want this it exists on https://search.marginalia.nu

geysersam · on July 25, 2023

They should just add Google ads to all technical documentation pages. Problem solved.

coliveira · on July 25, 2023

Yes, google search is nowadays, like everything else, run by AI. What nobody tell is that the AI is trained to maximize Google's revenue. That's why they figured out it is better to put these ad sites on top.

NoMoreNicksLeft · on July 25, 2023

> Google is really failing hard in this regard,

Failing at what though? Is it anything they care about, that they want to do?

If not, then it's not so much failure as it is a change of plans on their part. They don't want to do that anymore, and there's no one else to pick up the slack.

geysersam · on July 25, 2023

There are browser extensions for blacklisting domains from your Google searches. I've been so incredibly happy using one if them. If I see one of those despicable content farms I just blacklist it and move on. Often when I search on Google for technical stuff I only get 2 visible results on the first page, 1 SO and 1 documentation. Soooo relaxing.

The business reasons why Google doesn't take steps to remove the bad content and make their product pleasant to use again is so far from my understanding it might well be aliens running the company for all I know.

lolc · on July 25, 2023

My understanding is that Google has an incentive to send people to content farms because those farms will show Google's ads. Stackoverflow doesn't. So they can increase ad exposure.

Thinking of it, it would be an interesting test to compare the ranking of two similar sites, one with google ads, another with ads from another provider. Might be good evidence for antitrust litigation. But what do you do if they just prefer sites with more ads? Because due to their market position, that benefits them, but it isn't anti-competitive against other ad-pushers.

geysersam · on July 25, 2023

Maybe you're correct. I've heard that explanation before but it just seems too incredible that they'd undermine their monopolistic global billion dollar business for a measly share of the revenue of geeks4geeks.

rightbyte · on July 25, 2023

Google is a self playing piano with clueless leadership. There is probably no plan involved.

Just managers doing what they get more money for or devs hunting promotions by increasing ad revenue by 0.01% in the short term one sting at a time.

permo-w · on July 25, 2023

the way you phrase it there makes it sounds miniscule, but you scale that up to the size of the SEOified internet and the numbers are surely into the billions

martin_a · on July 25, 2023

I was thinking the same. Taking in consideration the vast amount of such SEO farms, there's surely a lot of ad money to be spend/earned if you prioritize the "right" sites.

marcosdumay · on July 25, 2023

Last time I saw, Google gets much more revenue from ads on search than from the entire 3rd party ecosystem.

permo-w · on July 26, 2023

a big number that's much smaller than a big number is still a big number

arp242 · on July 25, 2023

I don't think it's an intentional decision anyone has taken, or that they intentionally made the search engine the way it works now, but more of a "there's nothing wrong here from out perspective, so what's there to fix?" kind of thing.

dmje · on July 25, 2023

I've been using Kagi[0] for a while now and it's pretty great in general - but also has options to boost up / down / totally ignore certain domains. It also has "lenses" that let you set a context (example: I'm searching for code stuff so just include sites a,b,c).

It's really good and IMO more than worth the price.

[0] https://kagi.com

viraptor · on July 25, 2023

Yeah, my Kagi list of content farms / SO clones which are completely dropped from all results keeps growing. On the other hand, searching just SO from Kagi still seems to give decent results.

rightbyte · on July 25, 2023

Your experience matches mine. Spend two or three weeks blacklisting sites as you hit them and they disappear.

Some people argue that Google possibly can't win the fight vs spam sites, but obviously it works perfectly fine manually blacklisting them.

It takes time to build ranking.

The underlying reason is probably that the spam sites use Google Ads (revenue which is tied to 1000s of PMs and managers bonuses) and that Google as an org is deeply dysfunctional at this point.

ncruces · on July 25, 2023

Yeah, but then they're editorializing.

And not in a generic “this kind of content is bad for our users” but very specifically “site x.com is bad for our users.”

tivert · on July 25, 2023

>> Some people argue that Google possibly can't win the fight vs spam sites, but obviously it works perfectly fine manually blacklisting them.

> Yeah, but then they're editorializing.

> And not in a generic “this kind of content is bad for our users” but very specifically “site x.com is bad for our users.”

Except that shouldn't be a problem because I'm pretty sure Google already blacklists domains.

ncruces · on July 27, 2023

Surely, they do. But they reserve that for stuff that's really way beyond the line. For everything that might be legitimate they leave it to the ranking algorithm to sort out and it's a game of cat and mouse.

jazzabeanie · on July 25, 2023

This is good to know. I refuse to click on a geeks4geeks result even if it looks like it has exactly the answer I want.

coldpie · on July 25, 2023

Do you have a suggestion for a Firefox extension to do that?

gpgn · on July 25, 2023

I have noticed that Wikipedia is often pushed to the bottom of the page compared to a few years ago where it would always be at the very top.

makeitdouble · on July 25, 2023

Anecdotally Wikipedia is often the top result for me ... with the twist that it's the Google widget, with the side bar and related videos.

Only way below this block (which takes about 120% of my whole screen's height) come the "organic" results, that aren't great, but probably match what Google assumed I wanted to see.

stavros · on July 25, 2023

And it's always the result I want.

OJFord · on July 25, 2023

Then consider using DDG & '!w term', or some other method (searching Wikipedia, extension, I think Firefox has something engine-agnostic built-in) instead?

stavros · on July 25, 2023

Oh I do use DDG. I think it might suffer from the same problem, actually, since now I'm wondering why I see Wikipedia results very far down in Google when I don't really use Google.

prmoustache · on July 25, 2023

The comment above mention using !w in your search to look only in wikipedia since you say it is nearly always what you are looking for.

I would say at this point you could just use wikipedia as your default search engine.

stavros · on July 25, 2023

Yeah, I do do that for those things, but when I don't, I've noticed a decline in quality.

TRiG_Ireland · on July 25, 2023

You can add a keyword search in Firefox.

PaulHoule · on July 25, 2023

I never really liked StackOverflow. The only questions they seem to allow are “How do I get the length of a string in Python?”. Most of the problems where I am scratching my head and really need the benefit of somebody’s experience are software selection problems that aren’t allowed.

The competing answers paradigm is also fundamentally broken, I don’t want to see 15 answers to “How do I get the length of a string in Python?” I just need to see

   len(x)

Programming splogs do better than SO does in this respect. In fact, even the Q/A paradigm is bad because the average SO post requires scrolling past at least one extensive code example that does not work,

For more than 10 years I thought the world needed a search engine for programmers. You really ought to be able to upload your POM file or equivalent and have the system automatically search the correct version of documentations. (Any attempt to look up things in the Java manual has to be written like “JDK17 javadoc {className}”; Javascript libraries like reactstrap, react-router and such often have a few wildly incompatible versions and I don’t want to waste a millisecond with the wrong version doc, …)

I wouldn’t mind searching answers from stackoverflow but I only want the best correct answer and I don’t want to read a long confused question, etc. As this would clearly save coders time maybe they’d pay for a subscription as they do for Jetbrains tools.

Cthulhu_ · on July 25, 2023

Years ago, Google announced they would crack down on content farms, and SEO advice was really like "you NEED to have this meta tag if you duplicate content from elsewhere else google will fuck you over HARD!", but it seems they earn more money off of content farms then the sources.

This will hurt them long term I presume, but they won't care because they earned money.

soco · on July 25, 2023

Google's current recommendation is usually heaps of Pinterest randomness, and then they wonder why people start relying on ChatGPT "oh it's not a search engine" - sorry folks Google isn't one (anymore) either.

mschuster91 · on July 25, 2023

Google has gone down the drain. As I've written recently somewhere here, they could easily fix their search by hiring maybe a dozen people per country to moderate common search request results or to, hell, listen to users like here, and respond by booting the scammers.

The problem is, they won't, because active moderation beyond responding to legal (DMCA, right to be forgotten, anti-CSAM) demands would massively endanger their "we are an impartial search engine" defense.

ljm · on July 25, 2023

It's been over 10 years and it still endlessly frustrates me that searching for any Ruby or Rails documentation will send you to an APIDock page for Rails 3.2 and you have basically goad Google into giving you the official documentation for either.

I suppose the real frustration is that Google became so pervasive that bookmarking a website and using its own search functionality is a total afterthought.

fouc · on July 25, 2023

Google search results have a filter for time, so could potentially improve the results by changing the date range back to 3 years ago.

masukomi · on July 25, 2023

try Kagi ( kagi.com/ ) . SO answers are almost always the first ones for my geeky questions (as they should be in most cases), and it also extracts and displays the official answer to the question that best matched your search.

shafyy · on July 25, 2023

Try out Kagi Search. You can manually increase website's weight and completely block others. E.g. I have increased Stack Overflow's weight and blocked those stupid content farms. Works great.

mglz · on July 25, 2023

Conspiracy theory: Bad initial search results forces people to search more often, hence allowing google to show more ads. Since few people switch away as a result, they continue doing this.

permo-w · on July 25, 2023

this is a bit like the cosmetics industry. there are very clearly probiotic solutions to body odour that could be developed with the coins down the back of P&G's sofa cushions, but if you fix everyone's body odour, then how are you gonna sell them anti-perspirant from now to the end of time?

now, in an ideal world competition would solve this problem, but the cosmetics companies heavily collude and anti-compete to prevent this

JacobSeated · on July 25, 2023

This is where I want to remind you, Stackoverflow is a Q/A site that sometimes contains stolen content from, as you put it, so-called "content farms" and the official resources.

Now, I do have a Stackoverflow user as well, but I actually prefer publishing my ideas on my own site rather than help build someone else's content farm for free. Stackoverflow is, itself, a content farm, and it can be very hard for new users to join the site. You can not even post comments without first earning enough points. For a very long time I would actually resist joining the site for that reason. I have only recently earned enough points to comment.

Now, I happen to own a so-called "content farm" too, and the choice can either be between creating a standard blog with very little traffic or try and cover everything you can possible think of in order to compete with other "content farms" in your niche. It is very difficult if not near impossible for a single individual to create a valuable resource and maintain it, and it is simply not sustainable if you have paid authors working on it as well. There is no way you can monetize it decently. Stackoverflow probably found a way around this problem by simply leaning back and monetizing their users content.

Once your site grows big enough, you also deal with a ton of spam- and hacking attempts. Everything combined just requires an inhumane amount of time to deal with.

Of course, authors are desperate because of how difficult it is, and perhaps especially authors from poor countries that might not have other sources of income. Their basic business model seem to be: create a content farm with ads, fill it with copy-written spam and hope Google indexes. Often these sites even have multiple authors, which is quite baffling given the extra expense it must create for them. But I do not think they have actually thought the idea through – because it is just not profitable.

Weirdly it's often in the technology niche, which they are clearly not proficient in, and more or less containing stolen solutions with little original content added.

I have seen a few sites like this, ripe with some of the most nasty grammar too. It interesting they are able to rank simply based on their volume? Of course they must be using blackhat techniques, including linkbuilding if you analyze their link-profiles, because there is no way that something so poorly designed and maintained is getting that much attention compared with official sources or stackoverflow.

For those of us who own blogs, such sites are often easily outranked simply by writing a comprehensive article on whatever tiny topic they have posted about.

wahnfrieden · on July 25, 2023

Yes, if you cite a solution the mods there get angry when you don’t copy paste the third party site content instead of just link to it. The stated reason is to make sure the content isn’t lost. In other words to ensure the content is duplicated on SO.

I have no allegiance to SO ownership so when the fake SO sites show up in results instead of SO, usually reading them will just give me the answer more quickly than finding the actual SO source.

ceejayoz · on July 25, 2023

They want enough of an excerpt so the answer doesn't become useless years later when someone redesigns their blog URL schema or shuts it down. That's reasonable, and probably falls within fair use.

wahnfrieden · on July 25, 2023

That’s what I said

bobajeff · on July 25, 2023

>mods there get angry when you don’t copy paste the third party site content instead of just link to it...

There's a good reason for that. Sites come and go and as a result links to solutions die and you wish someone had just answered the question instead of just linked to it.

wahnfrieden · on July 25, 2023

Thats what I said

EspressoGPT · on July 25, 2023

Absolutely. Google search results quality has declined and I often find myself prefixing search queries with "site:reddit.com".

Someone · on July 25, 2023

> it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now?

Are they just fine today, too? To judge that, you have to look at the date of the question and its answers, make an educated guess at what OS/language/library versions they are about, judge whether that makes a difference for the version(s) you’re using, and only then evaluate whether the reply even was correct at the time (it may have had a thousand upvotes, but still be dated)

I think a really good Q/A resource would require posts to be tagged with version info. Most people think manual tagging isn’t fun, though, so it’s hard to get such a set from volunteers.

An alternative would be to require test cases that the site can run to check what version(s) replies are valid for, but writing such tests that do not break over time is hard, and, again, in general volunteers don’t like writing them.

That leaves generating tags or test cases. I don’t think we’re there, quality wise, to do that.

avereveard · on July 25, 2023

Stack overflow committed the cardinal sin of running their own ad network on their sites, not much of a mistery it's downplaced.

azherebtsov · on July 25, 2023

I was involved in SEO related projects some time ago. Not that I’m an expert. I’ve heard google understands that the site is a search engine and does not index it. However it should be smarter, like do not index SO’s search pages, but do index question pages - because the original content is there. SO might have ran out of crawl budget which Google assigns to each site, and/or Google prioritises fresh content. But I agree with the sentiment, what we know as SEO is nothing more than playing games with Google indexing algorithms, based on rumours about the recent changes in it, or improving page performance beyond reasonable boundaries. The other day I was looking at apple.com internals and spotted a few things which we were “fixing” on our pages. I asked SEO experts “what is the point of doing X, since there are examples of a well indexed page having that same problem?”. And the answer was like “when we will be as big as Apple…”

Modified3019 · on July 25, 2023

uBlacklist can help by culling the spam results, while it's mostly a manual thing, it's fast, easy, and a little effort goes a long way I've found.

Unfortunately it still doesn't solve the issue that sometimes the good results are still buried pages away or simply not come up at all due to google's shitty algorithm.

I really need to look into SearXNG or something.

ajross · on July 25, 2023

Is that actually true though? So, I literally just went to Google this morning for a toddler python question (very much not my first language, heh).

"how to load a file all at once in python" returns a first hit pointing to a blog post answering the question correctly, a second pointing to a SO answer that is actually for a slightly different problem but contains the correct answer, answer #3 is a youtube video that probably answers the question correctly.

Geeks4geeks doesn't show up until #4, well below Stack Overflow. (FWIW, their answer was fine too).

> You seriously can't find useful anything these days using Google.

That really feels more like a meme than reality. Are there other subareas where the SEO is doing better than this one? It seems like a pretty representative question.

coliveira · on July 25, 2023

The answer is that the content farms are doing a better job of interacting with Google algorithm than SO. Of course it is a problem with Google search, but search was always hackable. The made-for-google sites know very well how to play the game.

bee_rider · on July 25, 2023

I wonder if Google should make their SEO prevention worse but simpler. Everyone has always wanted to SEO for Google, as long as Google has been around. It has seemed like only recently that good sites predictably lose.

Perhaps 10 years ago Stack Overflow was able to do some minimal SEO and then get by on content strength. Perhaps nowadays Google is doing a good job preventing basic stuff from working, so the only people to get good results are SEO-ologists that only know about exploiting SEO, and have nothing interesting to say on any other topic.

coliveira · on July 25, 2023

I think the answer is simpler. To rank well on Google you need to integrate with Google (search console, analytics and similar). I guess SO is not giving all their data to Google, so they cannot "optimize" for the site in the way that content farms are willing to.

m000 · on July 25, 2023

I think they need to bypass Google somehow to keep it going. Embracing LLMs could be a way out.

I already go to ChatGPT to cut through the SEO-optimized crap that Google offers me in the first couple of result pages. I would bet that a lot of the responses given by ChatGPT come from Stack Overflow.

Now, what if we had StackGPT which offered me similar funtionality as ChatGPT, but better? E.g. respond with some code and an explanation, but also link to the sources (which are probably within their site, so they have prime access to them). Or offer as an explicit option to respond using sources other than their archive, but perhaps without citing sources.

highspeedbus · on July 25, 2023

My theory these days is that indexing services like google are now too big to work properly. There's more and more noise added every time new information is indexed, to the point where strong bias is necessary for it to return relevant results to average user.

Maybe there's a point where the internet, with decades old information pilling up, becomes unbearably big for indexing services to handle all of it in a efficient manner. Hence the recent "optimizations" that companies swear haven't worsened searchability.

madog · on July 25, 2023

This is what I want from a new search engine:

1. Respect exact match searches - this used to work enclosing the search terms in "" quotes, but no longer works. If there are no exact match results, return nothing.

2. Allow blacklisting or removing results from certain websites entirely e.g. I want to be able to configure geeks4geeks to never show up in any results ever

If someone could make this new search engine they would have a good shot at replacing Google :)

freediver · on July 25, 2023

Both features exactly as described already exist in Kagi search [1] (founder here).

We are not trying to replace Google though, but offer an alternative to people who care so much for the quality of their search experience, that they are willing to pay for it.

[1] https://kagi.com

fho · on July 25, 2023

You won me over by summarizing listicles to a short list :-)

To be honest I think your pricing is to high. $25 for unlimited queries might be fine for somebody who needs a good search to work and earn appropriately.

But as a (former) PhD student I ran through the 100 free queries in 2 or 3 days and just would not have been able to afford 25€.

I would gladly pay 10€ (for unlimited searches) or 15€ (for a unlimited family option). But to me, 25€ just seems to high. That's 5 meals at my workplaces cantina right now (Germany, NRW).

(I assume you are aware of pricing issues as pricing options have changed at least once while kagi is on my radar)

freediver · on July 25, 2023

Thanks for the kind words. There are many things like grouping listicles you can do to improve search experience, once the incentives are aligned.

Unlimited for $10 is something we are working towards.

fho · on July 30, 2023

Thanks for listening! At $10 per month unlimited searches I'll immediately switch.

Also thanks for creating kagi. Kagi was the first "alternative" search that convinced me that there can be competition to Google. YaCy just does not work, most competitors (DDG, etc) just repackage the big engines. I use presearch as my daily driver right now, but am somewhat turned down by the NFT shenanigans behind that. Kagi looks like the only engine that stands on its own and is definitely something worth paying for.

eitland · on July 25, 2023

Can confirm.

And if you find a result that got included despite not being an exact match you can report it and see it get fixed in a few days.

christophilus · on July 25, 2023

I think Brave search has those features. (I haven’t tried it, though.)

DarkmSparks · on July 25, 2023

https://search.brave.com/

probably my new default search engine. Thnx

GMoromisato · on July 25, 2023

I'm sure everyone has thought of this, but is any search engine trying to add LLMs to the crawler pipeline? That might be more useful than at the user side (like Bing) where the index is already polluted.

adolph · on July 25, 2023

> but it still holds years worth of answers that were just fine a few years ago

The flip side of that is a large proportion of those are no longer fine or operant.

Calamitous · on July 25, 2023

That’s one of the nice things about Kagi: you can lower our block content farms, and elevate sites like stack overflow.

pydry · on July 25, 2023

>SO might be horrible now, but it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now?

SO had an expert sexchange.

tzs · on July 25, 2023

> moderation on SO has gotten progressively more horrible. can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever.

Related: I've often been looking how to do X, find an SO question asking that, but the answerers there refused to answer until the person explained why they wanted to do X, and then all the answers (correctly) told the person that they actually needed to do Y and explained quite well how to do Y.

I actually need to do X, so those answers are useless to me.

Then I find another question on how to do X, and the mods close it as a duplicate of that earlier question. Even when the questioner specifically notes in their question that it is not a duplicate of that earlier question because they really need to do X the idiot moderators close it.

harimau777 · on July 25, 2023

I ran into that a lot when I was doing low level firmware programming. The answers to someone's question would be something like "that feature is only intended for ultra-specialized low level programmers". And it's like "in this case, I am an ultra-specialized low level programmer".

I'm thinking of things like assigning constants to pointers in C and/or manipulating pointers directly.

tivert · on July 25, 2023

> Related: I've often been looking how to do X, find an SO question asking that, but the answerers there refused to answer until the person explained why they wanted to do X, and then all the answers (correctly) told the person that they actually needed to do Y and explained quite well how to do Y.

> I actually need to do X, so those answers are useless to me.

I know what you mean. Whenever I (rarely) ask a question on Stack Overflow, I always have to defensively load it up with language anticipating misinterpretations and instructing people to answer my question and not some other one.

Otherwise, internet-point-chasers will fall through the woodwork giving easy, worthless advice. Even with all the defensive language, a few always show up.

texuf · on July 25, 2023

Sounds like you’re good at prompt engineering

firesteelrain · on July 25, 2023

> defensively load it up

Agree. This isn't a problem on the partner/sister sites as much. They are very helpful too. But they don't show up on Google

peeters · on July 25, 2023

I've never really thought about how contributors trying to avoid the XY problem really stands at odds with StackOverflow's mission of being a repository of answers, rather than a helpdesk. Not all Ys present as X, and not all Xs are actually Ys. Sometimes its an XZ problem.

The best you can hope for is some answer down the page that says something like "to answer the actual question..."

JuanPosadas · on July 25, 2023

Sometimes X problem is just an X problem.

dkarl · on July 25, 2023

The mods are so ubiquitous and so busy on SO, I wish they'd spend some of their time silencing the "let's figure out what your real question is" pseudo-trolls.

I call them pseudo-trolls because I think they are well-meaning, but they function as trolls: overrunning a web site, hijacking discussions with repetitive and irrelevant content, and making most potential users feel that participating isn't worth the time and effort of interacting with them.

lostphilosopher · on July 25, 2023

Even if X isn't the right solution to my use case I still often want to know _why_ X (or my implementation of X) doesn't work. The answer to that might be a really valuable learning independent of the problem at hand.

gromneer · on July 25, 2023

The XY comic has done irreversible damage to stackoverflow.

LtdJorge · on July 25, 2023

I've seen this example thousands of times, it's both as infuriating as it is frustrating.

JohnMakin · on July 25, 2023

here’s a classic SO thread:

OP: “how can I accomplish X doing Y?”

top voted response: “I wouldnt do X by Y. Instead, I’d consider Z. It’ll do everything X by Y would do, plus these other things.”

buried sub comment to top voted response: “that’s not what he asked, to do X by Y (exact solution provided)”

then 10 comments blasting the guy who actually answered the question for not doing it the way the top voted answer did it.

newaccount74 · on July 25, 2023

The very popular Zalgo answer is a perfect example of this problem [1]

The user asks a question that can be answered quite easily, and dozens of people post answers claiming that this is the wrong way to do it and that they should use some other tech to solve the problem.

Some people on Stack Overflow care more about showing off how smart they are rather than answering questions, and I think the point system attracts these people.

[1]: https://stackoverflow.com/questions/1732348/regex-match-open...

cowthulhu · on July 25, 2023

Hilariously, the accepted (I assume by default, not by the asker) answer is flagrantly breaking, like, half a dozen rules and guidelines… but because it’s cynically and unhelpfully crapping on a newbie, it stays up. Or maybe there’s a good reason for it to stay up, but at a glance it sure isn’t a good look.

I actually think SO is a great site and resource, but I also think a lot of that is despite the bitter old timers in the community, not because of them.

borland · on July 25, 2023

That answer is only there because it's really old, from the early days of S.O. where people were allowed to ask questions that weren't super serious binary yes/no style. It'd get moderated and deleted in a heartbeat today. A forlorn monument to the cool place that S.O. once was

umanwizard · on July 25, 2023

Maybe I’m just a fun-hating asshole but personally I find this kind of thing annoying, not cool. People are just trying to get work done, not see someone’s attempt at cringey “nerd culture” humor.

eitland · on July 25, 2023

That is not what most of us complain about I think.

I and I think many other are sad that S.O. removes many serious work related questions (have lost count of how many times I saw the perfect question with the perfect answer, with a note that this isn't what Stack Overflow is made for and these questions only exist for historical reasons).

the_af · on July 25, 2023

You agree with SO: those answers are no longer allowed, and haven't been for years. They are only kept as historical artifacts, and marked as such.

underdeserver · on July 25, 2023

Oh come on, it's not crapping on a newbie. It's a funny comment that serves as a reflection of the days and weeks this guy spent debugging these kinds of systems.

As someone who was a newbie at the time when it was posted, who was looking for a way to parse HTML, I took away that it's just really the wrong way to go about it. I didn't feel crapped on at all.

bazoom42 · on July 25, 2023

The question is about tokenizing XHTML, not parsing it into a tree structure like a DOM, which is a critical distinction. Regular expressions are a perfectly valid way to tokenize. This is why the snarky answers does not suggest a better solution - there isnt one!

If you scroll down long enough, you will see answers explaining that. But they arent upvoted as the answers suggesting the questioner is an idiot.

stavros · on July 25, 2023

The question isn't looking to parse HTML (or XML). Regexes are inappropriate for HTML because they can't adequately match the starting and ending tags, not because of black magic. The OP isn't looking to do that, so regexes look like a perfectly acceptable way to go.

underdeserver · on July 25, 2023

I agree completely.

But the asker is very clearly a newbie. The question does not contain further context. The asker's suggestion is wrong (I think). And we've all worked with junior engineers who try to use the wrong tool.

The answer is a whimsical way of making an appropriate suggestion in this inferred context.

Also, to be fair, I think it's not mathematically impossible to use dark regex magic (with look-behinds and such) to parse HTML, but that's a discussion for another day...

Etheryte · on July 25, 2023

The answer can only ever be accepted by the asker, not even mods can change that. It's actually not that rare that the accepted answer is not the one with most votes in which case the accepted answer is somewhere further down, not the first one on top.

cowthulhu · on July 25, 2023

Oops, you’re right! I was totally misremembering.

pasc1878 · on July 25, 2023

No accepting an answer can ONLY be done by the person asking the question.

Lutger · on July 25, 2023

The question explicitly invites the kind of witty reflection shown in the accepted answer, by adding: "and what do you think?"

As mentioned elsewhere, this is an old question and both the kind of question and answer wouldn't be allowed these days.

However, I also fundamentally disagree that questioning the assumptions in a question is unhelpful. You want to solve a problem, find an approach and want help because you have problems with the approach? What if the approach you took _is_ wrong? Its very helpful especially for advanced beginners or at the intermediate level, to be given a different way of solving the problem even if that is not what you asked.

It depends on context if this is just pedantry or genuinely helpful. The best answers I found start with answering the question that was stated, but then proceed in showing how the problem behind the question can also or better be solved.

umanwizard · on July 25, 2023

“You didn’t actually want to do X, here’s how to do Y instead” may indeed be helpful for the beginner who initially asked the question, but it’s very unhelpful for me who finds the page years later actually wanting to do X.

_gabe_ · on July 25, 2023

> “You didn’t actually want to do X, here’s how to do Y instead” may indeed be helpful for the beginner who initially asked the question

Stack Overflow isn't a site for beginners, it's for "professionals". At least, that's what all the Stack Overflow defenders tell me every time I criticize the snarkiness, rudeness and patronizing manner of many answers/comments you receive on Stack Overflow.

waterproof · on July 25, 2023

I disagree- on SO you’ll usually get some literal answers as well as the more high-level, question-the-premise ones. Why not both?

umanwizard · on July 25, 2023

Because in many cases people don't bother giving the real answer once a comment already has an answer, especially a highly upvoted one.

I'm fine with indirect, "question-the-premise" comments, but they should be posted as comments, not answers, because they are in fact not answers.

waterproof · on July 25, 2023

In this case, the question actually says “and more importantly, what do you think?”

So the answer is actually on topic.

It may be a bit… stylistically unusual, but I think I came away with a pretty good idea of what the answerer thinks.

diemechanist · on July 25, 2023

> "Some people on Stack Overflow care more about showing off how smart they are rather than answering questions, and I think the point system attracts these people"

"Some" is an understatement.

usrbinbash · on July 25, 2023

> and dozens of people post answers claiming that this is the wrong way to do it and that they should use some other tech to solve the problem

Yes, and they are factually correct in doing so. The correct answer to the question: "How do I use a hammer to tighten a screw" isn't a lengthy description of how to perform some magic with a hammer. The correct answer is: "You don't. Use a screwdriver."

HTML isn't parseable with regex. The various answers under the question explain in great detail why [1] that is the case.

SO isn't a help forum, it's a question archive. The purpose of an answer isn't to solve one guys specific question, but to provide an answer that is useful to all people who ever stumble upon this question.

[1]: https://stackoverflow.com/a/1758162/19508364

firesteelrain · on July 25, 2023

Your response and the other responses are proving our point. It wasn't about context free grammars, level 2 or level 3, etc. It was a very limited subset of a problem. Answer should have been, "while I don't recommend doing it the way you want to do it, that should work for your limited subset".

usrbinbash · on July 26, 2023

> Answer should have been

Yes, and answers on that very page, with lots of upvotes, do exactly that. People looking for answers online, can reasonably be expected to scroll down a page with results.

BeetleB · on July 25, 2023

> HTML isn't parseable with regex.

Poster is not asking for this. He is asking how to parse a specific subset of HTML. And it is demonstrably parseable.

> The correct answer to the question: "How do I use a hammer to tighten a screw" isn't a lengthy description of how to perform some magic with a hammer. The correct answer is: "You don't. Use a screwdriver."

It is not the appropriate way to tighten a screw, but there likely is a correct way to do it with a hammer.

It is fine to point out that there are better ways to parse that HTML, but it is not wrong to do it with regexes.

Sorry to be blunt, but having coworkers like you make the job really annoying. I'm not a newbie, but a seasoned programmer. If I'm asking a question and am in a domain with a fair amount of experience, don't give me patronizing answers.

usrbinbash · on July 26, 2023

> Poster is not asking for this.

Poster is not the one answers are for. Answers are for everyone who stumbles upon this question in the future, and the general topic of the question is very much about parsing some HTML with regex.

Again: SO != Help FOrum

> but there likely is a correct way to do it with a hammer.

No, there isn't. Because the correct way is to use a screwdriver. There is certainly a way to do it with a hammer, same as there is a way to write a webserver in brainf__k. Doesn't mean that way is good or should be done.

> Sorry to be blunt, but having coworkers like you make the job really annoying.

Bluntness is fine. I will be blunt as well: Having to fix code full of hammers used to tighten screws is a lot more annoying than having colleagues who try to prevent a codebase full of hammers in the first place.

anter · on July 25, 2023

> The correct answer to the question: "How do I use a hammer to tighten a screw" isn't a lengthy description of how to perform some magic with a hammer. The correct answer is: "You don't. Use a screwdriver."

My car broke down in the middle of the desert because of a screw that came loose and all I have is a hammer. You have just condemned me to death because you assume you know better.

marcosdumay · on July 25, 2023

Nothing on that questions makes me think the person asking it wants to parse HTML. Most HTML parsers will never give the result the question described. And unless you want to dig into tar structure, solving that question is an essential part of creating a parser.

So, no, the top 3 answers are all bullshit.

bazoom42 · on July 25, 2023

> HTML isn't parseable with regex.

The questions isn’t about parsing, the question is about recognizing a token.

tlogan · on July 25, 2023

The funny thing is that the person who worted top voted answer is not smart all. He might look smart for a newbie but the question is about tokenizaion and not about parsing.

So here is the reason: the top voted answers are wrong.

mihaaly · on July 25, 2023

Ironic, as answering an other question ("What alternatives are to do X?") instead the original one is not indicating smartness at all.

Yajirobe · on July 25, 2023

> If you parse HTML with regex you are giving in to Them and their blasphemous ways which doom us all to inhuman toil for the One whose Name cannot be expressed in the Basic Multilingual Plane, he comes.

Jesus this is cringe

RHSeeger · on July 25, 2023

To be fair, that question was written during a period when "how do I parse this html with regular expressions" was asked multiple times per day. And "regular expressions are not a reasonable tool to do that, use a parser" was the correct response to 99% of them. And, at some point, someone decided to throw out a more amusing version of that response. It _was_ funny at the time.

zeroonetwothree · on July 25, 2023

I thought it was amusing. Reminds me of a lot of 90s internet humor.

Eisenstein · on July 25, 2023

The thing is that oftentimes people who want to do 'X by Y' are actually asking how to accomplish Q. They think 'X by Y' is the solution and get hit by a roadblock, not knowing that it will not help them and they are wasting time.

This is called the XY problem and is extremely common on tech related forums and mailing lists.

* https://xyproblem.info/

jason-johnson · on July 25, 2023

Sure, but the issue is that SO was used largely for people working in companies with arcane rules. I can’t tell you how many times I’ve gotten one of these annoying “don’t do X, do Y” when I already know this. I have to do X for some reason, I don’t know how to do X because I do Y when given a choice and now no one will answer how to do X because someone killed interest in the question by apparently answering it. I use whatever points I get to downvote these answers.

The thing people don’t get is: when you answer on SO you’re not answering that poster. You’re answering anyone who will ever have this question. It’s quite arrogant to assume it will be an XY for every single person forever more.

The proper way to answer is to answer the question exactly as ask and then insert your “but you probably should be doing Y instead” at the end.

underdeserver · on July 25, 2023

Disagree.

Doing things the right way is BETTER.

If you can't, you should add a bit to your question saying "I know the standard way is to do Y, not X, but because of reason Z I can't do it."

jason-johnson · on July 25, 2023

Again, you’re not answering the person who asked but every person who ever will. Some of them will be asking because the “right way” is not an option in their situation.

ivanbakel · on July 25, 2023

And those people can look for questions where the "right way" is justifiably unusable, or pose those questions themselves (and find out if they really have to avoid it.)

Because you're answering every person who ever will ask, a lot of the people who pass through your question & answer will be people who don't know the difference between the right way and the wrong way. If they want to know how to do something the wrong way, because they don't know what the right way is, an answer that simply tells them how is a bad resource.

It's not enough to tag caveats onto such dangerous answers, because people can't read. Instead, newbies should have to overcome a sufficient amount of opposition to filter out those who don't know why they're doing what they want to do, and the rest can make the little effort of being very explicit about why they want to do something the wrong way.

lelanthran · on July 25, 2023

> And those people can look for questions where the "right way" is justifiably unusable, or pose those questions themselves

Can't be done, will be marked as duplicate.

chimprich · on July 25, 2023

Exactly. I've seen precisely this "documentation antipattern" occur many times. "How do I do X with Y"? "You probably want to do Z instead". Upvoted, question answered, all other related questions of "no, really I do want to do Y" get closed as duplicates.

Then Googling for doing X with Y gets you a bunch of closed questions and a labyrinth of links all leading to a question that was answered 10 years ago on a different software version where Z possibly was the right way to do it but now isn't.

And of course there's no way to reopen the question because it has been closed by a level 15 Magister Templi moderator and a lowly level 3 apprentice moderator like yourself needs to either answer 146 more questions or moderate 192 other questions to clear enough arbitrary hurdles to achieve holy question reopening powers.

And there's possibly an appeals process but that involves recruiting 13 moderators who you have to convince to give this question special treatment and declare that one of their number of sacred moderators made a mistake.

underdeserver · on July 25, 2023

This is bad then. They are not duplicate questions.

paulmd · on July 25, 2023

Yes. StackOverflow mods frequently mark questions duplicate that are not. That is something that has been observed by many many people.

Some of it is that SO has gamified shitting on and suppressing the question/asker instead of gamified providing the answer, and built a culture of toxicity that tolerates the abuse of the tools in this fashion.

And when the CEO asked them to tone it down maybe 5 years ago they basically did a collective “am I so out of touch? no, it’s the askers who are wrong”. Extremely funny to read the meta responses to that at the time.

https://stackoverflow.blog/2018/04/26/stack-overflow-isnt-ve...

https://news.ycombinator.com/item?id=16934942

(admittedly "women and people who don't speak english well are particularly unlikely to adopt to the pedantic neckbeard culture we've built" is a spicy take for your average SO'er, or wikipedian, but it's also not actually a wrong one either. SO's culture problems probably do disproportionately chase away users with marginal engagement, nobody likes putting up with formalized neckbeard culture and those users have absolutely encountered it before and absolutely have an aversion/revulsion to entering yet another online neckbeard nest. I think this is a case of “he’s probably right but the medicine would have gone down better with the manchildren if he hadn’t mentioned women and minorities”, and he’s also right that those issues have continued to bury SO over the last 5 years.)

tivert · on July 25, 2023

> Because you're answering every person who ever will ask, a lot of the people who pass through your question & answer will be people who don't know the difference between the right way and the wrong way.

Then you have to do two things in your answer:

1. Correctly answer the question as asked.

2. Add your opinion about the "right way" to do it.

If you only do #2, you are failing "every person who ever will ask."

ivanbakel · on July 26, 2023

Again, I don't think this enough - because it's a well-acknowledged fact that people can't read[0] (as I said in my comment.) How many newbies are going to see a working solution, try it out, and immediately skip all the extra text that they don't think they need?

[0]: https://www.joelonsoftware.com/2000/04/26/designing-for-peop...

tivert · on July 26, 2023

> Again, I don't think this enough - because it's a well-acknowledged fact that people can't read[0] (as I said in my comment.) How many newbies are going to see a working solution, try it out, and immediately skip all the extra text that they don't think they need?

You know that's not your responsibility. If some newbie makes a mistake, that's their responsibility (and a learning experience for them).

And frankly, I think you greatly overestimate how valuable and essential your non-responsive "you're asking the wrong question" answer is.

> https://www.joelonsoftware.com/2000/04/26/designing-for-peop...

That link is about users. You're misapplying its lesson if you're using it to justify not answering a developer's development question.

Quit coming up with excuses for not answering the question.

ivanbakel · on July 26, 2023

>That link is about users. You're misapplying its lesson if you're using it to justify not answering a developer's development question.

Why do you think "users" is an inaccurate description of the role question askers have on a developer Q&A board?

Put another way - when was the last time you used a development tool, or a library, or some other resource, and sat down to read the full documentation of it? I would posit that that's very rare as an activity, even for developers who need to develop a deep understanding of what they're using.

It's much more common to learn by doing, and the limit of that learning is very often what the developer can't do. Answers which easily enable developers to do something are overwhelmingly likely to lead to developers doing that thing - much in the same way that a long page of library documentation which gives an example is likely to lead to developers repeating that example, even if at the end of the docs, there's a little caveat saying that you shouldn't follow the example for so-and-so reason.

>If some newbie makes a mistake, that's their responsibility (and a learning experience for them).

But is it a good experience? Sure, maybe they'll learn that they always have to read the whole answer before they use any part of it. But we sensibly have abandoned this no-guardrails approach to teaching in almost every arena where it's been used, because it's not really suited to the way people do things in real life - and in real life, people often end up affecting others with their mistakes.

Does junior developer who learns how to glue SQL strings together in their favourite programming language, and makes the "small mistake" of not learning anything about SQL injection in the process, benefit from the learning experience when they cause a data leak? Do their customers? Or should the learning resources they access maybe use the pedagogical tools available to make sure those kinds of mistakes are really hard to make, even if it occasionally inconveniences a seasoned pro?

tivert · on July 28, 2023

> Why do you think "users" is an inaccurate description of the role question askers have on a developer Q&A board?

The are a lot of different kinds of "users," and I think the kind of thinking in that article is totally inappropriate when applied to developer Q&A board.

To be perfectly blunt: the result if what you're advocating is to condescendingly treat experienced people as newbies so dumb that their question should not be answered, because you think they're so dumb the real answer might distract them from the lecture you want to condescendingly give them.

People like that are super annoying and almost always unhelpful.

Every single fucking question I ask on SO has some lazy condescending dude chiming in to answer the easy question he thinks I should have asked, after he totally failed to understand the constraints that made my question hard. Of course, lazy condescending dude always thinks he knows better.

pasc1878 · on July 25, 2023

Yes but the right way should be the answer unless it is explicitly stated why they can't use this.

Most readers will be able to use the "right way".

jason-johnson · on July 25, 2023

No, the best thing is not assume you know better than anyone who will ever ask this. It’s good to mention what the right way is and why but your answer should always include the answer to the question exactly as asked at a minimum.

pasc1878 · on July 25, 2023

I agree but sometimes the answer exactly as asked leads to wrong things. So sometimes you don"t provide the answer to the exact question but include the reason why the exact question is not good. This gives the option for the questioner to comment why the exact answer is needed.

My experience with less experienced developers is that they ask the exact question as that is where they have stuck but they are ignorant of the better ways.

I do tend to answer differently depending on the questioners reputation. If they have a higher rep then I can assume they know what they are doing.

floren · on July 25, 2023

You know what's at least as common on SO? "I don't understand the thing you're asking, so I'll pretend it's the XY problem and tell you about something I do understand"

ideamotor · on July 25, 2023

I’m not convinced anyone interacting on SO can diagnose something like this. The act of breaking down a problem to a tiny part so you can post it kinda guarantees this scenario.

But I think it will always be up to the user of SO (not the poster or answerers) to make the real judgement on what is useful.

Often I think SO is useful to use as a bunch of puzzles folks solved. You gotta decide if they are relevant.

SOLAR_FIELDS · on July 25, 2023

Just some random musings about Stackoverflow:

SO is at its best when it’s actual error debugging IMO. When you google some specific error whoever else has the similar error it’s right there. I feel like GitHub is replacing this more and more though - I often get the GitHub ranked specific error higher than Stackoverflow these days. Usually you get better discussions on the GitHub issues too, for a multitude of reasons. Two off the top of my head:

1. all of the people working on the stuff related to the issue are very close by

2. the moderation is not nearly as heavy handed as SO.

ChatGPT is also much better than SO as well if you can give it enough context and the thing you are working on wasn’t built on stuff released after 2021

I also really like Stackoverflow for current event type stuff, like black swan type events. One recent example is when google’s Paris data center was on fire and infra guys were helping each other out trying to get systems online.

All of this combined means that StackOverflow the forum is probably on its way out though. They made the mistake of taking VC money and the model hasn’t really proven profitable so they have really made some poor decisions to please the vc overlords.

I won’t miss Stackoverflow much other than nostalgia unfortunately - better alternatives have arrived. Seeing the decline of all of the other Stackexchange sites kind of sucks though. There aren’t better alternatives for many of those

yarekt · on July 25, 2023

Just out of curiosity, what are the alternatives? I still find the moderation approach well made, even if it looks heavy handed. It’s important to create information for the future, not just for now

pasc1878 · on July 25, 2023

ChatGPT is not in anyway better than SO - no see the current moderation strike.

Both sides can identify ChatGPT answers as being wrong. The question is how can they be deleted. The moderators say they can delete a lot by manual inspection. SO say that AI tools were deleting wrong ones.

ElFitz · on July 25, 2023

In my experience it’s usually often enough been right enough, and has the added benefit of adapting it’s answer to my issue.

Which is more than Google, or a previously answered StackOverflow question, can say.

But then, it’ll probably greatly vary depending on the problem you are facing.

fireflash38 · on July 25, 2023

My biggest problem with github issues is similar to the problems with SO:

Bots closing issues because someone doesn't spam the page. Closing as duplicate of (non related bug). A slew of random solutions that are only tangentially related and don't really solve the problem.

axblount · on July 25, 2023

The issue is that sometimes poeple just want 'X by Y'. To get a question answered, you shouldn't have to list every constraint and design descision that led you to that point.

Comes up all the time when people ask how to do things in bash/sh. I know there are better tools for the job, but this is the one I have.

techdragon · on July 25, 2023

Oh god that just reminded me how often people ignore the question asking for posix shell or “/bin/sh” or other specific shell scripting language… and processed to answer the question using bash, zsh, Perl, python, or even the slightly less wrong (because it is kinda weird to be shell scripting without a moderately normal unix environment) of using a bunch of Unix binary programs to do the requested job without actually solving the core problem of the question, because the tools made it easier…

And then to ice the cake you find the question because your question has been marked as a duplicate of the older question where they answered using unix binary tools… and you specifically asked about doing something in “pure shell script” or something similar to that phrase.

Stack overflow is fundamentally a system design that breaks down at scale due to misalignment of incentives that are necessary for it to work well at smaller scales (as can be seen in the successful operation of various smaller Stack Exchange sites for various topics such as Law, Aviation, Physics, etc)

pasc1878 · on July 25, 2023

The bash not sh issue is due to ignorance on the part of the answerers not an XY problem. Also shell can"t do everything - you are in a POSIX environment so use POSIX tools. The Unix environment is an environment of putting together many small tools and not just using one so any shell script can call POSIX tools as a minimum. So just making a complex shell script rather than using the tools does need to explain why.

I must admit that I don"t buy in to that philosophy and like using one tool so for scripting I would do all in python, so I would not be answering that question.

Many Linux users think that their way is the only and that means bash as the shell and many others like GNU coreutils, gcc etc. I am an MacOS user and my professional career includes several non Linux Unixes so I know bash is not the only shell - try csh for fun - which is partly why I use python or previously perl as they are the same on all machines

Eisenstein · on July 25, 2023

Since I apparently did a terrible job explaining it (the link does a much better one) -- it is when someone has a problem that they are trying to solve in a way which will not solve their problem adequately or at all -- it is not when they are using the perceived wrong tool for the job.

lelanthran · on July 25, 2023

And what all the replies are telling you is that the most XY problems are misdiagnosis.

Explaining what the XY problem is to people who are telling you about it's high false positive identification is, itself, an XY misdiagnosis.

Your reply is an example of what people are complaining about - you are addressing the issue you wished was asked, not addressing the issue you were presented with.

xyproto · on July 25, 2023

Sometimes people ask questions like "how do I shoot myself in the foot and still have a working foot", though. Questions are not always reasonable. A question is not always "pure" either, but can embed incorrect assumptions.

lelanthran · on July 25, 2023

> Sometimes people ask questions like "how do I shoot myself in the foot and still have a working foot", though. Questions are not always reasonable. A question is not always "pure" either, but can embed incorrect assumption

But those are correctly diagnosed XY problems. No one is complaining about those.

My parent was told the issue is too many incorrectly identified XY problems, and responded with an explanation of what the XY problem is.

That is the example of a misdiagnosed XY problem, which was kinda my point. This sort of behaviour makes the actual experts leave the site in droves.

If, when answering a question, one where to discard the answer the minute they write "Why would you want to do this?", you'd get much fewer incorrectly diagnosed XY problems.

As I said in a different thread, ChatGPT sometimes does this as well, but at least with ChatGPT, when it is answering a question that was never asked, it doesn't also act like a condescending jackass. There are no "Why would you want to do this?" type of questions.

IIsi50MHz · on July 31, 2023

Asking why does not have to be condescending. I agree that some responses can read that way, or seem in some other way hostile. In text, or with any reasonable spoken tone, I would not assume that a person asking me why is condescending.

But on second consideration, I suppose you would not, either, and I suppose you are specific'ly talking about responses which, each taken as a whole, are easily interpreted as some form of hostile.

yarekt · on July 25, 2023

I do see both of your points of view. There are some good answers on SO that capture both. They first explain why it’s infeasible, talk about a better approach, then lastly give pointers on how to achieve what is asked regardless using their best reasoning. Think it just depends on the quality

Eisenstein · on July 25, 2023

> And what all the replies are telling you is that the most XY problems are misdiagnosis.

I responded to the person above me when there were literally two other comments in this thread.

> Your reply is an example of what people are complaining about

Defining something is not an example of an XY problem.

> - you are addressing the issue you wished was asked, not addressing the issue you were presented with.

As I am not much of a programmer but work on electronics and computer hardware I deal with different types of people than would be on SO, so I am not addressing anything but my own experiences.

pwdisswordfishc · on July 25, 2023

> I know there are better tools for the job, but this is the one I have.

You may know, but someone else asking the exact same question may not.

So sorry then, but listing every constraint and design descision it is.

lelanthran · on July 25, 2023

> So sorry then, but listing every constraint and design descision it is.

You don't need to do that, a simple "I know what the XY problem is, and this isn't it" prefixed to every question you ask should be enough to stop the race to tell you all about the XY problem.

I mean, at this point it's clear that more people know about the XY problem than people ho don't.

usrbinbash · on July 25, 2023

> you shouldn't have to list every constraint and design descision that led you to that point

True, but that logic goes both ways: Unless told otherwise, whoever reads the question, isn't required to assume that there is a constraint.

If I get asked how to water plants with a sieve, without getting told why getting a watering can is impossible, "You don't, use a watering can", is a perfectly acceptable answer.

Especially when the question is aked in a question archive, rather than a help-forum.

If specific constraints apply to a question, then they should be a part of the question.

lelanthran · on July 25, 2023

The trouble with SO that I've seen is that there are more false positive identification of the XY problem than false negatives.

IOW, any time you think you have spotted an XY problem, you're probably wrong.

And that's the problem with SO moderators and regulars. They classify everything as an XY problem because it allows them to answer the question they know the answer to rather than answer the question that was asked.

tunesmith · on July 25, 2023

Part of this is because problems (X) are complicated and people just as commonly demand the "simplest possible example" (Y) that demonstrates the problem. So people ask how to do Y, and then others ask why on earth they would do that.

One common example I've run into lately, as I've been reading about state machines, is people asking how to implement a simple react component as a state machine, and others objecting to the premise of the question since using a state machine for a simple react component is obviously a bad idea.

cwillu · on July 25, 2023

Telling people they're doing the wrong thing is extremely common. People actually wanting the wrong thing is common, but not _extremely_ common, and the mismatch is one of the more rage inducing things on the internet.

zulban · on July 25, 2023

Alternatively, people might have to use X and Y to accomplish Q because of their organization or team. If it's technically doable, there should be a solution and explanation for that X and Y problem somewhere.

_3u10 · on July 25, 2023

Like how to use a database as a queue, which generally works much better than any queuing system I’ve ever used, except ones that use redis as the database engine for the queue.

I’m sure if you’re twitter Kafka is actually a better solution, for everyone else, it isn’t.

msla · on July 25, 2023

That can be infuriating.

Want to host videos on a laptop (which has a big SSD) and stream them to a Pi (which is attached to a big screen) over a LAN? Hey, here's a post about how to host videos on a Pi and stream them to a laptop! Upvote and share! My point is, you don't even have to be trying to do something all that strange for people to apply the XY Problem logic and refuse to help you.

(Solution: NFS mount and a patient understanding that the Pi cannot play certain kinds of video, so you'll need to transcode some of them first. See? Nothing bizarre, but surprisingly outside-the-box given what I could find online at the time.)

placesalt · on July 25, 2023

My assumption whenever I see that behaviour in a response is that the responder simply does not know the answer to the question asked.

It’s fine, I think, to answer a question and then suggest a better method. It’s presumptuous in the extreme to dismiss a question with some pseudoacademic neologism.

dspillett · on July 25, 2023

If doing X by Y is possible and will achieve Q, then the best way to respond to these questions is either of:

1. To do Q that way you would to <solution, or at least pointers to how to find the solution elsewhere> but you will likely it far more efficient/easy/whatever to do <something else> instead.

2. Q can be achieved far more efficient/easy/whatever by doing <alternative>, but if you are stuck with using Y then try <solution or pointers as above>.

Of course this relies on you correctly deriving that they are trying to achieve Q, or them explicitly stating the fact. Maybe instead they are trying to get to K.

casey2 · on July 25, 2023

Everybody is aware and have read esr etc. The point is that people who are asking X by Y want to learn Y, with accomplishing X as a side goal at best. It's funny that you call wanting to learn Y a waste of time. Is it because you believe Z is a superior way of doing X? Why do you believe that? Experience, science or mathematical proof?

shriek · on July 25, 2023

Often the better answer is to just explain what's being asked by strongly encouraging them in the Z direction. Sometimes people just want to understand what's happening behind the scenes rather than just looking to solve for Y.

berkes · on July 25, 2023

But that's a different question. And should be asked different.

"I'm trying to learn foo.js and attemt to do so by porting Tetris to it. I know foo.js is a terrible option for a game. So. How can I write to the canvas from foo.js?"

Is often answered properly, if only because to many it's a nice puzzle.

"How can I write to the canvas fro foo.js?" Is different in that it will attract a lot of people explaining that foo.js deliberately did not allow writing to the canvas, because Z"

dan-0 · on July 25, 2023

This is part of the problem. You know your constraints, other people do not, but like to assume they do. So you end up having to write a defensive argument about your problem rather than just plainly asking your real question.

This may help less experienced engineers not understanding their problem set, but for a more experienced engineer it's absolutely obnoxious to think of all the ways to defend my question so I don't have to deal with a rush of "oh but you should do this instead" answers getting up voted that don't actually answer my question, then asking to accept the answer.

To one set of users it's possibly helpful, to another it's useless if not also condescending, often condescending to both sets.

The bottom line is now, SO is so filled with these types of responses, I can't expect to get a very specific question answered, which is really the only reason I'd ask a question in the first place, so why use it?

There are plenty chat groups now via Slack and discord in my field where I can get much more direct answers. People aren't worried about getting down voted for a bad question, people aren't giving low quality answers to boost their points. So for me, SO is practically dead except for the occasional obscure error message that I can query for there.

berkes · on July 27, 2023

> So you end up having to write a defensive argument about your problem rather than just plainly asking your real question.

My experience is that it's not so much a defensive argument, but context. My example was poor in that it could be misread as a defensive argument, sorry about that.

I meant it to show how adding some context changes the question. Because, in programming, it is all about context. AKA that "It Depends" meme.

rerdavies · on July 25, 2023

That's unkind, and doesn't really address the nature of the problem at hand.

What's the name for the "I don't want to use a sledgehammer to solve a problem that should be solvable with a screwdriver" problem?

Or in this case, the "spinning up 300 lines of code to integrate an XML parser vs. a dozen lines of code based on regexes" problem. For reasons that are unclear to me, XML parser libraries tend to be painfully difficult to use (speaking from personal experience with 4 different XML parser libraries).

I don't think it's a surprise to anyone involved that an XML parser is going to solve the problem.