Hacker News new | past | comments | ask | show | jobs | submit login

The problems I’ve noticed with Stack Overflow are a few and hard for me to narrow down but basically:

- google used to return really relevant results for SO, and it stopped doing so at some point a while ago

- moderation on SO has gotten progressively more horrible. can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever.

- because of the previous bullet, oftentimes the best answer is buried in comments and has very negative feedback despite answering the exact question

Due to a combination of these things, filtering against the noise for what I wanted became increasingly more difficult and often the solution to my problem was easier found searching github comments or random blogs.




Your first point seems to be most important.

> - google used to return really relevant results for SO, and it stopped doing so at some point a while ago

SO might be horrible now, but it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now? Google's current recommendation of going to w3schools or - even worse - geeks4geeks or any other content farm is and always will be worse than stackoverflow. I don't have a clue what their algorithm is doing but it's surely trying to kill Google search as fast as possible.

Another joke is the fact that searching for "[language] [symbol]" also brings me to these content farms instead of the documentation. You seriously can't find useful anything these days using Google.


This whole situation just shows as a lie everything we hear about SEO. Stack Overflow has the exact text, it loads incredibly fast (should be commended more for this), doesn't require ten meg of Javascript to render as far as I know generally meets HTML standards.

These scam sites load megabytes of junk, load slowly, have text interpersed with ads and modals that render right on top of them, if you open devtools you'll often see pages of warnings about deprecations and/or invalid html, and despite having the same scraped text always score higher on Google.


It has long been a mantra in SEO land that user generated content sites in general and forums in particular are to be aggressively down ranked. The reason for this is that industrial strength spam farms otherwise spin up tens of thousands of forum domains to pass link juice to what they are targeting. This naturally penalizes real forums, which often contain the best content for a query.

This is why Google has basically surrendered and why so many search result categories are now dominated by whatever sites Google has arbitrarily declared the winner through editorial decision making. In many search categories, we are effectively back where we started with Yahoo directories and hand-picked search rankings. What you see on the first page SERP is the best that they can do under the circumstances based on the fundamentals of how search works.

The web was a fun idea while it lasted, but if you are using it as a primary information resource, you are wasting your time.


funny, I just rented an introductory book about signal processing and was (re) amazed to see how much the information is well explained, with tons of example, and a real plan to guide you in the ton of knowledge you have to master.

I for one welcome back our new library overlords.


Come on now... Don't hold out... Citation?


Understanding Digital Signal Processing, 3rd edition

Published by Pearson (January 14, 2022) © 2021

    Richard G. Lyons


I think there are a number of things that could be done to improve this, but I'm sure Google won't do:

1. Heavily penalize presence of ads. The content farms makes their money with lots of shit ads. This will wreck their business model.

2. Just manually block out heavily penalize content farms and boost known good sides like SO and MDN.


Well, those sites are often running ads probided by Google, so it's understandable why Google doesn't really have a good incentive to follow your first suggestion.


They do penalize heavy ads. There have been shenanigans on this front also (penalizing non-Google ad networks and favoring Google ad networks). They do penalize some content farms and favor others. The issue they are concerned about with SO and MDN among other user generated content sites is covert seeding at scale for the purpose of manipulating search results.

There's just a lot of fraud on the internet related to search and advertising manipulation, but it's under-policed in part because of the internationalized nature of it and because it is hard to bring fraud cases in the United States because of the particularized pleading standard. That should not stop the feds from bringing criminal cases, but generally the feds care about large dollar value frauds (as they probably should be) rather than on policing very large numbers of small dollar value frauds that have a major aggregate impact on the online economy. They like going after the guys who steal $100 million from deaf children with Lupus rather than doing 200 $500k fraud prosecutions.


Except those adds are often provided by Google itself. So penalizing them would be self harm for google revenues


Anecdotal, but my website lost a lot of search traffic after Google's core update in March, which seems to have affected SO as well looking at the first chart.

If I look at Google's guidelines, my articles follow all of them: in-depth, well-researched, demonstrating personal experience, better than other articles appearing in the search results. And yet, they were "penalized" by this update for who knows what reason.

I looked into it and some other websites benefited from the update, so who knows what changes they made and why.


Again anecdotal, but I lost the majority of my SEO traffic late last year around the time of a core update. I've spent the best part of a year attempting to repair it, on the assumption I'd committed some heinous SEO crime. The more time that passes, I'm starting to think that the issue isn't mine so much as Google's. It's baffling. I wrote about it here:

https://johnnyreilly.com/how-i-ruined-my-seo


I don’t see what you did wrong, it must have been the algorithm change. My parents had a business that was killed by a Facebook algorithm change. My brother took a significant hit from an Amazon algorithm change. Building a business around any of the big tech companies seems very risky.

I think Google search has just declined a lot. I guess they’re losing the constant cat and mouse game with SEO. It seems worse than it has ever been, I’m relying more on ChatGPT and copilot now.

I can only imagine that LLMs will be the end of any content based search ranking. I don’t know how they’ll adapt to that.


Looking at your post and the HN discussion, I am also of the same impression.


Maybe Google (semi/permanently?) penalized your domain when spammers started using your GA4 tag?


Gosh that would suck. Sounds plausible though


> These scam sites load megabytes of junk, load slowly, have text interpersed with ads and modals that render right on top of them

Only if you're not googlebot. The crawler sees a much nicer site.


which should — in theory — get them penalized for cloaking. But obviously it doesn’t. Reinforcing GP’s point.


Google has gotten pretty lenient about that: https://developers.google.com/search/docs/essentials/spam-po...

"If you operate a paywall or a content-gating mechanism, we don't consider this to be cloaking if Google can see the full content of what's behind the paywall just like any person who has access to the gated material and if you follow our Flexible Sampling general guidance."

I wonder if they just gave up


Hypothesis: Search, being Google's oldest product, is no longer prestigious to work in. It's in maintenance mode.


Does Google run other indexers for the purposes of catching cloaking? Are there other strategies that can be used? One of the problems of SO is that most of the valid content is out there and easily available without having to scrape the site which may mean penalizing for bad content is harder.


And the fact that google is not detecting those is damning (to google)


Does it even make sense to serve different content to a bot than what a human would see? Isn't the search engine trying to rank content made for humans?


It's an adversarial process. The search engine is, in theory, trying to rank by usefulness to the user, and the site owner is trying to maximize revenue by lying to the search engine. And the user.


I'm generally puzzled by Google's reluctance to do manual intervention in these cases. It's not like this is a secret. Just penalize the whole domain for 60 days every time a prominent site lies to the crawler.


There are very many sites where the content you see as a non-logged-in user is different from what you see if you have in your possession an all-important user cookie.


If Google's support is any indication, Google doesn't like to involve humans in their processes. There probably isn't enough humans to do this manual intervention you propose.


Then, maybe the "crawler" should be an actual PC navigating to the browser, taking a screenshot (or live feed) of the page and processing it with AI.


Eh, Google choose to be identifiable as googlebot and to obey robots.txt for other reasons of "good citizenship", because not everybody wants to be crawled.


It makes sense if you know your content isn't nice for humans (e.g. full of ads and tracking stuff) but you want it to rank high anyway.


I wonder what will I see if I change my browser's user agent?


Google is really failing hard in this regard, and I'm fairly sure it's intentional on their part. Searching "Typescript array" has obvious intent from the user, and an obvious "correct" first result. Google returns the documentation page in the 3rd result, but it's a link to a deprecated version of the page. The rest of the above-the-fold links are websites that contain Google ads.

Duck-Duck-Go returns the up-to-date documentation link 2nd, and the MDN result in 3, with W3Schools in 1. Bing returns actual content on the results page, describing exactly what you need to understand a TS Array.

Google have the incentive to push the poor sites, because they earn revenues from doing so. Bing and DDG don't have that incentive, and return much more relevant and useful links. That doesn't feel like a coincidence.


I spent years learning a programming language well then further years delivering a training course, iterating and then providing sections of the course on the website free online. Both as advertising and to get new people started. Your "typescript array" returns one of the sites in the top 5 that basically copy-pasted via thesaurus many of my articles. I checked and it turns out they offer $50 for people to submit content for any language / technology. So you have someone in a cheap country paid to go copy content and reword it on that site. Then they rank higher than you, as they do this over many languages thus seeming more authoritive. Even more worryingly with chatgpt, they won't even have to pay the $50 any more. So the whole internet may become like this. Leaving me little incentive to publish material except that which solely entertains myself, mmm facebook/twitter = not a good outcome.


I have a friend that does something similar, but only does video with the text gated behind a paid-only site. He makes pretty good money, but for the exact reasons you listed is why the site is paid-only. They have a much harder time stealing (as in posting as their own content) the video.


Results on Kagi for comparison:

https://kagi.com/search?q=typescript+array&r=us&sh=Qa5cXHvwj...

We simply downrank sites which display a lots of ads on them and also use community blacklists for dev site clones.


I also notice and appreciate that Kagi returns older results while Google continues to push newer webpages. I have found so many useful results from perfectly fine content on older webpages. At this point, I’d be extra happy if Kagi had a Web 1.0 filter that focuses on basic html websites.


For those who want this it exists on https://search.marginalia.nu


They should just add Google ads to all technical documentation pages. Problem solved.


Yes, google search is nowadays, like everything else, run by AI. What nobody tell is that the AI is trained to maximize Google's revenue. That's why they figured out it is better to put these ad sites on top.


> Google is really failing hard in this regard,

Failing at what though? Is it anything they care about, that they want to do?

If not, then it's not so much failure as it is a change of plans on their part. They don't want to do that anymore, and there's no one else to pick up the slack.


There are browser extensions for blacklisting domains from your Google searches. I've been so incredibly happy using one if them. If I see one of those despicable content farms I just blacklist it and move on. Often when I search on Google for technical stuff I only get 2 visible results on the first page, 1 SO and 1 documentation. Soooo relaxing.

The business reasons why Google doesn't take steps to remove the bad content and make their product pleasant to use again is so far from my understanding it might well be aliens running the company for all I know.


My understanding is that Google has an incentive to send people to content farms because those farms will show Google's ads. Stackoverflow doesn't. So they can increase ad exposure.

Thinking of it, it would be an interesting test to compare the ranking of two similar sites, one with google ads, another with ads from another provider. Might be good evidence for antitrust litigation. But what do you do if they just prefer sites with more ads? Because due to their market position, that benefits them, but it isn't anti-competitive against other ad-pushers.


Maybe you're correct. I've heard that explanation before but it just seems too incredible that they'd undermine their monopolistic global billion dollar business for a measly share of the revenue of geeks4geeks.


Google is a self playing piano with clueless leadership. There is probably no plan involved.

Just managers doing what they get more money for or devs hunting promotions by increasing ad revenue by 0.01% in the short term one sting at a time.


the way you phrase it there makes it sounds miniscule, but you scale that up to the size of the SEOified internet and the numbers are surely into the billions


I was thinking the same. Taking in consideration the vast amount of such SEO farms, there's surely a lot of ad money to be spend/earned if you prioritize the "right" sites.


Last time I saw, Google gets much more revenue from ads on search than from the entire 3rd party ecosystem.


a big number that's much smaller than a big number is still a big number


I don't think it's an intentional decision anyone has taken, or that they intentionally made the search engine the way it works now, but more of a "there's nothing wrong here from out perspective, so what's there to fix?" kind of thing.


I've been using Kagi[0] for a while now and it's pretty great in general - but also has options to boost up / down / totally ignore certain domains. It also has "lenses" that let you set a context (example: I'm searching for code stuff so just include sites a,b,c).

It's really good and IMO more than worth the price.

[0] https://kagi.com


Yeah, my Kagi list of content farms / SO clones which are completely dropped from all results keeps growing. On the other hand, searching just SO from Kagi still seems to give decent results.


Your experience matches mine. Spend two or three weeks blacklisting sites as you hit them and they disappear.

Some people argue that Google possibly can't win the fight vs spam sites, but obviously it works perfectly fine manually blacklisting them.

It takes time to build ranking.

The underlying reason is probably that the spam sites use Google Ads (revenue which is tied to 1000s of PMs and managers bonuses) and that Google as an org is deeply dysfunctional at this point.


Yeah, but then they're editorializing.

And not in a generic “this kind of content is bad for our users” but very specifically “site x.com is bad for our users.”


>> Some people argue that Google possibly can't win the fight vs spam sites, but obviously it works perfectly fine manually blacklisting them.

> Yeah, but then they're editorializing.

> And not in a generic “this kind of content is bad for our users” but very specifically “site x.com is bad for our users.”

Except that shouldn't be a problem because I'm pretty sure Google already blacklists domains.


Surely, they do. But they reserve that for stuff that's really way beyond the line. For everything that might be legitimate they leave it to the ranking algorithm to sort out and it's a game of cat and mouse.


This is good to know. I refuse to click on a geeks4geeks result even if it looks like it has exactly the answer I want.


Do you have a suggestion for a Firefox extension to do that?


I have noticed that Wikipedia is often pushed to the bottom of the page compared to a few years ago where it would always be at the very top.


Anecdotally Wikipedia is often the top result for me ... with the twist that it's the Google widget, with the side bar and related videos.

Only way below this block (which takes about 120% of my whole screen's height) come the "organic" results, that aren't great, but probably match what Google assumed I wanted to see.


And it's always the result I want.


Then consider using DDG & '!w term', or some other method (searching Wikipedia, extension, I think Firefox has something engine-agnostic built-in) instead?


Oh I do use DDG. I think it might suffer from the same problem, actually, since now I'm wondering why I see Wikipedia results very far down in Google when I don't really use Google.


The comment above mention using !w in your search to look only in wikipedia since you say it is nearly always what you are looking for.

I would say at this point you could just use wikipedia as your default search engine.


Yeah, I do do that for those things, but when I don't, I've noticed a decline in quality.


You can add a keyword search in Firefox.


I never really liked StackOverflow. The only questions they seem to allow are “How do I get the length of a string in Python?”. Most of the problems where I am scratching my head and really need the benefit of somebody’s experience are software selection problems that aren’t allowed.

The competing answers paradigm is also fundamentally broken, I don’t want to see 15 answers to “How do I get the length of a string in Python?” I just need to see

   len(x)
Programming splogs do better than SO does in this respect. In fact, even the Q/A paradigm is bad because the average SO post requires scrolling past at least one extensive code example that does not work,

For more than 10 years I thought the world needed a search engine for programmers. You really ought to be able to upload your POM file or equivalent and have the system automatically search the correct version of documentations. (Any attempt to look up things in the Java manual has to be written like “JDK17 javadoc {className}”; Javascript libraries like reactstrap, react-router and such often have a few wildly incompatible versions and I don’t want to waste a millisecond with the wrong version doc, …)

I wouldn’t mind searching answers from stackoverflow but I only want the best correct answer and I don’t want to read a long confused question, etc. As this would clearly save coders time maybe they’d pay for a subscription as they do for Jetbrains tools.


Years ago, Google announced they would crack down on content farms, and SEO advice was really like "you NEED to have this meta tag if you duplicate content from elsewhere else google will fuck you over HARD!", but it seems they earn more money off of content farms then the sources.

This will hurt them long term I presume, but they won't care because they earned money.


Google's current recommendation is usually heaps of Pinterest randomness, and then they wonder why people start relying on ChatGPT "oh it's not a search engine" - sorry folks Google isn't one (anymore) either.


Google has gone down the drain. As I've written recently somewhere here, they could easily fix their search by hiring maybe a dozen people per country to moderate common search request results or to, hell, listen to users like here, and respond by booting the scammers.

The problem is, they won't, because active moderation beyond responding to legal (DMCA, right to be forgotten, anti-CSAM) demands would massively endanger their "we are an impartial search engine" defense.


It's been over 10 years and it still endlessly frustrates me that searching for any Ruby or Rails documentation will send you to an APIDock page for Rails 3.2 and you have basically goad Google into giving you the official documentation for either.

I suppose the real frustration is that Google became so pervasive that bookmarking a website and using its own search functionality is a total afterthought.


Google search results have a filter for time, so could potentially improve the results by changing the date range back to 3 years ago.


try Kagi ( kagi.com/ ) . SO answers are almost always the first ones for my geeky questions (as they should be in most cases), and it also extracts and displays the official answer to the question that best matched your search.


Try out Kagi Search. You can manually increase website's weight and completely block others. E.g. I have increased Stack Overflow's weight and blocked those stupid content farms. Works great.


Conspiracy theory: Bad initial search results forces people to search more often, hence allowing google to show more ads. Since few people switch away as a result, they continue doing this.


this is a bit like the cosmetics industry. there are very clearly probiotic solutions to body odour that could be developed with the coins down the back of P&G's sofa cushions, but if you fix everyone's body odour, then how are you gonna sell them anti-perspirant from now to the end of time?

now, in an ideal world competition would solve this problem, but the cosmetics companies heavily collude and anti-compete to prevent this


This is where I want to remind you, Stackoverflow is a Q/A site that sometimes contains stolen content from, as you put it, so-called "content farms" and the official resources.

Now, I do have a Stackoverflow user as well, but I actually prefer publishing my ideas on my own site rather than help build someone else's content farm for free. Stackoverflow is, itself, a content farm, and it can be very hard for new users to join the site. You can not even post comments without first earning enough points. For a very long time I would actually resist joining the site for that reason. I have only recently earned enough points to comment.

Now, I happen to own a so-called "content farm" too, and the choice can either be between creating a standard blog with very little traffic or try and cover everything you can possible think of in order to compete with other "content farms" in your niche. It is very difficult if not near impossible for a single individual to create a valuable resource and maintain it, and it is simply not sustainable if you have paid authors working on it as well. There is no way you can monetize it decently. Stackoverflow probably found a way around this problem by simply leaning back and monetizing their users content.

Once your site grows big enough, you also deal with a ton of spam- and hacking attempts. Everything combined just requires an inhumane amount of time to deal with.

Of course, authors are desperate because of how difficult it is, and perhaps especially authors from poor countries that might not have other sources of income. Their basic business model seem to be: create a content farm with ads, fill it with copy-written spam and hope Google indexes. Often these sites even have multiple authors, which is quite baffling given the extra expense it must create for them. But I do not think they have actually thought the idea through – because it is just not profitable.

Weirdly it's often in the technology niche, which they are clearly not proficient in, and more or less containing stolen solutions with little original content added.

I have seen a few sites like this, ripe with some of the most nasty grammar too. It interesting they are able to rank simply based on their volume? Of course they must be using blackhat techniques, including linkbuilding if you analyze their link-profiles, because there is no way that something so poorly designed and maintained is getting that much attention compared with official sources or stackoverflow.

For those of us who own blogs, such sites are often easily outranked simply by writing a comprehensive article on whatever tiny topic they have posted about.


Yes, if you cite a solution the mods there get angry when you don’t copy paste the third party site content instead of just link to it. The stated reason is to make sure the content isn’t lost. In other words to ensure the content is duplicated on SO.

I have no allegiance to SO ownership so when the fake SO sites show up in results instead of SO, usually reading them will just give me the answer more quickly than finding the actual SO source.


They want enough of an excerpt so the answer doesn't become useless years later when someone redesigns their blog URL schema or shuts it down. That's reasonable, and probably falls within fair use.


That’s what I said


>mods there get angry when you don’t copy paste the third party site content instead of just link to it...

There's a good reason for that. Sites come and go and as a result links to solutions die and you wish someone had just answered the question instead of just linked to it.


Thats what I said


Absolutely. Google search results quality has declined and I often find myself prefixing search queries with "site:reddit.com".


> it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now?

Are they just fine today, too? To judge that, you have to look at the date of the question and its answers, make an educated guess at what OS/language/library versions they are about, judge whether that makes a difference for the version(s) you’re using, and only then evaluate whether the reply even was correct at the time (it may have had a thousand upvotes, but still be dated)

I think a really good Q/A resource would require posts to be tagged with version info. Most people think manual tagging isn’t fun, though, so it’s hard to get such a set from volunteers.

An alternative would be to require test cases that the site can run to check what version(s) replies are valid for, but writing such tests that do not break over time is hard, and, again, in general volunteers don’t like writing them.

That leaves generating tags or test cases. I don’t think we’re there, quality wise, to do that.


Stack overflow committed the cardinal sin of running their own ad network on their sites, not much of a mistery it's downplaced.


I was involved in SEO related projects some time ago. Not that I’m an expert. I’ve heard google understands that the site is a search engine and does not index it. However it should be smarter, like do not index SO’s search pages, but do index question pages - because the original content is there. SO might have ran out of crawl budget which Google assigns to each site, and/or Google prioritises fresh content. But I agree with the sentiment, what we know as SEO is nothing more than playing games with Google indexing algorithms, based on rumours about the recent changes in it, or improving page performance beyond reasonable boundaries. The other day I was looking at apple.com internals and spotted a few things which we were “fixing” on our pages. I asked SEO experts “what is the point of doing X, since there are examples of a well indexed page having that same problem?”. And the answer was like “when we will be as big as Apple…”


uBlacklist can help by culling the spam results, while it's mostly a manual thing, it's fast, easy, and a little effort goes a long way I've found.

Unfortunately it still doesn't solve the issue that sometimes the good results are still buried pages away or simply not come up at all due to google's shitty algorithm.

I really need to look into SearXNG or something.


Is that actually true though? So, I literally just went to Google this morning for a toddler python question (very much not my first language, heh).

"how to load a file all at once in python" returns a first hit pointing to a blog post answering the question correctly, a second pointing to a SO answer that is actually for a slightly different problem but contains the correct answer, answer #3 is a youtube video that probably answers the question correctly.

Geeks4geeks doesn't show up until #4, well below Stack Overflow. (FWIW, their answer was fine too).

> You seriously can't find useful anything these days using Google.

That really feels more like a meme than reality. Are there other subareas where the SEO is doing better than this one? It seems like a pretty representative question.


The answer is that the content farms are doing a better job of interacting with Google algorithm than SO. Of course it is a problem with Google search, but search was always hackable. The made-for-google sites know very well how to play the game.


I wonder if Google should make their SEO prevention worse but simpler. Everyone has always wanted to SEO for Google, as long as Google has been around. It has seemed like only recently that good sites predictably lose.

Perhaps 10 years ago Stack Overflow was able to do some minimal SEO and then get by on content strength. Perhaps nowadays Google is doing a good job preventing basic stuff from working, so the only people to get good results are SEO-ologists that only know about exploiting SEO, and have nothing interesting to say on any other topic.


I think the answer is simpler. To rank well on Google you need to integrate with Google (search console, analytics and similar). I guess SO is not giving all their data to Google, so they cannot "optimize" for the site in the way that content farms are willing to.


I think they need to bypass Google somehow to keep it going. Embracing LLMs could be a way out.

I already go to ChatGPT to cut through the SEO-optimized crap that Google offers me in the first couple of result pages. I would bet that a lot of the responses given by ChatGPT come from Stack Overflow.

Now, what if we had StackGPT which offered me similar funtionality as ChatGPT, but better? E.g. respond with some code and an explanation, but also link to the sources (which are probably within their site, so they have prime access to them). Or offer as an explicit option to respond using sources other than their archive, but perhaps without citing sources.


My theory these days is that indexing services like google are now too big to work properly. There's more and more noise added every time new information is indexed, to the point where strong bias is necessary for it to return relevant results to average user.

Maybe there's a point where the internet, with decades old information pilling up, becomes unbearably big for indexing services to handle all of it in a efficient manner. Hence the recent "optimizations" that companies swear haven't worsened searchability.


This is what I want from a new search engine:

1. Respect exact match searches - this used to work enclosing the search terms in "" quotes, but no longer works. If there are no exact match results, return nothing.

2. Allow blacklisting or removing results from certain websites entirely e.g. I want to be able to configure geeks4geeks to never show up in any results ever

If someone could make this new search engine they would have a good shot at replacing Google :)


Both features exactly as described already exist in Kagi search [1] (founder here).

We are not trying to replace Google though, but offer an alternative to people who care so much for the quality of their search experience, that they are willing to pay for it.

[1] https://kagi.com


You won me over by summarizing listicles to a short list :-)

To be honest I think your pricing is to high. $25 for unlimited queries might be fine for somebody who needs a good search to work and earn appropriately.

But as a (former) PhD student I ran through the 100 free queries in 2 or 3 days and just would not have been able to afford 25€.

I would gladly pay 10€ (for unlimited searches) or 15€ (for a unlimited family option). But to me, 25€ just seems to high. That's 5 meals at my workplaces cantina right now (Germany, NRW).

(I assume you are aware of pricing issues as pricing options have changed at least once while kagi is on my radar)


Thanks for the kind words. There are many things like grouping listicles you can do to improve search experience, once the incentives are aligned.

Unlimited for $10 is something we are working towards.


Thanks for listening! At $10 per month unlimited searches I'll immediately switch.

Also thanks for creating kagi. Kagi was the first "alternative" search that convinced me that there can be competition to Google. YaCy just does not work, most competitors (DDG, etc) just repackage the big engines. I use presearch as my daily driver right now, but am somewhat turned down by the NFT shenanigans behind that. Kagi looks like the only engine that stands on its own and is definitely something worth paying for.


Can confirm.

And if you find a result that got included despite not being an exact match you can report it and see it get fixed in a few days.


I think Brave search has those features. (I haven’t tried it, though.)


https://search.brave.com/

probably my new default search engine. Thnx


I'm sure everyone has thought of this, but is any search engine trying to add LLMs to the crawler pipeline? That might be more useful than at the user side (like Bing) where the index is already polluted.


> but it still holds years worth of answers that were just fine a few years ago

The flip side of that is a large proportion of those are no longer fine or operant.


That’s one of the nice things about Kagi: you can lower our block content farms, and elevate sites like stack overflow.


>SO might be horrible now, but it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now?

SO had an expert sexchange.


> moderation on SO has gotten progressively more horrible. can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever.

Related: I've often been looking how to do X, find an SO question asking that, but the answerers there refused to answer until the person explained why they wanted to do X, and then all the answers (correctly) told the person that they actually needed to do Y and explained quite well how to do Y.

I actually need to do X, so those answers are useless to me.

Then I find another question on how to do X, and the mods close it as a duplicate of that earlier question. Even when the questioner specifically notes in their question that it is not a duplicate of that earlier question because they really need to do X the idiot moderators close it.


I ran into that a lot when I was doing low level firmware programming. The answers to someone's question would be something like "that feature is only intended for ultra-specialized low level programmers". And it's like "in this case, I am an ultra-specialized low level programmer".

I'm thinking of things like assigning constants to pointers in C and/or manipulating pointers directly.


> Related: I've often been looking how to do X, find an SO question asking that, but the answerers there refused to answer until the person explained why they wanted to do X, and then all the answers (correctly) told the person that they actually needed to do Y and explained quite well how to do Y.

> I actually need to do X, so those answers are useless to me.

I know what you mean. Whenever I (rarely) ask a question on Stack Overflow, I always have to defensively load it up with language anticipating misinterpretations and instructing people to answer my question and not some other one.

Otherwise, internet-point-chasers will fall through the woodwork giving easy, worthless advice. Even with all the defensive language, a few always show up.


Sounds like you’re good at prompt engineering


> defensively load it up

Agree. This isn't a problem on the partner/sister sites as much. They are very helpful too. But they don't show up on Google


I've never really thought about how contributors trying to avoid the XY problem really stands at odds with StackOverflow's mission of being a repository of answers, rather than a helpdesk. Not all Ys present as X, and not all Xs are actually Ys. Sometimes its an XZ problem.

The best you can hope for is some answer down the page that says something like "to answer the actual question..."


Sometimes X problem is just an X problem.


The mods are so ubiquitous and so busy on SO, I wish they'd spend some of their time silencing the "let's figure out what your real question is" pseudo-trolls.

I call them pseudo-trolls because I think they are well-meaning, but they function as trolls: overrunning a web site, hijacking discussions with repetitive and irrelevant content, and making most potential users feel that participating isn't worth the time and effort of interacting with them.


Even if X isn't the right solution to my use case I still often want to know _why_ X (or my implementation of X) doesn't work. The answer to that might be a really valuable learning independent of the problem at hand.


The XY comic has done irreversible damage to stackoverflow.


I've seen this example thousands of times, it's both as infuriating as it is frustrating.


here’s a classic SO thread:

OP: “how can I accomplish X doing Y?”

top voted response: “I wouldnt do X by Y. Instead, I’d consider Z. It’ll do everything X by Y would do, plus these other things.”

buried sub comment to top voted response: “that’s not what he asked, to do X by Y (exact solution provided)”

then 10 comments blasting the guy who actually answered the question for not doing it the way the top voted answer did it.


The very popular Zalgo answer is a perfect example of this problem [1]

The user asks a question that can be answered quite easily, and dozens of people post answers claiming that this is the wrong way to do it and that they should use some other tech to solve the problem.

Some people on Stack Overflow care more about showing off how smart they are rather than answering questions, and I think the point system attracts these people.

[1]: https://stackoverflow.com/questions/1732348/regex-match-open...


Hilariously, the accepted (I assume by default, not by the asker) answer is flagrantly breaking, like, half a dozen rules and guidelines… but because it’s cynically and unhelpfully crapping on a newbie, it stays up. Or maybe there’s a good reason for it to stay up, but at a glance it sure isn’t a good look.

I actually think SO is a great site and resource, but I also think a lot of that is despite the bitter old timers in the community, not because of them.


That answer is only there because it's really old, from the early days of S.O. where people were allowed to ask questions that weren't super serious binary yes/no style. It'd get moderated and deleted in a heartbeat today. A forlorn monument to the cool place that S.O. once was


Maybe I’m just a fun-hating asshole but personally I find this kind of thing annoying, not cool. People are just trying to get work done, not see someone’s attempt at cringey “nerd culture” humor.


That is not what most of us complain about I think.

I and I think many other are sad that S.O. removes many serious work related questions (have lost count of how many times I saw the perfect question with the perfect answer, with a note that this isn't what Stack Overflow is made for and these questions only exist for historical reasons).


You agree with SO: those answers are no longer allowed, and haven't been for years. They are only kept as historical artifacts, and marked as such.


Oh come on, it's not crapping on a newbie. It's a funny comment that serves as a reflection of the days and weeks this guy spent debugging these kinds of systems.

As someone who was a newbie at the time when it was posted, who was looking for a way to parse HTML, I took away that it's just really the wrong way to go about it. I didn't feel crapped on at all.


The question is about tokenizing XHTML, not parsing it into a tree structure like a DOM, which is a critical distinction. Regular expressions are a perfectly valid way to tokenize. This is why the snarky answers does not suggest a better solution - there isnt one!

If you scroll down long enough, you will see answers explaining that. But they arent upvoted as the answers suggesting the questioner is an idiot.


The question isn't looking to parse HTML (or XML). Regexes are inappropriate for HTML because they can't adequately match the starting and ending tags, not because of black magic. The OP isn't looking to do that, so regexes look like a perfectly acceptable way to go.


I agree completely.

But the asker is very clearly a newbie. The question does not contain further context. The asker's suggestion is wrong (I think). And we've all worked with junior engineers who try to use the wrong tool.

The answer is a whimsical way of making an appropriate suggestion in this inferred context.

Also, to be fair, I think it's not mathematically impossible to use dark regex magic (with look-behinds and such) to parse HTML, but that's a discussion for another day...


The answer can only ever be accepted by the asker, not even mods can change that. It's actually not that rare that the accepted answer is not the one with most votes in which case the accepted answer is somewhere further down, not the first one on top.


Oops, you’re right! I was totally misremembering.


No accepting an answer can ONLY be done by the person asking the question.


The question explicitly invites the kind of witty reflection shown in the accepted answer, by adding: "and what do you think?"

As mentioned elsewhere, this is an old question and both the kind of question and answer wouldn't be allowed these days.

However, I also fundamentally disagree that questioning the assumptions in a question is unhelpful. You want to solve a problem, find an approach and want help because you have problems with the approach? What if the approach you took _is_ wrong? Its very helpful especially for advanced beginners or at the intermediate level, to be given a different way of solving the problem even if that is not what you asked.

It depends on context if this is just pedantry or genuinely helpful. The best answers I found start with answering the question that was stated, but then proceed in showing how the problem behind the question can also or better be solved.


“You didn’t actually want to do X, here’s how to do Y instead” may indeed be helpful for the beginner who initially asked the question, but it’s very unhelpful for me who finds the page years later actually wanting to do X.


> “You didn’t actually want to do X, here’s how to do Y instead” may indeed be helpful for the beginner who initially asked the question

Stack Overflow isn't a site for beginners, it's for "professionals". At least, that's what all the Stack Overflow defenders tell me every time I criticize the snarkiness, rudeness and patronizing manner of many answers/comments you receive on Stack Overflow.


I disagree- on SO you’ll usually get some literal answers as well as the more high-level, question-the-premise ones. Why not both?


Because in many cases people don't bother giving the real answer once a comment already has an answer, especially a highly upvoted one.

I'm fine with indirect, "question-the-premise" comments, but they should be posted as comments, not answers, because they are in fact not answers.


In this case, the question actually says “and more importantly, what do you think?”

So the answer is actually on topic.

It may be a bit… stylistically unusual, but I think I came away with a pretty good idea of what the answerer thinks.


> "Some people on Stack Overflow care more about showing off how smart they are rather than answering questions, and I think the point system attracts these people"

"Some" is an understatement.


> and dozens of people post answers claiming that this is the wrong way to do it and that they should use some other tech to solve the problem

Yes, and they are factually correct in doing so. The correct answer to the question: "How do I use a hammer to tighten a screw" isn't a lengthy description of how to perform some magic with a hammer. The correct answer is: "You don't. Use a screwdriver."

HTML isn't parseable with regex. The various answers under the question explain in great detail why [1] that is the case.

SO isn't a help forum, it's a question archive. The purpose of an answer isn't to solve one guys specific question, but to provide an answer that is useful to all people who ever stumble upon this question.

[1]: https://stackoverflow.com/a/1758162/19508364


Your response and the other responses are proving our point. It wasn't about context free grammars, level 2 or level 3, etc. It was a very limited subset of a problem. Answer should have been, "while I don't recommend doing it the way you want to do it, that should work for your limited subset".


> Answer should have been

Yes, and answers on that very page, with lots of upvotes, do exactly that. People looking for answers online, can reasonably be expected to scroll down a page with results.


> HTML isn't parseable with regex.

Poster is not asking for this. He is asking how to parse a specific subset of HTML. And it is demonstrably parseable.

> The correct answer to the question: "How do I use a hammer to tighten a screw" isn't a lengthy description of how to perform some magic with a hammer. The correct answer is: "You don't. Use a screwdriver."

It is not the appropriate way to tighten a screw, but there likely is a correct way to do it with a hammer.

It is fine to point out that there are better ways to parse that HTML, but it is not wrong to do it with regexes.

Sorry to be blunt, but having coworkers like you make the job really annoying. I'm not a newbie, but a seasoned programmer. If I'm asking a question and am in a domain with a fair amount of experience, don't give me patronizing answers.


> Poster is not asking for this.

Poster is not the one answers are for. Answers are for everyone who stumbles upon this question in the future, and the general topic of the question is very much about parsing some HTML with regex.

Again: SO != Help FOrum

> but there likely is a correct way to do it with a hammer.

No, there isn't. Because the correct way is to use a screwdriver. There is certainly a way to do it with a hammer, same as there is a way to write a webserver in brainf__k. Doesn't mean that way is good or should be done.

> Sorry to be blunt, but having coworkers like you make the job really annoying.

Bluntness is fine. I will be blunt as well: Having to fix code full of hammers used to tighten screws is a lot more annoying than having colleagues who try to prevent a codebase full of hammers in the first place.


> The correct answer to the question: "How do I use a hammer to tighten a screw" isn't a lengthy description of how to perform some magic with a hammer. The correct answer is: "You don't. Use a screwdriver."

My car broke down in the middle of the desert because of a screw that came loose and all I have is a hammer. You have just condemned me to death because you assume you know better.


Nothing on that questions makes me think the person asking it wants to parse HTML. Most HTML parsers will never give the result the question described. And unless you want to dig into tar structure, solving that question is an essential part of creating a parser.

So, no, the top 3 answers are all bullshit.


> HTML isn't parseable with regex.

The questions isn’t about parsing, the question is about recognizing a token.


The funny thing is that the person who worted top voted answer is not smart all. He might look smart for a newbie but the question is about tokenizaion and not about parsing.

So here is the reason: the top voted answers are wrong.


Ironic, as answering an other question ("What alternatives are to do X?") instead the original one is not indicating smartness at all.


> If you parse HTML with regex you are giving in to Them and their blasphemous ways which doom us all to inhuman toil for the One whose Name cannot be expressed in the Basic Multilingual Plane, he comes.

Jesus this is cringe


To be fair, that question was written during a period when "how do I parse this html with regular expressions" was asked multiple times per day. And "regular expressions are not a reasonable tool to do that, use a parser" was the correct response to 99% of them. And, at some point, someone decided to throw out a more amusing version of that response. It _was_ funny at the time.


I thought it was amusing. Reminds me of a lot of 90s internet humor.


The thing is that oftentimes people who want to do 'X by Y' are actually asking how to accomplish Q. They think 'X by Y' is the solution and get hit by a roadblock, not knowing that it will not help them and they are wasting time.

This is called the XY problem and is extremely common on tech related forums and mailing lists.

* https://xyproblem.info/


Sure, but the issue is that SO was used largely for people working in companies with arcane rules. I can’t tell you how many times I’ve gotten one of these annoying “don’t do X, do Y” when I already know this. I have to do X for some reason, I don’t know how to do X because I do Y when given a choice and now no one will answer how to do X because someone killed interest in the question by apparently answering it. I use whatever points I get to downvote these answers.

The thing people don’t get is: when you answer on SO you’re not answering that poster. You’re answering anyone who will ever have this question. It’s quite arrogant to assume it will be an XY for every single person forever more.

The proper way to answer is to answer the question exactly as ask and then insert your “but you probably should be doing Y instead” at the end.


Disagree.

Doing things the right way is BETTER.

If you can't, you should add a bit to your question saying "I know the standard way is to do Y, not X, but because of reason Z I can't do it."


Again, you’re not answering the person who asked but every person who ever will. Some of them will be asking because the “right way” is not an option in their situation.


And those people can look for questions where the "right way" is justifiably unusable, or pose those questions themselves (and find out if they really have to avoid it.)

Because you're answering every person who ever will ask, a lot of the people who pass through your question & answer will be people who don't know the difference between the right way and the wrong way. If they want to know how to do something the wrong way, because they don't know what the right way is, an answer that simply tells them how is a bad resource.

It's not enough to tag caveats onto such dangerous answers, because people can't read. Instead, newbies should have to overcome a sufficient amount of opposition to filter out those who don't know why they're doing what they want to do, and the rest can make the little effort of being very explicit about why they want to do something the wrong way.


> And those people can look for questions where the "right way" is justifiably unusable, or pose those questions themselves

Can't be done, will be marked as duplicate.


Exactly. I've seen precisely this "documentation antipattern" occur many times. "How do I do X with Y"? "You probably want to do Z instead". Upvoted, question answered, all other related questions of "no, really I do want to do Y" get closed as duplicates.

Then Googling for doing X with Y gets you a bunch of closed questions and a labyrinth of links all leading to a question that was answered 10 years ago on a different software version where Z possibly was the right way to do it but now isn't.

And of course there's no way to reopen the question because it has been closed by a level 15 Magister Templi moderator and a lowly level 3 apprentice moderator like yourself needs to either answer 146 more questions or moderate 192 other questions to clear enough arbitrary hurdles to achieve holy question reopening powers.

And there's possibly an appeals process but that involves recruiting 13 moderators who you have to convince to give this question special treatment and declare that one of their number of sacred moderators made a mistake.


This is bad then. They are not duplicate questions.


Yes. StackOverflow mods frequently mark questions duplicate that are not. That is something that has been observed by many many people.

Some of it is that SO has gamified shitting on and suppressing the question/asker instead of gamified providing the answer, and built a culture of toxicity that tolerates the abuse of the tools in this fashion.

And when the CEO asked them to tone it down maybe 5 years ago they basically did a collective “am I so out of touch? no, it’s the askers who are wrong”. Extremely funny to read the meta responses to that at the time.

https://stackoverflow.blog/2018/04/26/stack-overflow-isnt-ve...

https://news.ycombinator.com/item?id=16934942

(admittedly "women and people who don't speak english well are particularly unlikely to adopt to the pedantic neckbeard culture we've built" is a spicy take for your average SO'er, or wikipedian, but it's also not actually a wrong one either. SO's culture problems probably do disproportionately chase away users with marginal engagement, nobody likes putting up with formalized neckbeard culture and those users have absolutely encountered it before and absolutely have an aversion/revulsion to entering yet another online neckbeard nest. I think this is a case of “he’s probably right but the medicine would have gone down better with the manchildren if he hadn’t mentioned women and minorities”, and he’s also right that those issues have continued to bury SO over the last 5 years.)


> Because you're answering every person who ever will ask, a lot of the people who pass through your question & answer will be people who don't know the difference between the right way and the wrong way.

Then you have to do two things in your answer:

1. Correctly answer the question as asked.

2. Add your opinion about the "right way" to do it.

If you only do #2, you are failing "every person who ever will ask."


Again, I don't think this enough - because it's a well-acknowledged fact that people can't read[0] (as I said in my comment.) How many newbies are going to see a working solution, try it out, and immediately skip all the extra text that they don't think they need?

[0]: https://www.joelonsoftware.com/2000/04/26/designing-for-peop...


> Again, I don't think this enough - because it's a well-acknowledged fact that people can't read[0] (as I said in my comment.) How many newbies are going to see a working solution, try it out, and immediately skip all the extra text that they don't think they need?

You know that's not your responsibility. If some newbie makes a mistake, that's their responsibility (and a learning experience for them).

And frankly, I think you greatly overestimate how valuable and essential your non-responsive "you're asking the wrong question" answer is.

> https://www.joelonsoftware.com/2000/04/26/designing-for-peop...

That link is about users. You're misapplying its lesson if you're using it to justify not answering a developer's development question.

Quit coming up with excuses for not answering the question.


>That link is about users. You're misapplying its lesson if you're using it to justify not answering a developer's development question.

Why do you think "users" is an inaccurate description of the role question askers have on a developer Q&A board?

Put another way - when was the last time you used a development tool, or a library, or some other resource, and sat down to read the full documentation of it? I would posit that that's very rare as an activity, even for developers who need to develop a deep understanding of what they're using.

It's much more common to learn by doing, and the limit of that learning is very often what the developer can't do. Answers which easily enable developers to do something are overwhelmingly likely to lead to developers doing that thing - much in the same way that a long page of library documentation which gives an example is likely to lead to developers repeating that example, even if at the end of the docs, there's a little caveat saying that you shouldn't follow the example for so-and-so reason.

>If some newbie makes a mistake, that's their responsibility (and a learning experience for them).

But is it a good experience? Sure, maybe they'll learn that they always have to read the whole answer before they use any part of it. But we sensibly have abandoned this no-guardrails approach to teaching in almost every arena where it's been used, because it's not really suited to the way people do things in real life - and in real life, people often end up affecting others with their mistakes.

Does junior developer who learns how to glue SQL strings together in their favourite programming language, and makes the "small mistake" of not learning anything about SQL injection in the process, benefit from the learning experience when they cause a data leak? Do their customers? Or should the learning resources they access maybe use the pedagogical tools available to make sure those kinds of mistakes are really hard to make, even if it occasionally inconveniences a seasoned pro?


> Why do you think "users" is an inaccurate description of the role question askers have on a developer Q&A board?

The are a lot of different kinds of "users," and I think the kind of thinking in that article is totally inappropriate when applied to developer Q&A board.

To be perfectly blunt: the result if what you're advocating is to condescendingly treat experienced people as newbies so dumb that their question should not be answered, because you think they're so dumb the real answer might distract them from the lecture you want to condescendingly give them.

People like that are super annoying and almost always unhelpful.

Every single fucking question I ask on SO has some lazy condescending dude chiming in to answer the easy question he thinks I should have asked, after he totally failed to understand the constraints that made my question hard. Of course, lazy condescending dude always thinks he knows better.


Yes but the right way should be the answer unless it is explicitly stated why they can't use this.

Most readers will be able to use the "right way".


No, the best thing is not assume you know better than anyone who will ever ask this. It’s good to mention what the right way is and why but your answer should always include the answer to the question exactly as asked at a minimum.


I agree but sometimes the answer exactly as asked leads to wrong things. So sometimes you don"t provide the answer to the exact question but include the reason why the exact question is not good. This gives the option for the questioner to comment why the exact answer is needed.

My experience with less experienced developers is that they ask the exact question as that is where they have stuck but they are ignorant of the better ways.

I do tend to answer differently depending on the questioners reputation. If they have a higher rep then I can assume they know what they are doing.


You know what's at least as common on SO? "I don't understand the thing you're asking, so I'll pretend it's the XY problem and tell you about something I do understand"


I’m not convinced anyone interacting on SO can diagnose something like this. The act of breaking down a problem to a tiny part so you can post it kinda guarantees this scenario.

But I think it will always be up to the user of SO (not the poster or answerers) to make the real judgement on what is useful.

Often I think SO is useful to use as a bunch of puzzles folks solved. You gotta decide if they are relevant.


Just some random musings about Stackoverflow:

SO is at its best when it’s actual error debugging IMO. When you google some specific error whoever else has the similar error it’s right there. I feel like GitHub is replacing this more and more though - I often get the GitHub ranked specific error higher than Stackoverflow these days. Usually you get better discussions on the GitHub issues too, for a multitude of reasons. Two off the top of my head:

1. all of the people working on the stuff related to the issue are very close by

2. the moderation is not nearly as heavy handed as SO.

ChatGPT is also much better than SO as well if you can give it enough context and the thing you are working on wasn’t built on stuff released after 2021

I also really like Stackoverflow for current event type stuff, like black swan type events. One recent example is when google’s Paris data center was on fire and infra guys were helping each other out trying to get systems online.

All of this combined means that StackOverflow the forum is probably on its way out though. They made the mistake of taking VC money and the model hasn’t really proven profitable so they have really made some poor decisions to please the vc overlords.

I won’t miss Stackoverflow much other than nostalgia unfortunately - better alternatives have arrived. Seeing the decline of all of the other Stackexchange sites kind of sucks though. There aren’t better alternatives for many of those


Just out of curiosity, what are the alternatives? I still find the moderation approach well made, even if it looks heavy handed. It’s important to create information for the future, not just for now


ChatGPT is not in anyway better than SO - no see the current moderation strike.

Both sides can identify ChatGPT answers as being wrong. The question is how can they be deleted. The moderators say they can delete a lot by manual inspection. SO say that AI tools were deleting wrong ones.


In my experience it’s usually often enough been right enough, and has the added benefit of adapting it’s answer to my issue.

Which is more than Google, or a previously answered StackOverflow question, can say.

But then, it’ll probably greatly vary depending on the problem you are facing.


My biggest problem with github issues is similar to the problems with SO:

Bots closing issues because someone doesn't spam the page. Closing as duplicate of (non related bug). A slew of random solutions that are only tangentially related and don't really solve the problem.


The issue is that sometimes poeple just want 'X by Y'. To get a question answered, you shouldn't have to list every constraint and design descision that led you to that point.

Comes up all the time when people ask how to do things in bash/sh. I know there are better tools for the job, but this is the one I have.


Oh god that just reminded me how often people ignore the question asking for posix shell or “/bin/sh” or other specific shell scripting language… and processed to answer the question using bash, zsh, Perl, python, or even the slightly less wrong (because it is kinda weird to be shell scripting without a moderately normal unix environment) of using a bunch of Unix binary programs to do the requested job without actually solving the core problem of the question, because the tools made it easier…

And then to ice the cake you find the question because your question has been marked as a duplicate of the older question where they answered using unix binary tools… and you specifically asked about doing something in “pure shell script” or something similar to that phrase.

Stack overflow is fundamentally a system design that breaks down at scale due to misalignment of incentives that are necessary for it to work well at smaller scales (as can be seen in the successful operation of various smaller Stack Exchange sites for various topics such as Law, Aviation, Physics, etc)


The bash not sh issue is due to ignorance on the part of the answerers not an XY problem. Also shell can"t do everything - you are in a POSIX environment so use POSIX tools. The Unix environment is an environment of putting together many small tools and not just using one so any shell script can call POSIX tools as a minimum. So just making a complex shell script rather than using the tools does need to explain why.

I must admit that I don"t buy in to that philosophy and like using one tool so for scripting I would do all in python, so I would not be answering that question.

Many Linux users think that their way is the only and that means bash as the shell and many others like GNU coreutils, gcc etc. I am an MacOS user and my professional career includes several non Linux Unixes so I know bash is not the only shell - try csh for fun - which is partly why I use python or previously perl as they are the same on all machines


Since I apparently did a terrible job explaining it (the link does a much better one) -- it is when someone has a problem that they are trying to solve in a way which will not solve their problem adequately or at all -- it is not when they are using the perceived wrong tool for the job.


And what all the replies are telling you is that the most XY problems are misdiagnosis.

Explaining what the XY problem is to people who are telling you about it's high false positive identification is, itself, an XY misdiagnosis.

Your reply is an example of what people are complaining about - you are addressing the issue you wished was asked, not addressing the issue you were presented with.


Sometimes people ask questions like "how do I shoot myself in the foot and still have a working foot", though. Questions are not always reasonable. A question is not always "pure" either, but can embed incorrect assumptions.


> Sometimes people ask questions like "how do I shoot myself in the foot and still have a working foot", though. Questions are not always reasonable. A question is not always "pure" either, but can embed incorrect assumption

But those are correctly diagnosed XY problems. No one is complaining about those.

My parent was told the issue is too many incorrectly identified XY problems, and responded with an explanation of what the XY problem is.

That is the example of a misdiagnosed XY problem, which was kinda my point. This sort of behaviour makes the actual experts leave the site in droves.

If, when answering a question, one where to discard the answer the minute they write "Why would you want to do this?", you'd get much fewer incorrectly diagnosed XY problems.

As I said in a different thread, ChatGPT sometimes does this as well, but at least with ChatGPT, when it is answering a question that was never asked, it doesn't also act like a condescending jackass. There are no "Why would you want to do this?" type of questions.


Asking why does not have to be condescending. I agree that some responses can read that way, or seem in some other way hostile. In text, or with any reasonable spoken tone, I would not assume that a person asking me why is condescending.

But on second consideration, I suppose you would not, either, and I suppose you are specific'ly talking about responses which, each taken as a whole, are easily interpreted as some form of hostile.


I do see both of your points of view. There are some good answers on SO that capture both. They first explain why it’s infeasible, talk about a better approach, then lastly give pointers on how to achieve what is asked regardless using their best reasoning. Think it just depends on the quality


> And what all the replies are telling you is that the most XY problems are misdiagnosis.

I responded to the person above me when there were literally two other comments in this thread.

> Your reply is an example of what people are complaining about

Defining something is not an example of an XY problem.

> - you are addressing the issue you wished was asked, not addressing the issue you were presented with.

As I am not much of a programmer but work on electronics and computer hardware I deal with different types of people than would be on SO, so I am not addressing anything but my own experiences.


> I know there are better tools for the job, but this is the one I have.

You may know, but someone else asking the exact same question may not.

So sorry then, but listing every constraint and design descision it is.


> So sorry then, but listing every constraint and design descision it is.

You don't need to do that, a simple "I know what the XY problem is, and this isn't it" prefixed to every question you ask should be enough to stop the race to tell you all about the XY problem.

I mean, at this point it's clear that more people know about the XY problem than people ho don't.


> you shouldn't have to list every constraint and design descision that led you to that point

True, but that logic goes both ways: Unless told otherwise, whoever reads the question, isn't required to assume that there is a constraint.

If I get asked how to water plants with a sieve, without getting told why getting a watering can is impossible, "You don't, use a watering can", is a perfectly acceptable answer.

Especially when the question is aked in a question archive, rather than a help-forum.

If specific constraints apply to a question, then they should be a part of the question.


The trouble with SO that I've seen is that there are more false positive identification of the XY problem than false negatives.

IOW, any time you think you have spotted an XY problem, you're probably wrong.

And that's the problem with SO moderators and regulars. They classify everything as an XY problem because it allows them to answer the question they know the answer to rather than answer the question that was asked.


Part of this is because problems (X) are complicated and people just as commonly demand the "simplest possible example" (Y) that demonstrates the problem. So people ask how to do Y, and then others ask why on earth they would do that.

One common example I've run into lately, as I've been reading about state machines, is people asking how to implement a simple react component as a state machine, and others objecting to the premise of the question since using a state machine for a simple react component is obviously a bad idea.


Telling people they're doing the wrong thing is extremely common. People actually wanting the wrong thing is common, but not _extremely_ common, and the mismatch is one of the more rage inducing things on the internet.


Alternatively, people might have to use X and Y to accomplish Q because of their organization or team. If it's technically doable, there should be a solution and explanation for that X and Y problem somewhere.


Like how to use a database as a queue, which generally works much better than any queuing system I’ve ever used, except ones that use redis as the database engine for the queue.

I’m sure if you’re twitter Kafka is actually a better solution, for everyone else, it isn’t.


That can be infuriating.

Want to host videos on a laptop (which has a big SSD) and stream them to a Pi (which is attached to a big screen) over a LAN? Hey, here's a post about how to host videos on a Pi and stream them to a laptop! Upvote and share! My point is, you don't even have to be trying to do something all that strange for people to apply the XY Problem logic and refuse to help you.

(Solution: NFS mount and a patient understanding that the Pi cannot play certain kinds of video, so you'll need to transcode some of them first. See? Nothing bizarre, but surprisingly outside-the-box given what I could find online at the time.)


My assumption whenever I see that behaviour in a response is that the responder simply does not know the answer to the question asked.

It’s fine, I think, to answer a question and then suggest a better method. It’s presumptuous in the extreme to dismiss a question with some pseudoacademic neologism.


If doing X by Y is possible and will achieve Q, then the best way to respond to these questions is either of:

1. To do Q that way you would to <solution, or at least pointers to how to find the solution elsewhere> but you will likely it far more efficient/easy/whatever to do <something else> instead.

2. Q can be achieved far more efficient/easy/whatever by doing <alternative>, but if you are stuck with using Y then try <solution or pointers as above>.

Of course this relies on you correctly deriving that they are trying to achieve Q, or them explicitly stating the fact. Maybe instead they are trying to get to K.


Everybody is aware and have read esr etc. The point is that people who are asking X by Y want to learn Y, with accomplishing X as a side goal at best. It's funny that you call wanting to learn Y a waste of time. Is it because you believe Z is a superior way of doing X? Why do you believe that? Experience, science or mathematical proof?


Often the better answer is to just explain what's being asked by strongly encouraging them in the Z direction. Sometimes people just want to understand what's happening behind the scenes rather than just looking to solve for Y.


But that's a different question. And should be asked different.

"I'm trying to learn foo.js and attemt to do so by porting Tetris to it. I know foo.js is a terrible option for a game. So. How can I write to the canvas from foo.js?"

Is often answered properly, if only because to many it's a nice puzzle.

"How can I write to the canvas fro foo.js?" Is different in that it will attract a lot of people explaining that foo.js deliberately did not allow writing to the canvas, because Z"


This is part of the problem. You know your constraints, other people do not, but like to assume they do. So you end up having to write a defensive argument about your problem rather than just plainly asking your real question.

This may help less experienced engineers not understanding their problem set, but for a more experienced engineer it's absolutely obnoxious to think of all the ways to defend my question so I don't have to deal with a rush of "oh but you should do this instead" answers getting up voted that don't actually answer my question, then asking to accept the answer.

To one set of users it's possibly helpful, to another it's useless if not also condescending, often condescending to both sets.

The bottom line is now, SO is so filled with these types of responses, I can't expect to get a very specific question answered, which is really the only reason I'd ask a question in the first place, so why use it?

There are plenty chat groups now via Slack and discord in my field where I can get much more direct answers. People aren't worried about getting down voted for a bad question, people aren't giving low quality answers to boost their points. So for me, SO is practically dead except for the occasional obscure error message that I can query for there.


> So you end up having to write a defensive argument about your problem rather than just plainly asking your real question.

My experience is that it's not so much a defensive argument, but context. My example was poor in that it could be misread as a defensive argument, sorry about that.

I meant it to show how adding some context changes the question. Because, in programming, it is all about context. AKA that "It Depends" meme.


That's unkind, and doesn't really address the nature of the problem at hand.

What's the name for the "I don't want to use a sledgehammer to solve a problem that should be solvable with a screwdriver" problem?

Or in this case, the "spinning up 300 lines of code to integrate an XML parser vs. a dozen lines of code based on regexes" problem. For reasons that are unclear to me, XML parser libraries tend to be painfully difficult to use (speaking from personal experience with 4 different XML parser libraries).

I don't think it's a surprise to anyone involved that an XML parser is going to solve the problem.


LOL. It's not "called the XY problem" just because some Dunning-Krugerite decided to make a website on a budget TLD.

Here, I'll coin a name for a problem I see much more often, which is called the XX problem:

1. User has problem X, and asks for a solution for it.

2. People viewing the question decide that the original user actually has problem Y.

3. Those people tell the original user that they actually want to solve problem Y, condescendingly flame the original user for not asking about problem Y, and if they have the power to do so, edit the original user's question to be asking about problem Y.

4. Those people use poorly-thought-out pop-social-psychology to justify their shitty behavior.

5. The original user still doesn't have a solution to problem X, and they really needed a solution to problem X all along.

It is not without irony that examples of the XX problem are sometimes also examples of mansplaining.


I understand your frustration but honestly the tone and style of your comment is dismissive and condescending. It strikes me that you are complaining about people treating others high-handedly and without understanding by epitomizing that attitude in your own post.


"You're saying your problem is (X) people don't answer questions, but have you considered that your problem is actually (Y) that your tone is condescending and dismissive?"

You realize that it's dismissive and condescending to ignore the problem I'm describing and respond with an assumption that I'm unaware of the tone of my post, right? Pot, meet kettle.


I'm not ignoring anything, and this conversation has gotten strangely emotional for people responding to me. I joined this thread when it had a few comments and explained what an XY problem was. I don't use stackexchange and I am a little bewildered why people are accusing me of being condescending.


> I don't use stackexchange and I am a little bewildered why people are accusing me of being condescending.

Because you're propagating the idea that you (or anyone asking questions) knows better what a person asking needs than the person asking does. Telling people you know what they need better than they do is pretty close to the definition of condescending.

People are emotional because nearly everyone who asks a question on the internet has to deal with people telling them "You don't actually want an answer to your question, you want an answer to this other question." By boosting the signal of the horrible XY problem idea, you're contributing to that problem.

I'm not saying this kind of miscommunication never happens, but the opposite is actually far more common.


I ask follow up questions and get at what the person really wants, confirm it, and help them with it. I'm sorry if I am making troubleshooting harder -- I think that the answerer of the questions should be obligated to solve the problem or they shouldn't be helping.

Unfortunately I had no idea about the perverse incentives for question answering and the terrible moderator practices on SO, so I walked into a minefield giving an answer here that I thought was helpful based on my experience as a hands on technician working directly with people -- but it turned out to be a lightning rod for people's frustrations on these issues.


Is it SO's purpose to make assumption about the intention or context of questions? I would say no, and that it's even harmful, as it prevents the quest-giver from learning the flaws in their solutions on their own.


which you can discuss in the comments below the question, instead of polluting the answers.


You can't discuss in the comments unless you have enough reputation. This is IMHO the absolute stupidest feature of Stack Overflow.


I actually finally made an account this year to leave a comment on this question[0]. I'm not familiar with what or why curricula are what they are so I can't really answer the question, but as a math teacher it'd be helpful for the OP to know that calculus and linear algebra are deeply related; the derivative of a function is the best linear map (i.e. matrix) that approximates it! For practical purposes, calculus is a toolbag to turn (intractable) nonlinear problems into (tractable) linear ones.

But apparently I can't comment as a new user so I guess the discussion will just be "because calculus classes use matrices in examples".

[0] https://matheducators.stackexchange.com/questions/26417/why-...


This is precisely what kept me away asking anything on SO. It is so much easier and quicker searching on or figuring out by myself than preparing a question defensively and then answering all the questions about why asking about that thing and the road getting there.

When I find a straight and relevant answer that is great, I like browsing SO, but I wouldn't dream having a question just to find myself in a word duel with some very eager contributor steering me into his favourite domain. I want to get ahead with a task not being in the quest of finding The Ultimate Solution or having a social life over interesting and related topics.


sprinkle in a 100K+ reputation user commenting "this is easily answered with `man Y`" — oblivious to the fact that Y's manpage is a 188 page monster, and manpages being awkward to search in.

Case in point: https://stackoverflow.com/q/19622198


Here's a question maybe someone can answer for me, or maybe I'm just stupid and other people don't need it, but...

What's up with the "man Y" aversion towards the usage of examples in the man pages? I don't expect some behemoth like ffmpeg to have a billion examples in the man files, but damn, every other reasonably sized CLI app would be so much easier to use, if you had just a dozen or two of examples at the end of the man file.


`man` is "a documentation". It turns out there's more than one type of documentation, but lots of man-page writers didn't get the memo.


tldr pages do exactly this.

https://tldr.sh


bless you - I had no idea this existed!


brew install tldr

tldr ffmpeg


I may be missing something, but the top answer is pretty good. And the comments even explain why searching for an answer on google might be difficult, and suggests how to.


> here’s a classic SO thread

My favorite example, from "TeX stackexchange" :

https://tex.stackexchange.com/questions/100574/really-wide-h...

Quote from the actual answer (which is not the bogus accepted one):

The question wasn't "should it be done?" But, for the same reason men climb mountains, "could it be done?" The answer, [...] is yes. Thus, we introduce [...]


This reminds me of my last ever SO question. It was many years ago, so I might be misremembering some details, but:

I wanted some information on figuring out what texture serialisation was supported on a given client for a WebGL app. I needed to know this because I was optimising the client and had to deal with very large textures (it was an AR / augmented reality context).

Cue a barrage of comments along the lines of "you shouldn't need to do this" "the abstraction means you can simply assume the GPU has infinite texture memory" (!) "just provide all the formats and let the GPU bridge figure it out". Then the question had a downvote and that was that, cast to the bottom of the pile.

It seemed to me everyone responding to my question had this assumption "questioners are morons". A moron asking about texture serialisation is a paradox, ergo the question must have a faulty premise.


Should get rid of the reward system -- or at least reform it drastically to discourage pettiness. An accepted solution should not result in a reward to anybody. Somebody just starting to learn a topic posts a basic question and gets an answer from an old timer - and they both get rewarded! for what? There seems to be an army of "moderators" who are ready to pounce on easy questions and they end the answer by reminding the OP that he should upvote or accept the answer! Did any of them added anything to the knowledge base? You can get excellent answers to basic questions from ChatGPT anyway. Also they won't allow new members (perhaps experts) from commenting until they have picked up enough credits yet. The whole system is downright silly.


I'd like to add the "in jQuery you can do it MUCH easier like this: [...]" when you are searching for Vanilla JS solutions.


SO changed its policies a long time ago already... they don't want to answer your questions. They hate your questions (and by extension, they hate you!).

It's ok, you deserve the hate. After all, you're asking the wrong questions.

They want questions that are "textbook-worthy" or possibly "encyclopedia worthy". They're on record as having said this officially. If programming and technology is just messier than that, if it's more complicated than that, and you still have questions that don't fit because of it... well, fuck you. They don't exist to answer your questions, they exist to answer great questions that they can use to build up this pretty little sight that now has some purpose other than whatever it was you thought you were using it for.

The Z guy, he's playing their game. He's awesome. You, you need to be punished until you comply.


My experience is that the "real" answer depends on the situation. Sometimes one really should use the alternative (e.g. cleaner, more general solution, most updated API), while other times they should address the original question as is (e.g. avoid additional dependency/third-party library).

This is one I ran into recently:

https://stackoverflow.com/questions/30196175/const-methods-i...

The question is very clearly formed. The accepted (and currently top) answer does a good job. But at one point the other answer which is badly worded and confusing was at the top. And I don't think it should ever get the top position of the answers.


Usually prefixed by a condescending "why would you want to Y?"


Or a befuddled asking of the same "Why?". Or an honestly curious asking.

Yes, there are plenty of rude replies, but sometimes it helps to assume what somebody really means is they failed to do a context switch. Exchanging more details can help turn a 'madness' into a method. Even if they were outright rude, doing this can lead to an answer that might not otherwise be given (even though that answer may be provided by someone else).


I'm assuming you are talking about XY problem?

https://xyproblem.info/


They're talking about the XY problem-problem. The problem when other programmers mistakenly think you have an XY problem and ignore your words.

Sometimes, the original poster is not mistaken. Especially when an expert asks a question, there's a reason for it. Assuming the expert to be a beginner who hasn't tried easier solutions is degrading, and forces experts off the site.

Actively insulting your userbase, especially your expert-level userbase, is a bad idea. It leads to StackOverflow falling and collapsing over the years.


    the XY problem-problem
I never saw this term before, but it is perfect. So many times I have experienced it when asking questions in mature tech domains. It is so frustrating. Frequently, my question is asked poorly from the view of an expert, so they sweep it aside as not a real problem. Only after getting help (from comments) to provide more info or improve the writing, does the expert suddenly agree it is an issue.

This is also why I no longer waste my time raising bug tickets for open source projects. You just get shouted down and feel terrible about yourself. I raised many "WONTFIX" bugs in my career against open source projects. What a waste of my time and a harm to my self esteem.


I was active on SO years ago and my answers and questions typically are either upvoted to large number, or downvoted to a large number despite my best attempt at explaining the context.

So I do know this problem (problem-problem) very well and I've since stopped asking or answering on SO.


I was learning Golang nearly a decade ago while participating in a performance competition.. Asking questions about unsafe in Golang Nuts was nearly rage inducing.

IMHO Copilot Chat has picked up on these bad habits.


I've noticed a similar tendency in ChatGPT (answering a question that was never asked), but at least in ChatGPT it doesn't act all condescending towards you.

I literally cannot remember when last SO was useful time, due to the false positive identification of the XY problem.

At least with ChatGPT, it's much faster to get it to answer the question asked and not the question that was not asked.


That is incredibly annoying to find when you search for help for your problem. While maybe OP could/should do something else, you have great reasons to do X by Y. Ofcourse if you make your own question about doing X by Y you get referred to the other question and have yours closed.


Which is the correct response to this question?

> How do I shoot myself in the foot? I’m pointing the gun at my foot and pulling the trigger, but nothing is happening.

Is it:

> A common reason for guns not firing is that they aren’t loaded. Try loading the gun and trying again.

…or is it:

> Woah, hang on a sec! What are you trying to achieve exactly? It seems like you are doing something very wrong here. I’m sure there’s a better way to do whatever it is you are trying to do.

There are many technical questions that give the very strong impression that somebody is asking how to shoot themselves in the foot. It’s not responsible to blindly answer the question regardless of the consequences. Yes, people sometimes overcorrect for this which can be annoying, but they are only trying to steer newbies away from shooting themselves in the foot.


When a user is describing exactly how they're aiming at their foot and pulling the trigger but nothing happens, I'd assume they actually want to shoot themselves in the foot...

Thus, the first response is the correct one.

On preventing people from doing what they intend to do...I think the main issue is the world is a large place, people face many situations, and most advices given in good faith only have an extremely limited view of what people might need to do. And it's kinda awkward to ask for someone's whole life to decide if they are right in wanting to straight rename a table column on their live production DB.

Something could be a bad idea 99% of the time. But that leaves millions of people in the 1% for whom it's the best course of action.


> There are many technical questions that give the very strong impression that somebody is asking how to shoot themselves in the foot.

Sure, but some of the stuff I've looked up in the past is answered with helpful comments like "You won't need to do this, your CA will do this for you" or "this is handled by your certificate verification stack, you don't need to be involved in this stuff".

Well, thanks, but I'm actually implementing both of those things right now and I'm having an annoying issue with figuring out how the API for the (very popular but IMHO poorly documented) library I'm using for part of it hangs together.

> Yes, people sometimes overcorrect for this which can be annoying, but they are only trying to steer newbies away from shooting themselves in the foot.

I think sometimes its because comparatively the respondents are newbies, and are repeating received wisdom.


How I usually try to answer these kinds of questions is "Shooting yourself is generally not a good idea, for these reasons. If you really want to do it, you can try loading your gun and pulling the trigger again, but a better alternative might be to take off your shoe by untying your shoelaces and carefully pulling on your shoe, following the curve of your foot."

But it's hard! On the one hand, you don't want to teach people to do the wrong thing (especially not on SO, where answers are often copied wholesale and pasted into production code), but also not answering the question as asked usually doesn't help the OP at all.

If it's not clear what they're trying to achieve I usually leave a comment asking for clarification rather than answering, though.


The correct answer is obviously the first one. I come to StackOverflow because I want to learn some piece of information about some specific technology, not to be lectured by someone who thinks they know better than me about what I should or shouldn’t be doing (and who is virtually always wrong).


It seems like most of us all have the same issue with SO. However, is there a viable alternative?


ChatGPT has replaced most of my usage of StackOverflow/Google.

It probably won't last forever without people generating new answers somewhere else, but it answers a lot of things correctly, and the things it gets wrong are easy enough to verify.

I really hope ChatGPT causes StackOverflow to change.


> I really hope ChatGPT causes StackOverflow to change.

It already has, and not for the better. SO is currently awash with nonsensical answers that are clearly the result of feeding the question into ChatGPT. Not to mention questions like this: https://stackoverflow.com/questions/76748781/how-pythons-bui...


Well, perhaps that's what moderation should be focusing on blocking then, rather than driving away humans (as was mentioned a lot by other answers in this thread).


The impression I get from commenters is that somehow they think their careers are tied to StackOverflow points. Trying their best to downvote correct answers and promoting their own in order to further their agenda.


Downvoting costs points.


SO ran a careers site for a while, and as a part of that, provided a resume/profile page upon which you were encouraged to share your score and/or your own questions and answers. I wonder if there was actually something to that…


Yes, it's stupid af. Rather to keep silence, than to answer indirectly the quesiton. Or even if you want to suggest "another problem", make it as indirect (not the accepted answer).


For me personally it’s the rise of good documentation. 12 or 13 years ago I needed to build something for our sharepoint 2010 in a world where sharepoint 2013 was out.

I’m not a share point developer, at all, it was my first time with it. I am of the old school however, so figuring out how systems work by reading the manual or specification isn’t foreign to me. I’ve worked with the bitmap format, I’ve worked with solar inverters, I’ve worked with embedded construction software and so on, so sharepoint should’ve been easy. But I couldn’t have done it without StackOverflow.

Fast forward to 2023 and I need to build something for sharepoint again. Haven’t touched it since I was sort of dreading it. Only this time the official documentation made it so easy I never needed anything else. I’m sure StackOverflow could have helped me, but I didn’t need it.

Those are very isolated examples, but really, that is my personal experience with almost every thing I work on these days. Yes, I’ve also got 10 more years of experience under my belt, but I do think it’s because we as an industry have become much better at working through the official channels. I mean, when you have a problem with something today, so you go on StackOverflow or do you go to the GitHub issues (or whatever else they have)?


Out of interest, do you think your recent experiences with SharePoint were because Microsoft decided to unify pretty much all of their services behind the common Microsoft Graph API? SharePoint used to suffer from having multiple different overlapping styles of APIs depending on what API style was fashionable at the time that feature was developed - which could make development a nightmare... (OK, an even bigger nightmare).


I sort of like ODATA on the client end of things. It's an absolute nightmare on the server side, at least with .NET, but for clients it's usually fairly easy to consume. This is true for every part of the Microsoft Graph API that I've worked with except for the SharePoint part. Which for whatever reason works differently when you query things. It's sort of hard for me to answer though. I've spent quite a lot of time in enterprise organisations, and I've integrated with basically everything, and it's almost all bad in one way or another. On that scale I think the Microsoft Graph API in general is a 8/10 maybe even a 9/10, but the SharePoint parts of it are at most a 5/10.

I'm not 100% sure this is the fault of the API, if it's because of SharePoints indexation, if it's because of the 3rd party meta-data indexation that we buy for our document library or if it's because of how terrible our own architecture for the data flow is. But I've had to build a lot of redundancy and caching into the terrible piece of gaffa tape which is our integration, because SharePoint won't give me every document everytime that I ask for them.

So a bad experience? But at least it's not FTP (yes not sftp) pulling different file formats from 9000 solarplant inverters bad.


This is odd as my view is that documentation was much better pre 2000 and SO was needed because official documentation got so bad.


Yes, and documentation that is actually usable. I don't enjoy the standard JavaDoc pages but I do enjoy the, for example, GoDoc pages. I think the programming community did a good job in establishing user-friendly documentation.


A typical pattern I've repeatedly seen in various shapes and forms:

  Q: [very explicit title about F in context X] I'm specifically not asking about F in context Y which I know about but is irrelevant; I'm specifically not asking about libT which may accept it in accordance with standard U; instead, in context X, is F working for libS?

  A1 [100+] [+50 bounty] [auto-accepted] [date d]: "it works" + long winded answer about decontextualised substring of F proposition working for libS in context Y

    Comment 1: this is not the correct answer, see A15

    Comment 2: complaint that A15 note about U is from a second hand site so answer is warrant of discredit [even though authoritative docs about U are not publicly available]

  A2 through A7 [30+]: "it works" (rehash of A1, posted within d+[1..365])

  A8 [30+]: Q is a dupe, see this answer [link to question and answer about libS in context Y]

  A9 through A14 [30+]: "it works" (gives example for libT, which behaves differently)

  A14 through A17 [30+]: "it works" (gives example of standard U, which libS does not comply with) 

  A15 [5] [d+5]: Q being explicit about F for libS in context X and explicitly _not_ about context Y, let's address it for libS in context X: *it does not work* because foo (link to libS doc || source code || example). Tangent for the sake of completeness, context X is irrelevant because bar is orthogonal to foo + context Y. standard U does not apply because libS is not compliant [quote or link about U]

     Comment 1: this is the correct answer

     Comment 2 [d+1095]: answer is invalid because libS vN+1 has just been released and adds F in context X

  A16+ [<0]: completely haphazard lottery answers


> A1 [100+] [+50 bounty] [auto-accepted]

I don't think SO "auto accepts" answers. The only one who can do that is the original asker, manually. See: https://stackoverflow.com/help/accepted-answer

On meta.so this has been asked repeatedly and the consensus is that no, it cannot be done and furthermore, "accepted" doesn't mean "the best", it just means whatever the original asker marked as helpful for their situation, so "auto-accepting" would be meaningless. Also see: https://meta.stackoverflow.com/a/262915/147346

As for upvotes: if a question is upvoted that means the community upvoted it. The "community" in SO is just regular people, programmers both knowledgeable and newbie. I don't know that any "open access" platform can solve the problem of people upvoting the "wrong" answers; it's not a tech problem. It happens here on HN, too!


I have two opinions about your post. One: I agree. Two: I disagree. My point: I see both.

I see "answer as comment" in my most mature tech subjects because people are afraid of downvotes. In my experience, the most mature tech subjects are carefully guarded by a small community of very unfriendly "fastest gun in the West" types. The C++ community is incredibly unwelcoming and negative towards most questions. Note: Comments can only receive upvotes. (In theory, comments can be flagged as off topic, or offensive, but assume -- for my commentary here -- they are not.) Examples of more mature tech subjects: Python, Ruby, ASP.NET, Java (language/foundation libraries), C# (language/founding libraries), C (but not yet C++!), Win32, etc.

For less mature tech subjects (or faster moving), you see many more answers. Example of less mature tech subjects: Python AI/ML libraries, Java/C# open source libraries (Spring, etc.), C++, Qt, Gtk+, Zig, Swift, Golang.


> people are afraid of downvotes

People are afraid of downvotes because people will throw out downvotes without even reading your answer, and once you get the first downvote, everyone else will pile on, also without reading your answer.


It is interesting how a tech-oriented community still replicates monkey or chicken behavior against unpopular individuals.


So, regular humans, right? I am unsurprised.


I am not very surprised either, but it also indicates that we may be overvaluing education when it comes to behavior improvements.


Also answers becoming obsolete because newer versions of the technology came out.

E.g. Python 2 answers aren't relevant anymore, but still there.


Exactly. The top answer is often wrong and you have to look at an edit on the third answer to see "Since 2021, this is solved by doing x y z".


its a problem but once you know to look for the updated comment, it's not that bad.


If they exist. That Google is bad at indexing SO aggravates this.


This is a huge problem for Spring (since the correct way of doing things changes every 2 years..)


SO prioritizes more recent good answers now.


The worst part about xy problems is that even if the user could use a different approach, often the actual answer for the question would be very interesting. But because of SO, it's an xy problem, marked as duplicate and there no answer to the original question.


Owners of the SO sites are control freaks. Mods have started ruining the site around 10 years ago. Now, when I do a fresh registration I can't do basic things like upvote an answer or write a helpful comment, because I need X reputation.

So I can't add relevant info/feedback to a topic I landed on via google, because I don't have reputation, so in essence the site strangles itself. Because in 2023 I'm not going to farm(!) 50 points just to share my info. All this because a few bad apples tried to game a system and someone came up with this lame idiocy of you must have X reputation...

But you can ask questions.

But if noone upvotes your answer because it's a hard one, and/or noone knows the reply to your problem, then you are stuck. So should you ask some blatant question, like, is the sky blue and why? But it will be a duplicate. No reps for you. So it's a totally braindead system, they deserve to die.

Why don't they introduce a system where you can say, I need this answer for 5 bucks. But that will not be implemented, because it would make a dangerous precedent where you could harness other's valuable knowledge in almost an instant.


> Why don't they introduce a system where you can say, I need this answer for 5 bucks.

I think the problem with that is how do you prove it's a good or bad answer? Say you offer a bounty and I generate a great answer. You look at the answer and use it and then flag the answer as bad so you don't have to pay.


Then you will have that next to your profile. Non paying customer...


Someone reading your profile will have to parse whether the not-paid answer actually answered the question, before they can determine whether it's a "non-paying customer" or just someone refusing to pay for an irrelevant non-answer.


You can show something akin to ebay's reliability value based on user feedback. People can establish payment conditions before doing any task, I don't think that would be a problem.


This issue of having the answer in the comments definitely wasn't there 5-10 years ago and makes the site much more difficult to skim and read hence provoking more hapless questions.

Maybe they should expire comments or remove them alltogether.


That would remove a good percentage of correct answers. The obvious solution is to fix their weird and completely dysfunctional moderation.


<userFoo> comment-body [<a href=#permalink-id> This comment has been promoted to an answer.</a>] #timestamp

Or something vaguely like that?


"I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever."

Stackoverflow once set out - with the very best intentions - to be better than expertsexchange et. al. I think that's why they are so strict with their moderation.

The issue, of course, is that this is a hard problem. To moderate a community of experts you need to be an expert yourself. With Stackoverflow I feel most moderators are just overwhelmed with the task, so they aggressively shut down everything they don't understand.

Many of them are probably aware of that and just don't know how to handle it better. Others are so clueless that they do not even recognize the good questions.


I just realized how much SO's role has changed for me as a dev. Like most devs, I used to use it as a common reference. However, I realized just now that it's been months since I've looked as SO.

That's not because I was intentionally avoiding it, but a natural side-effect of it becoming much less useful, largely because of the thing you're citing here. The quality of answers has fallen quite a lot, and SO seems to actively avoid hard questions (and if I'm searching a site like SO, it's because I have a hard question.)


Seconded re moderation.

Also, the site itself - the ethos of the owners; I used to have an account, became disenchanted with the mods, decided to delete my content, close the account.

I then discovered;

  1. You cannot delete answers which have been accepted.
  2. The mechanism for deleting your own content allows you to delete a max of something like five replies per day.
I then took the time to look at the T&C, and SO as I remember it simply make all of your work their property.

I deleted everything I could, over the course of a significant number of days, and left.


I think not allowing deleting answers is ok, this is something like asking wikipedia to delete content you contribute; at some point it isn't really your content and its part of an article. If your asnwer is accepted others are discouraged to post another answer, so deleting content retroactively is really demaging. But also none of the answers should contain personal information so don't really see a reason to delete it.


Read the T&C. The answers are all licenced under creative commons with attribution so anyone can copy them. They are not owned by SO. This is why sites can copy all of SO to get better Google SEO also this is why ChatGPT etc can build their answers based on SO.

So remember it was never your property.


> it simply make all of your work their property.

Do note that your posts on SO are published with a CC-BY-SA 4.0 license:

https://stackoverflow.com/help/licensing

so regardless of who owns them, they can be reproduced, and probably are. So in many senses there is no deleting them.

Also, don't understand why you wanted to delete. Despite its problematic policies, SO is an important public resource and I assume so were your answers on it...


If you live in the EU, you could do a GDPR request to delete all your data. https://meta.stackexchange.com/legal/gdpr/request


I mean, they will remove your personal information from their databases.

Your answers will most likely stay, just anonymized.


The answers will stay they are licensed through Creative Commons Attribution licences so anyone can copy them.


Mmm! thankyou - I will give it a go.


> moderation on SO has gotten progressively more horrible.

Can't agree more. So many relevant questions shot down by an anal retentive mod you end up having to find the answer on another site.


Seeing these issues many years ago led me to predict the downfall of Stack Overflow then, and I see these identical issues with Wikipedia now.


On point 2, even worse is when you ask a question and the mods close it and point to another post that isn't the same. Or they mark the question as too open ended when it's not, or tell you to split up the questions into multiple posts even though that's dumb because they're dependent on each other's answers.


Google keeps returning SO answer threads that are way too old to be relevant. Like a question with a top answer from 2013.


Not all ecosystems disintegrate after a couple of years. Many times an answer from 10 years ago is still extremely relevant.


Not when the answer references libraries that haven't been maintained in five years or have fallen out of favor completely (like jQuery).


I understood what you meant. But first of all, many languages have libraries relevant for several decades; second, asking questions about legacy code is often valid (and often where you need the most help); third, some people even still use jquery! (Poor guys)


I understand all of that - I just question Google's algorithm that prioritizes answers from 2013 vs answers from 2021 for similar queries.


I recently was looking into why wget --quiet --content-on-error didn't output anything with a 403. Turns out its an ancient bug they never fixed... it's not documented anywhere, except some crusty old thread on SO.


Yeah I noticed something happened late 2019 to SO Google results. I feel like today it is better than it was, but still a far cry from how it used to be. (https://news.ycombinator.com/item?id=21503049#21504485)


Out of curiosity: what programming language are you searching for?

I never had problems finding SO answers for C++.


I am old. I find what you wrote resonates with my SO experience, and, in my antiquity, I wonder, grumpily, why "RTFM" isn't faster than stack overflow, and i think the answer is the quality of the manuals to 'freaking' read has decreased. I'm just old enough to realize the Unix manpages were excellent in proprietary Unixes, and then the Linux manpages got complicated by some emacs based reader. Anyway, Unix manpages used to be a quality go-to in order to answer questions, BSD and Sun (BSD and SysV variants) and NeXT and Sequent etc. I'm still inclined to do that, but it's not always adequate.


I responded to a question on SO once where I elaborated on the accepted answer - my solution was more functional as it provided assignment operator override.

Some hot shot user with an absurd amount of badges told me, in a condescending tone, that my solution did not answer the question and linked me to some TOS article or something about staying on topic. This was the first time I had ever answered a question on SO.

My takeaway from the situation is that SO is full of accounts that farm badges/rep. To what end, I do not know - perhaps they reference it on their resume or portfolios.

I called the guy out, seems he has since deleted the comment.


I honestly feel that Google and increasingly LLMs are the challenge here.

Behaviour of users on SO has always been, well, as all the complaints here suggest. It's never really bothered me that the top answer's someone going off on their favourite approach, most questions have multiple answers, and I've learned a lot about coding from reading the top few and trying to understand where the answers are coming from.

The discussion's as important as the answers sometimes, and that's why a potential collapse is a problem.


There is also a bizzare thing that I noticed only when I enabled RSS for some of the tags I wanted to watch. My RSS reader routinely had questions that were nowhere to be found on the site, even though they did not seem to be offtopic or have some other kind of structural problem.

Of course the users themselves could have deleted the questions, but regardless, I did not expect to see that.

It could be a nice experiment to enable RSS for some niche topics and checking automatically after a number of days to see which are gone.


IME moderation was bad after a year or so of SO. Something about the moderation role in general seems to attract certain personalities that have a negative effect on the content they are supposed to be moderating. That said I find SO still very useful, you can always ignore the moderators and caustic comments. Often a question has a high quality answer that I can use even though the question is locked for being off topic or the comments are filled with irrelevant argument.


> google used to return really relevant results for SO, and it stopped doing so at some point a while ago

One thing I noticed in the first chart displaying visits is that there is a sharp drop in April, which is just after Google deployed its last core update: https://status.search.google.com/incidents/Cou8Tr74r7EXNthuE...


This. Stack overflow contains many diamonds. We just can't find them anymore because Google randomly decided to stop being a proper search engine at some point.


moderation on SO has gotten progressively more horrible. can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever.

Yup. And that's why I stopped using SO quite a while (years) back.


I'll add one more with moderation. I've seen far too many cases where person asks "how do I solve X" and it's closed as a duplicate for a question of "how do I do Y". 2 unrelated questions but they have some similar wordings so the moderator shuts it down.


I'm glad SO is going down, because it is a really nasty site. Every time I click on a page it asks again for cookie permission. Probably they want me to register to the web site, which I refuse to. So, no wonder people who are not already members think it is not a good search result.


> google used to return really relevant results for SO, and it stopped doing so at some point a while ago

Oh, so maybe DDG is ahead of Google in terms of quality alone in at least one domain now? Because it definitely still gives answers from SO at the top.


My least favorite thing is how many highly upvoted answers are telling the OP to "do something else" instead of actually answering the question. Sometimes requirements come from above and we don't have a choice!


The sweet spot for SO for me is workarounds for esoteric problems, not for more common issues because the solutions tend to be out of date (but correct for an older version of the library or whatever).


The last 2 reasons are why I stopped to post on SO 10 years ago ... I tried to explain something and some mode was kicking my english and for being "off topic"

SO burried itself


> google used to return really relevant results for SO, and it stopped doing so at some point a while ago

I haven't noticed that at all. I also haven't noticed any degradation in the results on Google overall contrarily to what people claim.

What I have noticed though is that every time someone complains about search results and shares the way they search on Google, it becomes obvious that the problem is between the keyboard and the chair and not in Google.

For example my girlfriend always get completely unrelated results when searching. But she searches an unintelligible mess that even a human wouldn't understand.


I’ve been using stackoverflow for 15 years. Search results were fine for at least 10 of those. But yea, I’m sure it’s me and all the other people agreeing search got worse are also dumb like your girlfriend, thanks for the condescending and helpful answer, mr google turfer


Ironically(?) I seem to have little trouble getting search hits for SO questions that have been closed with some vague explanation as to why...


I have seen similar. I wonder if a new set of people are old enough to use SO now, and their style is different than the original community.


Copilot and ChatGPT have taken a lot away from StackOverflow.


I have found that I cannot and do not trust either of those for anything.


Really? For programming they are immensely helpful! ChatGPT 3.5 even rewrote Rust code for me that compiled with borrowing issues, replacing iter() with itermut() and other things. I also had a question about specific configuration w/ an old version of Chart.JS that I could not find an answer on Google, and ChatGPT figured it out.


I had it write an ansible job to enable a service with a specific name, and I got what I expected: something that looks correct a first glance, but with some subtle errors.


Much like half the answers on stack overflow.

Rather than ask it to write a full ansible job, try asking it the question you would ask on Stack Overflow (i.e. you probably wouldn't ask Stack Overflow to write the full job for you).


This is true for me.

I wanted good ways to deal with pointers to 2D arrays in C, which aren’t hard but you need to remember that [] has precedence over *, so (*pointer)[x][y] is needed to deference. It’s not the hard to mess it up. Ultimately GPT had an OK answer but… it had that immediately without searching and the didn’t have to craft some well written example, it picked up my dirty explanation no problem.

I have the 2D array typedef’ed now, but it’s still confusing to read and hard to work with. I’ll search tomorrow on it.


Asking anything C or C++ on SO has become an invitation to a ritual hazing. If, after dealing with everyone's ego, you can successfully demonstrate that you already know enough and you're not asking to pass a CS test, the Wise Ones may - may - deign to bestow you with an answer. Which usually won't relate to your specific issue.

Also, in C++, you can set your clock to someone commenting that you should just use smart pointers. It doesn't matter if the question is entirely unrelated.

As someone who spent a decade helping people answering code questions on Flashkit and later SO, I find the SO community and moderation now to be so off-putting that I avoid asking anything there if I can. I still give answers sometimes, but I'm much less likely to be on the site at all.


very good insight. Integrating AI into stackoverflow as a question refiner is a very good idea


And as of today, here it is: https://stackoverflow.co/labs/


It is not about SO at all. This is about chatGPT.


yep, moderation is pretty horrible.


Not to mention that you have no rights over the content you create on SO. You can’t take your content off the site if it happens to be an answered question or an accepted answer. Their reasoning is “but the content would be gone and would stop helping people”. The way it completely dismisses your efforts and your emotional connection with your own creation is the greatest indictment of how SO lacks the human perspective. No other platform, not even post-Elon Twitter or Facebook does this. The only exception could be Wikipedia, and it communicates its content format and collective editing mechanics very clearly. SO just isn’t that but thinks it’s whatever it works for them.


If you have such an emotional connection to your Q&A content that it would drive you to delete it, I think you're using the site wrong.


I do because I’m not a robot? I wrote answers that made it to HN’s frontpage (this one: https://news.ycombinator.com/item?id=5243389). Later, that answer was edited by automod to remove “Hello” line because somebody hated that people greeted each other on SO.

I wrote paragraphs only to be closed by mods as “opinion based” or whatever, so I had to turn them into blog posts, and they got picked up by popular publishing channels, like this one: https://medium.com/hackernoon/are-interfaces-code-smell-bd19...

Of course I care about how my work and my relationship with it is treated. Why don’t you?


Something long enough to be a blog post like that isn't what the site is for. I'm not saying you shouldn't care about things like that.

And I wasn't talking about the issue of mods deleting entire answered questions. I was talking about single contributions, and mods deleting things is the exact opposite of what my post was addressing!

On the other side, that compiler post is fun but it's not some big effortful thing. Caring about it some makes sense, but I wouldn't care about it that much.


> Something long enough to be a blog post like that isn't what the site is for

It wasn't a blog post on the site; I extended it to a blog post after mods closed it and flagged it for deletion. As a top 0.5% or whatever contributor of SO, believe me, I know what the site is for.

> but it's not some big effortful thing

It doesn't have to be. It's my thing. It sounds like me, it carries my spirit. Nobody else would have written it then. It reflects a period of my life, how I perceived things; and more importantly, its continued existence has an impact on me regardless how it's licensed. Say, if my writing style, or my choice of words, or my tone were associated with a traumatic event in my past, SO's insistence on keeping it up would be explicitly abusive, don't you agree?

Even if no such event had occured, SO's or HN's insistence on keeping my content online against my wishes is also abusive; it deprives me of control over my thoughts and my words. Is it legal? Yes, 100%. But, is it ethical? No, I don't think so. And for what? For keeping the answer for "how can I simulate a click on a DOM element?" online, as if that problem immediately becomes "unsolveable" when that comment, heck, the whole SO web site goes down. What a pretentious excuse to keep your income stream steady.

> I wouldn't care about it that much

You would if you were me.


> Say, if my writing style, or my choice of words, or my tone were associated with a traumatic event in my past, SO's insistence on keeping it up would be explicitly abusive, don't you agree?

No, I'd say your trauma is causing you to make an unreasonable demand. The writing style in a short technical post existing somewhere should not be harmful.

> SO's or HN's insistence on keeping my content online against my wishes is also abusive; it deprives me of control over my thoughts and my words. Is it legal? Yes, 100%. But, is it ethical? No, I don't think so. And for what?

It's supposed to be a collaboration, especially SO, and keeping things intact is important for that.

Just yesterday I found a helpful guide on reddit where half the posts were "." It's pretty clear how a site where the primary purpose is guiding people would do a worse job if it worked that way.

> And for what? For keeping the answer for "how can I simulate a click on a DOM element?" online, as if that problem immediately becomes "unsolveable" when that comment, heck, the whole SO web site goes down. What a pretentious excuse to keep your income stream steady.

If it would affect the income stream, then it's something that makes the site bad for users. You can't argue both sides of that at the same time.

> As a top 0.5% or whatever contributor of SO, believe me, I know what the site is for.

Maybe? I don't think most contributors would be anywhere near as upset about an inability to delete posts.

SO wants text like you gave them, I don't think they want the level of emotional investment in those pieces of text. (They might want emotional investment into the site itself, but that's a different thing.)


> I don't think most contributors would be anywhere near as upset about an inability to delete posts

Most contributors of any platform wouldn't be upset about anything. Conformance of the majority doesn't imply the rightfulness of the action.

But many people are upset about it, rightfully so; see for yourself: 1. https://www.google.com/search?q=why+can%27t+i+delete+my+cont... 2. https://www.google.com/search?q=why+can%27t+i+delete+my+cont...

> I don't think they want the level of emotional investment in those pieces of text

Of course they don't. They want you to be a free ChatGPT as much as possible. The less human you are, the better for them. That doesn't mean that what they are doing is okay or harmless.


> No other platform, not even post-Elon Twitter or Facebook does this.

Doesn't hacker news also do this?


It doesn’t. You just have to send an email.


The question was about whether hacker news allows you to remove your content? It does not. You can send an email just to have your content anonymized, but it will remain forever.


That’s appalling. I thought they honored deletion requests. Terrible look on HNs side too.

Edit: It seems like there’s a wiggle room, but still terrible. https://news.ycombinator.com/item?id=28293237


I happen to have tried this multiple times. Every now and again I want to erase all trace of me on the internet. HN doesn't allow it. They tell you that you can give a spreadsheet with comment ids and how you want to rephrase things in case you've given TMI about yourself (in order to anonymize). I just don't have the time to go through my entire history to do this. So I attempt to post less, knowing I don't own my comments and I can be tracked

Reddit is actually a lot better in this regard


Perhaps if HN eventually improves in this area then we will have all the proof we need that it is in fact becoming more and more like Reddit. Because it seems like a lot of the discourse is already checking a lot of those other boxes.


The content is licenced under the Creative Commons license, which ensures that everyone can take it and make something useful out of it. For example in the case Stack Overflow turns evil, adds a paywall or something like that.

This is a lot better than almost every other comparable site with user-created content. In most cases the company has the rights there and can do whatever they want with it, and users have no right to reuse the content of the site.


SO can still respect Creative Commons and the human at the same time by granting their wish to remove their content from SO when they request it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: