Hacker News new | past | comments | ask | show | jobs | submit login
How often do people copy and paste from Stack Overflow? (stackoverflow.blog)
221 points by prakhargurunani 5 months ago | hide | past | favorite | 230 comments

Interestingly, I don't remember when the last time I did so was. I seem to be relying on SO less and less. I do sometimes use it as a quick reference, usually because its one of the first results on search engines, but its typically just a quick "ah, that's the function for that" or "ah, that's how you do X" rather than actually copying code.

I'm trying to think what's changed, I guess I've just been writing stuff that either I'm super comfortable with or is niche enough that there's not that much on SO that's helpful. Looking in my browser history, the last thing I looked for that I got an answer on SO is what's the differences between C++'s std::scoped_lock and std::unique_lock, which was a few days ago. I still use SO, just not as frequently as a few years ago.

I do: Yesterday when I was looking up how to declare and use arrays in bash for the 1000th time. Bash is probably the #1 reason I end up there through a search.

After 25 years professional experience (plus 12 years before that), there are a number of reasons why I search for "how to do X in Y" online at least once a week:

- The language is poorly designed and has a lot of unintuitive, unorthogonal, difficult-to-remember parts or nasty footguns, but people continue using this beast-that-cannot-be-slain out of industry inertia (like Bash or CMake).

- The language is in a constant state of flux, adding new features and best practices / idiomatic things, and the cruft-that-must-not-be-used buildup is so insane that even a few years out of the loop makes you feel like a complete beginner (like C++ or Python or JS).

- The language is complicated enough that unless it's your day job you won't remember how to do basic things after a few months away (like Rust or Haskell or Gradle).

- It's been 10 years since you touched it and how the hell do you concatenate a string again???

I think the only languages I can just slide right back into are Java, C#, LUA, and Go (and C, but probably only because I used it daily for decades).

man.. I literally just did a job interview where I was asked if I had bash scripting experience, which I do, and my number one thought was "please for the love of god don't ask me how to do anything with it other than pseudocode / general sys adm work'.

man bash

woefully underused, but has everything and is extremely well written with no fluff.

I'm trying to think what's changed

Among other possibilities:

SO suffers from hostile moderation and a generally unwelcoming culture, perhaps even worse than Wikipedia. This has a profound chilling effect on positive, substantial contributions, particularly from new contributors.

SO had a great strategy initially with relying on search engines to index everything, but it never seems to have solved the recency/relevance problem. In that respect, it has become its own worst enemy, with old answers about obsolete versions and practices often ranking highly in search results.

The first of those problems then exacerbates the second, because the same cultural issues get in the way of both updating answers to old questions and asking new questions that might be superficially similar to ones that already exist but actually need a different answer.

Meanwhile, in many areas of programming, documentation from other sources has become both better in quality and more readily located thanks to other well-known sites and high search engine rankings. Relatively speaking, SO simply isn't as useful if there is already primary documentation that answers questions correctly and comprehensively.

And finally, you personally may have grown as a developer over time, becoming both more capable of solving problems for yourself and more familiar with whatever tools you use regularly, so you might not need external help so often.

FWIW, I'm also in the "rarely visit SO any more" camp. I think I have a kind of banner blindness for SO hits on search results pages now, perhaps because I'm assuming that following a SO link is unlikely to provide a useful answer so I almost always check other plausible sources first. On those occasions when I do get as far as visiting SO, I'm usually reminded of why I tend to work this way now.

> SO suffers from hostile moderation and a generally unwelcoming culture, perhaps even worse than Wikipedia. This has a profound chilling effect on positive, substantial contributions, particularly from new contributors.

This rings very true to me. The GP's comment made me consider that I very rarely go to Stackoverflow anymore, no longer contribute and what stuff I do find there useful is from well before the cultural shift in moderation, back when questions could be a little more open and answers a little more freewheeling.

It's a wasteland now. Legitimate, perfectly answerable questions get hostile comments and close votes.

The other sites in the Stackexchange network aren't nearly as bad, fortunately. English Language Learners and Math/MathOverflow are very positive places. Software Engineering is pretty good, too.

There's also a lot of people karma farming (for employment reasons?). A couple of years ago I was searching for something and the same question was asked by someone else on SO and this came up repeatedly while I was looking, but after 4 years there was no answer so it was a very low quality result. A couple of days later I did find/discover an answer so I went back to SO and added my reply, miraculously after 4 years someone else added an answer about 12 minutes later, it was the same answer as mine with a little more formatting and marked themselves as correct.

So after that experience would I go to the effort of contributing back again? Maybe in this case, because the answer was shorter than this post, but the equation might change if it required more effort.

I actually deleted my account some time ago via GDPR request. So I haven’t contributed in a long time (but wasn’t a heavy user anyway). But I have noticed a major decline in my using it at all, even for just reading answers. Yes, I did look up that C++ thing a few days ago, and ended up finding the answer on SO, but before that, I have no idea when I used it last. Could be months.

There are weird people on SO who are both trolls and knowledgeable. There is one person who edits EVERY SINGLE question regarding a certain programming language. Then attempts to close every question as "already asked", even where it is somewhat questionable whether it had been asked already or not. He knows his stuff really well though so he ends up not getting banned.

I do not get it.

If you're talking about Flimzy, yeah, he's a tool.

ha, no somebody else. maybe this um, personality quirk is not uncommon.

> SO had a great strategy initially with relying on search engines to index everything, but it never seems to have solved the recency/relevance problem. In that respect, it has become its own worst enemy, with old answers about obsolete versions and practices often ranking highly in search results.

Oh god yes, this is a nightmare.

Also the moderation is often hidden. Last week I ran into one of my answers for a language I haven't used in a while. When I read the answer I thought it was odd because the english was very poor. Then I checked and found that a moderator had made a really poor edit of the original answer. Maybe this is reasonable on a site like SO but I really felt that somebody had essentially impersonated me and entirely changed the tone of some writing which was attributed to my name.

> SO had a great strategy initially with relying on search engines to index everything, but it never seems to have solved the recency/relevance problem.

I think the SO leadership is currently searching for a solution with this issue. Whether they deploy it and it works is another matter.

Regarding the moderation and unwelcoming culture, I don't really see it, but maybe it's because I follow different tags?

Once SO killed expertsexchange and similar sites, it became less useful to me!

I stopped relying SO and started relying heavily on https://grep.app and Github search. Most of the time the code actually works and there's good references on how to structure things.

If you like grep, you'll probably really like Sourcegraph, which I believe has most of GitHub indexed and supports some useful operators: https://sourcegraph.com/search

It will also download repos for you if you point them at it.

I’m finding GitHub files and blog posts dedicated directly to my topic more and more, even relatively minor questions or function syntax.

StackOverflow is great when you don’t know where to start looking for a solution. Whenever I’m in a new domain I’ll spend a lot of time on SO, spending less time there as I get more comfortable.

Put another way, I spend 100% less time in high school than I used to, but HS is still a valuable resource.

I also use stack overflow very little but I think it’s much more easily explained by my personal habits than changes on the site.

When I used stack overflow more, I was doing stuff with web technologies and Unix command line tools. These days I’m much more experienced with Unix tools; I use more specialised tools that aren’t mentioned on SO, use a programming language which isn’t often mentioned on SO, and a large array of libraries which are even less frequently mentioned there. I’m also generally better at finding my own way around—digging through code, jumping to definitions, looking up errors. I think the reason I don’t use it so much is that it is less useful because it doesn’t have the answers I care about and because I have fewer questions for it to answer.

That does sound a bit like me too. I use tools that I’m e it her familiar with enough not to need SO, or that are too specialised or niche for SO to be useful. I have no problem reading documentation or just diving into the code if all else fails.

>I'm trying to think what's changed

SO used to have really god content for really some fundamental questions which gave it some credibility. As it accumulates more content the quality is really going down - I actively avoid SO unless I can't find any other source.

GitHub - literally every problem I have nowdays I spend searching through GH issues, reading through the code upstream and reporting back. For example I was working with .NET framework for a while (~5-6 years back now) and when dealing with closed source frameworks SO was a replacement for support channels. Now even as I'm coming back to a .NET core project - everything is OSS and on GitHub.

Docs also got better, MDN is now defunded by mozilla but where I had to do a SO search for random JS behavior I can now look up decent references on MDN. I think Google got better at indexing documentation as well.

> SO used to have really god content for really some fundamental questions which gave it some credibility.

The moderators' refusal to allow "opinion-based" questions has been very harmful to the usefulness of the site. Now you can only find best practices if someone happens to add them as a tangent. Or if you happen on an old one that got 1000+ upvotes before the mods decided theory was off topic.

> I think Google got better at indexing documentation as well

I think this is it. I also thing documentation in general has gotten a lot better at answering questions rather than being just an autogen api doc site.

Sure it wasn’t just autocomplete? Autocomplete eliminates half my reasons to visit SO.

Previously you had to know the function you wanted to call. Now you can just scroll through the functions from typing the dot.

I don’t rely on autocomplete quite as much as some people, although it for sure helps.

I do think that a big part is that the two things I spend most of my programming time on, one Clojure project and one C++ project, I’m just pretty familiar with at this stage. I’ve been using Clojure for ten years and the language hasn’t changed all that much, so I don’t need to look things up often (besides the occasional reference material on clojuredocs). For libraries, I check their readme’s and just read the code. I’ve actually never really used SO for clojure, now that I think about it. The Clojure project I’m working on is also a little bit niche, so I don’t tend to find anything useful on SO for it.

For C++, I keep cppreference at hand and only really use SO when I’m trying to figure out weird things, or newer features I’ve not yet used much. Again, for the libraries I use, it’s a mixture of documentation and reading headers and it’s rare that SO helps there.

In my previous job, I was mainly working with AWS services, terraform and some python (using boto) and, again, I had everything I needed from their documentation. Before that, it was Java with some Clojure, but again, the challenging part was the domain, which was nothing that SO could have helped with. Everything else, we had under control.

Maybe I got better at reading documentation?

The quality of autocomplete that's possible has a lot to do with the type system.

Same. Sometimes I forget a function name or method call (especially when using a language I use rarely) and just use stack overflow for the example and as soon as I see it I'm like "ahhh". No need for copy/paste.

Most frequently I do it for PHP. I help a friend with a Wordpress site, PHP is easy to read but I'm hardly fluent in the ins-and-outs, and a simple Wordpress hook added to functions.php can fix so many issues. I always leave a comment with the full link for context.

That's the case of for me as well.

I always leave a comment trying to explain to my future self where I learned to do stuff that was not intuitive.

Bash, Make, non-standard Python libraries, E-Lisp, and Fortran.

Also, it's been quite a while since I've touched JavaScript, so the newer features have been harder to grok.

IMO its not even that we're copying people's logic, its just that stack overflow acts as a weird sort of crowd-sourced centralized documentation for programming languages.

For example if I forget the name of a function for something in a particular language I don't even go to the docs, I just google something like "python reverse list" and click the first SO link.

Is it just me or is this also a symptom of Python's documentation being really strange to navigate and generally having a massive impedance mismatch with Google?

When I search on Google for "python reverse list", not a single link is to the official Python documentation. Not even if I search for "python reverse" does the documentation page show up. Searching for "python reverse documentation" leads to the second link to the Build-in Functions page (https://docs.python.org/3/library/functions.html), which is what I "need".

Excuse the comparison, but "matlab reverse list" has the top three to the official documentation (all of them relevant, but slightly different semantics). Why can't Python be better than that?

It's not you - Python docs are very comprehensive but organized in a very odd way and miss a lot of marks. If you have time to 'read everything' you're fine but it's somewhat less suited as a raw reference.

Although I don't mind the informal voice of the docs - there's clearly not much in the way of editorial oversight and pro technical writing. It makes one reconsider how much actual effort goes into putting together docs from corporate backed efforts.

That said, I don't think anyone does it perfectly well, and that we could do a lot better in terms of providing information - to the point wherein I would consider it a little bit of a failure that SE has to be consulted for smaller problems. One would hope that 'good docs' would provide clear and concise answers to such things, rather than have the crowd cobble it together and vote on it.

I actually don’t mind the python docs, but then, I learned python from them back in 2001 when I had infrequent internet access, so I had to rely on my offline copy!

It’s not just you. Python seems to suffer from Python-specific ”tutorial sites” being SEO’d above Python’s official docs. I don’t know what it is about the Python documentation that lowers its rank on Google search results. In general, not a big fan of Python docs.

JS has / had the same problem, which is why a common tip is to add "MDN" to the keywords to get a higher quality reference style result from MDN: https://developer.mozilla.org/

I wish there was an MDN for Python/Julia.

Python documentation is a static Sphinx site with no need for JS to render content. That's the easiest possible design for search engines to index. It's Google's fault that their algorithm can be gamed by low quality seo sites.

The only way it wouldn't be able to be gamed is if they kept making large changes to it for no other reason than to be a moving target. Like you can say "oh it's stupid that they prioritize X and not Y when Y is a much better indicator of meaningful content" but as soon as they switch Y to be a priority, sites will shift to maximizing the amount of Y.

Google can fix the problem. They just don't want to. They were able to kill the Wikipedia clones by demoting their rankings. They don't want to surface non-profit ad-free sites on the first page results because that cuts into revenue.

With how much money Google spends on ML it's really a wonder that they haven't been able to out-compete a bunch of consultants on SEO. This really seems like a blunder on Google IMO.

> Is it just me

no. I stored the entire python document set locally, yet I still search online for functions.

stuff like: is it str.beginswith() or str.startswith()?

Blast, how do I convert to/from unicode again???

I think what we need is some sort of mashup of rosetta code and the docs and an index.

> When I search on Google for "python reverse list", not a single link is to the official Python documentation. Not even if I search for "python reverse" does the documentation page show up. Searching for "python reverse documentation" leads to the second link to the Build-in Functions page (https://docs.python.org/3/library/functions.html), which is what I "need".

But what you want is probably ls[::-1]

It's a shame that most manuals have evolved to web pages that you can't download as PDFs or eBooks. Now you usually have to go online to find that keyword, and as you're online already, might as well google for the whole answer.

One of the most productive times in my career was programming for the IBM i in the 90s. There were manuals online, but you could also download indexed PDFs versions. IBM did a superb job with their documentation, there was hardly any need to look up anything in the Internet, whatever you needed was in the manuals, examples included.

Of course you still used the Internet at work, but mostly for mingling with your peers and having technical discussions, although too many people still blindly copied/pasted from the forums.

You may be interested in https://devdocs.io (offline-friendly documentation tool). And if you prefer a straight up desktop app - https://zealdocs.org

I also really enjoyed Dash when it was free to use: https://kapeli.com/dash

Why not pay for it or get your employer to pay for it, especially if you enjoyed it?

Seriously, it's a $30 one-time purchase. In an age where companies want to charge you $30 per month for software, that's a steal.

I was in school when they started charging for it, and then it kind of just fell off my radar until I remembered about it now.

Thank you. I've used it, although it adds a lot of complexity for the same features than a folder full of manuals would give.

I'm probably too set in my ways, but I prefer references that are always available offline. Devdocs.io once in a blue moon would forget about my choices, and I would hate to redownload them.

I once downloaded Python docs as an iPad app and a small Python IDE - I was working on a cruise ship and so offline documentation really helped.

It taught me to get better at reading the Python docs, but more importantly to cope better without being able to just search every issue I was having online.

My first programming experience was with mIRC scripting and it was ridiculously convenient to have everything in a nicely written a local, offline .hlp file


I remember spending the afternoon tinkering with it until late at night when dial-up "logic" meant I could finally go online without it costing an arm and a leg

That's interesting that you say that because I actually have the opposite reaction. When I go to look for documentation and find that it's a PDF, I want to die.

PDFs are horrible as eBooks, but in a manual I prefer the fidelity of a PDF than the reflowing of an ePUB, or the fickleness, slowness, and potential unavailability of a web page.

A well structured and correctly indexed PDF is a godsend, because you just open the table of contents, quickly locate what you're looking for, click and there is your answer. As I said, IBM excels at writing documentation. Every language has a Language Reference manual, and a Programmer's Guide manual. The first is for reference, the second is to learn how to use it, including examples.

Don't take my word for it, you can check out, for example, the COBOL manuals here: https://www.ibm.com/docs/en/i/7.1?topic=languages-cobol

I think it depends. For something well-structured like a cookbook I'd prefer PDF in most cases. But it really varies. For example I find most guidebooks do a decent job of Kindle these days and have actually switched over to a significant degree because it's really useful to have everything on my phone.

for this specific use case a single html file that you open in the browser might be the best option (at least php offer that option and I appreciated it).

The first PDF I had was a 'bonus' book that came on a CD included with a programming book. There was a compiler along with sample code from the book, and then something like SAMS Teach Yourself Visual C++ in 21 Days.

I absolutely hated it. My screen was only 640x480, so I could barely see a whole page, much less see my IDE at the same time. These were the days of single 14 or 15 inch monitors, not dual or triple 24" widescreens.

It was also a little on the slow side. Early Pentium, 8MB RAM, 4X CD-ROM...

These days I'm delighted to see a PDF. There's a table of contents and index, they're searchable...copy and paste usually stinks, though.

PDF's are forever. They print nicely. They don't get lost when the project upgrades a major version, you can still use them when there's a DNS issue with your ISP, you can use them when you don't have internet...

I have a hunch that this tendency to look up SO rather than actual language/API docs is prevalent more in some ecosystems than in others.

For eg, with Rust and Go projects I would invariably read the actual docs (which I find are very accessible to read) as compared to when I write Python or C++ where I'm happy to SO my way through my task.

However, what I've found is that reading the actual docs is better for multiple reasons: 1) it reinforces your learning / memory via spaced repitition 2) you tend to glean some extra, related useful info from the docs.

These days I try and put myself in a no SO straitjacket as far as possible, forcing myself to read the actual docs instead.

I would definitely agree that it depends on the strength/weakness of the docs/SO for that ecosystem.

For example. MDN had terrific docs for JavaScript and web apis. Things are discreetly segmented and this easy to navigate. Terrific, up to date information about how to things, why as well as great information on the caveats of the API (what works where and what fails where).

The SO answers are often a hack on how to implement it in an outdated framework. And ten variations that are "faster".

But you might still end up there because you're not sure how to look it up directly in the docs. Maybe you do an SO layover.

.Net is similar, although the official documentation suffers from marketing speak pollution, where it's hard to find answers because Microsoft names their APIs by randomly permutating ASP, Entity, Core and Framework when making things.

Add to that all the trashy noob doc sites, codeproject, csharpcorner... that pollute search.

Both ecosystems suffer on SO because of SOs inability to stay up to date with a changing API. A great example is all the Python 2 answers.

Then you have ecosystems like Scala, which has terrific documentation in source code. As well as official docs that are reasonably easy to navigate. But it's not a huge ecosystem, so sometimes it's hard to find answers to questions that haven't been asked. Then SO provides a valuable platform to get those answers. Sometimes from the maintainers themselves.

All in all, SO often offers a good complement to official docs, ranging from excellent to subpar. Personally I wished it was better at providing up to date information.

I do this for <canvas> sometimes and get... myself in 2013 answering somebody

This happens to me too. It freaks me out, because it means that not only I forgot the answer, but that I forgot that I ever knew the answer.

Also been there, however mine was me answering my own question. Several years later I helped myself out again with it.

> (...) acts as a weird sort of crowd-sourced centralized documentation for programming languages.

I see it more as an expert system where problems and their solutions are documented in a queriable way.

What stack overflow offers is more than your run of the mill documentation. It leaves a paper trail of weird corner cases and their workarounds.

Just in case this idea needed any more validation, a few years ago Stackoverflow themselves launched a 'Stackoverflow for Documentation' product [0], and eventually shut it down because, as was probably obvious to many going in and certainly in retrospect, this product was of course.. just Stackoverflow.

[0] https://meta.stackoverflow.com/questions/354217/sunsetting-d...

It’s a shame, I really liked where that product was going. I work best with concrete examples.

I just google something like "python reverse list" and click the first SO link.

If that's your workflow it might be worthwhile adding a custom search to your browser so you can just do "SO python reverse list" and only get results from SO.

Chrome https://www.google.com/amp/s/www.techrepublic.com/google-amp...

Firefox https://support.mozilla.org/en-US/kb/assign-shortcuts-search...

Or use duckduckgo: it basically has this builtin, so it works the same way without configuration across browsers. Enter !so <query>. No configuration is always nice, plus this really helps when working on other machines than your own.

“Back in the good old days”, you’d get detailed documentation with for example DirectX. The whole visual c++ experience was so good. I never needed to look up anything. Browsing for documentation is such a zone-exiter...

Rails, Some gens, and some Java projects are the few properly documented projects out there.


Or is it reversed(list)? checks Stack Overflow

Search it to find whether destructive operation or return new list

That’s exactly the thought process lol

I once had a brief contract helping out a two-person Rails consultancy where pretty much all they did was follow RailsCasts.

They got very angry at me several times for not doing things "the Rails way". We were on Rails 4, which I already had loads of experience with. The RailsCasts they were following were written for Rails 2. They literally had no idea all of the ways Rails had moved on between those versions.

Their codebase was an absolute mess as well. This was an application that was supposed to contain medical records and it had broken routes that were leaking data everywhere unauthenticated. And they were mad that I spent two weeks cleaning all of that up and bulletproofing their application.

I was happy to move on from that one, but it taught me a lesson about just how valuable sales skills are. These two people were living a comfortable lifestyle off a single paying client (essentially getting money indirectly through DARPA) and were punching way above their weight technically.

Why "government founded grants", "medical software", "incompetence", "very profitable" and "data leaks" are always magically joined together...

It gets much more interesting than that...

The field itself is a poorly understood area of medicine. By that I mean, it's only about 50 years old and no two doctors practice alike.

It's science, but nobody has consistent repeatable steps that seem to be conclusive beyond their own individual labs. The treatments are extraordinarily expensive.

I'll leave it at that because I'd prefer not to give identifying details (even though I basically have), but really the whole thing was eye-opening for me.

With the right connections you can have a very comfortable life with receiving grants from DARPA. I used to work for a startup founded by ex DARPA guys and it quickly showed that there was an insider club of current and former DARPA people that lived off grants.

I was told long time ago that without being "insider" I will never get direct access to NYSE and will be able to make money by doing so, proved them wrong, while there is "insider club" in NYSE, to degree there same "insider club" for DARPA or any other enterprise/organizations.

I think interesting questions is; to what degree insider club introduces friction for new comers with justified novel proposals.

Good for you! Nobody is saying that it’s impossible for an outsider to get access to NYSE or to DARPA. I just said that there are arp ton of insiders who like to hand contracts and grants to each other.

For example, trust plays factor too from PM perspective at DARPA, it is not all conspiracy.

This is not about conspiracy. It’s just a comfy Insiders club like many others.

I 110% believe you :D

I was working at a company that was acquired a few years ago, and part of the deal was them doing a source-level audit of code, mainly looking at licenses and open-source. They used some software which could detect copy-pasted source code, and had Stack Overflow indexed.

The Stack Overflow stuff was a massive, massive pain, for several reasons.

For one, it's not clear how CC-BY-SA applies to code[1]. This involved many long meetings with copyright lawyers. They ended up asking for all this to be removed, which our C-level people eventuality agreed to.

We had to then go through several hundred "flagged" things (in a hundreds of thousands of lines codebase) and fix or justify everything.

Only a few cases were outright copy+paste of a big function or class, and those were at least straightforward: rewrite. Usually it was old code anyway, and replaced with newer style, much more concise code.

Tons were one- or two-line things, which involved lots of arguments along the lines of "that is the way to open a file in append mode in this language, and logFile is the appropriate variable name in this case!" This is really a failing of the tool they were using, but it still required answering.

The funniest was a handful that were our answers: they were copy+paste our team had done from our code to SO! (Nothing proprietary, just a few lines of an algorithm or solving whatever random problem).

Anyway, this experience left me extremely cautious about introducing any external code (as opposed to libraries) into a codebase. OSS libraries with permissive licenses (MIT, BSD, etc) are easy to deal with: it's straightforward to comply and being a library means the code is nicely compartmentalized if you need to replace/remove it.

[1] https://opensource.stackexchange.com/q/1717

I feel like the experience should make you extremely cautious of using stupid tools to analyze your codebase, not the introduction of external code.

Yeah, they definitely should have bailed on the acquisition when the acquiring company asked for proof they weren't using code they didn't have a license to....

It gives the impression that they don't trust their other more standard testing and analysis tools, and also that they don't trust their developers to use discretion when selecting online resources.

And it seems the software itself reflects a misunderstanding about how programming code is actually manifested. It's very possible to read the official docs or use an IDE and end up writing the same lines of code in some Stack Overflow answer, as many here have said already.

Checking the origin of the code is a standard step in any acquisition process (I have been involved in several). It's reasonable that people do not want to pay for some code that they later discover has to be released as source. The general attitude is "trust, but verify".

As you say, there might be a case where two independent implementations came up with the exact same code, although the probability of this is reduced exponentially with the size of the exact match. Keep in mind, we do not talk about "i = 0;" here, but something longer. And we do not discuss examples matching the official documentation, because then you can point to it as origin, instead of SO -- and it would probably be under a better license.

The questions are: even if you wrote it yourself, do you justify the costs of potentially going to court to defend the case and are you sure you will convince a judge, who will be the person making the decision?

This. The CC-BY-SA default license on Stack Overflow makes whatever is posted there unsuitable to be copy-pasted in a proprietary-licensed software. It takes a lot of effort to train people not to do that.

That tool sounds properly dodgy too. Wouldn't their indexing of S.O. answers be itself an IP issue?

Using SO data for purposes like that is allowed under the CC licence used. You don't even need to do any scraping, since you can download an offical data dump of all SO answers at https://archive.org/details/stackexchange

It's great that they officially releases data dump to IA.

I have spent multiple hours digging into something, found my own Stack Overflow answer and pasted it back in.

I do this with documents and help articles I've published. "Here, let me show you how to do this." 6 mos later: "How the heck did I do that?... oh yeah. That makes sense, neat."

I had a similar thing happen after watching a YouTube video a couple weeks ago. I was going to leave a comment and looked through the comments first to see if anyone felt the same as I did. I read a comment that was nearly word for word what I was thinking only to realize it was mine.

I have got the "you can't upvote your own answer" warning several times over the past decade. It's a wierd mix of delight and embarrassment.

What's worse is when you find your own answer and don't even remember writing it!

Worse still is finding entire old projects with your name on them and not remembering them at all.

Happens to me with my own code. "What the hell was I thinking?" Isn't uncommon when I find I got a bit too clever on something. Over the years I've learned to be less clever and more literate in my coding style.

Why did you spend hours digging into something you already knew the answer to?

They didn't know it any longer. They are using StackOverflow as a second brain, offloading knowledge to it.


Because "knew" is past tense. There are a lot of things that I once "knew" but no longer "know", and it's nice when that past knowledge is documented somewhere.

Solving a problem once doesn't guarantee it's added to your knowledge. Learning needs repetition, if you solve that problem once a year, you'll learn it very slowly.

Not OP, but... more often than I care to admit, past me is perfectly aware of the answer to this problem, but present me has long since forgotten.

Knowing something at some point in the past does not mean you currently remember it!

I've found that the amount I use stackoverflow is inversely proportional to the quality of the documentation and the strength of the type-system of the language I'm using.

That's me with Python... For some reason I find the documentation extremely cumbersome to navigate and parse. It's just so wordy and dense. And there are barely any examples - which is what I'm usually looking for instead of having to read a whole paragraph.

The Elixir documentation on the other hand is succinct and always features basic examples for every function. It's a joy to work with.

As a slightly opposing take, Django has the highest-quality and most comprehensive documentation I've ever seen for a web framework... Or open source project, for that matter.

Not very related to the parent comment, but just an interesting note.

I agree, Django docs put Flask to shame! Flask in general is more intuitive I guess, but the docs are not as good as they could be.

+1 for Elixir’s documentation. José Valim knew what he was doing when he made docs so easy and pleasant to create and view.

I always love it when a programming language has their docs available off-line and easily accessible. I grew up on Perl without an internet connection (or a GUI for that matter) so I got very used to the perldoc command. That is now the standard I expect. Elixir is one of the few languages that exceeds this bar imo.

I've been playing with the ast module in Python lately, and the documentation actually says to go look at someone else's documentation because it's better, which is a refreshing take.

Official Documentation: https://docs.python.org/3/library/ast.html

Green Tree Snakes: https://greentreesnakes.readthedocs.io/

Definitely. Looking at my browser history for stackoverflow + PHP, it seems to be things to do with PHP, such as how to prevent a referrer from being sent when users click a link on my website. That's of course not actually part of the PHP language, so it would seem like I'm rarely looking up how to use PHP APIs on stackoverflow. I've always found PHP docs to be the best of any language that I've used.

Though even for PHP StackOverflow is still pretty useful e.g:


Note that php.net answer (official docs) is right below the stack overflow one.

I like the php manual. Its pretty simple and has good exmamples. They seem to used to have allowed/ allowed user comments, which used to be really old. (this one seems more up to date)

Related: http://xahlee.info/python/quality_docs.html | http://xahlee.info/comp/which_programing_language_has_best_d...

It's curious, despite not a example of a great language, PHP has great documentation.

I tried Pharo, and a great functionality there I miss in other languages, is the search by example ability. Saves a lot of visits to stackoverflow. See for example:


This seems like a terrible idea!

A terrible, horrible, wonderful idea…

(Kind of like one of the Grinch.)

This seems like a terrific joke!

I'm enjoying the license it uses:

"This module is licensed under whatever license you want it to be as long as the license is compatible with the fact that I blatantly copied multiple lines of code from the Python standard library"

Thanks for sharing. This made my afternoon :)


Inspired by stacksort: https://xkcd.com/1185/

On a side note, use caution when doing a copy/paste from a website into a terminal. There are several things you can do to reduce the risk. Here [1] is a demo of one risk vector. The article links back to a discussion here on HN from 2013 on some things you can do to mitigate the risk.

[1] - https://thejh.net/misc/website-terminal-copy-paste

In recent Bash versions this seems to have been fixed (available in Debian Bullseye, currently the 'testing' branch): when you paste something, it'll never auto-execute, even if it contains newlines.

It's actually quite annoying as I'll often copy from my terminal itself, purposefully with the trailing newline, and it now refuses to execute. I need to move my hand from mouse to keyboard to hit enter and then (often) back to mouse.

this is called "paste bracketing" if you want to search for it!

Thanks for sharing this!

So I have copied this answer into probably 100 separate bash scripts over the years:


I've thought about saving it somewhere (sometimes I copy from my other existing scripts) - but it's just too convenient to google/copy directly out of the webpage.

That link goes to:

> a useful one-liner which will give you the full directory name of the script

So we don't all have to click to find out what this is about...

I think copy and pasting code is very useful for a junior developer and nothing to be ashamed of. You will learn by debugging and modifying it.

As a more senior person, I rarely do copy whole snippets. I will look for general inspiration or to confirm my idea and check if there is a better solution. I wont blatantly copy as it rarely fits into my architecture and code style so it is faster to just directly write it how I need it.

It depends on how complicated the codebase is. Junior programmers usually work on simple programs so they can get away by copy pasting snippets.

I do it all the time for clever tricks (especially SQL things like tally tables), but like to leave the permalink to the answer that seems best (from the share link below the answers) as a comment in my code.

If it's something I had to google for once, it's likely needed in the future and sometimes it's easier to search my own code for all SO links than follow the thread through the labyrinth again...

Stack Overflow answers are licensed under a CC BY-SA license.[0] If you copy and paste a function without (at least) linking to the question, isn't that copyright infringement?

[0] https://meta.stackexchange.com/questions/333089/

There probably are programmers that paste entire functions, but for me it's mostly either concepts or minor snippets. In particular the DBA stackoverflow is great for all kinds of clever tricks in SQL to use a set-based rather than row-based approach (e.g. to things like string handling, and tricky aggregations). I'm not too worried about the copyright police for things like that.

It depends. The short of it is that a few lines are often not copyrightable at all, but otherwise you're right.

Hence, in my code I actually link to the answer when I think it might meet the bar for copyrightability. One example that comes to mind is a string->number hash function in javascript to assign random but consistent colors to items; I would use the resulting number as hue in CSS.

Problem is when “your code” isn’t yours but your employers. Don’t get me wrong this is not a new problem, but you should be aware that simply linking the answer to a comment (esp in a closed source code base) isn’t sufficient

On first glance I think you're wrong, but if you think I am then please tell me more!

The stackoverflow license requires attribution and share-alike, so: (1) there is a comment for the attribution, and (2) anyone who sees the code knows they can re-use it because I mention the license for that part of the code. Some licenses also require the binary or derivative works (e.g. rendered webpages) to reproduce a notice to any users, is that a thing with creative commons? I thought CC was for non-code licenses and, as far as I know, the binary showing a notice is not a thing CC specifies. Looking up the summary page[1] on their website, it says I must provide "appropriate attribution" which is a pop-over that specifies it as:

> you must provide the name of the creator and attribution parties, a copyright notice, a license notice, a disclaimer notice, and a link to the material

I wasn't aware of the required disclaimer notice so it's good that I double-checked, but I do meet the other requirements. Though it's also quite rare that I really copy something as substantial as a whole function, it's not like this is a weekly occurrence.

But I'd love to know if I'm doing it wrong, in case you still read this!

[1] http://creativecommons.org/licenses/by-sa/4.0/

If you put this in your software as John Doe committer/employee then your employer is now in violation of the CC license since it is not properly attributed. Just being in your source code is not enough, especially if you redistribute binaries or other obfuscated manners of packaged code. A lot of shops I have worked for in the past have compiled license attributions for all included works. Imagine if this CC license was excluded! Thus you copy and pasting code from stack overflow in your day to day coding as an employee opens your employer up to liability. This is just like looking at GPLv3 code saying “yep that makes sense” and then copy and pasting that. You are copying the license with the text and tainting your future code base if you do not have the right to alter the existing license.

now with that said

It’s highly unlikely this would ever be pursued at all or if it is, then swiftly settled out of court. But never underestimate legal trolls with nothing better to do

Ps this was in the context of employee/employer for-hire development. If you have the ability to attribute code as required either because this is a personal project or you are founder of said business then please disregard. My example was for jr Suzy F DevMcFace working for BigSoullessEvilCorp(TM)

I think whatever feature in my browser lets websites determine when I copy text should be ripped out, along with the site's ability to snoop on my scroll position.

The sandbox has been broken.

I believe firefox lets you set dom.event.clipboardevents.enabled => false

This is an unpopular opinion but I think the answer to this may depend on the age of the engineer.

For example, I learned programming before stack overflow. I have most of the standard library syntax in my head and mostly look at spec documents. Once in a great while I will go on stack overflow if I can’t debug a problem but I don’t post on there .

In the same way, I suspect some engineers like using video to learn things or debug things but it’s not for me.

> This is an unpopular opinion

[citation needed]

These kinds of comments just make those who agree be sure to upvote, they're part of this perceived in-crowd after all, and most of the time (here too) it's fairly obviously true so a lot of people will upvote. Of course a newb will copy more code than a veteran. They also have very different functions in organisations on average. It's neither an opinion nor unpopular.

That’s not exactly what I’m saying. Veteran isn’t really the same thing as “age”.

Also, I think it is pretty controversial for one to claim that some people don’t really use Stack Overflow

> Veteran isn’t really the same thing as “age”.

Of course not, but at the same time there are not that many 15-year-old gurus. It's not really the same thing, but on average people will gain more experience as they do a job for longer, and to do a job longer you need to age. It's not causal, but definitely correlated.

Sometimes you want a discussion of the pros and cons of the various ways to tackle a problem, in order to find the best approach for your application.

This kind of information is almost never in standard documentation.

Yep. There's nothing quite like the experience of going on Stack Overflow for a particular problem, and the "best answer" is almost what you need...

but you scroll down, and there's a different version that's exactly what you need! except it's for an older version of the library...

but there's a comment on that answer, from two years later, with a quick note on "if you're at version 7+, just do Y instead of X."

Absolutely wonderful.

I use Reddit for this

This is probably true. Having lived without the ability to search any random problem and have a snippet of code there, I'm used to searching for academic papers, RFCs, or reference documentation and then build up my understanding and make an implementation that way.

There are times that I've not done this though, for example a couple of years ago I was struggling to comprehend how to convert some math from a paper into actual code (the symbols were more than a little odd and I swear a bit of the equation was missing) and yet searching Github revealed a repo where someone had produced snippets of different obscure algorithms. Not quite a cut and paste, but definitely a "read the answer".

My secret move when I'm having trouble setting up a new library or dependency is to search for some common code from it on Github.

For example, how do I use the Python requests library (bad example because it has great documentation) would lead me to searching "import requests".

Possibly, but I don't know where the inflection point lies.

Before StackOverflow was a "thing," I recall noticing that it was easier to google spec documents on an electronic component even though I had the manufacturer's databook and Application Notes just across the room.

The ease of searching the pdf on google outweighed the ease of getting up and paging through the physical book. The main difference was in Application Notes: I could sit down and read some of the better ones like a novel (I'm looking at you, Analog Devices :-) and that was easier with the dead tree copy than the online one.

It's less about age and more about the stack you work with. I know several experienced engineers who have the entire Java library memorized, but have them work on a slightly different stack (or even a newer SDK version) and they will be blindly Googling for answers like anyone else.

"Kids these days"

Ha! Joke's on them. I type it out so it makes it look like I'm doing more work. (But in all seriousness, I normally do type it out, so it forces me to remember what I'm using better.)

The learning works better if you cut and paste it twice, forget to change something and then discover it after a few hours of debugging.

This, yes! Not necessarily the copy/paste part, but definitely the debug part!

I generally like deeper debug sessions, preferably without too much pressure. Getting deep understanding for a problem and figuring out a solution were the most valuable learning experiences in my career. The satisfaction you get from finding a solution to a hard problem, makes them more memorable. Even if the lessons learned afterward are not spectacular, then at least you got a chance to sharpen the tools in your belt to pin point problems as they occur!

Right? Who is literally copy/pasting, that seems dangerous.

I'd describe it as learning the concept/tactic/technique from the SO answer, and if me then implementing that concept/tactic/technique in my own code just so happens to look the same as the provided example, that's fine.

Frequently it doesn't, either!

Same, I've always done this for anything like this (i.e. tutorials), just to make sure what I'm learning sticks with me a bit more.

Well, hello, Junior. That's thinking!

Ha! Nice username :)

I don't copy and paste from SO.

I sometimes type what is shown on SO, but not copy/paste, this just sits better with me, I get a deeper appreciatio of what it's doing even if I've read the code and understood it well, and it lets me use names that fit with the program and context of what I'm doing.

Often, I use the example on SO not as the solution, but as inspiration for one, and so I type only a modified subset of it, and at that point, it's just a lot easier to type it yourself than to copy/paste/modify it anyway.

It's a great way to get individual functions for odd tasks instead of needing to pull in a whole library. Just last week I needed to implement POPCNT in C# since we're not using .NET 5 so BitOperations.PopCount is unavailable. The naive implementation is easy but this was needed in our core networking code so easy performance wins are good to go look for.

A quick Google search brought me to a question on Stack Overflow from someone looking for the exact same thing: a good implementation of POPCNT that works for a single byte[1]. I checked out the answers, found the one that looked the best, pasted it into LINQPad to test all the combinations, and then pasted it into our code with a comment that had the Stack Overflow answer's permalink. I don't see anything wrong with copy-pasting from SO that way. It's quite handy.

[1]: Yes, if I'm looking for maximum performance, working with just a byte at a time is not great. But for the data I'm working with, you need to know the population of a byte to find the next byte that needs to be counted.


Does anyone ever intentionally NOT copy and paste, even from their own code?

I usually copy/paste, but sometimes I'm inclined to type it out again because it's just... It's just really satisfying.

It's like a ballet move or a really graceful hockey play. I find the act of physically typing out some code to be graceful and elegant and just so darn satisfying.

I try not to copy & paste if the pasted version needs slight variations. Too often I miss one and waste time debugging. So I try to retype in those situations.

I wish they had shared with users how many of their answers/questions/comments were copied from. Views are aplenty but maybe one in a dozen, perhaps one in a hundred of the readers that found it useful actually upvote. I might even link an answer to friends, they express having found it interesting or useful, but no upvote comes in. Not enough karma is often a reason, but also most just don't habitually upvote useful posts and are not usually logged in.

I'd love to know how many people make use of my answers, and while there are per-page view counts, of course a view doesn't mean I really helped someone. Number of copies during these two weeks would have been very nice to get as a statistic!

Don't you have to be above a certain score or something to upvote?

I remember something about that when I tried once and haven't bothered logging back in since as I didn't have the clout to do anything of value

E: I am sorry, you mentioned that but I dived into a reply so fast I missed it. Aye the karma thing they have there really discourages (or at least it did for me) participation. I'm working on the thing that brought me to SO in the first place I'm not spending time building up karma on a dying website to upvote haha.

For what it's worth, if you click it but don't have enough karma, it is recorded and high-rep users can see how often people tried to up/downvote something and compare it to the actual votes. But the screen (I have enough rep to see it) seems to be more of a gimmick than actually used by anyone.

Personally, I try not to copy & paste things I don't understand—unless there's not enough information around the code to explain it, in which case I first copy & paste it, then tweak it and test it enough that I do understand it.

The quality of SO has deteriorated rapidly over the last few years.

I think the rot started when they brought I the careers section. People more interested in gaming it to get a job then help each other out.

It's weird, nowadays I think the only times I use SO as a reference are when I'm hitting something that is very product-specific (i.e. a weird error message in Typescript that doesn't quite make sense). In those situations, there's usually some SO thread that explains the quirky behavior and workarounds in depth. Otherwise, I'm never there anymore, which certainly wasn't the case years ago.

Stack Overflow suffers from Internet Ossification. Thousands of posts in the search engine but most are for obsolete questions. So lots of chaff shows up.

Sometimes its a melancholy thing - looking for a latest-version issue and some 12-versions-back question shows up from 2002. Still unanswered and I think of that poor soul still hoping for some direction after all these years!

One of the interview questions I ask is, "if you run in a problem you can't solve, where do you go online to find answers?" Anyone that doesn't answer Stack Overflow is - in my opinion - either lying or very, very new to the industry. (This is for .Net/C# jobs so maybe it's different for other languages.)

1. web search (google/bing/ddg) - often that will lead to SO, but also turn up some other relevant forums.

2. docs on package X. If I'm reasonably certain a problem is with a specific package, I'll search for a forum or issue tracker (often GitHub) for that package.

3. language-specific community. there are some lang-spec sites/forums that help with the nitty-gritty sometimes that SO and similar sites don't always get (or, more often, SO is out of date but still marked 'best').

That SO often tends to be where you end up doesn't, imo, mean you should always start there. If the same SO links are at the top of many search engines, that's probably a very good indicator, but you almost always need to broaden out when researching.

and lord help you when you are on page 6 of a google search and not finding it... :(

I would take that as evidence that I'm not doing something common, which is soft evidence that I'm not doing it the right way. I haven't had to do anything really novel, mostly I just have to do various forms of plumbing.

Oh no disagreement there. I usually end up there because of some odd error code from some code someone else wrote. If I am lucky I have the source code (not always). But 99% of the time whatever you are doing wrong is in the first 5 or so posts returned. But you get desperate sometimes :)

to be honest, I use my search engine of choice, yes stack overflow often comes up in the results, but also GitHub issues of people having a similar issue, often for other projects that use the same tools/libraries I do.

so I go to my search engine for answers, not stack overflow directly.

Yeah. I'd raise an eyebrow at Stack Overflow if they didn't also include Google. Google will get me official API docs, blog posts diving in deep about it, Github issues, as you say, -as well as- anything relevant on Stack Overflow.

Maybe they meant to go -ask- questions? Which...I rarely ever do, despite being in this game for over a decade. The few times no one has asked the question I have, it usually has to do with a library or missing functionality, and that tends to get resolved (at least with an answer of "yeah, that isn't supported") via an email.

right, often if it is a particular error message Github issues for the library and version you're using is more useful.

I disagree. From your perspective I must be a liar, and that is fine by me.

As, I can't recall the last time I found anything helpful on that platform, that any of my questions were actually answered and certainly not that I copied & pasted any code. I am not claiming that it never happened. It is just so rare that I wouldn't answer your interview question that way. To me that platform is an internet points farm combined with Groundhog Day of basic questions. I might be an extreme outlier of our craft. Just keep in mind that we exist and that not everybody answering the question "incorrectly" is a liar or an idiot. In fact, I would say that only those candidates who answer differently are truly interesting.

I've been in this industry around 20 years and I can't remember the last time I looked at SO to answer one of my own questions. Other people's, sure, but usually I look at the source code for answers to my questions.

The results are more consistent.

Now Github Issues on the other hand...

Why would I go to Stack Overflow? If it has any decent answers, they'll be right at the top of Google (or whatever) search results. If it doesn't, I just saved myself the need to do a second search.

I assume this is for more junior roles?

It's been a long time since I've needed to use SO for something in my day job. I mostly end up on the github/microsoft issues sites scrolling down to see whether the compiler/library bug I just reproduced has been fixed yet or at least has a workaround...

I think a better question would be "How do you go about finding a solution to a roadblock you've encountered while writing code?", that will test for people who go straight to colleagues for each little issue instead of spending 10 minutes looking up a solution themselves. It also gives you an idea of their problem solving process.

Search is the only correct answer. SO may be the first click often but limiting your problem solving to that is not something an experienced dev will do.

Not to mention that Stack Overflow's own search is absolutely horrible.

Searching online to find help to solve a problem doesn't equal to use Stack Overflow, at least in my experience.

It may be the type of problems, but I can't even remember the last time that DDG took me to SO and it was actually the answer that I was looking for.

I think I've found more answers searching for open issues in GH than anything else.

Eh depends on the language. For elixir I go to the elixirforum not stackoverflow

it really depends on the severity/depth/weirdness of the problem. I often run into things I don't know the names of or need documentation for how to do with a particular library/framework, those almost end up being solved on Stack Overflow - if I actually have a problem that I cannot 'solve', Stack Overflow is almost never actually helpful (some times it helps to try to write out a question that is clear enough for Stack Overflow to accept, because then you might think of what you actually have to research)

> "One out of every four users who visits a Stack Overflow question copies something within five minutes of hitting the page."

Ha. Before reading the article I thought to myself "I probably copy about 1/4th of the time..." Usually just syntax I can't keep straight.

> UPDATE: There has been a lot of interest in purchasing a real life version of our prank. The good news is we anticipated this might happen and we’ve been working on something along these lines. Stay tuned for more!

Nice! I thought it was a cute idea and I like little useless hardware.

copy and paste find and replace

those are fine, but use a tab not a space

CTRL shift L or I'm in hell

but please, don't merge my rebase

I've done it pretty often. Usually I paste the code into a Jupyter notebook (where I do all of my trial and error stuff), modify to suit my need, then add a comment with a link to the original page. I do this with anything that I glean from the Web.

How about a GitHub Action which scans lines of code in a PR patch for exact matches on SO? As a reviewer it would be helpful to see 1) was copy pasta 2) what SO comments say about the code (eg has it broken)

Q: How often do people copy and paste from Stack Overflow?

A: Very rarely. And when I do, it really depends on the language.

If you find a solution for Clojure, chances are it's pretty high quality. The fact that the language has been essentially stable for 10 years helps. The language culture around minimizing complexity and a strong preference for pure functions also doesn't hurt.

For other more "churny" & mutable languages you gotta be careful. It could be a good solution, or it could be old, unwieldy or excessively complex.

It would be interesting to see cases where the most upvoted answer isn't actually the most copied answer. Stackoverflow could potentially rank highly copied answers higher.

I copy and paste into my project diary. Which will also have the question that I'm trying to answer (and perhaps exactly what I googled). And of course the URLs of useful things I found. They are often SO. And then I may copy and paste into my diary. Rarely will I paste into source code. Copy and paste errors I find to undo the benefit vs just typing it in and using my own code style.

I did work for a small company whose “director of software development” had copied and pasted verbatim basic details from SO enough for it to become a theme. He would copy long, drawn-out, language-level examples from SO rather than use the idiomatic, syntactic sugar provided by the framework and its docs. The duplication would drive me insane.

Its one thing when you need to remember how to make a GET request with a certain framework, you can copy and paste that answer, its another thing when you need to integrate the request into your asynchronous queue and store the results in an ORM with JSON serialization, cant copy and paste that

I have copied the code one time in 2 years, but I ended up heavily modifying and then refactoring anyway ...

Way too often. One of the bugs I had the opportunity to debug was caused by a frontend developer who copy pasted invalid JSON from SO and pushed it all the way to production. He used different quotes: “ ".

When the data did not show up on the dashboard he just put the blame on the data team.

All the time, man. I cargo cult the fuck out of the following:

    # https://stackoverflow.com/a/246128
    DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
is a

I'd be curious at the length distribution. I tend to copy search terms (LongJavaClassNameThingy) to find some separate documentation.

I assume what's going on with the subset of high rep people that do a lot of copying is they're doing searches for duplicate posts.

Anytime I have a build config issue or something with webpack I've definitely been guilty of just pasting lines and hoping they work.

It's still super handy when updating something to a major version and figuring out what broke.

Every time when I need to unpack or pack something with tar. Or when I need the email regex validator. Or when I need the 'less' command to scroll to the last line. My memory is weak :)

I've seen more than one Stack Overflow answer straight copy-pasted into a single function, much of it leading to dead code anyway, and not cleaned up before being pushed into production.

I closed the page in frustration when I got this popup to subscribe and haven't been using it as much since. I thought it was real. Or maybe it was and this is damage control?

Im honestly surprised that people use stack overflow any more. In the early days, I gained a large amount of reputation both answering and asking questions. It felt like participation.

Now it is almost impossible to ask a valid question without it getting flagged for some reason. Questions are closed as duplicate, when they clearly aren't. If you follow the "new" feed, you can see valid questions getting closed before your eyes. Now I just stumble on it from google searches - and most of the time the answers are of poor quality, or to do with an outdated language or product feature.

It is sad, because it was an awesome Q&A site.

"We pretty much captured everything except the actual text being copied."

Didn't realize I need to browse Stack Overflow in incognito mode...

Don't paste from stackoverflow and do not hire people who paste from stackoverflow or anywhere online. This is literally how software is compromised. I worked with people at another company who had compromised a small part of Google and several banks this way. They contacted the security department later but they were surprised by the number of companies they were able to compromise just by posting instructions online.

> Don't paste from stackoverflow and do not hire people who paste from stackoverflow or anywhere online

Crucial corollary: "... that you don't understand".

Blindly copy/pasting what you don't understand is the problem. If it's too complex to analyze, it doesn't belong in Stack Overflow.

I would add - unless it's a very short snippet and you understand exactly what it does. My top SO answer is how to convert and IP address from string to integer representation in Python. It's one or two lines, it's perfectly fine to copy that sort of thing. I've actually copied my own code from SO on several occasions.

If terrorists really wanted to disrupt civilization, then S.O. would be a prime target. I'm just the messenger, shore it up, you've been warned. I'd estimate it solves at least 1/3 of the glitches I encounter from day to day.

That being said, I can't stand their all-or-nothing moderating. Let low-rated messages exist, but be hidden by default, similar to Slashdot and Reddit.

All Day, Everyday.

(p.s. SO Blog is really cool - they always post many good and informative articles there).

It's often faster to google on stack overflow for python code than write it yourself.


I don't always copy code, but when I do it's always regex !

....I have lots of questions about this "homegrown web tracking tool" (assuming this isn't still part of a joke?). What??? Was this opt-in? Did they track across everyone? What was collected? This is troubling, to say the least.

Hey there - I work at SO. Understand your concern and wanted to share some details.

Browsers fire a copy event when you copy, just like a click event fires when you press on a button. We just added analytics to it like we would any other feature on the site.

We didn't track the content of your copy (browsers don't let you see the text content) but we did track the following:

Meta data about the post and it's parent post like the id, owner, score, tags, if it was a question/answer, if it was accepted

If your copy was from a code block or from text content

The Referer header

Standard analytics properties like the date/time, approximated location, account metadata

Here's our privacy policy on analytics: "Analytics information Stack Overflow uses data analytics to ensure site functionality and to optimize our Product and Service offerings to you. We use web browser and mobile analytics to allow us to understand Network and Apps functionality. In doing so, we record information including, for example how often you visit the Network, how often you contribute content, Network and Apps performance data, errors and debugging information, and the type of activity you engage in while on the Network or in your use of our Products and Services. We may on occasion share this information with third parties for research or product and services optimization."

> Browsers fire a copy event when you copy, just like a click event fires when you press on a button. We just added analytics to it like we would any other feature on the site.

This is a bit misleading.

More accurate would be: Stack Overflow is able to configure our web pages to track certain of your activities on the page, including Copy and Click. We do that.

This is a bit misleading; "onclick" and "oncopy" are part of the HTML standard that "must be supported by all HTML elements [...] and that must be supported by all Document objects"

[1] https://html.spec.whatwg.org/multipage/webappapis.html#handl...

[2] https://html.spec.whatwg.org/multipage/webappapis.html#handl...

That's right, but nothing mandates that the web page configures these event handlers, and sends these events to the tracking server operated by Stack Overflow.

My point is: "We can make this thing happen, and we do so and record the results" is more accurate than "this interesting thing just happens, and we observe" which is the tone of the original.

Most sites do not configure their pages to track copy events. SO does the unusual thing. They should be honest about that.

I appreciate the response and detail here, would have liked to see that in the original post.

Serious question: why is this troubling?

I don't see the big deal, at all. This is absolutely nothing compared to what big advertising & social media do. Taking a count of people who hit ctr+c? Who cares, I can't see any possible scenario of how that data could be used in a bad way unless you think SO is going to email employers with a time spent & ctrl+c count or something?

Not OP, but, the troubling parts for me are 1) the tone of the post, and 2) the larger-than-SO issue of the gap in understanding copy+paste for the average user.

1) The post says "unfortunately" they cannot tie logged-out users to their logged-in account. In no moral way is this a reasonable perspective to me. It is extremely fortunate that SO does not build the tech to track you as a logged-in user when you are logged out. That's a bad precedent to set. Sure, some sites do that, but I think they shouldn't.

Other examples of the tone in the article abound. It's troubling for sure.

2) It is up to the web devs themselves to decide what goes in your clipboard. Many users don't know this. Sites that exploit the gap between user expectation of privacy while copy+paste as well as tracking do not match with reality. Stack Overflow is merely one of many players here, but they way they exploit this gap is mildly upsetting.

Something needs to be done, either technically or communication to users, about copy+paste reality.

Thanks for the explanation. Honestly I still don't see the big deal though. It's just a little bit of fun with an interesting statistic, not everything needs to be so doom & gloom serious.

I wasn't trying to be all "doom and gloom," but as one concerned about tracking (yes, all the social media and stuff you mention, which I try to be very conscious about) the casual mention of adding in tracking to a particular event really could have used more context and explanation (especially given the technical audience). I love see the data analysis on SO, but there is a general need for transparency and opt-in on the internet. Not saying this was a huge deal, but the fact that it can be so casual speaks to the larger issues I'd say.

After 10 years of coding I visit stack overflow less and less. In fact when I first started there was no stack overflow, we had to rely on Google.

I find it useful to educate other engineers on best practices. There are good posts on MVC, SOLID, TL;DRs of popular books like working effectively with legacy code etc.

I find it useful to see how different people have approached the same problem. I'll generally look at multiple questions and answers before deciding on my own approach (not necessarily the ones I saw on SO).

I never copy/paste directly but I try to understand what’s going on. It seems there are two kinds of devs: some that just want things get done and others what want to understand. The people who just want get things done will probably be happy copying some code verbatim.


I don’t think that link is what you meant to paste?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact