
Please Do Not Steal My Code, Mock My Analysis, and Present My Ideas as Your Own - minimaxir
http://minimaxir.com/2015/10/code-steal/
======
madaxe_again
The folks doing the ripping off here probably don't even see it as such. While
it's easy to attribute something like this to malice, it's saner to attribute
it to staggering incompetence and lack of introspection. Pity the poor fools
who are so unable to do something original that they must stoop to claiming
someone else's originality as their own.

I've seen exactly this sort of thing before in all sorts of walks of life,
from school work to academia, from corporate environments to international
politics - a bunch of mediocre, confused, out of their depth but don't know it
types take someone (singular or plural) else's idea, barely repackage it,
claim it as their own, _wholeheartedly convince themselves that it is their
own_ , and run with it - far too often somehow winning the battle of hearts
and minds, as when the original author comes along going "Oi!", their
readership or allies go "sore loser".

I've learned by this point in life to take people using my work, even if they
do so spitefully or in ignorance, as a compliment. Fabricators usually get
shown up sooner or later.

~~~
stevesearer
This is true in my experience as well.

Originally there was the concept of reblogging where one site would publish
something original and others would publish a small excerpt with a clearly
defined link to them.

From there came the idea of the [via] link where another site published its
own article based on information from the original publisher and attributed a
small link.

And from there it seems like there is just so much republished content that
sites think of the information as just public information and glean from
wherever without any sort of attribution.

~~~
lexicalscope
I find this amusing because I'm generally suspicious of sites that don't
include some form of reference to wherever the information was retrieved -
perhaps a habit from when I did more stuff in academia. Things don't exist in
vacuum except things that are basically pure opinion articles - just about
everything else is based on something and should probably be showing it.

Maybe that's just me though.

------
Macsenour
I worked at Broderbund many years ago. In the interview my future boss asked
me if I had any game ideas. I mentioned I'd not seen a good game with the
Smokey the Bear. When I started a couple of weeks later, as I was introducing
myself to the other programmers, one mentioned that he's working on a Smokey
the Bear pitch that came out of the blue from the boss. I mentioned it was my
idea from the interview, and this programmer insisted it couldn't have been
because the boss had brought it as his idea.

Fast forward a few months, I've presented 23 game concepts and they were all
turned down as "not good game ideas. Two weeks later i stumbled into a meeting
where one of those that turned down my idea, an artist, was presenting 3 of my
ideas as his own. I companied to the boss who said: "Ideas are free". My
retort was: "Sure, but the credit isn't".

That was many years ago and I still find those two people to be disgusting in
their attitude.

~~~
Overtonwindow
Maybe this is a lesson for the future for us all: Don't give away ideas unless
you've been hired, or someone has signed an NDA. Otherwise he's kinda right:
They stole your idea because you had no legal claim to it. Morally they're
dicks but hopefully you've learned from this.

~~~
Macsenour
I agree, no legal standing on my part. Morally they're both dicks. I did learn
my lesson. When I interviewed at another game company, iMagic, they asked for
game designs and I told them to buy the cow instead of trying to get milk for
free. They didn't hire me, and were later sued by someone else for stealing
his copywritten game concept. They're long gone as a company.

~~~
Overtonwindow
Oh wow, dodged a bullet with that one! I work in government relations and have
a similar problem. Potential employers want to know how I'm going to solve
their government relations, lobbying, or PR issues. I've had ideas stolen many
times and it's very frustrating. Sometimes I suspect interviews are treated
like fishing expeditions by some employers.

------
thomasahle
> I saw no mention of Max Woolf, minimaxir, or any mention of the original
> visualization by the post author.

I agree that it is good practice to show attribution, and not directly lie
saying you wrote something yourself. However the MIT license doesn't require
that.

There are a lot of open source licenses that give more protections to the
original author. I have a theory that too many people choose MIT just because
it is simple, but don't think through what they actually want from a license.

~~~
minimaxir
That's why I preface the post title with "please." I know that attribution
isn't required and I can't enforce it, but _pretending the original analysis
doesn 't exist_ is spiteful.

~~~
jordigh
I think it's even more spiteful when you write a blog post like this when
someone uses your code in the way you intended it to be used. There was no
slander against you, was there? If so, that's already covered by laws other
than copyright. And if you want more attribution (I'm not sure if the authors'
analysis counts as a "substantial" reproduction of your own work), then use a
3-clause BSD, which says,

    
    
        1. Redistributions of source code must retain the above 
        copyright notice, this list of conditions and the 
        following disclaimer.
    

That way, any reproduction of your work that wouldn't fall under fair use
requires attribtuion. I'm not sure if the "substantial" part of the MIT
license is the same scope as "fair use", but without that adjective, the BSD
license might be more explicit about requiring attribution.

I have had my own free software used in ways I don't like, but I don't write a
blog post about it. After all, I made it free. You want to insult me, mock me,
satirise me, print out the source code and wipe your butt with it; go ahead.
That's part of what free software must allow.

Point is, require with legalese what you want people to do. It's your
prerogative.

Above all, don't code call copying "stealing". This is not theft. It might not
even be copyright infringement. Calling it "stealing" is undue indignation
that indicates that they did something worse than what you explicitly allowed.

~~~
brazzledazzle
MIT already covers the copyright attribution:

    
    
        The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

~~~
jordigh
Is "substantial portion" the same scope as what you would need to do under
fair use?

~~~
brazzledazzle
Terms like substantial portion are for lawyers to decide. If you're not a
contract lawyer your best bet is to always play it safe and assume the worst
possible outcome for yourself. Huge lawsuits driven by huge corporations have
used less as ammunition.

------
mrow84
In meteorology and oceanography we call this type of plot a Hovmöller diagram
[0], and I'm sure it goes by many other names, because it's hardly a difficult
idea to arrive at. Perhaps you should credit one of those prior authors?

I would normally try to avoid the snark, but, setting the code-duplication
issue aside, the rest of your complaints seem a bit thin-skinned, put in the
light of the kinds of criticisms you find in normal scientific literature.

See aakilfernandes' comment here [1], for the appropriate way to look at this
other person's comments on your analysis.

[0]
[https://en.wikipedia.org/wiki/Hovm%C3%B6ller_diagram](https://en.wikipedia.org/wiki/Hovm%C3%B6ller_diagram)

[1]
[https://news.ycombinator.com/item?id=10459559](https://news.ycombinator.com/item?id=10459559)

~~~
rodionos
That was my thought as well. It's a very typical calendar chart. I'm sure
there are many implementation in d3 and ggplot. The author is being a bit too
dramatic about it.

------
apologizer
Hi guys,

Author of the medium post here. First I want to apologize to Max for not
linking back to his blog, as he said I referenced his work and image multiple
times and did not provide his name or blog in my post. I was in no way trying
to pass this off as a novel idea as I said multiple times in the write up.

As for what was perceived to be critiques of his work I was merely trying to
emphasize what I was doing differently in my analysis and project. His project
is great, I just used a slightly different approach and wanted to highlight
the differences. I definitely did not mean to come off as malicious or
mocking, but upon rereading I definitely see how it did come off that way.

I stumbled across the code to make the day of week and times here
[https://github.com/minimaxir/hn-
heatmaps/blob/master/hn_heat...](https://github.com/minimaxir/hn-
heatmaps/blob/master/hn_heat_map_points.R) after I had started the project. I
had posted on StackOverflow
[http://stackoverflow.com/questions/33263015/converting-
day-o...](http://stackoverflow.com/questions/33263015/converting-day-of-week-
from-integer-to-character-string-in-r) and had gotten an answer that seemed
close but not quite perfect. So I researched it a bit, and happened to find
that GitHub code. I figured that a couple of lines of code probably weren't
worth attributing, but I can see now that I should have been more thorough in
my attribution. I have since taken down the article.

~~~
jongraehl
Unsurprisingly, your version-2 story is as close as possible to version-1 and
the demonstrated proof of version-1's lies. This is why it's nice to hold in
reserve some additional evidence when you expose a liar - so you can catch
them lying in version-2 and completely destroy their public credibility.

I hope this doesn't happen to you, but in my experience version-2 is usually
only a partial confession.

Good luck recovering from this with a lesson learned: you don't look bad when
you shared credit and honestly represent your contributions. People will
already associate you with the good thing you brought to their attention,
deployed and customized for them, etc.

~~~
apologizer
By "version-1" do you mean my Medium post?

------
mcguire
I've no intention to address the credit issue, but I think you're getting a
little worked up over not much, regarding the mockery part. For example, in
/u/dleybzon's comment,

" _Since this is a link to my blog post explaining how the visualization was
created, I 'm not sure if it's necessary to include a comment, but here goes:_

" _I used BigQuery to query this dataset, and then made the visualization in R
using ggpot2. I normalized the data by dividing the total score for that hour
of the week by the number of posts posted in that hour. For more info check
out the article, or the commented code at the bottom._ "

I'm unable to see anything that calls for your comment, "As shown above, the
quotes from the article itself were written with just as much unnecessary
ego."

~~~
davesque
Yeah, I actually read that quote twice to be sure I wasn't missing anything
and didn't detect any of the alleged ego. I had trouble taking the article
seriously after that.

------
IanDrake
Did I miss the part of this article where Mr. Woolf contacted Mr.Leybzon and
asked him to properly attribute the original code?

Do we really need more public shaming without at least a civil discussion
between the belligerents? Remember Adria Richards anyone?

~~~
IkmoIkmo
I sort of agree, I hate public shaming and I could do without. But this isn't
a mistake you just resolve over the mail.

For example, I've used toooons of open-source code in my life, lots of
snippets, and I've not attributed every single thing although I usually do.
When I copy a pretty standard implementation of a sorting algorithm that is
used in a minor way to present the final results of some fancy code I
completely wrote myself, sure, contact me and ask me to attribute properly and
I will. My intention is to build something new, and use some code here and
there by others instead of reinventing the wheel, and sometimes you slip up
and don't track your attributions properly.

But in this case, nothing new seems to have been created, it seems to have
been almost entirely copied, variable names changed specifically for no real
reason (it's one page of code, on a big project I understand changing
variables to suit your project structure but on a one-pager...), and the work
is even referenced as being inferior (by pointing to a 2 year old version,
saying that is inferior, then copying the 1 month old version that already
solved the issues you're pointing out on the 2 year old work, and not
attributing anything). At that point, the notion someone is a cool dude
looking to build cool stuff and made an honest mistake goes out the window,
and the chances you're dealing with a guy who inflates his own ego while
dissing an author he's copying and goes out of his way to muddy the evidence
he's copying, pretty high. In this particular case I can really see where OP
is coming from in not resolving this by email.

------
cjensen
On an unrelated note, some comments on the visualization...

By visualizing the temporal position of posts matching a particular criteria,
I suspect the author has accidentally just visualized the temporal position of
all posts. There is no correlation between the criteria and the visualization.

It's a variant of the heatmap visualization error explained by xkcd[1]

[1] [https://xkcd.com/1138/](https://xkcd.com/1138/)

~~~
minimaxir
That is addressed with the discussion of the 3,000 threshold.

------
hharnisch
Just to throw another perspective in here. I once wrote a tool for a teacher
to detect copied code for CS assignments for some extra credit. Every step you
take to dig deeper just led to more similarity -- things like normalizing
variable names. At one point it looked like 80% of the class had turned in the
same assignment. Granted these where toy problems.

The heatmap layout you're saying people copied has been done over and over
again by almost everyone who's done data visualization on time series data. Is
not it possible you looked at the same StackOverflow post?

Even if they where copying you I'd be ecstatic - you've inspired a bunch of
people to pick up a dataset on go play around with it. That's wonderful thing,
because the world needs more people doing this.

------
aakilfernandes
1) The Medium author fucked up and should of credited you. Even if not legally
required, its the decent thing to do.

2) Complaining about your analysis being 'insulted' comes off as petty.
Ideas/analysis aren't sacred and they should be insulted if someone disagrees
with them.

~~~
minimaxir
> _2) Complaining about your analysis being 'insulted' comes off as petty.
> Ideas/analysis aren't sacred and they should be insulted if someone
> disagrees with them._

I have zero problem with criticizing my analysis with contradicting
information, but the _manner_ it comes across is relevant.

You can say something is wrong without calling it "a key mistake."

~~~
onli
It is pretty strange to me you see that as an insult. Maybe this is a language
thing? Claiming someone make a key mistake is nowhere near an insult in my
understanding of english.

I was also put off by that part of your article, even though I agree that he
should've credited you. But I could not discover any insult in any of your
citations.

------
lazzlazzlazz
Max Woolf is not the kind of person I would want to work with, and his
complaints are almost incomprehensible. He is very emotional over something
trivial, and his accusation that there was "ego" involved in this "incident"
has no basis.

I'm surprised and disappointed - Woolf's victim complex is truly something to
behold, and I hope it's not a sign of larger changes in our culture.

~~~
Gigablah
You're overreacting. There's no need to be hysterical over a silly blog post.
I hope you don't behave this way in your daily life.

There, I paraphrased your post and threw it back at you. How do you feel now?

------
lighthawk
It is tough to decide whether or not to call out someone when they blatantly
steal your code.

On one hand, you want to take the high road and not have to squeal. But on the
other hand, if you don't speak up, that person will just keep on stealing from
others, in theory.

Personally, I think the best thing to do is to call them out in a public but
quiet way that doesn't make it sound like you are offended, e.g. posting a
link to your work in the comments with a "Glad you could use my code: (link)".
That way they know that you know that they did it, and that anyone who looks
at the comments will see but maybe not make the connection that they stole it.
This way you get the point across without having to publicly beat them over
the head with it.

~~~
Nexxxeh
I think OP's actions were proportional to the offense. You slag someone (or
someone's work) off in public, and/or rip someone off publicly, you should
expect to be publicly flogged.

------
thieving_magpie
Think about the type of person motivated to do something like this. They
probably aren't happy, they probably lack self esteem. This is a way to
validate themselves. Not saying anything about this is right or moral, just
saying that maybe you don't put the full name of the guy out there for anyone
to blast. I really doubt this came from a malicious place.

edit: guess this is an unpopular opinion. I don't understand why HN lately is
so fond of public shaming, as seen in this thread.

------
falcolas
I have to be missing something here. Someone mind helping me out?

The code posted in the "infringing" article does not appear to be a copy of
the code in the OP's repository. Nor do the images (OPs are green, the
"infringing" articles are blue).

The OP's code also does not have the header in the file, which makes innocent
infringement a lot easier to do.

It really does look like the "infringing" author was inspired by the OP... but
there's nothing being stolen that I can see here.

------
iamwil
Gotta let this sort of stuff go when you license MIT.

~~~
lfender6445
I am with you. While attribution is nice, the time wasted on ranting and
fuming over the lack of credit feels a little whiny.

'For these reasons, all such projects are MIT-licensed, where anyone can
freely evaluate, transform, and edit the code.'

~~~
IkmoIkmo
Well don't stop there, the MIT license says some more:

'subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.'

And that copyright notice on the code written by OP says 'Copyright (c) 2015
Max Woolf' (OP). The Medium author include that and posted his use of the code
in the article and on github without this copyright notice.

------
ikeboy
>This is off-topic, but wait, what? #1138 is a joke about subsets of
populations which look the same as the general population because there is no
statistically-significant difference between geographies. It’s not a
normalization joke. Normalizing per capita would still make the maps look
same.

Wait, wouldn't normalization make the map look uniform, and _not_ a population
map?

~~~
mrow84
Normalising "subscribers to Martha Stewart Living" and "consumers of furry
pornography" by population would give you per-capita values, showing you
places where those properties had high incidence per-person (to which you'd
have to apply significance tests). I'm pretty sure it is a normalisation joke.

------
michaelwww
This reminds me of the argument among comics about joke stealing. Stealing
code, ideas or jokes from working programmers or comics can negatively impact
the original author, sometimes even causing them to be accused of theft! For a
fascinating discussion about the psychology of a thief see Marc Maron's WTF
podcast interview with Carlos Mencia (part 1 and 2 - 2 is where it gets
interesting.) It's a pathological condition.

~~~
verisimilidude
A good friend of mine is a comic, and through him I've met other comics.
They'll often riff off of each other. Upon hearing something really funny,
you'll often hear, "Hey, that's good! Can I steal that?" This is totally
accepted and expected behavior among this group of comics. So stealing jokes,
per se, is not the problem.

The troublesome part is that there's not a good way for other people outside
this group of comics to join in and ask for permission. All they can do is
listen to the material live, and then kinda take it if they think they can get
away with it.

I'm not sure if this is a problem that needs to be solved, but it's easy to
see where these misunderstandings might arise. Everyone copies and steals.
Sometimes it's unclear or uncool to ask for permission.

This is all separate from the main article here, where I think it's clear the
copier is a huge asshole. I also need to listen to that podcast.

~~~
bbunix
Joke theft sucks (if you've ever tried to write standup material, it's hard)
"can I steal that" is more akin to asking permission to use a joke than actual
theft. More here:
[https://en.wikipedia.org/wiki/Joke_theft](https://en.wikipedia.org/wiki/Joke_theft)

------
bachmeier
The author's original blog post doesn't mention Hadley Wickham at all. It
seems to be a judgement call when deciding whether to give attribution.

------
BinaryIdiot
Good read. This has made me realize I should probably reevaluate the license
of my open source work and instead of using MIT I think I may transition to
use Apache License 2.0 (which sounds like a better fit for the author...I
think; I am not a licensing expert).

~~~
mayoff
That won't stop people from copying your work without attribution. No license
can do that.

~~~
BinaryIdiot
> That won't stop people from copying your work without attribution. No
> license can do that.

I'm not sure what the point of your comment is. Of course you can't prevent
someone from copying your own without attribution especially if you're
publishing it openly.

Apache License 2.0 seems to require more inclusions in its usage over MIT
hence my thinking of switching to it but that doesn't mean people still won't.
It's just a license.

------
ohitsdom
The offending medium post has been taken down, and the "copy" author removed
the post from his facebook (or at least made the post private). Looks like he
may have deleted his reddit account too.

And from one comment on medium, the guy was also trying to get help covering
his tracks on Stack Overflow by cleaning up some code [0]. Seems like he
outsourced pretty much every part of this ripoff.

[0] [http://stackoverflow.com/questions/33263015/converting-
day-o...](http://stackoverflow.com/questions/33263015/converting-day-of-week-
from-integer-to-character-string-in-r)

------
slantedview
"But then I realized that whoever made this graphic made a key mistake"

This sort of conflation has happened with my work as well - where a deliberate
decision is taken by someone else as a "mistake" that must be fixed. Fine to
have a differing opinion, but the author is right that the insulting tone is
unnecessary.

Also unnecessary from the author's facebook post:

"I made a thing (again)!"

Wow, you made a thing! Again!? Good for you! This is so annoying, I can't help
it.

~~~
scott_s
I don't find the quoted claim to have an "insulting tone". It is perhaps
insensitive - being told in plain language that you made a mistake often
hurts, even if true - but not outright insulting.

------
wangii
it's a sad reality that in this world, marketing ability is far more important
than originality, unless the new idea is 10x times better than the old ones.

------
dikaiosune
Why not use a BSD license? IIRC, that legally requires attribution.

Also, I think OP would have some legal recourse in that the MIT license didn't
accompany the code in the copycat Medium post -- doesn't MIT require that you
at least include the license in any modified or redistributed code?

~~~
jackmaney
> doesn't MIT require that you at least include the license in any modified or
> redistributed code?

Yep[1] (emphasis mine):

> Permission is hereby granted, free of charge, to any person obtaining a copy
> of this software and associated documentation files (the "Software"), to
> deal in the Software without restriction, including without limitation the
> rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> sell copies of the Software, and to permit persons to whom the Software is
> furnished to do so, _subject to the following conditions_ :

> _The above copyright notice and this permission notice shall be included in
> all copies or substantial portions of the Software._

[1]:
[http://choosealicense.com/licenses/mit/](http://choosealicense.com/licenses/mit/)

~~~
mintplant
And that copyright notice is:

> Copyright (c) 2015 Max Woolf

[https://github.com/minimaxir/reddit-
bigquery/blob/master/LIC...](https://github.com/minimaxir/reddit-
bigquery/blob/master/LICENSE)

So in this case, the MIT license would actually require attribution.

------
falsedan
The public calling out/shaming is a dick move. This belongs in a private
correspondence. If how your work is used is important to you, then consider if
releasing it as open-source suits your wishes.

Also, you will always have people violating your unwritten assumptions unless
you, y'know, tell them how you want them to use your stuff. Wishing people
weren't dicks -> avoiding a social problem.

I can't help but think of K. C. Green and his tumultuous relationship with one
of his early creations, dickbutt. He finally got sick of people stealing his
work and got real[0].

[0]
[https://twitter.com/kcgreenn/status/654044280286265345](https://twitter.com/kcgreenn/status/654044280286265345)

------
data_spy
While I think it'd be nice to be referenced as inspiration, what Max did was
probably done before and not that novel. These other authors added actual
analysis to the data compared to just plotting it.

------
jakejake
I agree it is a jerk move, but I also think it's a good lesson in choosing the
appropriate license for your work. MIT is clearly not the right license if you
feel that others should give you a courtesy credit. Regardless of how rude it
is to do so, the MIT license tells people that it's OK to use your work
without attribution. There's some CC licenses that would probably be more
appropriate.

~~~
ciupicri
As others have already mentioned several times, the MIT license requires
attribution.

~~~
jakejake
It requires the original license text to be included somewhere, nowhere does
it say you have to give a credit or thanks. The author says specifically in
his article that what the other person did was perfectly legit within the
terms of MIT, so I can only assume the license was there somewhere. I take the
Author at his word since he's the one ranting about not getting credit.

------
herbig
His first blog post was also mostly lifted from here:

[https://community.smartthings.com/t/hack-the-amazon-dash-
but...](https://community.smartthings.com/t/hack-the-amazon-dash-button-to-
control-a-smartthings-switch/20427)

I found that by Googling the code snippet, which the original author actually
gives a shoutout for.

~~~
IkmoIkmo
He gives a shoutout, too, but less directly. (saying he was inspired by Ted
Benson's post here [0], which has the same code)

[0] [https://medium.com/@edwardbenson/how-i-hacked-
amazon-s-5-wif...](https://medium.com/@edwardbenson/how-i-hacked-
amazon-s-5-wifi-button-to-track-baby-data-794214b0bdd8#.bsbiwe6tk)

The kid is just really bad with appropriate attribution, but he does refer to
the people that inspire him loosely. The issue is that he's posted code in
various places without any direct reference, so anyone copying that code will
attribute him, rather than the original author. Which sucks because he's
literally copying large chunks of code 100%.

e.g.
[https://gist.github.com/theleybzon/4e721f17e5dfc642c738](https://gist.github.com/theleybzon/4e721f17e5dfc642c738)

is 100% similar to the source he copied it from (Benson).

[https://gist.github.com/eob/a8b5632f23e75b311df2](https://gist.github.com/eob/a8b5632f23e75b311df2)

He does refer to Benson in the article, but not the code or its repository.
Anyone looking at the code would now attribute him, not Benson, the 100%
original author. It's an issue but it's not out of malice or ego or anything
like that in this particular case, I think he's just messing up a lot rather
than trying to present other people's work as his own. (the OP's article does
have hints of that, though)

------
williamle8300
Ok

------
chejazi
People do it for the reputation. It seems the only solution is to have one's
identity linked across separate places on the Internet. Many are opposed to
that for reasons such as the negative exposure you're giving people in your
article.

~~~
CountSessine
In this case, the offender, Danny David Leybzon, used his meat-space name.

------
blainesch
I also only contribute to open source to get fake internet points. How dare
somebody take my fake internet points!

~~~
swalsh
Fake internet points can translate to real world points. A good blog post can
build someone's reputation, it's a sales tool. If you're looking for a better
job, or a consulting gig you may seriously consider trying to show your stuff
in blog form.

Someone "stealing" your tools without attribution hurt's your internet
reputation which can have a real world impact.

~~~
solidpy
It doesn't hurt your internet reputation as much as it gives them fake
internet reputation. Which eventually will come bite them back if they try to
use it.

~~~
s73v3r
It can hurt yours if people think you were the copier.

