Hello, I'm Vivek, founder/CEO of HackerRank. Our intention with this initiative is to takedown plagiarized code snippets or solutions to company assessments. This was definitely an unintended consequence. We are looking into it ASAP and also going to do an RCA to ensure this doesn't happen again.
Thanks Vivek for responding. Speaking as one of the maintainers of SymPy (who received the DCMA takedown request by email) I would like to note some points:
1. If you do a web search for "HackerRank DMCA" you will find similar examples of this that have happened in the past (with exactly the same text).
2. The text of the request is not in any way designed to enable a reasonable response. No detail is given about what the infringing content is so there is no way to comply.
3. It is also clear from this and previous cases that whatever process is used to identify potential targets for DMCA yields false positives and no reasonable judgement seems to be applied in whether or not to issue a takedown request.
I also think that GitHub's processes are faulty here:
1. We were given one business day to reply. I replied within 2hrs to say that the request was obviously spam and then sought legal advice. The repo (and website) were shut down before we could get legal advice. One day is nowhere near enough time. The request came in on Thursday night (UK time) and in the UK both Friday and Monday are public holidays this weekend. I don't even know what one "business day" means in this situation.
2. GitHub gives some guidance for how to submit a counter claim but none of the information is applicable to a case like this where so little information is given that is simultaneously impossible to comply with the request and impossible to dispute it. If GitHub makes it this easy to spam DMCA requests then they should also provide some guidance for how to deal with spam requests.
I wonder if the FSF or someone else can come up with a good template response to DMCA requests that just don't have enough information in them. If the request doesn't even explain how you could comply then it should be possible to respond to that in a blanket fashion.
Oscar, thank you for responding and I'm terribly sorry for the disruption this has caused. I should have taken this more seriously when it sprung up on HN the first time but for whatever reason we just fixed that problem and moved on. This is a serious one and we are working on a strategy that is sustainable.
Meanwhile, if we need to make a monetary contribution for your effort or anything more we could do to promote, what would be the best way to do it? If you'd like to send me an email to discuss further, I'm available at vivek at hackerrank
If you don't want to utterly destroy your company's reputation and have every company that depends on Python or other critical open source infrastructure boycotting you, you need to stop. Just stop. Don't send out any DMCA notices at all, not unless you encounter a case of infringement so blatant that it is obvious to all. And then make the decision yourself, don't contract it out. The risk of damage to others and blowback to you is too great. You're risking the failure of your company if something like this happens another time after two high-profile cases; damage to your company could be that bad.
And here's the thing: if there's a match between the SymPy Docs and something you consider yours, the copying was quite likely in the reverse direction: the solution you claim copies something out of the SymPy documentation (which is legal, but it isn't yours).
> stop. Just stop. Don't send out any DMCA notices at all, not unless you encounter a case of infringement so blatant that it is obvious to all. And then make the decision yourself, don't contract it out.
?
Sounds great - did you make similar promises last time?
I'm looking at the page that was taken down[1] and it seems to be largely full of pretty basic code snippets demonstrating how to use the Solvers module.
Your DMCA request[2] includes the statements that say, in part, that Fair Use has been considered, that you own the copyright, and that the information in that statement is true and accurate.
Could you elaborate what on that page you consider to be your copyrighted content, and how you came to hold that copyright? Is it the code snippets, as per your comment? Did your company have an employee write that content?
When the DMCA takedown request was being drafted, did someone consider whether those snippets are even qualifies as an Original Work? I'm getting at whether they contain even a minimal level of creativity, as the US Supreme Court has said is a requirement for copyright[3].
Exactly, I have been sent on HackerRank by potential employers twice and every time I was given the usual coding challenge I had already seen on leetcode or similar.
Possible solutions to affect some justice and reclaim your image:
1. Fire WorthIT Solutions (no RCA necessary, do not automate or outsource DMCA, even if it's human based automation)
2. Apologise to SymPy
3. Discuss internally the cost-vs-benefit (more than monetary) of embarking on copyright policing as part of your company's future strategy
4. Blog results of #3
WRT #3, sustainability comes to mind - if dissemination of solutions is a serious threat to your current business model/implementation (something you might want to validate anyway) then consider adapting to mitigate it to the point of irrelevance. Perhaps you are already doing this, but DMCA policing is at best a distraction and at worst a PR nightmare and menace to FOSS like today, it also doesn't scale well.
Because it is a horrifically flawed metric by which to hire engineers, and is teaching upcoming crops of engineers valueless skills to enter the industry
As someone who (last week) ok'ed a guy with 35 years experience to skip the first round of interviews, only to then watch him struggle to write syntactically-correct code in a language he's (allegedly) been using for longer than I've been alive, I disagree. HR isn't a be-all-end-all metric, and it certainly has its faults, but it's very useful as a first-round filter.
Did you give your guy an IDE that he would normally use? That your company normally uses? A real world problem to solve? Resources he would normally use? Resources people at your company use on a day to day basis?
White boarding or using codepen isnt remotely close to evaluation for the actual job, and neither are data structure brainteasers, curious what you and your candidate did.
You got one false positive and now want to use a system known for its false negatives.
If you're just using HackerRank as a first round filter to see if they can write syntactically-correct code in a language... you don't need hackerRank. Or to restate this, if your interview process can't filter out such a candidate without using hackerRank, your interview process itself is flawed.
Their interview process surely can handle "can this person code fizzbuzz", but it makes no sense to take time from one of your engineers to test this when it can be trivially handled by something like hackerrank.
I don't understand why people find this practice questionable, it saves time for both company and candidate.
But perhaps a different way of answering your question would be a more apt response?
Take this with a grain of salt, but in my experience of building organizations, it seems to me that in social systems, as opposed to engineering systems / code, the most powerful organizational forces are the second order and tertiary effects of the policies that are institutionally put in place. Not the first order effects.
The first order effect of using HR/LC easy questions might be that it weeds out people who really can't code in a language. And in that result, using HR and LC in your interview process, isn't a bad thing.
But the 2nd order effects of using HR/LC are often disastrous. Because once you start using HR/LC as your FizzBuzz question resource, you've made it easy for interviewers in your organization to start using HR/LC questions for more than just a FizzBuzz replacement. Maybe interviewers go to HR/LC for medium, hard questions. Maybe more?
HR/LC elevate what should be a small (though important!) part of the interview process, to one of the most important, hardest, and complex parts of the interview process. In most interviews I have witnessed in the last few years, this comes at the expense of other things that _should_ be more important parts of the interview process. And that's what is disasterous.
That's the issue with HackerRank and LeetCode. It is intentional from HR and LC's point of view. And is an unignorable, indeed one of the most important aspects of adopting a policy of using HR/LC in your organization.
I disagree. Kind of. There are many questions that are bullshit puzzle types but I do think HR and LC provides value of teaching when a Data Structure or algorithm is suitable for use. At the very least, It’s far better means of understanding the flexible use that college textbooks don’t teach while having test cases to see pitfalls of implementations. Having a community that shows ingenious ways of tackling problems is valuable too.
I agree that a good interview process should check that the applicants have their bases covered with algorithms and data structures. I agree that having a compendium of good questions to ask for this purpose is useful. And I agree that HackerRank and LC have compendiums of such questions.
However, I disagree that that makes HR or LC valuable or a positive in the interview process. Basic DS and Algo questions do not require the types, number, or complicated questions one finds in HR or LC. Such questions are easy to come up with and test, and should be a notable, but minor part of the interview process.
HR and LC take what should be a minor (though important!) part of the interview process, and blow them to huge proportions, and thereby distort and harm the very process they're supposed to be helping. They do this, in an attempt to present themselves as having a larger value add than they actually do.
fizzbuzz is the least of your concern if you do these type of interviews. the problem is that you will be asked a question which required several hours to produce an optimal solution, yet you are supposed to deliver this solution in under 5-10 minutes, and the only way to do that, is to repeatedly grind the solution and memorize it and then quickly reproduce it in the interview. as such, the style of interviewing is broken, and optimizes for candidates willing to endlessly grind LC or HR, and memorize the solutions. in the end, everyone loses. coding interviews are there to exhaust the candidate so when the offer stage finally arrives, they take any offer. all it does it put everyone in a bad position.
and you are mistaken that it's up to the company to decide what happens. as an industry of professionals, it is our decision.
If you have a computer science degree, you are expected to know fundamental topics such as data structures and algorithms.
As a professional software engineer, you are expected to be proficient at those skills to an extent in which you can demonstrate them in practice.
The burden of verifying you acquired the skills associated with your education falls on the employer.
Fizzbuzz is trivial and should not be a problem for any developer at any level. Writing a binary search function should not be a problem either.
But I agree that the industry's obsession with competitive programming has gone too far in some cases, especially in cases where the skills being tested are not relevant for the job.
no, the burden falls on the college. that is the point, if the college is good, then you cannot graduate unless you mastered both theory and practice. and i am not talking about fizzbuzz, read my comment.
companies can keep a list of universities and decide to only screen those who do not have a degree from those universities. the list can be dynamically changed based on on the job feedback.
any university in western europe is pretty hard to graduate at without actually having these skills. so it is pointless to enforce coding interviews for those who graduated there. you can’t “buy” a degree there.
If you do that, you will be recruiting talent that is not necessarily good at programming and you will be excluding talent with exceptional programming skills.
If you build software that sounds like a bad idea.
The most valuable companies in the world all converged into the same recruiting processes for a reason.
Thanks for showing up here and replying. I can understand the net being cast too wide for a real problem you are trying to solve. It does have real consequences for already under-resourced communities, though.
I appreciate the "fix-it-twice" attitude implied by the RCA promise (Root Cause Analysis for those who also had to look it up). Also, consider recognition and restitution for the unnecessary work you created for the NumFOCUS director, a NumFOCUS lawyer, and the SymPy maintainers.
A $25k donation to NumFOCUS would be a good start.
If you are willing to talk about what happened publicly, I'm starting a podcast/video series to discuss business and open-source. This would make an interesting conversation. Perhaps we can turn this into a positive to raise awareness and inspire better behavior in the industry? Ping me.
Great way to get $25k (and I'm with you, I don't mean that sarcastically or that you don't deserve it), the company needs to pay up to manage somewhat their reputation.
I'm sorry about triggering you. I can see your point. I did not mean to focus attention away from what happened and how to avoid it.
I am sincerely interested in seeing if the HackerRank leadership will reach out and discuss. I was not intending to promote anything I'm doing right now. I suspect they won't but I will talk to them respectfully if they do.
Let's promote the SymPy maintainers, though. Let's also promote NumFOCUS, because they do a valuable service to community-driven projects and could help SymPy respond to this sensibly. Let's also promote NumFOCUS because they efficiently and tirelessly work to help the projects they fiscally sponsor (like SymPy) -- providing legal support: https://numfocus.org/.
Perhaps something good can come out of this, still.
What are you doing about all the other bogus DMCA claims you company has sent (itself or via its external contractors)?
What are you doing to ensure this never happens again?
Why should we believe your answers, and how can we see concrete evidence that you've stopped doing this and made up for all the times you've done it in that past?
> What are you doing about all the other bogus DMCA claims you company has sent
Multiple commenters asked him this and he never replied.
Disappointing but not surprising. It's all just superficial hand waving. They are sorry but only for bring caught.
Dear Vivek, thank you for the note. I am the original author of SymPy. Most people who work on SymPy myself included do it in our free time. So far this has cost many hours of my and many other people's free time to try to figure out what is going on and to get this resolved. I also personally reject your allegations against us.
I thought about this situation and how your company could make this right. If you would be willing to help improve the SymPy project, I think it would be very well received. If you are interested, please let us know! See Travis Oliphant's reply, you could for example donate some money to NumFOCUS. We are very good with using money that people or companies donate to fund development and we can really boost SymPy forward big time with a generous donation.
No you aren't. You are only sorry this made it to the front page of HN. Revoke the contract and stop with DMCA takedowns. Unless your company has invented brand new ways of solving common algorithms everything you are doing is most likely derivative work anyway.
This company hires fresh college graduates to regurgitate famous algo problems in different wordings and slight variations and then claim copyright. Just look at their problems.
Also they take pictures and videos of candidates while taking interviews. Without that permission they don’t allow the test.
They’re not part of the problem - they’re the problem.
Are you seriously claiming copyright over other people's solutions? You have *no* right, legally or ethically, to exercise any kind of control over "solutions to company assessments" under the DMCA.
At worst, you can try to exercise the controls in the EULA that your victims agreed to, but this sounds like abuse of the legal system.
The problem with the DMCA is that if the claimant disagrees with your counter notice, the content stays offline for at least enough time for a lawsuit be be filed over the contents of the post, if the claimant is actually willing to take the matter to court.
I don't think anyone will actually go through the expenses to go to court over a code snippet on Github, unless they're particularly principled, wealthy, and retired and I think Hackerrank will definitely try to send their in their legal team.
By just filing a counter notice, you're also doxing yourself to people who clearly have no interest in following the law.
Github seems to have processed the DMCA counter notice but the content is still offline. From that I gather that Hackerrank filed a lawsuit or the counter notice was retracted somehow, or Github is violating the DMCA by keeping the content offline.
> The problem with the DMCA is that if the claimant disagrees with your counter notice, the content stays offline until a lawsuit is fought out over the contents of the post.
Not quite. The obligation of the service provider when they receive a counter-notice is to inform the person who filed the original notice, and give them 10-14 days to file a court order to block the content. If they don't get that court order in that time, the service provider must put the content back up or risk losing their DMCA safe harbor.
100% and that is ridiculous, it's going to be a cancer on the open source community. Basically patent troll level harassment, except you didn't need to go through any of the "work" or "effort" to acquire a patent, and you don't have to actually bring a suit in court for the damage to be done.
You can't copyright algorithmic code snippets. It's like trying to copyright math. You will have an insane amount of false positive from every repository that have ever implemented a prefix sum or binary search. (Or anything autocompleted using github copilot or alphacode which I believe is trained on competitive programming styled submissions from codeforces and atcoder)
The solution would be to build some sort of cheat detector and punish only the candidate. Don't take down and lose the goodwill of the entire programming community by abusing draconian laws.
This is a misunderstanding of the case, as rayiner’s comment in that thread attests. That code was not the only code that was copied, there were also tens of thousands of lines of interface code that were the same. The case rested on whether the interface code was copyrightable. Oracle used that smaller sample as evidence that Google had copied carelessly from copyrighted material.
I'm sorry. I should have clarified and I can't edit my original comment now. I meant plagiarized problem statements. Usually these sites have the entire problem statements from our website along with the solution (aka code snippets to solve.)
Most, if not all, of the problem statements won't even qualify for copywrite, they are generic problems that people have been stating for decades or more and as such are extreamly unlikly to be unique or original to hackerrank.
I am sure however you have already assesed this and applied for the appropriate registration for these works that you are claiming to own the copywrite to? right?
A person driving 100 miles per hour down a residential street and crashing into something is an unintended consequence, but not unforeseen.
This may have been "unintended" but when you have a company file DMCA notices on your behalf without proper supervision (especially when they had filed DMCA notices like this against other open source libraries) it is not "unforeseen" for them to file a DMCA notice against a completely innocent project.
At the very minimum the open source library that was affected and their maintainers deserve monetary recompense for the time and stress that your actions have cost them.
It would actually probably count as a derivative work under US copyright law. (Don't misread me: I do not endorse, condone, or make excuses for US copyright law. I'm just describing it).
edit: Here's an example,
- "In so doing, the magistrate judge agreed with plaintiffs’ assertion that the solution manuals sold by defendant qualified as “derivative works” under the Copyright Act. As in Pavlica v. Behr and Addison-Wesley Publ’g Co. v. Brown, defendant’s manuals complemented plaintiffs’ 187 copyrighted textbooks, had no “independent economic value” and were “meaningless’ without the textbooks because they merely provided answers to questions posed in the textbooks."
IANAL but I believe the case is different for algorithmic solutions.
The solutions to a particular textbook are entirely useless outside the context of the textbook, but the solution to "efficiently find a substring within a string" is useful in day-to-day programming. The takedown isn't aimed at the result of "convert 700m to feet", but "describe the process of converting between metres and feet", which is plain ridiculous.
To ensure that they aren't liable to HackerRank for any infringing content, GitHub has to keep content down for at least 10 days after receiving a counter notification.
Do you plan to contact GitHub and give them a legally binding agreement to indemnify them for any infringement of HackerRank's copyright in the material they took down (I'm sure there is no such infringement, but by agreeing you won't sue GitHub for restoring the content, you effectively retract the notice and give them no reason to wait the 10 days)?
I think a sincere apology should include steps to urgently minimise the ongoing damage to the existing victims, not just analyse what went wrong to prevent hurting other victims.
It's not an either or. We need to do the takedown AND figure out a way where we can do things like randomizing questions but preserving the integrity of it to ensure a fair evaluation, etc.
Why do the takedown though? You’ll never get rid of solutions being online.
There are solutions to most questions on leetcode on GitHub as we speak.
I don’t get it. If your assessment is designed that knowing the solution ruins it then your assessing process is broken.
I’ve literally showed candidates the solution to leetcode questions and they still don’t get it. Trying to remove solutions seems like a poor use of effort imho.
Is your company's legal position seriously that you hold copyright on independently written solutions to your prompts? If so, I'd love to see you sued for this. This issue goes way beyond just one misfire.
Exactly. When you file a DMCA takedown notice you are asserting under penalty of perjury that you own the copyright. While few have suffered consequences for false notices, pissing enough people off might change that.
> figure out a way where we can do things like randomizing questions but preserving the integrity of it to ensure a fair evaluation, etc
How does any of these have anything to do with copyright infringement in the context of DMCA takedown? Do you even own the copyright to the alleged leaked solution?
literaly everything your assessing is programming 101. its not something that you can or should be asserting any copywrite on.
infact I would challange you to register the copywrites on your questions/answers and see exactly how far you get with that process before you even consider sending a single DMCA notice.
You’ve made a serious mistake somewhere in your reasoning. What you believe is the property of your relatively insignificant enterprise, the whole rest of the world knows damn well is not your property.
Besides, this is small potatoes. Why don’t you copyright Wikipedia, then DMCA WikiMedia, then put your copy behind a paywall.
Wait. Wait a minute. Just wait one little minute here...
Your company has issued a DMCA takedown notice for "plagiarized code snippets" against the library-in-questions own documentation? (The take down was issued against docs.sympy.org; linked directly within sympy's GIT repo README.md: https://github.com/sympy/sympy)
And this is an "initiative" your organization has undertaken? Have you thought this through?
Where should engineers who want to learn about sympy go to attain that knowledge?
Where did your team learn about and generate the assessment questions?
Where do you stop with this initiative? Would you issue takedown notices to MDN? Stack Overflow? What about VSCode Copilot?
> Our intention with this initiative is to takedown plagiarized code snippets or solutions to company assessments.
How can this be legal? Plagiarized is moreover a very conveniently vague word.
By this very vague definition the overwhelming majority of your own content has been seen on various parts of the internet way before you ever started your business.
There can't be a copyright on such a thing and I hope someone fights you in court, this is nothing short of bullying.
I hope that all of your snippets and solutions are continued to be spread as coding interviews need to disappear and are the ultimate evil our industry has been dealing with now for several years. Besides, it's not your property anyway, it's the property of those taking the coding interviews.
Allow me to strongly suggest you stop DCMAing things without a human-with-a-clue in the loop before firing.
I get it, I'm an automate-first type of person, too. But it appears your outfit has made a habit of doing this, and you're messing with innocent people for no good reason. Doing that in a `for` loop is irresponsible as hell, and frankly you deserve the bad press you're getting. I highly recommend you reconsider.
Legacy media companies do this kind of thing, too. I hope you're more interested in your reputation than they are.
> is to takedown plagiarized code snippets or solutions to company assessments
So which is it? Plagiarized or solutions? Because unless your company actually owns the copyright for the code in question, you have no right to it.
If a Google assessment asks “how many manhole covers are there in New York” and someone decides to create a repo with “Manhole covers in New York: 23,000” there is no copyright claim.
Do HackerRank users affirmatively sign a non-disclosure agreement? I checked your terms and there is not an apparent non-disclosure agreement. If not, there isn’t any legal recourse if users decide to talk about or solve your problems off the platform.
Educational and not for profit use is often protected by Fair Use. I didn’t see the repo in question so I can’t comment on if the use of the material meets the Fair Use criteria.
However if people are sharing their answers to your questions in a repo, especially for educational or non-profit value, there isn’t much you can legally do about it. For example, If I take the GMAT and I discuss a question on Twitter, that’s allowed by Fair Use. If I offer to sell a PDF of the entire test, that would not be protected.
If someone said “hey HackerRank’s Ruby test asked a question about finding the smallest value in an array” and then they posted a solution. That’s absolutely not violating a HackerRack copyright. People even have the right to describe their experience on HackerRank in great detail.
From your own policy: “Content that You own and post on or through HackerRank belongs to You..”
So if someone wants to post their solution to a question, your own terms allow that. It’s their solution, it belongs to the user.[1]
If someone else “plagiarizes” another user’s solution, HR doesn’t have standing for a DMCA takedown. The user that created the content does, but not HR.
However even if HackerRank did own the solutions, there is a high likelihood that Fair Use would be in effect.
I have a strong dislike of HackerRank type companies because it’s pretty rare that those ridiculously academic assessments predict real world performance. They are a screening tool for the lazy. A well designed, relevant code test relating to the work of the company or a solid technical interview is much more valuable in my experience. When I was interviewing with companies, I immediately passed on the job of HackerRank was part of the hiring process. I ended up at a FAANG for over 5 years, so it’s pretty clear that anti-HackerRank bias among potential employees like me have a detrimental effect on recruiting.
But it's working. Here is a DCMA the author tried to fight, saying hacker rank doesn't publish their solutions, and the solution is the sole work of the author:
Sorry, everyone!
EDIT: update here https://news.ycombinator.com/item?id=31092085