Hacker News new | past | comments | ask | show | jobs | submit login
YouTube-dl is now part of GitHub/dmca.git (github.com/github)
1098 points by madars 9 months ago | hide | past | favorite | 315 comments

Heh, I didn't expect to get much attention for this. I thought it would be funny to push a merge commit between the 2 repo's latest commits. As a result, the git history is accessible from the dmca repo if you know the commit hashes. Since I didn't rebase, all the commit hashes were preserved with signatures. Another fun discovery is that deleting my fork of github/dmca didn't affect the PR like I thought it would, so it seems a mirror of youtube-dl's commits are stuck in the ether until GH deletes my PR and garbage collects the repo.

This is hilarious, well done.

I realized something while :+1:-ing your PR: I was thinking about how digg deleted my account over posting the AACS key, I really couldn't care less if Microsoft deleted my account over it.

Very interesting considering that even just 2 years ago I never would have done this for fear of my account being deleted. All of my work and personal projects are moved to gitlab (the CI/Kubernetes/etc integration are just too good to pass up).

I know a sample size of 1 has an effectively 100% error rate, but, I think Microsoft is losing mindshare with GitHub. Stuff like this doesn't help. I could see a small company like GitLab needing to toe the DMCA line, but, Microsoft has the deep pockets and could have built some major community will here by handling this better. Unfortunate that they didn't.

Anyway, fun hack, I wonder how long it will last, or will they merge it? It must be the most approved PR in GitHub history at this point!

Microsoft is losing the good will they've worked so hard to rebuild.

Microsoft employees, managers, Satya: do you see what you're doing? I love the direction you've taken over the past five years, but bowing to the RIAA and attempting to disrupt an innocuous tool are horrible decisions. It's a chilling note, and everyone involved in tech can hear it.

Are you telling us Github is not a safe place to develop software anymore? Because I'll believe it and go elsewhere.

Siding with the RIAA and bringing engineers' Github accounts into compliance will tell us who your real customers and allies are.

Don't side with regressive legal trolls that bring zero benefit to the world.

IANAL, but as far as I recall, a site receiving a DMCA notice must immediately remove the stated content. The person who put the content up may then file a counter-notice, which, if uncontested may then be followed by reinstating the content. If contested, the process moves into the courtroom.

This is not a DMCA notice; A DMCA notice requires the identification of claimed copyrighted content and this document does no such thing.

The document claims that youtube-dl is illegal for being able to download video and audio from YouTube but not for hosting copyrighted material in the repo or binary distributions. The RIAA is claiming youtube-dl is "circumventing DRM".

They do actually mention 3 specific copyrighted songs used as examples in the tests. Seems to me that was on oversight, and they should have only used open videos as tests in the repo. Similar to how Kodi can't use copyrighted material in app store screenshots or videos.

The tests are testing that youtube-dl can successfully see through three specific varieties of obfuscation as used by those videos. Testing open videos would not work, since Google doesn't obfuscate those in the same way. Uploading special videos with the right configuration wouldn't work, those particular obfuscation methods are only available to specific YouTube partners, not to the youtube-dl authors.

Isn't that evidence for the case though? youtube-dl has code to see through obfuscation methods which are only available to specific YouTube partners. If you're a tool for downloading open, not copyrighted videos, why do you need that?

youtube-dl is not, strictly speaking, a downloading tool. It's an access tool. You want to access YouTube videos through an alternate client, as is your legal right in the European Union? youtube-dl is a useful library for the job.

You can use youtube-dl to download videos, but the intended use here is clearly for watching them. Just because syringe needles are tested on (and advertised for) human skin and are also tested with poisonous chemicals, that doesn't mean you should attack syringe needle manufacturers for making murder weapons.

You're trying very hard to make a technical distinction but the problem is that this is a legal problem not a technical one and rightly or wrongly, legal definitions don't always align with technical ones.

For starters, intent is factored into the law. There's several different classifications of murder depending on intent. Likewise someone carrying a kitchen knife home, still packaged, from the shops is unlikely to be reprimanded compared to someone carrying a more decorative knife around. They're both knives but one instance clearly carries a different intent to another.

The problem with youtube-dl is that having tests which work against copyrighted content and having the README describe usages against copyrighted content, it's much harder to argue that the intent of this is purely for copyleft content. Your point about this being an access tool (which actually makes no difference in terms of circumventing DRM anyway -- which was the claim for the take down) also doesn't fly because this tool creates a file on your local disk so it's hard to argue that the intent is for that file to be temporary.

I'm not saying I agree with the take down notice (I don't) but something a great many techies on here miss is that not every argument can be won with science.

> ...intent is factored into the law.

What gets lost or missed is the underlying intent of the protocols and tools being used to share content. An http server serves files independent of the client/user-agent–that's how the web works. If a work is published this way then that's the expectation. If YouTube and the RIAA want it to work another way, then use a different protocol/medium and put the content behind a login and limit access.

I'm not saying people should be free then to republish/share copyrighted works. Just that we are free to use tools to retrieve files that have been served openly via the web.

Right, but even when offering free resources to the public you still maintain the manner in which they are accessed and distributed.

It's not as clear as let's say access to a public park on private land. But the idea would carry weight that youtube may control the manner in which their publicaly accessable website may be accessed.

Let's say you can access via the YouTube app which does not require an account, and now you reverse engineer that app to bypass the app all together. It's a bit like cutting a hole in the fence around the playground.

I'm not a fan of what happened it's just clear to me why it happened.

This was indeed bound to happen. The discussions we have are what's needed to decide how the internet/web moves forward. And there's a bit of cake eating and having it too from both sides.

There's an expectation from the publishing/serving side of the equation that the content is being served to a proprietary app or a web browser that works a particular way. With the browser being one of Chrome, Firefox, etc., along with Google's YouTube apps.

On the user side, especially those who understand how the content is served, that the browser is not the only abstraction allowed. Google themselves run bots to scour the internet employing all sorts of tricks to access and index content. There's a fundamental way in which the http protocol works and its content served that is client agnostic. Everyone, Google Search most of all, have benefitted from this.

Making tools other than browsers illegal will fundamentally break the internet in my opinion.

> An http server serves files independent of the client/user-agent–that's how the web works. If a work is published this way then that's the expectation. If YouTube and the RIAA want it to work another way, then use a different protocol/medium and put the content behind a login and limit access.

They don't use HTTP. HTTP is used to bootstrap the player, not to feed the content. The content itself is served over another protocol such as RTMP and that protocol is a streaming protocol (it sends chunked data) and it wasn't intended for downloading files and writing them to disk as a singular binary blob. Obviously it can be used that way but it's fair to say services like youtube-dl are using the protocol in ways it wasn't originally intended to be used rather than content owners serving content on a protocol that was always designed for distributing files.

It's a bit like recording something on VCR from an RF signal; there's nothing technically stopping you from doing that as recording a TV show is technically equivalent to watching it. But equally you can't blame TV networks for using RF to air their broadcasts knowing that risk is there.

The problem is any delivery system you can dream up for enabling consumers to view a recording will have some unintended method for copying said content. Even if it is as low tech as someone physically sat in a cinema with a handheld camcorder (how many movies have been leaked online that way?!)

This is why I keep coming back to the point that you can't use science to argue a legal issue; they ultimately serve different purposes. Science can prove something can be possible, the law is there to argue if something should be allowed to happen (putting aside for one moment the variety of differing opinions about morality et al). So if you have an issue with the youtube-dl take down then you need to treat it as a legal problem rather than a misunderstanding of a technical solution.

> They don't use HTTP. HTTP is used to bootstrap the player, not to feed the content. The content itself is served over another protocol such as RTMP and that protocol is a streaming protocol (it sends chunked data) and it wasn't intended for downloading files and writing them to disk as a singular binary blob. Obviously it can be used that way but it's fair to say services like youtube-dl are using the protocol in ways it wasn't originally intended to be used rather than content owners serving content on a protocol that was always designed for distributing files.

Not true at all, most YouTube videos are offered as plain webm files.

Also, keep in mind that recording TV's is legal.

> Not true at all, most YouTube videos are offered as plain webm files.

It's been a while since I've written a video streaming scraper but it used to be quite common for a file to be served over HTTP but that file was a small "shortcut" type file to an RTMP stream. So a webm file wasn't the content itself but instead a pointer to where to stream the content from.

I'd imagine the same would still be true for YouTube since, like most other video streaming services, YouTube can adaptively switch bitrate depending on the bandwidth available to the end user. That seamless switching can't be done with a HTTP GET of a singular video file but it can happen effectively with a chunked streaming protocol.

> Also, keep in mind that recording TV's is legal.

Yes but with caveats, depending on the country.

Though it is worth noting the only reason America and UK law is so relaxed regarding VCR usage is because corporations making video recorders were taken to court by film studios and won their case. So once again it comes down to presenting a legal argument rather than a technical one.

This hasn't been the case in a long time, most streaming sites no longer do RTMP unless specialised cases, because of scaling and ease of scaling. They're mostly HLS or equivalent now.

Ah yes, I'd forgotten about HLS (doh!). But even there HLS, while based on HTTP, is still very different to the kind of GET requests the GP (or however many posts down it was now) suggested when they talked about downloading a file.

HLS is not about downloading a file, it's about downloading chunked data. It wasn't intended (though it can't be prevented) that the chunks would be used to recreate a video in full, unlike with a stream of bytes from a HTTP GET which are very much intended to be recreated in full at the receivers end.

HLS really only uses HTTP transport as headers to circumvent many firewalls (and in fact you can do this with RTMP too, eg RTMPT) but aside from that it's a completely different beast to GET.

>> not every argument can be won with science.

Is this even a science argument though? The law and its enforcement may occasionally use science to illustrate specific cases (the implementation) but it's more the documentation of a giant, waterfall-based architecture project. Why are we suprised when we see specific cases that seem like failures despite agreeing with the fundamental premise?

Come off it, the other part of the name is "dl"

I use YouTube-dl as much as the next guy for lawful reasons but these HN comments saying it's not for YouTube and it's not for downloading is silly. It's literally in the name.

Please remember that to reverse this YouTube-dl needs to go through a court. Courts are not code that can be tricked by clever wording. If anything you'll be trying to describe what it is to some old grandma who hasn't got a clue what GitHub is never mind convincing them this software isn't for downloading YouTube videos.

By all means everyone go on with coming up with clever interpretations of the law but you're just farting into the wind

P.s downvoting my comments doesn't change the way the law works either, but if it helps you feel good have at it :)

> … these HN comments saying it's not for YouTube and it's not for downloading is silly. It's literally in the name.

Regardless of the name, the tool supports over a thousand different sites, not just YouTube, and is routinely used for streaming rather than downloading, with no permanent copy saved. If you run "mpv https://youtu.be/WhWc3b3KhnY" to simply play the Blender Open Movie "Spring" it relies on youtube-dl behind the scenes to stream the video. (Though of course the difference between streaming and downloading is a trivial one; getting the content to the end user's device is the hard part.)

YouTube itself is the entity responsible for duplicating and distributing the content, and they have a license to do so, ergo there is no copyright violation here. The most any user of youtube-dl might be liable for, assuming they don't save a permanent copy or further redistribute the data, would be a violation of YouTube's TOS. Which is no concern of the RIAA. To call youtube-dl a "circumvention tool" is laughable; obfuscation of a video's URL is not DRM.

> You want to access YouTube videos through an alternate client, as is your legal right in the European Union?

Do you have a source on this? I can't find anything about this through a quick Google search, but I'd love for this to be true.

To the best of my knowledge (and IANAL) a good example of this is the European Court of Justice Case C-355/12 "Nintendo v. PC Box" where europas highest court ruled that DRM must respect the principle of proportionality and circumvention of technical protection is only illegal if done for unlawful purposes. If DRM prevents lawful purposes, then the DRM is not proportional to consumer rights (or those of other corporations). If circumventing DRM is illegal is a question of the specific case and must take into account the purpose of the tech used to do it and what people actually use it for.

This, at the core, means it is very hard to argue that some library written to circumvent DRM is "illegal tech" and have it taken down in this manner, because the DRM could be inappropriate and it is not the copyright owner who has to decide that.

The ECJ was asked by a Milan court for a preliminary, so they gave instructions how the Milan court should handle the matter and how the law is to be interpreted. The base case was about a mod-chip sold by pc-box, who argued as a defendant, that circumventing nintendos DRM for the purpose of playing homebrews was ok and nintendo preventing that is inappropriate.

In the general case this was a huge win, because "circumventing DRM is illegal" is only true with a big IF, not as a blanket statement. And from that follows that usage of alternative clients is well within the consumers rights, but again i am not a lawyer.

However note that the Milan court then ruled in its case 12508/2015 that this particular mod chip is illegal. Nintendo gave a lot of evidence about the advantages of their DRM in terms of cost, ease of use, security etc, comparing it to inferior implementations of their choice that would fail to protect the copyright holders interests, as well as evidence of usage of the mod chip for piracy. On the other side pc-box defaulted, filing no evidence showing that their users are a vibrant community of homebrew gamers and techheads that circumvent DRM for purposes well within their rights, like running self written software on the hardware. The Milan court also argues that defendant has a burden of proof to show that a more proportionate drm method was possible, which i strongly disagree with, and which seems to follow nintendos argument that their solution is appropriate even if more restrictive then strictly necessary. Note that Milan does not speak for the EU.

The EU directive is a bit different to the implementations I've seen (which are stronger), but you still have a right to create such an alternate client (if it's sufficiently different in its expression, which youtube-dl clearly is):

https://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:... article 6(1):

> 1. The authorisation of the rightholder shall not be required where reproduction of the code and translation of its form within the meaning of points (a) and (b) of Article 4(1) are indispensable to obtain the information necessary to achieve the interoperability of an independently created computer program with other programs, provided that the following conditions are met:

(a) you're allowed to use YouTube, (b) YouTube is undocumented, (c) this only applies to relevant parts of the code.

> 2. The provisions of paragraph 1 shall not permit the infor­mation obtained through its application:

to be (a) used for other stuff, (b) distributed, (c) used for cloning or copyright infringement.

> 3. In accordance with the provisions of the Berne Convention for the protection of Literary and Artistic Works, the provisions of this Article may not be interpreted in such a way as to allow its application to be used in a manner which unreasonably prejudices the rightholder's legitimate interests or conflicts with a normal exploitation of the computer program.

Creating youtube-dl isn't infringing on anyone's copyright, and the rightsholder here is Google, so it's allowed. There's wiggle room for arguing – it's not as cut and dry as most Big Bold Legal Statements I make, so iamnotalawyerandthisisnotlegaladvice – but I'm fairly sure this is sound.

In the UK, you have unequivocal rights to do this. https://www.legislation.gov.uk/ukpga/1988/48/section/50B, based on this directive, says:

> (3) In particular, the conditions in subsection (2) are not met if the lawful user—

> (a)has readily available to him the information necessary to achieve the permitted objective;

> (b)does not confine the decompiling to such acts as are necessary to achieve the permitted objective;

> (c)supplies the information obtained by the decompiling to any person to whom it is not necessary to supply it in order to achieve the permitted objective; or

> (d)uses the information to create a program which is substantially similar in its expression to the program decompiled or to do any act restricted by copyright.

The wording in (d), here, is clearer than the EU directive – unless youtube-dl's existence can somehow be shown to be a copyright violation (specifically, if its creation was an act restricted by copyright), it's permitted. Not sure whether this would help in an EU court, but if other countries' implementations have taken the obvious interpretation of the directive, then the other language versions of the directive are probably clear on the matter.

> youtube-dl is not, strictly speaking, a downloading tool. It's an access tool.

What you’re describing is a circumvention device. The DMCA explicitly outlaws these. 17 U.S. Code § 1201


So why can't we crowdsource some Youtube partners that are willing to allow their copyright to be used, while still utilizing the obfuscation?

Or are we talking a sweetheart deal only available to specific parnters ?coughvevocough

That just proves that Youtube-dl is intended to circumvent copyright protection measures. Those obfuscation measures count as "effective" for the purposes of the DMCA. Trafficking in software to circumvent them is a felony. GitHub and Microsoft could find themselves criminally liable if they do not take it down.

Take down chrome then its "intended to circumvent copyright protection measures" as you can clearly watch these youtube videos with chrome.

That's what confuses me. Isn't there a gray area here where it's not illegal to download copyrighted material, but illegal to reshare it, aka upload it.

So if that is true, then youtube-dl is not illegal and this DMCA is easily reverted.

RIAA is just trying to use their deep pockets against a nobody and Microsoft is basically saying "We side with you" knowing damn well their lawyers know this code is not illegal.

I would love to know the legal distinction between YouTube-dl and Chrome.

They are both obviously user agents used for accessing video content hosted on a public remote server using standard web protocols.

Then tell me how uBlock Origin is somehow Ok if YouTube-dl is not.

What is the legal distinction between DeCSS and the decryption code in the firmware of a DVD player?

One comes from the publisher and is intended to play back the content in an approved way.

The other is a reimplementation of the first, intended to circumvent the restrictions imposed by the first. That's illegal, per the DMCA. It's black-letter law in the US and elsewhere.

In the case of DVDs, the copyrighted code which decrypted DVDs was being licensed to DVD player companies. The code was not open source, and so stealing the key was itself copyright infringement, and more obviously circumvention of DRM.

In this case we have video which is streaming on the open web, which trivially provides for the user agent to download / cache a copy. The question is not a matter of whether that copy is being redistributed, but that the user agent can watch it in an ad-free space whenever they want. You could probably accomplish the same thing, or nearly so, with a browser plugin.

Which is why I make the comparison to uBlock. uBlock is similarly "playing back the content in an unapproved way" and perhaps you could say it is also a circumvention device.

The whole concept of "user agent" is that software on the user's machine -- that they control -- is rendering remote content in a form and fashion chosen by the user. The HTML provided by a remote server is not a legal contract for how that content must be displayed. It is a semantic description of the content, which the user agent can do with whatever it pleases for display to the user.

As long as no redistribution is occurring, the whole basis of the world-wide web is that a user agent can do whatever it wants to the content, including save a copy on the local machine.

So to me the most interesting question is exactly how youtube-dl becomes/became distinct from a user agent.

I missed the news story where the RIAA is going after chrome and the one where they filed a DMCA complaint against the uBlock Origin repo.

Anyone who read the justification in the original complaint would see how these are different, but it doesn't matter; the RIAA is not pursuing them so until they or someone else does it's irrelevant.

> Take down chrome then its "intended to circumvent copyright protection measures" as you can clearly watch these youtube videos with chrome.

No-one out there thinks there's any problem watching them on Youtube, where royalties can be traced and paid.

streaming through your browser from the YT site is clearly an allowed and promoted use - it's explicitly in the T&S.

Regardless, not honoring a DMCA would mean GitHub is asserting that they think it's false and would need to defend their decision if RIAA were to sue them; it doesn't make sense to verify every single DMCA claim and increase liability by not honoring some.

THis is likely a consideration. From my decidely non-lawyer viewpoint the RIAA is using a terrible law, but still a law and they have grounds under this law for their claims. GH and MS have deep pockets and if they tried to initially fight this I'm confident they would settle. If you're the RIAA who would you rather go after - someone who will pay after a little bit of fighting, or a huge group who will immediately roll over but have nothing to give you?

DMCA covers "circumvention devices".

But the DMCA notice mechanism does not.

No, youtube-dl is illegal for including code to decrypt youtube video request signatures, which are scrambled specifically on most music videos, which a german court ruled is an effective technological protection measure.

>llegal for including code to decrypt youtube video request signatures

No it's not, in Switzerland for example it's legal to use tools to remove stuff like DRM, if you DONT redistribute the decrypted stuff.

What a German Court says is often not relevant outside germany.

Switzerland, as well as almost every western country that ratified the relevant WIPO treaties, has broadly similar laws to the DMCA: https://www.admin.ch/opc/en/classified-compilation/19920251/...

Yes and? I just wrote that removing DRM is not illegal here, but distributing the de-drm material is. And tools to remove drm is also not illegal.

If you read the link you will notice that just like the DMCA, distributing tools capable of removing DRM is illegal in some cases.

>The ban on circumvention may not be enforced against those persons who undertake the circumvention exclusively for legally permitted uses.

And download a Youtube Video is legal.

Storing your DVD's in another format and rip your CD's

EDIT: BTW in the take-down notice you can read that the tool promotes or is/can be used to download Justin Timberlake (and that would be illegal if you redistribute it) so it's not that the code to decrypt is illegal (you know "hackertools" are illegal in germany too) but the potential intention of the tool.

"Exclusively" is the key word. Whether youtube-dl is designed to "exclusively" be used for legally permitted uses is a matter for a court to decide. The RIAA's claim listed several reasons why it believes this is not the case.

The DMCA as written is a little more nuanced, but on whole the laws serve the same purpose.


Not the tool is meant here but the User, and you know it.

It's legal to record radio, and record/download Youtube videos, if you don't redistribute it (outside Family and Friends

The distribution of youtube-dl itself is what's at issue, not the use of youtube-dl or the distribution of files downloaded with youtube-dl.

And why is that not a issue anymore? (hint it was):


The RIAA is specifically making the allegation that "the source code was designed and is marketed for the purpose of circumventing YouTube’s technological measures to enable unauthorized access to our member’s copyrighted works" [1], justified by the references to copyrighted works in the youtube-dl readme and source code. If it wasn't for that, it'd be much easier to argue that it's designed for non-infringing purposes only (or one of the other exceptions to the DMCA's TPM clauses), like you could plausibly argue for libdvdcss.

[1] https://github.com/github/dmca/blob/master/2020/10/2020-10-2...

Here's a video of a copyright lawyer agreeing with GP for 1h20min: https://youtu.be/wZITscblMBA

In short, what a german court says will absolutely influence what a US court says in the context of an international copyright treaty.

>german court says will absolutely influence what a US court

Not will but can. That's a difference...and i dont watch a 1h20min YT video about Copyright...when decrypting has nothing todo with copyright.

It's a video about the exact case in question, talks extensively about the decryption aspect and in general the circumvention of technological protection measures.

Watch it, don't watch it, it doesn't matter to anyone here, but please lower that horse back down to earth, it's a little high.

>but please lower that horse back down to earth, it's a little high

As if your talking about yourself, maybe your a bit young but we had the exact same problem in the past (decrypting DVD's) those are Disc who stored Movies on it. And a Swiss Court decided that it's legal to use dvcss (not sure about the name).

And you tell me to watch a YT from a US Lawyer.

If you'd bother watching the video you'd note that he specifically talks about the CSS DRM, and even the AACS encryption key.

So yeah, tone it down a bit.

>So yeah, tone it down a bit.

Speak for yourself please.

It is.

The DMCA makes circumventing "effective" protection measures illegal. "Effective" is a term defined in law and basically means any hoop you have to jump through to get to copyrighted material, no matter how flawed or trivially bypassed. The restrictions in place to allow YouTube videos to be only downloaded through YouTube count.

Youtube-dl is absolutely illegal software.

> Youtube-dl is absolutely illegal software.

Not in Italy

What US law says is often irrelevant outside of the US of A

No, the restrictions around circumventing an effective technological protection measures are implemented in Italian law too [1] and more broadly are required to be implemented in every EU member state as a result of EU Directive 2001/29/EC [2]

[1] https://www.wipo.int/edocs/lexdocs/laws/en/it/it211en.pdf

[2] https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A...

That doesn't make youtube-dl "absolutely illegal".

You should know that "absolutely" means anytime, everywhere, which is obviously wrong.

You should also know that in Italy for something to be claimed illegal the court has to emit a verdict.

If nobody challenge the status of youtube-dl obtaining a guilty verdict, it is perfectly legal.

Last but not least, that law was created during fascmism, so there's also that.

But in any case "fair use"(as you americans call it) is always permitted in Italy, you don't need a license, you are liable only if the entity detaining the copyright obtains a restricting order from a judge.

So in Italy I can download any video I want from YouTube and I am permitted to watch them, It's not against the law.

What's against the law is the re-distribution of the content.

Moreover, you probably misread the law or skipped that passage, in Italy internet publishing (such as file downloading) itś a different right and it's called "exclusive right of communication to the public of the work"

> it also includes the making available to the public of a work in such a way that members of the public may access it from a place and at a time individually chosen by them


You're getting down-voted because people don't agree with you, not because you're fundamentally wrong.

All the responses "not illegal in COUNTRY X"; how is this relevant? Is the RIAA using a law from <UTOPIA_COUNTRY> on a company based in <FANTASY_LAND>?

You can absolutely host this repo with a country & business that doesn't recognize or cooperate with US laws. Your choice is somewhat limited.

What was sent was not a DMCA notice in the standard sense; I encourage you to read it. It’s designed to look like one, but lacks critical elements (such as identifying an infringing URL).

The infringement described in their notice is hypothetical, not an actual instance of infringement, which is one of the required things in a DMCA notice.

GitHub can indeed leave the repo up, as this notice is insufficient to trigger the part of the DMCA that forces them to remove it.

More info from Parker Higgins in a tweet thread:


Correct. A company cannot not comply with a dmca takedown. When the law was written, the supposed safeguard against copyright holders filing tons of frivolous claims is that a counter claim can be pretty serious. Of course history has shown that this didn't work out as planned.

A company is under no obligation to comply with an invalid notice. Wikipedia ignores invalid DMCA requests all the time.

It is risky though, as the company opens itself up to a lot of liability if its wrong.

wikipedia has the advantage that they can deploy a banner informing people about stuff in an instant. From some googling, back in 2015 Wikipedia had half a billion monthly users - that is massive reachout capability.

Anyone daring to put up a fight with Wikipedia risks being flattened by outrage.

That's pretty much it though.

In all honesty, we need a GH alternative hosted outside the US, in any place where absurd laws like DMCA don't exist.

What we really need is a decentralised git forge using something like ActivityPub. We cannot keep relying on companies to host open content for us.

ActivityPub is federated like git is distributed. You can add several remotes from different servers (or peers in case of single user instances hosted at home)

Even though other sites like GitLab exist that GitHub remains a near-monopoly default doesn't bode well for attempting to migrate to a completely decentralized system as default. There are decentralized extensions of git (like git-ssb).

Addenda: Git is also not well suited for a lot of decentralization due to how branches are modeled. Patch-theory based systems like darcs and pijul would need more adoption before code repositories could be decentralized.

I understand that but there is a difference between just git and the forge, which is GitHub in this case. If I fork a project on GitHub it has a fork relationship on the platform itself. This makes it very easy to interact with the original project. However, without an actual GitHub account I can't contribute back to the project at all.

If I have for example my own Gitlab server I have the ability to clone a project from GitHub and maintain my fork on my server. However, to contribute back, I will still need to bring my changes back to GitHub first.

What a federated implementation of such a "git hosting/collaboration platform" would allow me to do would be to fork a project from another instance to my self-hosted instance. It would benefit every open-source project that would be willing to host their own instance. I could then easily contribute back into the main project without needing an account on their instance.

It would also make it much harder to take down a project with a DMCA like this because forks of it would exist across instances, meaning claims would have to be sent to each maintainer of these instances. In this case, they were simply able to list all of the forks on GitHub and because GitHub is one website, all of them were taken down, seemingly without any further inspection wether the claim covers these forks as well.

I realise that this is difficult to implement but I think mastodon is a great example of how such federation can work in practice.

What if... you kept github in the loop?

What if you could make a federated/decentralized git that doesn't care whether the git repo is made available by github or my-fancy-federated-git-host?

> What if... you kept github in the loop?

It is like wanting to keep Twitter in the loop when migrating to federated alternatives like the Fediverse. Twitter (or Github) wont federate in good faith and will actively attempt to capture as much users as possible.

> What if you could make a federated/decentralized git that doesn't care whether the git repo is made available by github or my-fancy-federated-git-host?

Git is distributed already; remotes work like that.

Yeah Twitter doesn't work because it's not simply hosting the posts that you interact with elsewhere.

Github is simply hosting the repos.

Git is decentralized already. That's the key difference between Twitter and Github and why I made GAnarchy the way I did. and also why I make a point of hosting a GAnarchy instance on github pages and encouraging others to do the same. (self-hosted is better of course, but.)

Yes, like https://forgefed.peers.community/ which is exactly that.

Probably a dumb question, but don't federated services eventually lead to centralization? for example SMTP is federated but most users are on gmail, IRC is decentralized but most users are on freenode server, etc. users tend to choose the services/instances with the most users, which is what makes github currently very valuable.

That could very well be argued. However, I still have the ability to run my own Email servers and communicate with user who use gmail.

If gmail decided to ban a number of accounts due to something like a DMCA claim, I would be unaffected by that.

The same is not true on GitHub which is one Website. Every fork of this project on GitHub was affected by this claim...

Gmail killfiled a domain I was running on a small but reputable ISP I had used since 2001 (there were no spam or reputation issues in 17 years).

They did let one-in-20 or so emails through, but everything else got to gmail recipients’ Spam folder. I wasn’t able to troubleshoot this with Google tools, and there’s no one to talk to at google.

(Worst thing, you get no feedback - except realizing a month later that someone didn’t get an email you sent)

I gave up and replaced small ISP with fastmail.

My bottom line is that, no you can’t really run your own SMTP server anymore unless google, Microsoft and fastmail let you, by virtue of hoisting 90% of your recipients.

Sure, something like that can happen... but it is not like they have an explicit "allow-list" of domains that mail can be received from. These might be the biggest providers but if me and my friend who do not use their services want to communicate, we will always be able to do so without them.

I have many domains that I use to send and receive mails, and I have personally never had these issues.

Now if I did have a gmail account, I suppose I would have to check the spam folder regularly. But I don't have that problem with my server, it affects their users more so than me.

in the current world, you host your code on github and you can only submit prs to github with a github account. anyone without a github account cannot submit a pr. github hosts all the code.

in a federated world, while github may still host all the code, you could potentially submit prs with and have metadata about a repository, such as issues, spread across multiple different providers. Instead of needing a github account to contribute, you just need an account that github could federate with.

This would mean that if Github did start doing something you didn't like, you would be able to change code host without losing the metadata.

Potentially, anyway.

Federation doesn't prevent centralisation if the service is good enough; it just makes it less painful to decentralise if better competition exists. It also diversifies ownership of data, which is in general a good thing for consumers - and a bad thing for big tech companies that wish to make money off of analytics, which is why we will never see current social media platforms allow federation with other social networks, even if would be better for the world and consumers.

IRC isn't federated in that sense, users on EFnet can't interact with users on freenode, so the network effect works in that case.

Gmail has an enormous market share (about 40%) but I don't think that's related to the nature of SMTP, more about their incredibly competitive free tier and the decidedly not federated groupware for their business offering.

The law says “expeditiously”, not immediately. This gives a degree of flexibility to judge the basic elements of the claim, which they clearly did not do here (as others have pointed out, this does not look like a valid request).

> notice must immediately

No, at most-- they lose a safe harbour if the notice was well formed and properly delivered, and they don't follow the procedure.

But the vast majority of notices are not well formed or properly delivered.

And loss of the safe harbour isn't particularly important if the complaint is bogus to begin with.

Sure, it does mean a slightly increased risk of legal costs-- though anyone can sue github at any time regardless-- but ultimately those sorts of risks are business decisions that have to be weighed against other business costs and benefits.

Github has historically been pretty unusual in its degree of following the DMCA takedown requirements, a lot of other places are a LOT more willing to ignore apparently spurious DMCA complaints than Github has been. I had hoped that this would change with the Microsoft acquisition, because maybe before their position was just that they couldn't afford any legal fights... but it doesn't seem to have been.

False in this case.

The 17 USC (2)(c)(1)(C) safe-harbour protections apply only to hosting of infringing works, and neither youtube-dl nor its test suites infringe on any RIAA or member copyrights as averred in RIAA's notice.


The RIAA's letter does not claim infringement within the text of youtube-dl source or test suites, though it tries hard to appear so, but rather anti-circumvention of a "copyright protection mechanism", under §1201. That is also part of the DMCA, but falls outside the safe-harbour.

At best, youtube-dl's test suite may be infringing works when run, in which case infringement would accrue to the operator, presumably a tester or Github's CI/CD process. Even that argument is specious.

Given output is discarded, no permanent copy is retained, and the action is for research and development, and numerous Fair Use affirmative defence claims exist under §107, notably (1) and (4), test suite execution falls outside exclusive rights. Any one fair-use test is sufficient, or none at all. Test suite execution could be argued non-infinging under numerous theories, including reverse engineering, research, interoperability, all under §1201, or under general limitations on exclusive rights in §112, §117, or elsewhere.

This is where ... things get interesting....

- The "copyright protection scheme" in question, if it even is one, was written by and is provided by Google/Youtube, not the RIAA.

- It is not even clear to me the RIAA has standing to sue under §1203: "Any person injured by a violation of section 1201 or 1202 may bring a civil action in an appropriate United States district court for such violation." RIAA are not injured due to utilisation of a non-member's mechanism.

- Does not pass the 17 USC 1201 (a)(2)(B) test: "has only limited commercially significant purpose or use other than to circumvent a technological measure that effectively controls access to a work protected under this title".

- Yes, Microsoft / Github may have liability under 17 USC 1201 (a)(2), "offer to the public, provide, or otherwise traffic" the code, subject to the same test above. However there is no safe-harbour provision for such violations.

- Microsoft (owner of Github) is listed on the RIAA's members page. Neither Google LLC, its Youtube subsidiary, nor parent Alphabet Inc. are. The RIAA are threatening a member for a §1201 violation against a nonmember. That's ... weird. https://www.riaa.com/about-riaa/riaa-members/

- There's an exception in §1201(f)(2) "a person may develop and employ technological means to circumvent a technological measure, or to circumvent protection afforded by a technological measure, in order to enable the identification and analysis under paragraph (1), or for the purpose of enabling interoperability of an independently created computer program with other programs".

- Youtube-dl is executing code as a World Wide Web user agent, provided by Google/YouTube, and meant to be accessed and run by user agents in order to access YouTube content. That is, youtube-dl's operation is entirely within YouTube's technical design and intent.

- Any potential copyright infringement which might occur through use of youtube-dl is at the volition of users, not the software's authors, actions would properly be directed at such users for individual acts of infringement, and much of this is subject to the same and other defences listed above.

The remaining question is whether or not this claim should be contested. I argue that it should, on numerous grounds;

1. Though the claim is made under US law, similar anti-circ provisions exist in international law, which is highly standardised in large part thanks to the RIAA, MPAA (video), SIIA (software), WIPO, and other copyright monopoly cartels' special-interest deep-pockets lobbying. Offshore legal safe havens are limited and vulnerable. Defence within DMCA /anti-circ / WIPO / Berne regions is unfortunately necessary. Simply hosting the repository outside US jurisdiction is not sufficient, though a valid immediate response.

2. Such claims are specious at best, carry heavy chilling efects, may be entirely fraudulent, and should carry considerable risk. A countersuit agaist RIAA may help make this cartel, or others, think twice about repeating such attempts, as well as establish precedent agaist future such attemps.

3. The future of software, to which Microsoft claims to have harnessed its own wagon, is open, collaborative, Free Software. As such, the software and information services industry's interests diverge from those of regressive copyright maximalists.

TL;DR: This is not a 17 USC 512 infringement/safe-harbour, RIAA's standing is highly questionable, it is threatening a member for an averred nonmember's §1201 injury, any actual works duplication is not performed by youtube-dl's developers directly, nor is the work itself or its test suite an infringement of RIAA / members copyrights, and numerous defences exist for routine use or incidental transmission or copies made by others. Further, youtube-dl, digital and information liberties groups, Microsoft, and Google/Youtube should fight the RIAA's claim.

Haha! This is absurd:

> Microsoft (owner of Github) is listed on the RIAA's members page. Neither Google LLC, its Youtube subsidiary, nor parent Alphabet Inc. are. The RIAA are threatening a member for a §1201 violation against a nonmember. That's ... weird. https://www.riaa.com/about-riaa/riaa-members/

That ... was an interesting realisation.

> Youtube-dl is executing code as a World Wide Web user agent, provided by Google/YouTube [...]

I'm guessing that's a typo, and this was actually meant to say the following?

> Youtube-dl is executing code as a World Wide Web user agent, accessing a service provided by Google/YouTube [...]

No, the code is provided for the purpose enabled by ytdl.

I'll try to clarify that elsewhere:

> Youtube-dl is executing code provided by Google/YouTube, for Wold Wide Web user agents, as a World Wide Web user agent, and meant to be accessed _and run_ by user agents in order to access YouTube content. That is, youtube-dl's operation is entirely within YouTube's technical design and intent.

Wait, but it doesn't execute analytics or run ads.

Which are neither 1) acts infringing RIAA / members' copyrights (this is literaly not copying creative works, nor 2) an anti-circumvention action.

NB: I've adapted this comment as an article here:


GitHub is not a safe place to develop software that has an association with piracy, nor would any centralized service.

Yeah, there's a difference between Microsoft saying "We love open source now" and Microsoft saying "fuck the police, IP law needs to be rebuilt from the ground up anyway".

The idea that Microsoft should lose some of the goodwill it got doing the first for not doing the second is really naive.

No the line appears to be: "we love opensource, unless it messes with some BigCorps businessmodel, in which case a letter of nebulous legal fudge is enough to delete your repo without due process"

To the point. If it is illegal by the hosting country, good luck. And that can be whatever legalese imaginable.

But luckily git is decentralized.

What piracy are talking about?

In the case of YouTube-dl they had test cases supporting downloads of DRM'd material. Specifically "Shake it Off" by taylor swift.

I think there's more nuance to this. on one hand, they loose individual developer mind-share, but I'm sure that big corporation executives and governments are loving this kind of stuff...

Adhering to the law should be in every mindset. Stupid laws and wrong hosting country might be a different story

Why Microsoft? github.com is managed by GitHub, Inc. (that is owned my Microsoft). It is like complaining to Volkswagen if your Porsche gets broken.

They don't want to end up in jail. It's a felony to traffic in circumvention devices under the DMCA. YouTube's obfuscation methods count as "effective" copyright protection per the law.

GitHub never was a safe place to develop illegal software. For example, you could expect them to take down hacking tools lickety-split.

> I know a sample size of 1 has an effectively 100% error rate, but, I think Microsoft is losing mindshare with GitHub. Stuff like this doesn't help. I could see a small company like GitLab needing to toe the DMCA line, but, Microsoft has the deep pockets and could have built some major community will here by handling this better. Unfortunate that they didn't.

If MS didn't comply with the DMCA takedown notice, then the RIAA could go to court and get an injunction to force it down. Depending on how spiteful the RIAA and the judge are feeling, the injunction could be worded to take down the entirety of GitHub over a single repository. And the impact of actions like these is to make it more likely that such an expansive injunction is sought or granted.

What was submitted was not a DMCA notice, Microsoft was not legally obligated to comply with it.

From what I know, DMCA notice requires the identification of claimed copyrighted content and this document does no such thing.

RIAA taking down all of GitHub would instantly pit them against all of Microsoft, and make every developer who already hates their guts to pick up the pitchforks and march. They won't try doing it.

And as much as RIAA is mostly a slimey pit of lawyers, they would stand no chance again msft's goblin hoard of attorneys.

More attorneys does not convince judges more. The RIAA has some merit, therefore they can preliminary injunction github to remove the repo. If github doesn't comply then they receive sanctions. That's simply how it works.

The other commenters here have argued pretty convincingly that the RIAA has no merit at all. Microsoft should have done the same reading of the letter and determined that this is a hill they have to fight on, in order to operate Github effectively.

> RIAA taking down all of GitHub would instantly pit them against all of Microsoft

microsoft isn't gonna fight the good fight. There's no reward (to them) for doing so, and there's only a cost. It'd be a stupid business decision.

What is more correct is to lobby legislation changes to copyright. Repeal the DMCA, or change copyright law to such that you cannot abuse it this way. And this lobbying is the civil responsibility of everyone, not just microsoft.

No company would fight the good fight for the same reasons. This can only be fixed by changing the law. Not to mention find and persecute the parties who were paid directly or indirectly to pass this legislation. RIAA lobbyists (and all other corporate lobbyists) don't lobby in the sense of arguing for something. They use stick and carrot techniques against legsitlators. It's practically raketeering.

I doubt they'd fear the PR backlash. They didn't fear it before SOPA, and even when SOPA was defeated, it was largely because the backlash was driven by efforts by sites like Wikipedia, Facebook, and Google advertising counter-SOPA messaging, and that drives far more traffic than sites like GitHub.

Taking down GitHub would instantly make millions of developers at thousands of companies unable to do their work.

Interestingly, my country (Turkey) did ban github (along with google drive, microsoft onedrive etc.) back in 2016. However they reopened it within hours (18 hours to be exact), because systems all over the country were failing. One of the driving reasons was that ATMs stopped working. Appearently they were downloading some updates from github to function.

Stupidity has no limits.

For more info: https://en.m.wikipedia.org/wiki/Censorship_of_GitHub

And GitLab has now banned Iranians:


I am well aware of that. But RIAA likely doesn't care one iota about that damage. The injunction they'd write would be along the lines of:

"It's necessary that the entirety of GitHub be shut down until this case is decided. As noted here, they have persistently failed to adhere to our DMCA take-down notices, and continue to knowingly host content that circumvents our copyright protection schemes. Furthermore, it is not sufficient to just remove this content, because it has been and will be repeatedly readded, and GitHub has taken no action to remove these extra copies. We are likely to succeed on our merits because <blah>. We are irreparably harmed because <blah>."

That doesn't mean they'd necessarily win on a broad injunction, but they can still push for one (see Apple v Epic for a recent example). And stunts like this one make it more likely that a judge might agree that a broad injunction is necessary instead of a narrow one.

If we cannot shut down criminal banks because they are too big to fail. Then we certainly cannot shut down Github.

Only for a few days, critically.

GitHub may be the single source of truth, but it’s not the only copy of the data.

It will take far more than "a few days" to change GitHub-based workflows. Data loss is not the only way to lose productivity.

Any diligent company that relies on GitHub so heavily will have a service-level agreement in their contract, which will include compensation for a major outage.

Otherwise they're getting GitHub's best effort, which is exactly what they've paid for.

More importantly GitHub could lose the status of safe haven and could be held liable for all repositories.

>I think Microsoft is losing mindshare with GitHub.

You're not wrong. Sure, the cloudflare browser checks and antibot pages on gitlab are aggravating, but there are a litany of documented reasons people are walking away from Microsofts latest acquisition. maybe the biggest one is the ICE contract they ardently refused to back out of.


Its also pretty obvious MS doesnt check any of these DMCA claims, and just rubber stamps the commits with an insulting and sterile "Process DMCA request" commit msg.

gitea and github both host splendidly inside kubernetes or standalone.

Well I've been guessing that github was going to get killed off and rolled into of azure devops at some point.

As for the ICE contract, do you have any idea how much money the U.S. government pays to microsoft, of course they aren't going to back out of a contract. if they did, you can bet an emergency board meeting would be held, and who ever made that call would be axed and the contract reinstated, the JEDI cloud storage contract alone is worth $1 billion a year. not to mentions the 25+ million plus OS lisc. and office 365 contracts, and the GCC contracts.

It's very possible that microsoft as a corporation doesn't even deal with github DMCA until they become a PR issue like this one. Github processed DMCA noticed before they were acquired, and their staff likely still is. Not like that is a hot button issues to convert and integrate in the grand scheme of things.

Just food for thought.

I think you are right, except your first sentence. Azure DevOps will be the one which will be reoriented to GitHub, not the other way around.

Azure Repos and Pipelines are basically duplication in GitHub world. But Azure DevOps issue tracker is superior to GitHubs (or GitLabs but not Jira).

Actually I suppose Skype over lynx is a good example counter to my own point, who knows?

Naming a brand after a concept is an asshole thing to do, sure it puts your product in the search results for anyone googling devops, but it makes looking for anything related to the product painful, so I'm all for another name change.

Fair. But Skype rendered itself irrelevant in the years because of this BS ;)

But I agree. Never underestimate politics vs. reality. Oh 2020.

> gitea and github both host splendidly

Do you mean GitLab?

Haha thanks, yeah I do wonder whether my account may be impacted. Best case they just delete the PR and garbage collect the repo. Worst case I get the final push to use gitlab.

why do you think this is MSFT's fault? Didn't github pre-acquisition always comply with DMCA takedown requests too?

Also, why shouldn't they? This really doesn't seem like the hill they should choose to die on, if they decide to go against the government they would have far more interesting things to fight for.

And this is one of the less defensible ones too, given they explicitly gave examples of infringing copyrights in the code.

EDIT: ah, but this isn't a proper takedown notice, as per https://twitter.com/xor/status/1319755776043384838

This is such a great hack. I approved the PR, naturally.

Note that a hack is not a fix.

All of you, write to your representatives about that if you care. In EU when they were going to put some similar laws we all organized and wrote massively to the parliament and they backed down. It is a lot of effort but it works.

And what should we ask for?

There is stuff to complain about with the DMCA, but the core process is reasonable.

Someone claims copyright infringement by a user, and the website takes the content down.

The user claims no copyright infringement, the website restores the content.

The website is immune from liability, the alleged copyright owner can settle things in court with the user if they want to.

> There is stuff to complain about with the DMCA, but the core process is reasonable.

it absolutely is not in any way reasonable.

you must take down the content without question and then, after filing a counter-notice wait 10-14 days minimum before it can be restored [1]:

"After a counter notice has been received, a service provider must wait 10-14 days before they can reactivate the claimed infringing content."

frivolous complaints can completely cripple a business with little risk to the supposed claimant.


That is, to be fair, an aspect that could be improved on.

However, how do you do this?

For example, perhaps a large-business exemption where they are required to follow through with legal proceedings if a claimant files a counter-notice (so they can't do the whole claim, counter-claim filed, they don't follow up, like with PopcornTime, usually because the DMCA was bogus).

But then this encourages further litigation, so it discourages people taking the risk to file a counter-claim. But unless they follow through with legal action, nobody can assess if the initial notice was "fraudulent". It's a real thorny problem.

There are many possible solutions, e.g.

- A company that files a fraudulent request loses DMCA rights (that way, it doesn't have to be proven repeatedly, just once)

- Liability for false takedowns w/ punitive damages, criminal liability with actual enforcement for malicious attempts or even negligence. Potentially a deposit requirement once a false claim has been made.

Those are good ideas, but the first one should have been:

- Nothing at all happens until a unbiased court issues an injunction.

In other words, the process that was in place before the DMCA was passed.

You should ask for immunity for tools that could potentially be used to infringe copyright, but that are not actually infringing copyright.

If guns don't kill people, people kill people, then certainly congress can accept that software doesn't pirate copyrighted works, people pirate copyrighted works.

Actually I found the statement "guns don't kill people, people kill people" very oxymoronic.

Sure one might argue that if you have the intent to kill, it's not necessary for one to use a gun, you have alternatives such as grenades, knives, chainsaw or even your bare hand if it's lethal enough. This is how you derive that "the gun is not evil, it's the people who are" statement.

But guns, under the category of firearms (not only guns, what about rocker launchers?), are almost-certainly designated to kill, and what is the top priority of firearm? Aim to be the most efficient killing machine.

So if you use a gun, it's not necessary that you are going to kill people, but it is very likely that you are imposing a threat to do so, regardless of if you're attacking/defending.

Therefore, trying to apply the gun analogy to current state of youtube-dl is an injustice to youtube-dl, as you compared an intrinsically evil entity to another.

There is nothing intrinsically evil about a gun. You might as well say that a knife, pen or any other inanimate object is intrinsically evil. Also, there is nothing intrinsically evil about defending yourself.

Oh, but in most cases there is something intrinsically evil about killing to defend yourself.

Like whenever you're not facing a murderous psychopath.

It's a tragedy when two people who have a gun feel the need to kill the other, over nothing more than the fear that the other might have and use one too. Even if one was breaking into the house of the other.

If you're a healthy human being you don't want to kill the intruder, you want to be safe. Neither does the intruder want to kill you, they want to get away with your stuff.

Two people not wanting to kill, being forced to kill simply by the presence, or even just the possible presence of a gun.

If you can't see the intrinsic evil in that, then I guess you must really think lightly about killing people.

I get the analogy, but guns are typically purposed for lethality.

Software is not typically purposed for piracy.

Maybe "trying to hit someone with your car"? Not sure.

It's a perfect analogy because it is so extreme. The powers that be have already accepted that way of thinking (because guns are not illegal) and it's therefore even less of a stretch to apply the same concept to software tools.


For one I’d like there to be some punishment for repeated abuse of DMCA. GitHub policies state that repeat offenders can get their accounts suspended/deleted. Why can’t repeat offenders (RIAA) get banned from the system at least? Some increasing back off after a rejected claim was posted? Anything that would make it un-economical to just spam the system with notices wherever you like it because there is no downside to it.

I wonder if anyone has written something to find and independently assess the validity of these bulk false DMCA take downs.

Honestly, I'm guessing a ambitious law firm could actually make a good class action suit against these offenders.

Seems like broad, real damages could be justifiably demonstrated.

If only we could retool SCO for a force-for-good?

Weren't there environmental laws that a private party could sue an offender over and would get a portion of the federal fines?

Seems like a free-market solution that the right would get behind. :)

The problem is that there is no penalty for fraudulently issuing claims.

Other than it being a felony in the US to issue a false claim with a jail time of up to 5 years aye

Reform copyright. It was based on the assumption that taxing copy production is a reasonnable way to fund creation. It was an incredibly good system in the days of the press where making copies requires heavy capital investment.

It is hilariously wrong nowadays when everybody own a copy-making machine in each pocket.

> It was an incredibly good system in the days of the press where making copies requires heavy capital investment.

It was never a good system, but it was certainly less obviously broken and destructive to society then than it is today.

But what part of it is copyrighted? URL? It does not contain copyrighted text, audio, video.

Information is on the plain sight, it is like banning base64. Any general purpose computing should be taken down as well. Browsers has DRM, as I know youtube-dl can't download Netflix, Youtube Premium etc.

For one, you could ask for due process, or the right to defend yourself before the takedown is executed.

Right now, DMCA is an accusation AND a sentence bundled into one. There's no review of its validity, you're instant guilt unless proven wealthy.

You missed the relevant part of the DMCA for this takedown, which makes bypassing "anti-circumvention' measures a crime. That section has caused nothing but problems.

Fuck no, they didn't back down. Hundreds of thousands marched across Germany, in Munich where a friend of mine organized it was one of the largest rallies in recent histories - and all for vain, we got the upload filter crap regardless of all the promises.

What? Pretty much every EU member ratified the same WIPO treaties and so have almost identical laws to the DMCA.

I don't understand what's going on here. Can you explain?

I might not be able to explain well, but from what I understand about how github works, when you fork someone else's repo, github only stores 1 tree but you have your own set of tags / branches. This led to an issue that was probably fixed where if you set a repo to private, anyone who had a clone could guess commit hashes from their fork's remote. Another interesting thing about git is that you can have 2 root commits (the Linux kernel has 4 root commits iirc).

Because of these 2 "features", when I clone dmca and run `git pull some_ytdl_git_mirror master --allow-unrelated-histories`, I end up with a giant source tree that consists of both repos joined by a merge commit. Because no rebasing happened, no history was changed and it can be pushed without force permissions. Now that all the youtube-dl commits are in the same tree as the dmca repo, you can access them regardless of what fork you've cloned via `git fetch origin <hash>`.

I hope that makes sense?

Nice comment. I'm sipping coffee and much to my shame i thought this was going to be some meta joke about screwing with some sort of master list of DMCA'd repos on Github or something. Sipping coffee, reading comments, chillin.. then read yours, and almost spit it out at how much funnier this is than i thought it was.

I re-opened the link and actually used my eyeballs, and yup.. it's the damn Youtube-dl library on the /dmca repo, bahaha. I almost whooshed the joke entirely, so thanks :)

This seems like a security issue, no?

It is a security issue if the presence of a commit or tree in a repo is supposed to be enough to get GitHub to nuke the repo, as this then allows malicious users to convince GitHub to nuke any repo they like, but GitHub can instead deal with this more sensibly and not make it a security issue.

It could indeed be a security issue. A few options:

- Make a PR to a project that changes e.g. one of the dependencies to typosquatted alternatives. Disguise the commit message as something trivial. Post it to HN with a GH link to the upstream project's repo at your commit.

- Make a PR to a project that adds malicious code, suggest a change to a distro package's source repo to use your commit. Unless the maintainers know about this GitHub behavior, that'll look much more trivial than it actually is.

Seems to me like you could ddos a repo this way, though I guess that would be true of any pr spamming?

>Another fun discovery is that deleting my fork of github/dmca didn't affect the PR like I thought it would

Making the pr will put the branch in the target repo under pull/<number>/head

the commit will forever be referenced

If you have a clone of the dmca repo, run:

    git config --add remote.origin.fetch 'refs/pull/*:refs/remotes/pull/*'
    git fetch origin
You'll now have all of the PR refs in "remote" `pull/<number>/head` branches.

    git log pull/8142/head
    git log 416da574e

And if GH deletes, and anyone else opens a new PR like you did, then those commits would remain until a second GH admin intervened, and so on?

Why github/dmca? Why not e.g. tensorflow or numpy or some other package that people actually depend on?


Just curious, why is this possible with an unmerged PR? Just a weird setup on GitHub's end?

I think it's because GitHub wants to allow repo maintainers to merge in PRs without them having to add separate remotes themselves, ie `git remote add` isn't required to `git merge`.

This basically means that any content can be injected into anyone's GH repo (since PRs can't be turned off), but really only in terms of being able to view it on the GitHub website. To give an example, pull 437 on torvalds/linux[0] hasn't been merged in, but if you go to the commit hash in the browser, suddenly main/init.c has the relevant changes and commit that condense the file into one line[1].

This very well could be abused - imagine framing (or just 'canceling') someone with [insert illegal content here] by PRing their repo with a commit with a forged author[2] then linking people to their repo with the commit tree showing the illegal content.

0: https://github.com/torvalds/linux/pull/437

1: https://github.com/torvalds/linux/blob/2793ae1df012c7c3f13ea...

2: https://stackoverflow.com/a/60900120/3878893

Indeed. And you don't even need to create a PR to "inject" commits into a GitHub repo. You just need to fork it and push to your fork. See https://mathieularose.com/github-commit-injection

yuck, yet another reason to get people to do commit signing - and enforce it by github not attributing unsigned commits.

Does commit signing really solve this? I believe you can restrict branches to only allow signed commits, but since these commits are not in any branch on that repository it looks like that wouldn't change anything. Correct me if I'm wrong, though.

That yes, but at least the github/gitlab/... UI could refuse to link unsigned comments to the userpage belonging to the email in the commit.

It's due to how git works. In order for git tools to compare and otherwise work with two commits, both commits need to be in the same repo.

If "forking" a repo on github really cloned it in their infrastructure, they'd require far more data. So all forks of a github repo point to the same repo, only with different branches.

Note that git clone only clones the actually present branches of the upstream you point it to, but on the backend, all branches of all forks are present.

This isn’t simply because of how Git works. You can configure Git to look in multiple places for repo objects. For whatever reason, the GitHub devs either didn’t know this, or they didn’t want to implement their forking and pull request systems this way.

As someone else mentioned, this may be an intentional design to make it simpler to implement pulling down remote PRs from the destination repo.

> You can configure Git to look in multiple places for repo objects.

What do you mean by multiple places for repo objects? Do you mean multiple remotes? The remotes are fully inside your local database if you run commands like git pull or git remote update, they are just not in your checkout. Commands like git show <commit hash> work on commit hashes in those remotes as well, even if it's not in one of your local branches.

Or do you mean configuring git to use multiple .git/objects directories? I haven't heard of that feature, can you give a link?

The feature’s called alternates. You can use it on-the-fly without modifying any repos by using the GIT_ALTERNATE_OBJECT_DIRECTORIES environment variable.

If you want the effect permanently, there’s the .git/objects/info/alternates file. For HTTP remotes, there’s apparently a .git/objects/info/http-alternates file as well (no idea how that works though). I’m assuming these files allow multiple alternates as the environment variable does.

I'm pretty sure GitHub does use alternates in the same way that that GitLab does:


I vaguely recall seeing @peff comment about this on HN years ago but I can't find that comment now. Here's a GitLab employee claiming GitHub uses alternates:


The thing is, that both the dmca repo and its forks must have alternates files to the same underlying common repo, otherwise the PR ref in the dmca repo wouldn't be able to see the merge commit pushed to the fork. Pushing the merge must have duplicated all the youtube-dl commits into the common repo used by both the dmca repo and its forks because youtube-dl and dmca would have different common repos.

Oh indeed, very interesting. Apparently the feature also existed since 2005, before Github (2008) so they could have used it from the start.


There's a lot of reasons it's possible, but the one that sticks out is that the repo owner needs to be able to modify the commit before the PR is merged. AFAIK, the way that's done is by incorporating the remote repo's commit history into the destination repo underneath a pr-specific branch, which naturally brings all of the commits themselves into the repo's git database.

Follow up question, are these commits or pr-specific branch accessible in target repo's `git` (not GitHub)?


You can get it with: git fetch <remote> refs/pull/<pr>/head

My git config has the alias:

  pr = !f() { git fetch $1 refs/pull/$2/head:pr/$1/$2; } ; f
which will create a local branch corresponding to the provide PR. This is useful for evaluating large PRs that would be difficult to fully evaluate with just the online UI.

Yes, but I don't know if you have to track the branch first or not in order to pull down the data into your local repo.

But this is exactly how merging a PR locally[0] works.

0: https://docs.github.com/en/free-pro-team@latest/github/colla...

It is very useful that commits become part of the target repository as soon as a PR is created. This allows people reviewing the PR to checkout it on their local machines without needed to add the source repository as an additional remote.

Would closing the PR be enough to remove it, or does it actually have to be deleted? I didn't think PRs could be deleted, only closed.

Closing will not be enough. Even deleting the fork that made the PR will not be enough. (The PR remains open and the commit URLs automatically get updated to point to the parent repo, just like the URL that was submitted.)

Users can't delete PRs but GitHub can. They do it for PRs reported as spam, etc.

Regardless, what's needed here is not just deleting the PR (and the fork) but also doing a GC (as Stephen304 said), which too is something only GitHub can do.

I don't think so, if you look at the other closed PRs, you can find some where the owner also deleted their fork like I did. Despite that you can still access the commits they wanted to merge.

The end result will be GitHub taking down this repo and possibly blocking PRs. Congratulations on making life difficult for other people.

If GitHub ever makes it possible for public repos to disable PRs, I think many regular users will be interested :) It's a quite old feature request: https://github.com/dear-github/dear-github/issues/84

It's been somewhat alleviated recently by the "archive" feature, though it would still be nice to have in cases where the repo is still being developed but doesn't want external contributions.

Congratulations on making life difficult for other people.

Tell that to the RIAA...

It's funny to see the Streisand Effect happen with this one.

US copyright law has teeth for stuff like this.

If the RIAA goes after the OP for statutory damages, he's basically fucked for life. And they love to make examples of people. Did everybody forget Kazaa and Limewire?

Not remotely on the same level.

Kazaa and Limewire were moneymaking companies with fairly undefensible behavior from a legal perspective. This is an individual using a website the way it’s supposed to be used, for documentation purposes. A judge would “expeditiously” send the RIAA packing with a large bill for defendant’s legal fees.

Right, it's not the same level. Kazaa and Limewire faced theoretical liabilities in the billions.

That doesn't mean that the programmers of youtube-dl, or those who choose to engage in spreading the program, can't be held liable for much lower levels of damages which are still financially ruinous for an individual even if they're small on an absolute basis.

I've been on the other side of an RIAA lawsuit, and their lawyers are aggressive. They offer a carrot settlement, but if that settlement is declined they will beat you with a stick and offer no mercy.

They could call the U.S. attorney and go after the OP with federal felony charges.

In a 10 second glance at closed PRs, it doesn't seem like they ever merge any from the public.

I suspect that many would consider this to be a win, although I doubt this will happen.

The people who are responsible for this SHOULD have their lives made as difficult as possible.

It's a fun hack, but to those thinking about streisanding the source: The strength of ytdl and other downloaders isn't their source code, it's the extensive library of scrapers that are tailor-made for individual sites.

The devs have to constantly maintain and update those to keep working when a site changes its design.

So if the takedown manages to stop ongoing development on ytdl then even existing copies will become mostly useless pretty quickly.

Would it not be possible to move to some decentralized Github alternative?

You mean something like... got? ;)

But I agree with the sibling posts, while decentralised change tracking is pretty much the original idea of git, all the project management stuff (issues, comments, membership/permissions, discoverability) are a lot harder to decentralize.

I also don't think the MPAA or whoever else would be very impressed by that if the end result is still illegal activity: If the developers are known, then they'd either have to pass on the project to someone else or risk liability, whether their project repo is centralised or not.

Just host a gitlab instance through TOR or something, bish bash bosh

GitLab through tor sounds slow...... (Six dots for six hops)

Tor uses 3 middle nodes, so it's only 6 hops if you count the round trip.

It's the latency that kills you, not sheer bandwidth - I can watch YouTube on Tor on a good day when Google doesn't block my exit.

More people should try Tor. It's easier than you think.

A hidden service uses 6 middle nodes, 3 for the server, 3 for the client.

There was a hackernnews article about a decentralised github (not yt-dl) yesterday: https://news.ycombinator.com/item?id=24874994

The problem is finding a good alternative and getting everybody to move there.

“Everybody” is surely not a massive amount of people for youtube-dl. I mean, it’s hardly the Linux kernel... let’s remember that this is just about developers, distribution is a separate (arguably easier to solve) problem.

And even the Linux kernel does not use a centralized 3rd party service. They use only email.

Of course the server hosting LKML or in the end Linus' mailbox are single points of failures.

Something like... git? Git is designed from the start to work for decentralized workflows, development could continue without too much issue by sending patches by email instead of pull requests by github.

You can’t decentralise issues and pull-requests.

Edited: as pointed-out, Fossil supports decentralized issues (but not PRs). However, Fossil is a totally different system than Git.

It's amazing that no one has thought to do this yet considering the checkered history of GitHub's compliance with DMCA takedowns.

In the early 2010s, there were a whole bunch of distributed issue trackers for git. They all died out due to various problems, such as working across branches, polluting the commit history, extra commands to keep synced (including not getting retrieved with the initial clone), not having a web view for project managers/bug reporters, etc.

I think this was the most popular one: https://github.com/aaiyer/bugseverywhere

Tell that to fossil devs.

I think the long term solution for this problem is the developers of open source software like the bit torrent client, youtube-dl and other similar software will have to host their code outside of the reach of DMCA regualation. That will usually mean something like Github like site hosted in Russia, China or some other bloc where DMCA does not apply.

Right now, lots of users in the US scientific community use Scihub or how millions of US students use libgen, millions of users of open source software will have to get used to a non US controlled domain to get access to their software. Also development of these software will be done in the shadows so as to avoid the reach of the DMCA.

Relying on other countries can't ever be a long term solution.

That pirate bay and scihub are still around, that their authors are walking around freely, is a miracle that should not be taken for granted.

We need to work on fixing these problems in our own countries. We need to talk about the freedoms we care about, put a good brand to those freedoms and win popular attention and enshrine those freedoms into law. And build a society that depends on those freedoms so much that rolling them back would be unthinkable.

That would be an actual long term solution. It is hard but if we all contribute, we can do it. Probably.

>Relying on other countries can't ever be a long term solution.

It is the only solution that can put any pressure on the US in a macro sense. The DMCA is a uniquely American creation, and it will remain in tact until the US starts losing intellectual capital to other countries. Such a thing is not likely to happen any time soon, but in a decade or two, the US will still be arguing which Larry owns the copyright to a header file, while China and others would have moved on to far more interesting challenges.

"Peter Sunde, Fredrik Neij, Gottfrid Svartholm and Carl Lundström were all found guilty and sentenced to serve one year in prison and pay a fine of 30 million SEK". I wouldn't call it walk freely.

Yeah, Sweden doesn’t allow for zeroing debts due to legal judgements in personal bankruptcy, IIRC.

This means that they were effectively sentenced to lifelong debt slavery. This is another dark part of the Scandinavian model that most aren’t aware of. Same system in Norway.

I know a friend of one of those 4 mentioned and he is doing absolutely fine (money-wise at least)

Do you know any details? I had the distinct impression that these debts would never be discharged.

Are the authorities just not following up on using the legal framework to seize any of their earnings towards the debt?

> That pirate bay and scihub are still around, that their authors are walking around freely

Let's not equate these sites. scihub gives people what they are likely already paid for. Academic publishing business is a fraud. https://www.youtube.com/watch?v=PriwCi6SzLo (recent video on the topic that I liked)

> scihub gives people what they are likely already paid for

No, as long as the researcher was not bound by the grant to publish in the open and published in a journal/conference with a copyright, you do not have the legal right to get it, even if taxpayers funded the research.

It’s fucked that the law doesn’t stipulate open publications for taxpayer funded research, but that’s an issue with the law. Academic publishing isn’t a “fraud” in the normal definition.

You're countering what I perceive to be moral argument with a legal one, which would be a category error.

Yet whether the authors walk freely is entirely a legal issue, not a moral one.


I agree in principle, but a much easier fix would be to host such code on a tor hidden service. That way the physical server can be located more or less wherever.

Then we just have to normalize the use of Tor so that banning it becomes unthinkable ...

In other words, instead of trying to change the laws, which is the whole idea of a democracy, just give up on laws altogether.

The vast majority of people don't give a shit about this, so the money being pumped by rights holders will win out.

> I think the long term solution for this problem is the developers of open source software like the bit torrent client, youtube-dl and other similar software will have to host their code outside of the reach of DMCA regualation

I'm not sure about that. I think the real long term solution is to have a decentralized version of YouTube/video publishing platform that is resistant (invulnerable?) to censorship. I'm not saying that it's not difficult, or even impossible, but I would love to see more discussion regarding true web decentralization, not legal semantics.

Zeronet [0] has implemented a decentralized website model, a Github alternative on there could be more resilient, though it does require additional client software to run. I believe the site owner can update their decentralized website with their cryptographic keys.

Perhaps WebTorrent [1] can be the basis for an in-browser implementation, along with service workers to stay cached and self-update without requiring a running host. Talk about serverless :^)

[0] https://zeronet.io/ [1] https://webtorrent.io/

Zeronet is very underrated, though I've never been able to exactly grok their high level architecture, for some reason.

LBRY.tv is a decentralized Youtube alternative. Another one is : bittube.tv

> outside of the reach of DMCA regualation.

What's the latency to the Moon again? :)

2.5s is the minimum roundtrip. Add server and satellite hops and you get something closer to 3s minimum. Every time.

Thanks. Sounds like it'd be workable as a backup target then, but would probably suck being the primary place to grab source code.

That being said, maybe it would be ok as a primary location if there are more localised caches (eg lower response times) commonly available.

It's git - what do you need low-latency for? Download directories, trees, as the initial requests, then send a second request for the source code you want to download. Then the bottleneck is download speed instead of latency.

For just git, sure.

For a web application to present it (eg Gitea, GitLab, etc), then the lower latency would be helpful. ;)

eh, web applications are overrated. :)

Not a problem for Git or even PR workflow then.

It would not be an issue even in the Earth-Mars case, where the roundtrip latency is 20m-1h15m depending on the time.

I wonder how this will work out in practice. Funny thing old pre-Internet protocols migh even work out of the box in this case. Interplanetary email & newsgroups! :-)

Well, you probably wouldn't want to try using standard TCP connections with that. ;)

Hosting the git repository on ipfs might be a solution.

It appears there is already a recent copy of the Git repository hosted on IPFS:


The most recent commit is from October 23.

There is also git-ssb

Nah, that won't work. China is a member of WTO and follows the TRIPS agreement. Therefore, I guess the RIAA will just go court in China, and they can still take down any repository which they'd like to. Lots of companies from the western country have won intellectual lawsuits in China.

Sites like sci-hub works in a way tricky manner. You can't expect an ordinary company which is in another country to do so, because they are still inside the framework of the international capitalism.

So they need to DMCA the DMCA repository to get rid of it? Nice.

It looks like this is the corresponding PR: https://github.com/github/dmca/pull/8142

Maybe this pull request is being censored/shadowbanned right now. Opening the page takes tens of seconds and returns "Unicorn! This page is taking too long to load. Sorry about that. Please try refreshing and contact us if the problem persists.". However, PR #8142 is indeed listed in the list of open PRs at https://github.com/github/dmca/pulls?q=is%3Aopen+is%3Apr .

This just loaded fine for me in Firefox. I'd give them the benefit of the doubt for this one, after all this PR contains 10k commits and 120 comments, and then quite a lot of approvals.

I saw the unicorn when using firefox, but when I switched to chromium, the PR opened fine

if you have the dmca repo on your computer, you can checkout this PR via git fetch upstream pull/8142/head:pr-8142&&git checkout pr-8142

Should work even if they blackhole the pr.

There’s no need to create that pr-8142 branch, which will otherwise hang around indefinitely which is most commonly not what you want:

  git fetch upstream pull/8142/head
  git checkout FETCH_HEAD

Isn't it more like the following?:

  git clone https://github.com/github/dmca.git
  git fetch origin pull/8142/head
  git checkout FETCH_HEAD

I was just modifying the parent comment, which assumes an existing repository.

Since we’re fiddling with things in this way, if you were just trying to get a copy of the contents, this’ll do that about most efficiently:

  git init
  git fetch --depth=1 https://github.com/github/dmca pull/8142/head
  git checkout FETCH_HEAD
(I don’t think you can git-clone an arbitrary ref, only a branch.)

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact