Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft sends GitHub DMCA shutdown for Skype open source project (github.com)
189 points by ecaron on May 2, 2012 | hide | past | web | favorite | 90 comments

Please note: this is not an attempt to clean-room reverse engineer Skype, or even to reverse engineer and create an open-source replacement. As shown with things like WINE, Microsoft does not (and probably cannot) take them down.

No, that's not what "Skype Open Source" project is. This is a confusingly misnamed attempt to attach code to patched copies of copyrighted official Skype clients.

Microsoft is not overreaching here; this is what anyone would expect with a lame project like this. I'd love if someone really created an open-source Skype client, but this is not a project to do that.

You are wrong. You eyes are blind by hating. Stop. Listen the Nature.

Skype-open-source code here: https://github.com/skypeopensource/epycs

repo was closed because i host in download section deobfuscated skype5.5 binary, my fault.

No, that's bullshit. https://github.com/skypeopensource/epycs/blob/master/sources...

That function right there (and the vast majority of the rest of the file) is a straight decompiled version from Skype. You even use the register names and comment with addresses from the binary.

Having spent a lot of time reversing apps myself, this is a straight copyright violation. Your code is a directly derivative work from Skype.

I wonder if that's a small enough component to count as de minimus copying.

At least 3/4 of those 4000 lines is directly copied from Skype, and that's just one file. I don't think that's small enough to not matter; anyway, if it really was so small, it should easily have been rewritten with original code.

It's the size compared to the size of the relevant copyrighted work that matters, not compared to the size of the work it's copied into.

No that's a simplistic approach. Both matter, for example one volume of the Encyclopedia Britannica is a very small portion of the overall work, but it cannot be freely copied. Likewise if you write a book a million pages long, copying entire chapters does not suddenly become ok.

It is my understanding that "the relevant copyrighted work" in the Encyclopedia Britannica example is likely to be the article rather than the entire encyclopedia.

Is there the notion of minimum copyright infringement?

Yes and no. There is no such notion in copyright law. Case law is unclear, however: http://en.wikipedia.org/wiki/De_minimis#Copyright

The jury instructions issued by the judge in the ongoing Oracle vs Google case make considerable reference to the principle: http://www.groklaw.net/pdf3/OraGoogle-1018.pdf

"No, that's bullshit."

What about you talking? Maybe you need read dictionary definitions of words bull and shit. Pay copyright rent for this words, and stop trolling us.

How exactly is one supposed to make a clean-room reverse engineering in the absence of any documentation and sophisticated obfuscation on Skype's part ?

Usually, that would mean reverse engineering to create specifications and then getting the product implemented by some one else using the specification which would then be termed as clean.

Yes, and reverse engineering to create specifications is done... how ? It requires examination of the only currently working implementation that we have.

The term "clean room" has a well established meaning you probably aren't aware of, I refer you to [1], it might clear up what the original reply was referring to.

[1] http://en.wikipedia.org/wiki/Clean_room_design

I am well aware of the joke that is the clean-room process. Note that it supposes that the first team has access to the specification, source code, or any IP-encumbered indication about how the software in question works. In the case of Skype, the only public information we have is the binary.

Note that it supposes that the first team has access to the specification, source code, or any IP-encumbered indication about how the software in question works

No it doesn't.

The Wikipeda link[1] mentions the original IBM PC clean room implementation by Columbia Data Products.

It doesn't have the complete story of that, but from memory the way it worked was CDP had one team documenting how the BIOS responded to inputs, and then a totally independent team reimplementing that behaviour.

Most clean room implementation don't have access to a specification, let alone source code.

[1] http://en.wikipedia.org/wiki/Clean_room_design

Interesetingly enough, if you read on about more current cases it mentions Sony vs Connectix, which I thought Connectix lost, but actually the ruling was overturned on appeal.

From the ruling: "Some works are closer to the core of intended copyright protection than others. Sony's BIOS lay at a distance from the core because it contains unprotected aspects that cannot be examined without copying. The court of appeal therefore accorded it a lower degree of protection than more traditional literary works."

Thus one could try to make the same case for Skype.

>Note that it supposes that the first team has access to the specification, source code, or any IP-encumbered indication about how the software in question works. In the case of Skype, the only public information we have is the binary.

So what? That's not a valid excuse to distribute modified binaries to the public. If the "first" team wants to look at binaries to do their documentation, they can download the official Skype client from skype.com. Why do they need to download a hacked binary from Github?

Well this decisions effectively forbids the "team A" of a clean room process to operate in the regular open source mode of distributed and public development.

I understand the rationale behind it, I think it is wrong and has bad implications for the future.

You are aware that clean room development is about protection from legal problems, not about solving the problem as efficiently as possible, right?

From my understanding, you are allowed to decompile and inspect the released binaries, as well as intercept and analyze all inbound/outbound data, as long as you aren't modifying the running binary in any fashion. Things like debuggers are commonly used for this goal, as are tools like wireshark for analyzing the traffic. Once you create a specification based on your analysis, someone can create a clean-room implementation from that spec.

Exactly. Deobfuscated version just comfortable to debugging.

Sure, but it doesn't mean you're allowed to publish a deobfuscated version...

This is a serious obstacle to collaborative open-source reverse-engineering.

Yes it is, but thats where other solutions need to be met.... say a script that de-obfuscates a vanilla binary package so the output is created on the users machine vs being distributed as such. You can distribute the script and leave compliance to the end user.

If the result of this script is forbidden to be distributed, this could be a rationale for forbidding the distribution of the script itself.

If I write a script that generates the Matrix movies from the binary file of, say, Elephant's Dream, I won't be allowed to distribute that.

Yes, copyright law is complicated. You need to make sure the script used to create a deobfuscated binary isn't a derived work from the original binary, whatever that means (that's where consulting with a lawyer can help).

Is a patch a derived work from the file it applies to ? It is impossible to produce the patch without using the original file, yet intuitively, I would say no.

It's hard, but people have made progress with that approach:

http://recon.cx/en/f/vskype-part1.pdf http://recon.cx/en/f/vskype-part2.pdf

Those links detail reverse engineering the binary. I think Iv's point is that reverse engineering an existing binary is not what "clean room" means.

If team A uses the binary to write a spec, and team B uses the spec to re-implement without direct help from team A, it's generally considered legit.

This was the method used by Compaq to implement the IBM BIOS, back in the day.

Yes, and we agree that team A's work needs to be done before team B can work. A's work is exactly what this DMCA notification tries to shut down.

> A's work is exactly what this DMCA notification tries to shut down.

What part of A's work involves distributing deobfuscated binaries to the general public?

Team B should only have access to the specifications developed by Team A and nothing else from them. That's the whole point of isolating them.

Yup, I'm with you. I don't think anyone's done anything like that for Skype, though.

Seems like you could do it in theory by observing the protocol in action. Set up two machines a-skyping, sniff all the packets, think real hard, write code that interoperates with Skype clients.

And how do you propose to decrypt those packets?

didn't you see the part about "think real hard"?

Maybe then you're not "supposed" to do it?

For example, there is no reverse engineered libraries for Facetime and iOS API. If you use Apple's libraries, they can come after you for copyright infringement.

Cappuccino is exactly such a reverse engineered library, as is gnu step. For that matter Linux is a reverse engineered drop in replacement for the unix kernel.

I'm not familiar enough with Cappuccino to comment, but GNU Step was an independent implementation of the published OpenStep interfaces. That isn't reverse engineering.

Linux is an independent implementation of the Posix interfaces.

You can reverse engineer anything that you can reverse engineer. "Supposed" really doesn't come into it. If you buy a copy of osx and reverse engineer some libraries, then publish a description of their behaviour, so what?

That's fine, but if you publish the modified binaries of OS X's libraries with hacks to make them work in Linux along with the description, you will be in legal hot water.

Are (were) they distributing the patched binaries or just the tools to do it?

It's hard to tell. The author is clearly not too fluent with English, so it's difficult to tell exactly what he's saying. It sounds like it's based on a decompiled and recompiled version of the core Skype kernel dlls and not a clean room implementation.

There was deobfuscated version of skype 5.5 in downloads section of project. Not in repo.

Code in repo contains reimplementation of skype protocol (RC4 chiper, 41 and 42 encoders) and sample tool capable of sending IM to skype peer.

There also was some tools for extracting user certificate from skype profile.

Right. Now code moved to most fine named - https://github.com/skypeopensource/epycs

How about "Skype: The Missing Technical Manual"?

MS will contribute nothing to the state of the art of free voice and video calls over the internet. As is so often the case, what they purchased with the Skype deal was a user base. One that they could never obtain with the own products.

There is a way to do this without violating copyright.

Do not waste time duplicating the P2P element of Skype (the P2P protocol). P2P protocols have been done, several ways, some of them are easily good enough, maybe even smarter than Skype's (e.g., avoiding the exposure of your IP to the entire internet) and enough of the code is GPL'd or BSD licensed to keep things open. We have ample solutions for P2P. View that as the "open platform".

Now you need "apps" to run on it. First one is a softphone, but with Skype's codecs.

Focus on creating a standalone softphone using Skype's codecs.

Does MS have exclusive rights (patent rights) on Skype's codecs? Not even close. They did not develop them. The patent license could fit on a single page; it's as simple as they come: build stuff, pay nothing.


If you don't want to be compatible with Skype, you don't need their codecs. There are plenty of good voice codecs around, freely licensable.

The issue is how to tap into the existing Skype userbase -- receive and make calls to Skype clients -- from an open-source client.

I agree tapping into the existing userbase is enticing. And that's no doubt what some people are trying to do (e.g. the Russians).

But all VOIP codecs are not created equal. Skype's success is not due to NAT piercing. Even though Skype easy to use, maybe easier than previous SIP alternatives, if calls sounded terrible, people would not use it. Skype's success is due to being usable and having decent sound quality. They did not use the decades old codecs other softphones used. They wrote new ones. And anyone can use them.

Using the same codec as Skype uses should not in any way bind you to their network. It has nothing to do with compatibility. It has to due with getting Skype-level sound quality. Quality that the older codecs have failed to deliver.

It's easy to get people to sign up for free voice and video calls. The key word is "free". You do not have to find inroads into the "Skype user base". Skype spread by word of mouth. If people learn about another client that works as well or better (same sound quality), and it's easy to use, they will almost certainly try it.

Forgive me if I have misunderstood what you were trying to say in your comment. But I do not understand your reasoning.

Yes, they purchased users, but also a lot of patents which go far beyond codecs http://www.google.com/search?tbm=pts&tbo=1&hl=en&...

Read it please.

Demand for Immediate take-Down: Notice of Infringing Activity URL: https://github.com/downloads/skypeopensource/skypeopensource...

Its not to repo code. Its for patched binary, yes.

even better link:


(rendered into human readable)

That is the exact link I posted?

funny, because i got to a page showing the diff, and i thought i linked to the rendered page.

anyway, my apologies, no disrespect, have a nice day.

This seems legitimate and they are well within their right to do this as the copyright holder.

The authors website is http://skype-open-source.blogspot.com/

Apparently he had already been served with DCMA takedown requests. And talks about deofuscation. Doesn't sound like a cleanroom implementación.

Now, we are working on stage1, make working code and do specification. Takedown was because of patched binary which reversers used as black box for make specifications.

This project was recently discussed on HN at http://news.ycombinator.com/item?id=3899829

You linked to a discussion of a tool that lets you lookup the IP addresses of Skype users (which seems like it might have been developed with this tool), but I think you meant to link to https://news.ycombinator.com/item?id=3753155

Was the code mirrored anywhere offshore before the takedown? Could someone re-host it?

Can someone explain why Microsoft would go to such length obfuscating the Skype code and taking down derivative work?

Is their IP worth the trouble? Do they see interoperability as a threat? Are they afraid of misbehaving clients?

Surely it's not just a case of a lawyer having too much time on his hands.

We are appreciate you help on working on Epycs - reverse engineer skype project.

Feel free to study current implementation stage here - https://github.com/skypeopensource/epycs

This is first stage of 'clean room' reversing process - writing specification.

If I am not mistaken it's not the first time Skype send a DMCA to this project. But given where the autor lives, the DMCA does not apply so it doesn't go anywhere.

AFAIK the DMCA takedown applies to Github, who are in the US, regardless of where the author of the project lives.

This probably has something to do with this exploit: http://pastebin.com/yhcrRSVh

So I guess they should rename it Skipe open source.

"Skype: patched and recompiled to do other stuff" is a better title. Decompiling a proprietary product, changing a few lines of code, and then releasing that code doesn't make it "open source", it makes it stolen code (semantics of "piracy is not theft" aside, it's still not his code to be releasing in it's entirety.).

You are wrong. Decompiling is just allowing research deeper in skype-black-box. Project itself is here https://github.com/skypeopensource/epycs

If you want to stay within the law my understanding is that you need to distribute patches against decompiled Skype rather than decompiled Skype with patches applied, otherwise you're distributing a derivative work. This may still be a legal grey area though. IANAL, you ANAL, we all ANAL and that.

The name falls under trademark protection and is not related to copyright.

So much for Microsoft loving the open-source movement.

heh, did you look at the project? It's not open source, it's a decompiled Skype binary.

No, its not. Decompiled binary just need for reversing. Open source clone of skype is Epycs(reverse enginered skype). https://github.com/skypeopensource/epycs

I think you'll have some trouble fighting MS to prove that your open source version isn't directly based on code you got from the decompiled binary.

As other have said, it's better to follow the clean-room process and define the specification. Then someone else can implement it in a legally safe manner.

The specification by itself would be an enormous contribution.

I guess Google too hates open source since they won't release their proprietary modifications to the Linux kernel and their search engine code and will go after anyone getting them and putting it on Github.

That doesn't include Google's modifications.

From http://lwn.net/Articles/357658/

And there's a lot in that tree. Google started with the 2.4.18 kernel - but they patched over 2000 files, inserting 492,000 lines of code. Among other things, they backported 64-bit support into that kernel. Eventually they moved to 2.6.11, primarily because they needed SATA support. A 2.6.18-based kernel followed, and they are now working on preparing a 2.6.26-based kernel for deployment in the near future. They are currently carrying 1208 patches to 2.6.26, inserting almost 300,000 lines of code. Roughly 25% of those patches, Mike estimates, are backports of newer features. .. Linus asked: why aren't these patches upstream? Is it because Google is embarrassed by them, or is it secret stuff that they don't want to disclose, or is it a matter of internal process problems?

If it's like android, it might be much closer to upstream now (your article is from 2009).

MS can't catch a break... They defend their copyrights and get slammed on Hacker News for not being "cool" with the OSS community infringing on their properties. Weak sauce, Hacker News.

Where do you see people slamming Microsoft in this thread?

The post of this just implied, to me, that Microsoft is evil for putting in a take-down request. Why else post this? There are literally hundreds of take-down notices on GitHub. Why post this one?

Flag the post and move on. I don't see any Microsoft hating here, most of the posts are about Microsoft being in their right to take down a project that uses their copyrighted binaries and is distributing them wholesale after adding some extra code...

PG has declared all copyright null and void.

I think you are being downvoted because you forgot the qualifier "if it is too difficult to enforce"

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact