I wouldn't venture to say this doesn't belong on HN since it really is
interesting (if it was actually done correctly), but the files available
for download are most likely illegal, were most likely created with
pirated tools (IDA Pro/Hex-Rays, and yes, as a customer of theirs for over a
dozen years I've reported it), and of course, the usual vilification of
If you're reading this on a desktop or laptop system (rather than a
phone), then you are most likely using an "IBM PC Compatible" even if
you're using an Intel based Apple, and hence, you're using the fruits of
completely legal reverse engineering.
The way to do reverse engineering legally is to have one team reverse
engineer the target and completely document how it works. Once it's
documented, another disconnected team writes a new implementation from
the documentation. This process is how you're using an IBM PC
Compatible today, so yes, reverse engineering for compatibility is
If there is a patented algorithm required, it's not a sure thing. There
are most likely compatible ways around the patent, but there's also the
fact that the patent is only valid in the US. With open source hosted in
some other country, who are you going to sue? The users in the US?
--Nope, users are the ones paying for skype.
You might say, "But we forbid reverse engineering in our license!!!"
Contract clauses forbidding reverse engineering are invalid in many
countries and jurisdictions, and of course, you also have to prove the
other party agreed to the contract/license. With this said, it's very
easy to create a international jurisdictional nightmare to render any
such contract clause tactically impossible to enforce.
The easiest way to think about this is security research. The folks
finding and reporting exploitable flaws in software are obviously
reverse engineering it. Occasionally companies have tried to legally go
after people who have published security research on their products, but
usually this ends very badly for the company. Additionally, doing
security research is protected use in some countries and jurisdictions.
In short, competition is good for markets, and competing by studying and
mimicking the competition is both normal and legal.
For the "rights" advocates out there, there are legal problems with the
three file downloads available:
1.) According to the first file name, the original binaries are being
redistributed which may be (and usually is) against the license terms
and default rights granted by copyrights.
2.) The IDA Pro database (most likely) contains the entire target
binary, so you do have (illegal) redistribution of a copyrighted work.
You can load only parts of a target binary into IDA, but that doesn't
matter since it is still a portion of the original work. As for whether
or not said portion could fall under fair use is debatable (i.e.
lawsuit). In general usage, the entire binary is loaded, since without
it, you're limited to static analysis (i.e. no debugging).
3.) Decompilation, and to a lesser degree disassembly, are equivalent to
"machine translation" in the sense of copyright. Creating a translation
is considered creating a "derivative work" and unless you have been
given rights to create derivative works, then you're in trouble. One of
the comments here on HN claims the "source code" file is the output of
the Hex-Rays Decompiler.
I've never used skype and I've never read their license so I don't know
if they specifically allow redistribution.
I have no love for skype or microsoft, but if this had been done
CORRECTLY by releasing written documentation so an entirely new
implementation could be written, then I'd have no problem with it.
There are right ways and wrong ways to legally create compatible (open
source) software through reverse engineering, and this is a perfect
example of the wrong way.
HN is an internationally read site. Things that are illegal in some countries are not illegal in others. As responsible citizens, it is up to the individual to not engage in illegal activities in the region said individual is in.
"The way to do reverse engineering legally is to have one team reverse engineer the target and completely document how it works. Once it's documented, another disconnected team writes a new implementation from the documentation."
Yes, it is a common silly practice that stems from the real madness that are copyright laws. Considering that the documentation passed between the two teams contain all the informations to make the software work correctly, I wonder what makes it different from a source code. I could easily write a code generator that would be fed a "documentation" file and generate the C code that creates the final program. Hell, a C program is a specification on how to generate a given binary code. I wonder how often this really happens behind the doors at these "clean room implementation" teams.
Yes, but the Hex-Rays decompiler is not. I can't touch the files, so I can't tell you what version was used, but in the comments here, there is a claim that the supposed source code is Hex-Rays output.
I think you actually want two answers, the question you asked and "why
The IDA Pro disassembler and the Hex-Rays decompiler are not only very
expensive tools, but they are very difficult to purchase. Due to
constant problems with piracy, these days they will only sell their
products to three areas; (1) governments/law enforcement, (2) very well
established corporations (typically well known security research
people), (3) very well established university researchers.
Typically, they refuse to sell to individuals, but there is a fourth
class of customers who are individuals; very old customers like me who
have a perfect track record of maintaining possession of their copy of
Every copy of the software is custom compiled and watermarked so it
is traceable to a particular person. Every database created by the
software is also watermarked, so when someone who is not a licensed
customer publishes a database (.idb), the software can be traced and
the account will be terminated (i.e. no further purchases allowed).
When someone does something blatantly stupid like disassembling and
decompiling skype then publicly making all of the files available, it is
fairly certain that they are using a illegal copy of the software. They
do not understand what they're doing. They do not understand the tool
they are using. And they don't have any respect for either the tool or
the work of others. --All of this loudly screams PIRATE!
The pirates either don't know about or don't care about the watermarks
in the databases they create. They don't realize that publishing a
database is discouraged. I've never heard of a case where a database
watermark was successfully forged (i.e. pin the blame on someone else),
but a cracker named "Quine" once successfully removed the watermarking
in IDA back in the late 90's.
The "correct" method to publicly share the research work done in IDA is
to dump the database to an IDC script (an internal language), then
provide the IDC script and the target binary. Customers know this, or at
least they should. With that said, friends do toss databases back and
forth on occasion, but that's a matter of trust between friends where
both of them are customers. Some people in the InfoSec and AntiVirus
crowds exchange databases, even across competing corporate lines since
they're all working together towards the same goal and they've known
each other for years.
This copy of IDA was probably pirated for the same reason Photoshop is usually pirated: because it's expensive. But you don't know it was pirated.
Also: by editing your comments to account for the responses, you make the thread incoherent. I'd appreciate it if you wouldn't do that, or, at least, if you must do it, to do so in corrections at the end of your comment. It's fine to be wrong. I'm wrong all the time.
That is simply incorrect. I do not "know" Ilfak. I just emailed him, discussed the cost of a student license, provided proof of being a student, and filed an order form for IDA Pro Standard 6.0. The only thing that at all fits with your story is that a bank transfer was required, instead of paying by credit card, but I believe that is only for students.
Even so, Hex-Rays does sell to individuals. It's not even necessary to ask Ilfak: if copies are being sold to individuals, then they sell to individuals. And those copies are being sold. Here's a picture of my CD, purchased this year, as an individual: http://dl.dropbox.com/u/3177211/idaomg.png
That is incorrect. I purchased IDA Pro as an individual this year, and I am a simply a student interested in reverse engineering. (I believe what you are saying may be true for the "advanced" version, but from what I can tell anyone can purchase IDA Pro Standard.)
Assuming that the deleted parent is about the difficulty of purchasing IDA, as you know, it's only somewhat incorrect. I have wanted to buy IDA for almost a year now, but Hex-Rays is very picky about how they receive their money; I could probably arrange for it somehow but it is an completely inordinate amount of hassle. (And no, bank transfers are not only required with a student discount; I was willing to pay twice as much to avoid the requirement but it wasn't possible.)
Improving our tools is part of our birthright and responsibility; being able to modify and learn from software is a natural outgrowth of that. We of all people should not respect work intended to discourage collaboration by anyone who isn't "established" (granted the privilege of relating to software as a human being, not just a consumer).
Thank you for the insight into your field. Now I'm sorely tempted to try my hand at decompilation.
I've purchased IDA Pro for years for legitimate reversing work, but on the rare occasion that I need to do some more dodgy work for clients, where I don't want to reveal any identity (previously name, now license number) via the watermarks in the database, I will use a pirated version of the software.
My point is that it is not possible to know for sure if the user of a pirated software is indeed a pirate, as there are reasons of privacy to use these editions of the IDA (as well as the most common one of just not paying for it in the first place.)
As to the question of whether Bushmanov has used a pirated edition of IDA for his work, it's interesting to note that the distributed .idb files are in two different formats - as far as I can tell versions 5.2 and 5.5, but the license key is the same for both: A2-86E4-B9BB-D3. It's not one I recognise from any of the common pirated versions but I suppose only Ilfak could tell for sure.
yeah, the author of the blog post said the stuff he has came from VEST. so it's basically a POC based on the code released 2 years ago. if you google it there's a blog post by some other random guy who made a python plugin and some POC code from that stuff too
You are mentioning Compaq BIOS. Think Samba though.
Even OOo/LO .doc support is based on 1-2 FTE revEngs (which btw is dumped mfc/w32 memory on a FAT, but read Sun/IBM anyway).
The team will write a public spec, print it out on paper, and another team Down and Under will scan it and create new code (think RSA patent export). The skype protocol has long been reverse engineered and is available to several parties.
I'm not a US citizen or a lawyer but is the separation of implementation and exploration really required by the copyright laws (ignoring patent issues)?
I can understand that exploration/implementation division as a preemptive "don't sue us" move, but do US copyright really provide that such strong protection that someone who has looked at a decompilation can't be writing an independent implementation? It seems to me the writing an implementation with a different structure or in another language ought to be different enough for copyright reasons.
>If you're reading this on a desktop or laptop system (rather than a phone), then you are most likely using an "IBM PC Compatible" even if you're using an Intel based Apple, and hence, you're using the fruits of completely legal reverse engineering.
Not if you are booting via EFI, for example if you are booting Mac OS X on an Apple.
The way to do reverse engineering legally is to have one team reverse engineer the target and completely document how it works. Once it's documented, another disconnected team writes a new implementation from the documentation.
So you'd Skype's co-operation to do this? They are able to prevent reverse engineering by not writing the documents?
Suppose you and I work for the same company. I bust open Skype through decompilation, reading memory, the network, whatever trick I want. With that, I write documentation for how Skype's protocols work.
You read my documentation, and implement it in a new program. Since we haven't talked, and you've never seen a line of Skype's code, you haven't infringed on any copyrights.
It is important to note, though, that this does not necessarily protect us against a patent suit.
If it is done correctly, co-operation from skype is not required. The team that does the reverse engineering will write the specs and documentation from what they learn by examining and analyzing the executable binaries.
While you are correct about this particular instance of RE, I want to just take this opportunity to remind you that hands-on black-box RE is the technique used to create many of the drivers you see in Linux and BSD. Prior to AMD and Intel releasing video card documentation, every video card supported through community drivers was usually best-understood through RE experiments.
If the goal is to open up Skype, this isn't the way.
Even if some insane insomniac de-twiddles the pages upon pages of optimized indirection in this code (which I seriously doubt), all Skype has to do is tweak the protocol or encryption and the researcher is back to square one. It's a losing battle. And that's not even getting into the legality of it it all.
How about instead of trying to fruitlessly crack Skype, we spend the time making something that's both open and better?
If the system can successfully masquerade as an older Skype version it stands a chance. That is unless Skype has a baked-in not yet understood mechanism for pushing protocol changes to its clients. Skype would have to find discerning features, implement it on their servers and possibly even push it to their client updates. Whenever such discerning features were found it's a simple matter of arms race, i.e. a difficult but fair chance. Alternatively Skype can start blocking older versions which is rather unlikely.
Concerning the open and better issue; There definitely are open alternatives. None of them have quite the firewall-defying capabilities of Skype. Nor the user base for that matter. Building any kind of social network is fraught with chicken&egg problems and those first to reach mass have it made. Just check how one of the richest and most powerful technology companies Google is struggling to get a foot in Facebook's market.
There is WebRTC (http://sites.google.com/site/webrtc/) for Real Time Communication/Conferencing which Google just open sourced. I think that's the way forward rather than developing a separate "Skype-killer" protocol/app.
Both seem quite valuable. Long-term, we want an open peer-to-peer encrypted communication system. Short-term, until Skype dies, it would help to have the ability to interoperate with people who use it. Similarly, while XMPP represents the right open standard for chat, existing Open Source chat programs still need to know how to interoperate with MSN, AIM, and Yahoo
Are they going to force an upgrade to the Skype client, in order to enforce a change to the protocol? That would not be a great move. A lot of us are using older clients, due to how much we loathe the newer versions.
I don't believe it's possible to secure any sort of intellectual-property protection for a mere protocol. The usual way of protecting them, though, is to patent some essential feature needed to implement the protocol, which may or may not be the case here.
Actually, as I've said elsewhere, one easy way of protecting a protocol is to explicitly restrict the right to reverse engineer in in the Terms of Service of the client that implements the protocol. Without that client, there's nothing to reverse engineer.
The extent to which ToS are enforcible has to be tested yet, especially wrt people who aren't even using the client. I mean hackers who just take advantage of the information released by someone who actually broke the ToS. How are they even bound by the ToS?
Actually, no. If you look at those files, you're "tainted" and can't be the one who writes a new implementation. The correct way to do reverse engineering for compatibility is to have to completely separate teams. The first does the reverse engineering and writes the specification/documentation. The second completely separate teams takes the specs/docs and writes an entirely new implementation.
This is the process used to achieve the "IBM PC Compatible" system you're probably using right now (including your Mac). Reading up on the development of the Compatibles is a good way to understand how to do reverse engineering correctly.
You should always emphasize that its the correct way _in the US_. As somebody already mentioned, HN readership is international, and said restrictions on reverse engineering do not apply everywhere. Also the author, judging by his name, doesn't seem to be a US citizen.
Of course that was not the way the "IBM PC Compatible" market arose. IBM published a rather complete set of documentation of the system, including all interface signals and the BIOS source code. I still have several of those documents on my shelf. It is completely different from the complete lack of Skype technical documents.
Then the cloners moved at warp speed. According to Wikipedia, the PC AT shipped in 1984. For nostalgia, I kept my copy of IBM Personal Computer Hardware Reference Library Technical Reference, Pub #1502494.
"This manual describes the various units of the IBM Personal Computer AT and how they interact. It also has information about the basic input/output system (BIOS) and about programming support.
The information in this publication is for reference, and is intended for hardware and program designers, programmers, engineers, and anyone else who needs to understand the design and operation of the IBM Personal Computer AT."
It includes the source listing of the PC AT BIOS, as well as complete interface pinouts, etc.
The colophon for this manual reads
First Edition (March 1984)
So what is your time line for IBM only publishing this manual after the PC AT was cloned?
I was an early Compaq employee. The documentation produced by the research team was vetted for anything not descriptive of behavior, then forwarded through lawyers, who logged each document, to the engineering team designing Compaq's compatible BIOS from the functional specs. A weird side effect: the process reproduced BIOS-level bugs for complete compatibility.
Any software available for free will end up on rapidshare-like page where you can get the binaries and analyse them without accepting ToS or even installing the software. ToS is pretty useless for protecting against RE, since you don't need to look at it.
While copyright wouldn't apply to an independent reimplementation, the article included links to decompiled versions of the Skype binaries, which would definitely fall under Skype's copyrights. Nothing wrong with using those decompiled binaries to reverse-engineer and document the Skype protocol, and I hope this produces useful results there, but that doesn't make it OK to directly redistribute the decompiled binaries.
IANAL, but might the DMCA exemption on reverse engineering for program-to-program interoperability possibly apply in the US?
Of course I have NFI what country the skype-open-source poster is in. FWIW the blog host (blogspot) is obviously in the US, the depositfiles.com file host has DNS registered in Seychelles but seemingly resolves to a US server...
No, but he seems factually correct that the files are IDA Pro output from Skype binaries, and a patched/modified version of a Skype binary.
This is as opposed to publishing observations/specifications from looking at such dumps, or from a black-box observation of Skype's behaviour. In this case I believe jcr is 100% correct that what's being distributed isn't protected reverse engineering output, it's a derivative work of the original.
In short: Decompiling or cracking a program and posting it online with notes is not the same as reverse engineering it, although it's a step in that direction.
It wasn't an ad hominem attack and I have nothing against jcr. I don't even disagree. I just felt it was important to advise people to check their sources since the parent comment seemed to be putting a lot of trust in a comment that could easily contain misinformation.
- For every bit of the subject matter that I've learned, I can name at
least a half dozen people who know that bit better than I do.
I would suggest not wasting your time trying to authenticate me, the
source, but instead, put your effort into finding outside authentication
of the statements. The source in this case really doesn't matter, but
outside confirmation really does matter.
And things like compression algorithm are patented, and that's very likely skype is using some of it. Reminds me of a project by Intel of providing an implementation of g729 (a voice codec). The source was available, but it was "non commercial usage only" because of the patents mostly.
An interesting project, but doubt we'll see any usable implementation anytime soon IMO. And even if it does, skype will probably alter a bit the protocol to make it fail if it reaches a critical mass.
"All rights reserved" is not an obstacle to an OSI license. If you open source some code then you can still reserve all your rights to the code. This mean that you can redistribute/sell/license it under any other licensing schemes you wish to. Other people, of course, can only use it under the license you grant them.
From what I understand, the big unsolved problem open VOIP options have isn't the voice bit, it's negotiating a connection through NAT/firewalls/dodgy routers etc. I doubt this helps with those problems.
Backwards compatibility with linux clients probably isn't the top priority of Skype now that they're being rolled into MS, but I suspect embedded deployments on handheld devices might still keep them from deprecating.
aside from the awesome technical exersize in hacking, i don't see this as any net benefit for VOIP.
the time would have been much better spent working on the GNU VOIP client, not only would improvements have been usable without legal issues, they would be there in an (ostensibly, perhaps) understandable format - working code.
Skype voice and video calls route media directly among participants with two exceptions. First, some thin clients on mobile phones and a few embedded devices don't connect directly to the Skype network, so media streams through Skype gateways operated by Skype or by Skype mobile operator partners. The other exception is for group video calling, a premium service, which redirects video streams through Skype servers to push computational loads for media transcoding from the desktop or mobile clients to Skype's cloud.
Has anyone actually read the revealed code? Aside from the commented copyright text at the top, can you explain what the code does well enough to document Skype protocols? Is this in any way useful if you want to talk to Skype servers or clients?
I haven't tried Skype since 4.0, and I just tried it again now. When did it become such bloatware? I don't think I'll ever want to use it again if it stays this way. The new interface looks pretty confusing, too.
It seems that the encryption algorithm has been reverse engineered. I guess you'll still need the keys to decrypt the voice data using this algorithm, assuming it works. It's a big deal if has been done, because a lot of people have been trying to crack it. Some governments are going to love this. The Skype client itself has a lot of obfuscation to prevent something like this.
I think it's fair to surmise that those intelligence agencies that care have probably had the algorithm for a long time and searched for weaknesses. Bear in mind that at one time they were complaining about it's use by criminals to avoid phonetaps.
There has been some speculation about a backdoor in Skype which it has shared with intelligence agencies. Never confirmed by Skype of course. But this could allow anyone to decrypt a Skype conversation stream. All you need is a Skype supernode to get started. Or some kind of spyware on the subject's computer which stores/transmits the data stream.
Easier said then done. The most reliable way to dump audio output is by API Hooking, which isn't easy in the first place and detectable by AV, Spybot etc. It much simpler to intercept the network traffic.
Quite possibly, and certainly I think they could get it if they really wanted. Certainly it can't be much worse that typical public email providers where law enforcement seem to pretty much just have to type an email into a form to browse your inbox.
The ability to interoperate with Skype as one of the participants in a call has little to do with the ability to intercept and decrypt Skype calls as a non-participant in a call, unless Skype uses broken crypto (a very real possibility).
You come up with a name based on the observed behavior. I'm honestly amazed that they've made it as far as they have without decent naming conventions -- the use of good identifiers in your IDBs cuts reversing time way, way down.
1. I would love it if MSFT, as the new owner, gave up on the cat-and-mouse of security through obscurity and obfuscation and settled on a published and peer-reviewed protocol. I am sick of the memory footprint and cpu spikes in having to run skype clients because 70% of its resources are dedicated to hiding what is really going on. I would love a nice, clean, light version
2. we can well assume that if this is happening in the public domain then it was probably done a few years ago behind closed doors at the NSA et al