- Ghidra is basically the first real competitor to IDA Pro, the extremely expensive and often pirated state-of-the-art software for reverse engineering. Nothing else has come close to IDA Pro.
- Ghidra is open-source, IDA Pro is not.
- Ghidra has a lot of really cool features that IDA Pro doesn't, such as decompiling binaries to pseudo-C code.
- It's also collaborative, which is interesting because multiple people can reverse engineer the same binary at the same time -- something IDA only got VERY recently.
Ghidra also appears to have a functioning Undo operation, which IDA seems to still not have. Being able to make changes without worrying about your IDB accidentally becoming unusable is huge.
Context: in IDA, certain changes you make can inadvertently wipe out a lot of work - for example, undefining a function (U) can erase all your annotations in a single keystroke; defining a return type incorrectly can completely mess up callers, sometimes to the point where they won't even decompile properly; making a typo to an array size argument can obliterate the stack and every variable annotation you made on it, etc. etc. Many of these require much more work to undo than simply reverting the change you made. So a functioning undo is a big deal
Some more comparisons:
- Ghidra's type system is nice, and in some ways nicer than IDA's. Semi-automatic struct inference rocks, and it comes with a big type library.
- Ghidra will decompile code from a dozen different architectures. IDA will only do x86, x64, ARM and AArch64 (and you pay for all of those separately). In theory it could decompile a custom architecture if you implement your disassembler backend thoroughly enough.
- Ghidra's UI is marginally worse than IDA because it's implemented in Java Swing (compared with IDA's Qt).
- Ghidra and IDA both use Python for scripting. However, Ghidra's Python is actually Jython, which gives it access to the entire state of the system (minus the decompiler, which is native code - but you can interact with all the code that drives the decompiler). This is really big - the API surface of the entirety of Ghidra is pretty massive so the scripting opportunities are similarly exciting.
- Ghidra has a (mostly functional) patching interface which understands assembly. IDA Pro, despite costing many thousands of dollars, gets confused when you try to assemble something as basic as "mov rdi, rdx" in 64-bit code. (There's an outstanding bug which breaks ELF files - but being open-source, I'm sure it will be fixed soon)
I don't think I've ever seen an application where a working undo function was added in a posteriori. You either design it in from the start or you don't get a working one or you have to rewrite most of the application.
Think BinaryNinja has been acting pretty effectively as a competitor to IDA Pro. Its much cheaper than IDA, has a good API and I have been a very happy customer.
I'd like to hear the rest of this story, I was considering buying IDA Pro (though these days I'm having a lot of fun adding M68k support to Avast's Retdec)
No, they never replied at all. Their self-service site broke and they completely and utterly ignored my emails to the associated service address and to a number of other addresses posted on their site.
I grew up using, ah, other methods of satisfying my need for an interactive debugger and those methods continued to be viable after giving hex-rays $1100 and getting flaked, so I wasn't materially impacted by the flakery, even though I "should" have been. Had I been materially impacted rather than merely angered and frustrated, I probably would have tried other escalation paths -- phone, twitter, maybe even snail mail -- and I suspect I eventually could have gotten through, and my expectation absent evidence to the contrary is that if I were to have gotten through they would have helped me out. My takeaway would be "their self-service site is rubbish and they force you to use a company email address (as opposed to your gmail) and then their support email servers sometimes silently drop messages from your company email address, or something," not "they're crooks." On a professional app sold at a professional price, though, that's still not a good thing, and it informed my software choices going forward.
This is really strange to hear. What self service site did you use? Our website chat goes right to a slack channel that multiple folks monitor and reply to at all hours of the day. Worst case if we're all sleeping and you leave an email we respond when we're awake.
Just searched for your username in our chat and our email and don't see anything so I assume you've got a different email?
My problems were with an IDA Pro license I purchased from hex-rays. I have no complaints about BinaryNinja. If you represent hex-rays I'd be happy to PM you my business email.
I'm sorry you had a bad experience. This is the first I've heard about it! We normally get nothing but praise during any customer support interaction. Feel free to email me directly (jordan at vector35 com) with your email address so I can try to figure out what happened. Apologies it didn't go well.
Email privately and get noted for life. This is really annoying when some one complains publicly and they ask your to email in private. Why don't you go through the old emails and provide a solution.
Because it wasn't sufficiently clear to GPP that it was IDA Pro's support that was problematic. GPP incorrectly thought it was BinaryNinja support that was the problem, which isn't the case.
Being open-sourced is a big advantage. I just fixed a bug in GHIDRA relating to trackpad scrolling which makes it MUCH more usable for me. I could never do the same with IDA or Binary Ninja.
I do so love the shell code compiler of Binary Ninja, though. It works very well and has definitely saved me a lot of time.
If you have any pointers to a company / individual making a living building open source tools for developers please let me know. (Working for a large cloud / OS provider that is subsidizing tool development as part of a platform play does not count).
> So you're excluding deploying/maintaining Open Source Software as a service. That basically excludes how Open Source is supposed to get monetized.
Is it? This means that incentives would be wrong, because then developers would be incentivized to produce difficult to use (but useful!) software - with various poorly documented features, multiple ways to solve the same problems, poor and inconsistent UX... Oh wait... </s>
I am sorry Redis labs got the heat for their license change and I really hope some solution crops up. I appreciate opensource, but I am getting tired of poorly implemented systems, just because there is no incentive to do it differently.
I think the solution lies in "free-to-use (but not free-to-sell), source available" licenses. I haven't seen one that would convince me yet, but I am certain that with big-tech companies behaving like they do, more and more developers will think twice before giving away their work for free just so others can make billions off it (and take away income from the very companies providing bread and butter to FOSS developers - coughAWScough).
There are examples which other commenters have already given you, I see. What I'm curious about is why this matters; obviously we as users care about having open-source tooling (and we are in a thread discussing one such tool). Whether or not we can come up with examples of this off the cuff is completely immaterial to the constraint that proposed tools should be open-source.
The reason I asked the question was to find pointers to people to talk to about their experience making money off open source. I have been wanting to build tools in some domain and am struggling with how to monetize desktop based software tools in 2019.
Evan You (creator of VueJS) makes ~$16k a month on Patreon, plus an untold amount of fees for speaking at conferences. They also have an open collective which at the momement has a budget of just under $67k/year. It's hard to say what the money is actually used for, but at least some is used to pay for "VueJS maintenance" which probably goes towards paying Evan as well.
Thanks for this information, it helps. The issue for me is that the target audience is probably < 10k people globally. The tool would act as a serious productivity multiplier for those 10k people (and those people are very well paid). In Evans case his creation is not only awesome, it also has a large target audience.
Wrong answer. Selling support means that the developer is incentivized to make product unnecessarily complex. Also, with recent developments it turns out that other entities might be better at providing support than original developers, taking away the option to monetize OSS. Why bother writing your own software, when you can just sell services and provide support for a different one?
Note that RedHat, often quoted as success story, is really a services company that just happens to write an occasional piece of software, and support them when their customers need it.
There's a lot more going on than just incentives to make a product "unnecessarily complex". It's certainly a concern but many companies make it work. I judge this an acceptable answer, not a wrong answer.
> It's certainly a concern but many companies make it work.
They do? Maybe. But they are walking a thin line between their short term (make it complex so we can, you know, sell support) and long term (make it simple enough that people don't jump ship) commercial interests. This is the reason I call it "unnecessarily" complex - it is in the interest of companies who sell support services that the product is not as easy to use as it could be.
Do we really need this? I think not, and I actually prefer what Redis Labs, MariaDB and others have been doing with the licenses for their modules. Sure, Business Source license and similar are not open-source (as in "freedom to take the product you have built and sell services on top of it, driving you out of similar business as collateral damage"), but at least they provide developers with the incentive to produce easy to use software, and not just because they feel like it, but because they can actually earn their living from it.
There might be some exceptions that "make it work", but this is in spite of just selling services on top of their product, not because of it. The cards are stacked against them - it is much easier (and profitable) to take other persons' product and build on it that it is to build your own.
In my experience, selling support is an acceptable answer only in the eyes of would-be competitors. Otherwise it is just plain wrong. </rant>
> Restrictions. Subject to applicable copyright, trade secret and other laws, you are permitted under this License to reverse engineer or de-compile the Software but you may not alter, duplicate, modify, rent, lease, loan, sublicense, create derivative works from or provide others with the Software in whole or part, or transmit or communicate any of the Software over a network in order to share it with others.
Can't edit but I'll reply:
This was meant as more than a throwaway comment, please see the many discussions - Oracle's chief security officer got extremely upset by it.
Ghidra on the other hand you could, since even if they never get around to fully releasing the source (unlikely) they still granted us an apache license on the whole thing :)
I think it's probably pretty unique right now in that it's under an OSS license without all the source available.
The java sources seem to all be there in zip files actually (as far as I can tell). The part I was most interested in atm (the decompiler) turns out to be some sort of native language compiled to an executable, and its source isn't there.
It's a funny situation, though: decompilation probably should cost a small fortune. If you're in a line of work that needs it, the quality of your decompiler is probably a huge factor in how valuable an hour of your time is, and many [most?] fields where people routinely decompile stuff are very highly compensated.
IDA has always had a weirdly low price point given the bill rates of people who use it, and it's interesting to see that price being competed all the way down to free.
> It's a funny situation, though: decompilation probably should cost a small fortune.
In the past, the same could have been said of compilers and even web server and mail server software.
> many [most?] fields where people routinely decompile stuff are very highly compensated.
If it's more freely available, and more people have experience with it, then the compensation might go down as the supply of people with this experience goes up. I'm not sure using salary as a justification of what a tool price should be makes a whole lot of sense. to me, it just sounds like an inefficient market because there's not enough competition (justification on the ground that it does much more than any competitor and thus can command a premium does though).
Another way to think about it is that if any piece of professional software should cost a lot, a super-specialized piece of software that is hard to duplicate, is a near industry standard, and is used almost exclusively by people with high bill rates should be expensive. But again, my point is: IDA costs a lot less than its place in the market suggests it should.
I'm not arguing that a capable free alternative is a bad thing. I think there's an industry business case study in what Hex-Rays could have done to keep this from happening, though.
> I think there's an industry business case study in what Hex-Rays could have done to keep this from happening, though.
Is the fact that Hex-Rays is Russian one of the reasons why Ghidra exists? (Honest question.) If so, is there anything they could have done differently?
No, "the market" means financial considerations created an opportunity that was fulfilled to generate profit. It could be that the economy gained via this FOSS release, and that was part of the consideration, but that would be structural interference by the government (which can be a good thing IMO) working in opposition to the market.
Market forces are often reined in, eg by human rights, those things aren't part of the market operating they're mitigations of the damaging effects. In this case it's an external force (gov action), not the market, that has created the availability of the product.
Hmm? Large customer of leading widget manufacturer decides to make its own widgets in-house instead and keep the build minus buy money (plus possibly getting a better widget). Totally normal market practice.
Large player in widget market has low marginal cost and deep pockets, sells widgets at marginal cost its competitors can't match. Also totally normal market practice.
Competitors exit market or are relegated to minor market share, leaving de facto sole survivor. Also totally normal market practice.
We could be talking about chrome just as easily as IDA pro here.
In the past, the same could have been said of compilers and even web server and mail server software.
I'm not sure the exact same thing could have been said which to me seems like a testament to how complicated software pricing can be. Web servers never really sold†, platform vendors eventually figured out it's better for them for compilers to be free (non-platform vendors still sell compilers), etc.
† back in the 90s, Netscape used to pester web companies to make their Apache installs lie that they are Netscape web servers.
Video game modders certainly use IDA. IDA's purchase price, though, is, ah, not an issue for them- not because they have lots of funds available, but rather quite the opposite.
To be fair, I don't think HexRays is oblivious to this dynamic, and to that end I think the freeware version they offer makes a lot of sense. Especially if it supports AMD64, which I'm hearing it does nowadays.
That's not going to prevent many people from taking the five finger discount I'm sure, since they'd rather have as many of the features as they can, but at least nobody can say HexRays isn't trying.
Yeah, but I think the big issue is the lack of decompiler. If you're new to RE, it's literally night and day between that and "assembly with stack variables renamed and some helpful comments". (Even Binja's MLIL is a huge step up from the annotated assembly IDA provides.)
Biggest limitation in the free version for me, as somebody who likes to tinker with old games as a pastime, is that it does not support any other instruction set than x86/x64 and not many executable formats either.
There are a lot of good and interesting games that were made in the DOS era for PCs which used the DOS4GW DOS extender, and their binaries come in the OS/2 executable format (LE/LX) which is unsupported in IDA's free version. A lot of good and interesting games also happen to run on game consoles which use non-x86 processors.
Ghidra probably won't have plugins to support all of these weird old legacy formats and CPUs which the full IDA package does for a while, but hopefully it'll get there eventually. If it doesn't seem too difficult, I might even try creating a LE loader for it myself.
I agree that the decompiler is amazingly useful... but it's pretty telling that there's really no alternatives that come close. I sympathize with hobbyists that pirate IDA Pro, but I personally wish we could do better than that.
Isn't that true for lots of software that's been driven down to zero cost, though? Like, say, TensorFlow? Given the business value of people who need to use TensorFlow, it "should" cost more than even IDA.
It feels like to stay in business with software like this, it has to be lucrative, but not too lucrative, or else FAANG companies (or occasionally governments, like in this case) will either gobble up or kill the market.
The pricing of Ida Pro is set to limit the size of the support work and to avoid liabilities. Do not know how it works now, but many years ago, as you were buying Ida Pro, they were asking questions and if anything seemed to imply that you want to hide the buyer's identity, they refused to sell.
That is, Hex-Rays do not want to have any business relationship with the proverbial would-be teenage hackers.
Outside of that, Hex-Rays is a small business which has probably around than 1mln eur/year of turnover and they do not want to grow it much more. It was a Basecamp-style business long before DHH made the concept of anti-growth popular.
It still works like that. I practically had to beg Hex Rays to take my money. They were very skeptical of me at initial purchase, it took about 2 hours of email exchange and phone calls.
When I went to renew my support, they grilled me again. This was just a few weeks ago. I gave up and figured Ghidra was just around the corner. Looking forward to trying it.
I emailed them and told them a) I didn't appreciate being treated like a criminal (won't get into the specifics, but one set of answers led to another set of questions, but I'm a consultant with my own company, website, physical address, company history, blog posts, etc. -- I work in security / reverse engineering of electronic devices)
I also told them I've never had to work so hard to give someone my money. Finally I gave up. Let the market speak.
I think IDA's lack of significant competition until now is nearly a textbook example of how charging a lot for a tool is no indication that the funds will go toward improving the quality.
What's been significantly improved in IDA over the last 10-15 years? Certainly not the x86 decompiler, which costs something like five times as much as IDA itself. The interface is still super-clunky and missing functionality like keyboard shortcuts for frequently-used functions.
I'm ecstatic that there's finally a realistic alternative.
Certainly the x86 decompiler improved! It hadn't existed. We also got graph view, a Python interface, a native Linux port using Qt, and 64-bit binaries.
IDA comes with amazing technical support. I've emailed complaints, then gotten a freshly-compiled build with a bug fix within a couple days. Funds are thus improving quality in ways that customers request.
In fact, the essence of decompilation is a NP-Complete problem: Graph Isomorphism.
So far, our decompilers are just greedy scheme to approximate the original expressions as best as possible by treating each instruction as a tree then as a graph, but still even a single assignment could cause the entire outcome of the code to change a lot, let alone to correctly recognizing heavily optimized procedures.
Edit: Wiki said it is NP-Complete but I was pretty sketchy about it. I think the better wording should be "at least NP"
Graph Isomorphism is not known to be NP-hard, that is we don't know a proof that a polynomial algorithm for GI implies NP=P. So it is "at most NP" rather than "at least NP", because GI is obviously in NP.
Reverse engineering doesn't pay that well-- IME at or just below par with software engineers.
So as one point of comparison you might look at the tools of software engineers, which are essentially all free today.
To get to hex-rays having a reasonable price you probably have to look at jobs like pipe welding where the equipment is expensive and the hourly high, but the comparison is much less direct.
Besides the usual ones, I've had to use IDA Pro occasionally for compatibility purposes in my job as a NAS vendor.
There are lots of apps that make lots of assumptions about how filesystems behave, generally based on the local filesystem and maybe on one popular networked filesystem for the platform (NFS, SMB, AFP).
If one of those assumptions is violated, applications can crash or refuse to interact with you. Some just refuse to write to any networked filesystem. Some run only on whitelisted filesystems. Some will hit an error due to an unsupported operation on your filesystem, fall back to some ancient code path using long-since deprecated Carbon APIs that only work properly on 32 bit systems, and so truncate all of your data to 2 GB.
Problems like the latter are really helped by being able to do some reverse engineering of the application to figure out why the heck it just writes out the first 2 GB of the file.
Because this isn't our bread and butter but only an occasional tool in our toolbox, the licensing on IDA Pro can be rather frustrating. We use it only once every couple of years to debug some kind of compatibility issue like this, and so we usually have to dig around to figure out if we still have valid licenses, deactivate systems that we're no longer using, and so on.
Would you mind answering some questions if you're familiar with the area (edit: hah, just noticed you posted to the OP to this whole thread.); What are some examples of firms that are involved in this work? Is it mostly a collection of smaller shops/individual contractors? After a cursory search, I seem to be seeing a lot of groups/labs comprised of relatively few people. Why are there so many references to high bill rates in these comments, is the pay especially notorious? That's something I haven't heard before.
> My office is in the same building as BitDefender. (...) They mostly hire their researchers straight out of college if they have high C proficiency
Also partially OT, just wanted to say that I was a sort of college-roommate with one of their present-day senior security researchers in the early 2000s and to this day I remember that person as one of the most code-obsessed persons I have ever met, and I say that in a good way.
He was looking at almost every program running on our room's computer (yes, we only had one computer in our room of 4 or 5, no laptops) as a thing to be "broken apart"/analyzed/made sense of, he had a state of mind and a way of looking at things when it came to computers that I've never met since then at any other computer programmers (I've mostly met desktop, backend and front-end programmers, I'm a data-obsessed person myself). I realized in the meantime that in order to enter this "computer security" field and especially in order to be good at it you need to have a different set of skills and especially a different way of looking at things compared to other computer programmers.
Quite ironically for the GP, that's exactly what has happened in this case: a taxpayer-funded governmental organisation (NSA) has produced and released a public good for free consumption. They literally saw the toll bridge (IDA Pro), said ‘nope’ for whatever internal reason, built a new one downstream, drove their vehicles across it, and then said “hey folks, this over here is for you to use for free whenever you want”.
While I agree with your overall point, this isn't the greatest analogy, because a finite number of users are able to use a bridge simultaneously. A bridge with too many users is called a traffic jam.
Along these lines, the first customer should pay the millions of dollars it takes to market and produce the software and then they can do what they want with it.
$4k/cpu/year is really not very expensive at all for industry-leading niche software.
Good comparison might be Synopsys VCS. Prices are not published but I believe they are over $30k/cpu/year and for larger designs you really want a big sim server.
Wait, IDA has a collaborative mode? I couldn't find one; link please?
This is shocking, because, in an E-mail exchange a few years ago, Ilfak wrote to me:
> [...] we at hex-rays do not have any ideas how to implement dynamic database synchronization, so it is unlikely that others will come up with a good solution.
How well does it work? The aforementioned exchange with Ilfak came after my poor experience with collabREate, a previous plugin that claimed to do the same thing.
Ghidra appears to use version control, with a need to merge changes. Merges could get ugly.
I think Binary Ninja's enterprise version might involve clients connecting to a server that maintains the database. It would be more like Google Docs if that is the case. Actions in the GUI would request atomic transactions on the server, then display the current state.
Yeah, my mouse was hovering over the download button eager to test it out when my brain suddenly went "wait, don't do that!"
If you wanna run this thing, you should probably build it from source yourself (don't trust the binaries) and even then run it in a pretty well sandboxed virtual machine. I would not be surprised at all if the NSA left some surprises in that thing.
> I would not be surprised at all if the NSA left some surprises in that thing.
Huh. I would be extremely surprised if the NSA were to include some kind of malicious or pseudo-malicious easter egg in the open source RE toolkit they're releasing. How dumb would they have to be to pull a move like that, and for what? The self interest just doesn't line up.
Sure, the ones officially contributed by the NSA. What you have to wonder is how much code was contributed by some seemingly normal community member that is actually a front for the NSA to introduce subtly flawed code that they can use to their advantage while being plausibly just a bug?
There are those who suspect Heartbleed came about this way.
- SELinux has been free-software for over a decade, with many open-source contributions.
- Many people don't use SELinux, especially on distros like Arch, Gentoo etc. where it (luckily) doesn't come as part of the package, SELinux is far from universal.
If you're decompiling and analysing ANYTHING, you should be running it on a reflashable, airgapped machine in case it does something unexpected, even if it isn't intentionally malicious. Plus you probably don't want it phoning-home either...
ida can use hexrays to decompile, which is working great. but ofcourse you need some employer to pay for it which is aa pain, no one will pay some 10k for a tool personally. however, free version is very adequate for most things.
i would say these both tools ,as well as r2 have their own merits and weak points, and it would be good not to exclude one and take the other as better, but to have them compliment eachother in your arsenal.
in the end if you want quality, then manual work is always better than these opinionated tools, and sometimes that is required, so really the tools offer different perspectives / opinions of the same thing, and that is valuable in any case. when you run into the limit of 1 tool ,another might just fill that gap.
BAP is not a competitor to IDA Pro or Ghidra, it's a platform for implementing automated analysis, while IDA and Ghidra are more like reverse engineering tools that are focused on human interactions. We do support IDA Pro so that you can run BAP analysis from it and have the best of two words. We will soon roll out the support for Ghidra too (the issue is created [1]).
What is a really great contribution of Ghidra, to my opinion, is the detailed specification of all supported ISA in Sleigh (their terse and concrete ISA specification language). Ghidra ships with about ~200kLOC of instruction descriptions and this is the most valuable contribution to the community. We're planning to support Sleigh in the nearest future, and I believe that Sleigh might become a standard de facto for instruction semantics specification.
Ghidra’s source source code was not released. The only thing on Github is the Readme, the license, and some git files. The best you can likely do is use Ghidra to reverse-engineer Ghidra.
IDA Pro is not expensive at all for serious professionals in the field. Other common software in the industry costs way more. Nessus is $2k a year, Metasploit like $1500 to $15000, and Core Impact is $30k and up.
If this is expensive to you, then it’s not for you. This is for people who are making real money with these tools, not hobbyists dicking around.
> If this is expensive to you, then it’s not for you. This is for people who are making real money with these tools, not hobbyists dicking around.
That's an odd perspective. Imagine if this type of sentiment were applied to paint brushes. There is a lot of useful work that is not economically viable per se, and to discount that and to be pejorative feels wrong.
It is less odd if you look at it from the perspective of a professional in the field. Being expensive (and a little mean about it) discourages potential competitors from entry.
If you are using these tools you are either defending systems from threats or breaking into systems and making money through illegal activities. There is not really any other useful work you can do with these tools.
I don’t see how the perspective is odd. Having tools like Core Impact and the knowledge of how to use them well can propel you to a six figure income easily. On top of that these tools are also business expenses you can use for tax write offs.
They are certainly worth the investment. The only people who see the price as steep are those who cannot see any viable way to make a decent ROI off them.
1) No one is entitled to a career in cybersecurity or reverse engineering, no matter how poor or sad your origin story is.
2) There are always lucrative opportunities in this world that are out of reach by people who lack some resource. In this case, it's money, but it could easily just have been something like popularity, beauty, connections, location, or even plain old brains.
I always wanted to be popular and loved by many, but I came to accept long ago that it just wasn't going to happen. I'm an introvert, I keep to myself a lot, don't get much pleasure from social outings, and at the end of the day people just don't give a fuck about weird people like that. So I just try to enjoy the gifts I do have and the things that come naturally to me. We all have to accept the realities of our lives at some point, even the poor.
Sorry to hear you're finding it tough. Some people find ways of becoming less constrained by their introversion, but no judgment on you for doing what works for you.
It's true that some pre-existing conditions can limit what options people have, but it doesn't apply to everything.
It's important to be discerning about when this effect applies and when it needn't, and work to open more opportunities to more people wherever possible.
If you are studying to become a (paid) professional in the field, be it offensive or defensive, having a quality, open source, free tool available which is also the defacto standard is a big plus for getting you started. Elitist will fear such competition, those with love for the field of work will endorse it.
From someone who does binary reverse engineering full time, in my experience, BinaryNinja, Hopper, radare2, etc are toys compared to IDA Pro + Hex Rays Decompiler. The quality of the results and the features supported are unmatched... until now. I haven’t spent too much time with ghidra yet but it’s the real deal. The output of the decompiler looks alright (not complete garbage like I’ve seen with other tools). Even if everything else sucks, the decompiler by itself makes it outrank every other tool aside from IDA. And it costs $10k less! The fact that it’ll be open source is just icing on the cake.
Binja is the only real competitor in any remote sense IMO, and while the LLIL/MLIL are nothing compared to Hex-Rays, they do still dramatically improve the speed of the job. Binja is also fairly extensible/pluggable, though it's pretty undocumented... I just don't do it enough in my spare time these days (not in the field anymore) to justify a Hex-Rays license for myself (even if it is permanent...)
That said I just renewed my license so I have to get some use out of it, but Ghidra does seem like it could be the real deal. Honestly, I never really expected any free/FOSS alternative to IDA to ever exist at this point, so the possibility is tantalizing.
Binary-Ninja and IDA are a completely different class of tool from Radare. Don't get me wrong, I'm happy Radare exists. And I occasionally check it out and play with it -- I think "the vim of RE tools" is a cool point in the design space. As a Linux person, I find that attractive, especially for certain kinds of automated stuff (vs loading Python scripts in through a UX or whatever). But that kind of aesthetic is an extremely small part of these tools in the whole, and it simply does not matter if the tool cannot "keep up" with your work. All of that comes later on. You're comparing a Jalopy to a Prius -- and that Prius is already going up against a Ferrari.
When I use IDA, almost all of my actual work in the tool itself is very "boring" RE stuff, because it does its job. I am not constantly fighting with it to get basic things analyzed propertly, or fighting a lack of supported features that prevent it from opening something, or a bad analysis engine that misses 80% of things I later reverse by hand. You could comparatively stitch something together with the tools in Radare to patch over this for the cases it doesn't handle. You might even call those "edge cases", but reverse engineering is 90% edge cases and 10% easy stuff. I'll already be done by then.
I should also be clear that part of the issue is that reverse engineering is a money game, one where money is easy to come by if you have the clients -- and as a result and a lot of the developers of those tools have more money/labor available than the Radare developers. That also means people who need this can simply throw money at a problem, like an expensive IDA license, and move on. That doesn't mean Radare developers are incompetent. If you gave them a lot of money -- like, enough to fund 5-10 core developers for a couple years -- Radare would dramatically improve extremely quickly, I'm sure. (This is one of the reasons why I suspected a true competitor to IDA would never come around as FOSS -- it takes a shitload of money to do that, and it's also something you can make a shitload of money from.)
But I'll say this: if you put me into a situation where I had to reverse something, I'd pay for an IDA license 10/10 times even if every Radare developer was at my command, and I'd probably still get it done faster (most RE tools I know of lack even the most basic, fundamental features IDA has had for years -- such as FLIRT -- that can dramatically improve reversing speed.)
R2 has Cutter GUI, along with FLIRT support (and custom signatures format as well) for years as well. So bad example. And there are not much money even for IDA developers - it is very small market. So no tool would get a "shitload" of money ever.
Reverse engineering the firmware for an embedded product where someone lost the source code.
Bonus points available for:
* "the source control is ZIP files on a network share"
* "yeah we use forced squash commits on everything to keep the Git history nice and linear"
* "it was designed by a contractor who is now uncontactable"
Too often companies pay 6 digits for a feature that some supplier rips directly from an open source on the Internet (often GPL) and then sells as his own.
That depends on your definition. Many people, myself included, take 'red team' to mean -> attack simulation. If you have access to source, it implies a white box test, which is not an attack simulation but 'ordinary' vulnerability research.
The concrete difference between the two is that vulnerability research is mostly focused on the technical security aspects. Eg. is there a buffer overflow here yes or no? From an efficiency perspective it makes no sense to hide the source code or even credentials from the pentesters performing this research.
An attack simulation is more holistic in nature, the question becomes "can your security team detect when we exploit this buffer overflow?". The blue team and the red team do not share details, and to give the blue team a proper exercise they are often not even informed. To do a proper red team exercise the scope must be very broad. Both technical controls as well as procedural operations are in scope. If you call application/network security research a red team exercise I think you're doing it wrong.
So a red team, in the sense of the word that I specified, does not have access source code, and most definitively sometimes needs to reverse engineer binaries.
Because although you don't have source code (like other commenters are saying), reversing a program to get into a company would be the hardest way to go. Red teams are used to test a company's overall security, and reversing normally wouldn't make sense compared to phishing, using common exploits, and owning the network. Reversing binaries is not the job of a red team, but pentesters of specific systems.
Red teaming isn't limited to "get into a company" testing of networks, it's also used for testing products and infrastructure that's outside the company. For example, you can reasonably have a red team evaluation of some authentication or payment infrastructure based on smartcards or mobile apps, and that'd inevitably include reverse engineering of all the artifacts that are available to the users; and in such cases also likely that many/most software parts of "your" product or device aren't made by you but redistributed from some other vendor, and you don't necessarily have the source available for that.
Auto analysis when you have barely any information. Any tool can make nice output if you feed it nice input. Try a partial dump from an exotic device and then you’ll see IDA shine.
See, that's really most of what I ever did with IDA (I don't do a lot of Windows reversing) and I always had to do a lot of binutils munging to get weird architectures to work. But things may have improved dramatically in the last 8 years or so.
Definitely- it's all the years of tweaking and the massive numbers of heuristics to handle, i don't know, code emitted by Microsoft Visual FORTRAN from 1972- that's IDA's moat. Screw the decompiler. If GHIDRA can match that, it's a huge step forwards.
Not sure about everything, but last i looked IDA had a lot more support for different architectures and file formats compared to most of the open source stuff (not sure about other proprietary ones).
I’m a casual bystander who has only played with these tools, but I’ve been interested in this field for a long time. Do you think that radare2’s UI is a step forward? I like the Unix-esque command line and how composable everything feels. IDA (and now Ghidra) feel like an IDE, while radare2 feels more like Vim.
I mean having a good UI is great but without the features to back it up, you can’t do anything serious. I tried cutter again a few months ago and went back to ida after an hour of frustration. When handed a binary dump with no executable format or symbols, cutter just chokes while IDA was able to quickly find 90% of functions in memory as well as data xrefs and strings and so on.
I’m sure everything performs well on ELFs built with -O0 -g but in most real world usage, Ida is queen.
Since everything is open source, if ghidra is as good as people say it is, I’m sure people will make better guis for it (and tui) in no time.
Pretty much all of the seriously talented reverse engineers I've met started out hacking video games as teenagers. Also, IDK if you remember me from back in the day, but hi! ^.^
It really is a job for a GUI, but even IDA lets you type commands. You can use Python, or a built-in language that is in the C family, or anything custom that you have attached to the plug-in interface. I would be surprised if one of these tools is lacking such a capability.
I've found radare2 pretty neat for doing some automated analysis (specifically on RISC-V binaries), but I agree, IDA Pro has, until now at least, been the undisputed champion.
You are the leader in your segment of the market one day and the undisputed leader. You wake up and the NSA decides to send a free competitor out with better or matching functionality. Tough blow. But good for us.
"Eschew flamebait. Don't introduce flamewar topics unless you have something genuinely new to say. Avoid unrelated controversies and generic tangents."
OP does have a point that this software was subsidized by tax payers. One could argue the NSA needs advanced tools and that the costs of IDA Pro add up.
This, unfortunately, occurs so infrequently that it can safely be ignored by 99.9% of the economy. Businesses have really enjoyed having their cake and eating it too with the transition away from a highly involved acquisition process that generally resulted in a tailored solution that the USG owned, to the present COTS policy that allows them to then go on to sell software to people that have already effectively paid for it through taxes. While there was an impressive amount of bureaucracy and an infinitely self referential system of standards in the old method, it did lead to some pretty interesting side effects: Ada[0], IDEF[1], MIL-STD-498[2], etc.
The most recent liberation of useful taxpayer funded software that I can think of was over ten years ago, when NIST released NFIS2 - the fingerprint software that the FBI relied on. They of course had to be crappy about it and wrap it in export controls that limited its utility, but it was interesting to see all the work that internal development had done - very polished, with man pages going back to '97. Ah the memories: software classified as munitions, the clipper chip...
Likely nothing, it's the source code for an RE toolkit with an NSA sticker right on the box. There is literally no worse place to try to hide back doors.
At worst they will know how to mask their real malware from analysis with their own tools.
It's part of the NSA's recruiting push. "If you are interested in projects like this ... consider applying" is even mentioned in the README.
There's zero chance there's some secret trojan, because the people who are interested in this type of software are the exact people who would be able to find it.
Well... I suppose you could argue that it would make sense for them to add a secret trojan, encrypted alongside a message along the lines of "we'd like to talk to you about an interesting employment offer, give us a call on 00000" ;)
Fortunately, a lot of people who are very knowledgeable about reverse engineering are downloading & opening it as we speak. They will point out any flaws/viruses found. Seems one has already been found [1]. All you need to do is to wait.
AFAICT all the source is there, beside every `.jar` there is a `.zip` with the corresponding source. The source in a more usable form should be posted here soon: https://github.com/NationalSecurityAgency/ghidra/
(And if not I'm sure the community will reconstitute it)
Seeing 403 (I'm in Russia). Maybe some export restrictions, but more probably just a glitch in CDN (Cloudfront). P.S. I'm not in Crimea, never encountered software export restrictions before.
I'm curious what feature specifically prompted the NSA to develop their own IDA Pro alternative. I mean, someone somewhere at the NSA must have been trying to do something with IDA Pro only to repeatedly fail before the decision was made that whatever the NSA was trying to do warranted developing their own IDA Pro... right? Or perhaps they used IDA Pro so often and grew so frustrated by it that they started their own?
2. supporting classified proprietary architectures (think missile chips or something)
3. The intermediate representation (architecture independent representation of code) can be integrated in to many other classified tools. Maybe for automated analysis for example.
Good point... "we need a site license. No, I can't tell you for how many employees, that's classified. No, I can't tell you who we are, that's classified. No, I can't tell you what we are working on, that's classified. Hello? Hello? Darn they hung up again..."
When I worked for a hedge fund we had to deal with this sort of thing (not classified obviously, but wanted exemptions from certain things), but it was actually pretty easy to deal with. They just charge you more to get custom terms.
Operational reasons? They found a compromisable worker was employed there, or they somehow put modifications in IDA making the software compromisable and so not safe for them?
Hex-Rays can be hard to deal with, and the IC deals pretty extensively with large federal contractors like Raytheon, so it's possible they just needed something as capable as IDA that they could roll out across all their suppliers to use as a common toolchain and interchange format.
But it's also possible this is just sort of a labor of love type thing.
There’s a third possibility: they wanted a piece of software they could customize to meet their needs. Admittedly, for simple bugfixes and the like, Hex-Rays’ support is known for being quite responsive (as suits the small number of customers). There’s also a quite large (albeit poorly-documented and crash-prone) SDK, which can handle a wide variety of needs and has gained functionality over time. But if you want to add new functionality that can’t be implemented using the SDK? (That includes just about anything related to the decompiler.) You’re at the mercy of Ilfak’s priority list. Well, mere mortals are, at least; the NSA has enough money that it might be able to set up some sort of special contract with Hex-Rays, but I suspect that’s easier said than done.
From that perspective, the ideal is what the NSA ended up with, a codebase whose development is fully in-house. Notably, though, second best would be to just have access to the source code of an existing tool, so you can at least make your own patches if necessary, even if you’re not in control of the codebase’s overall direction. Did the NSA ever seek that in IDA’s case, and could they have obtained it if they did? I don’t know the answer to either question… but source access certainly isn’t offered to typical customers. In general I’m surprised that “paid + source access for customers” isn’t a more popular model of software development.
From my perspective, which admittedly is very different from the NSA’s, I was never very interested in low-cost IDA competitors like Hopper or Binary Ninja, but I’m very excited about Ghidra. Why? Partly because it’s a more full-fledged competitor in terms of feature set, I admit – but the competitors I mentioned are bound to narrow the gap over time. Partly because of cost: I myself am at a point where I could justify the $600/y for Binary Ninja’s commercial edition, or even the order-of-magnitude-higher cost of the Hex-Rays decompilers, without wincing too badly. but I believe that reverse engineering should be accessible to beginners and amateurs. (Piracy is a partial solution, including in IDA’s case, but some people don’t like to do that.).
But the main reason I’m excited about Ghidra is that I have the source code. As a concrete example, I’ve spent a good amount of time reverse engineering software for the Nintendo Wii and Wii U. Both consoles have a main CPU based on the PowerPC architecture, but with a custom ISA extension for an extremely barebones version of SIMD. Well, both Hex-Rays and Ghidra support PowerPC decompilation (although that’s a relatively recent development), but unsurprisingly, neither of them have full support for that ISA extension. IDA actually does have built-in support for disassembling it, but AFAIK not for decompiling; Ghidra doesn’t seem to support it at all (but I may just need to configure it properly). What can I do? Well, in practice, nothing, because I don’t care about the Wii U anymore. But if Ghidra had been released a few years ago, I’m pretty sure I would have gone and implemented support for the extension myself; I haven’t looked at Ghidra’s source yet, but since it already supports other vector ISAs, it probably wouldn’t be that hard. With IDA, I was stuck. The SDK supports adding custom instruction sets for disassembly, but the decompiler SDK is so limited that supporting them there is either impossible or at least would be a huge hack.
And that’s just one of many customizations I‘ve wanted over the years. Some of them are probably easier said than implemented, but at least now I can put that to the test!
If you need some very specific small functionality by next monday for $1million, then that development can be arranged.
However, NSA could also reasonably want that their targets (who have extensive capabilities of their own, likely including insiders in various companies) can't find out that NSA needs that very specific small functionality by next monday. They may not care if that functionality becomes available to the general public sometime in the next year (preferably in a more general manner that covers the reasonable/common usecases instead of just the one NSA had at the moment), but leaking the information that you needed (and thus probably used) X at time Y often isn't acceptable.
At the very least, they'd need every single employee who works on that feature or can see that this feature was developed to be vetted by them i.e. to have a security clearance; and that also requires the company to have the appropriate processes and infrastructure for separate, secret codebases and builds that can't be seen by uncleared people. And that is something many companies can't or don't want to provide.
I'm also in the situation where I'd like to try to reverse binaries for the custom PowerPC chips in Gamecube/Wii/Wii U. I'm wondering if it might be easier to rewrite the assembly such that the SIMD instructions are rewritten in terms of standard instructions, and then just use that. I hope it's possible.
May be the NSA probably had this before IDA came about. It’s the NSA after all. They are now releasing it because it’s not a competitive edge anymore and can be used as a recruiting tool.
I don't think that is possible. I used early versions of IDA Pro on MS-DOS in 1996... and it had the core analysis and interactive disasm mode then. Ghidra seems to be based on this very concept, all the way down to the "XREF" labels. Even the sub-panels in the UI look the same: Imports, Exports, Symbols...
I believe the included decompiler predates the one that goes with IDA Pro, so that could be a motivation. IDA Pro itself is really really old, having once been a 16-bit DOS shareware program, which explains why IDA Pro doesn't use the GUI shortcuts that were standardized much later.
IDA Pro still doesn't support collaboration, although there are very broken hacks that attempt to add it. Binary Ninja supports collaboration if you buy the enterprise edition.
Just used it to solve the 2015 flare-on challenge #1. Rudimentary, but I am blown away. The interface feels better than IDA, I was able to write a python script straight away! 10/10 recommended.
The python interpreter attached with it is aware of the state. Where is my cursor, what memory module I have selected etc. Easy to write scripts for
A common work-around is for a federal agency to hire a contractor to write software and require the copyright be transferred as part of the contract. The federal government can hold copyrights, so it's legally kosher.
Can they? How's that work, surely a resident in USA could take the work to the other country and make it available, if they can't then it wasn't public domain in the first place?
I don't quite get it either, but one of their documentation files seems to imply that they can:
>In countries where copyright protection is available (which does not include the U.S.), contributions made by U.S. Federal Government employees are released under the License. Merged contributions from private contributors are released under the License.
I'm definitely excited for this, considering I couldn't fork out the thousand of dollars needed for using IDA. I can't really justify that on a small hobby project (reverse engineering games).
My personal reason for not pirating IDA Pro is because I don't want to contribute to the problem. It's one thing to argue about the effects of piracy on things like video games, where the unit price is much cheaper, and a large number of users are casual users who mostly are going to buy legitimate copies if it's convenient and not exorbitantly expensive.
Power user software, like Photoshop, IDA Pro, VMWare, etc. are a different story. They provide tremendous value to both companies and individuals and yet I have no doubt an enormous amount of their poweruser userbase simply have never paid for them. As a young adult or child with no practical way to get a license, this is pretty innocuous since frankly it's hard to argue any sale was lost. But there's plenty of cases where large companies and of course hobbyist users end up pirating the tools they use. I believe Windows XP shipped with some audio files that were produced with a pirated version of Sony Soundforge, for example. That's just silly, but.. it happened.
IDA Pro is an excellent piece of software. They provide a freeware version, which is a pretty nice thing to do. And while the licenses are expensive I have no doubt it is worth it to the companies that purchase it, many times over.
Sadly, I can't afford IDA (as I've discussed eerily recently in HN comments, actually) so I've been mostly avoiding it for now, but I do buy other software, including Windows licenses, Adobe Creative Suite, VMWare, etc. If they're useful enough for me to use, then as an adult with decent income, I pay for them.
> yet I have no doubt an enormous amount of their poweruser userbase simply have never paid for them.
Do keep in mind that many of these companies expect users to pirate their software. Indeed, piracy is ironically part of what has made Adobe such a big player - teenagers pirating software in highschool, and using it up until their first job, make it their go-to tool when they actually do enter a company. Often leaving the company with no choice but Photoshop!
So if you are not paying to IDA's authors either way, what is the difference?
I'd say, by learning to use IDA through a pirated version you create a possibility that one day you will use it for something more serious and you or your employers will pay for it.
One possible argument against that is that by learning how to use all IDA features through a pirated version, you erase the competitive advantage of people who can afford to pay right from the start and remove their incentive to pay.
To that I'd say you would be just levelling the playing field :)
Somebody sort of casually pirated a copy of IDA Pro back in the mid-2000s (IIRC, he shared his copy on a public server). The IDA people (DataRescue, at the time, but from what I recall the page survived the move to Hex-Rays) found out, banned him from using IDA, and then put a page on their website threatening to rescind IDA licenses from any company that employed him. The IDA team is pretty aggro.
To be fair, there's definitely a difference between downloading a pirated version of software, and actually leaking copies of software for others to pirate. The DataRescue and now Hex-Rays folks seem particularly sensitive to leaked copies and I imagine leaked copies genuinely do affect their bottom line of sales given the kinds of markets they're in.
They certainly seemed to be very much against selling it to private individuals (or were when I asked many years ago).
I guess unless you've got a CV which says "presented at Defcon and Blackhat, five times" or "currently work at {big infosec company}", even if you can afford it the answer will be "nope".
The end result for me was that I bought a Mac Mini and a copy of Hopper and Synalyze It. My entire reverse-engineering of the Polaroid film recorder driver (and the resulting Linux port) was done by reversing the driver DLL in Hopper and shimming the driver and ASPI calls with PyDbg.
I keep looking away for a month or so and finding a new version of Hopper with shiny new features to play with...
This is an unusually large open source project, especially for NSA. I wonder whether they were motivated to release this tool because of their recent brain drain / hiring problems.
Here's a potential angle: If you're going to use a tool internally it's in your best interest to be able hire people w/ experience with that tool. (ie, people learn the tool for free on their own time)
Oh yeah, for those who are wondering; there's another NSA project where they made a tool that's a direct competitor with a product that's "out there": https://github.com/redhawksdr
I just don't understand the doubt and hate. It's perfectly reasonable to distrust the NSA in most cases, but look at the context - the NSA has a huge brain drain and PR problem. They desperately need qualified people to start trusting and applying to them again. Does anyone seriously think they would try to backdoor security researchers in such a stupidly obvious way?
I was actually at the RSA talk where they released the tool - the presenter was very open in saying that this is a recruiting tool. They want college kids just getting into RE to learn their tools and have their name in the back of their mind so they apply for internships and jobs, and are trained for those roles from day zero. There are other benefits to releasing the tool, like free labor and testing from people submitting patches and bug reports, but the real value is in making the NSA appear like the good guys and getting people on their side.
It seems pretty obvious to me that this gives the NSA more benefit than trying (and probably failing) to hack random people. And yet the dude sitting next to me was shaking his head and saying he would only ever run it in a VM. Irrational as hell.
Well, they shouln't have involve in "hacking random people". Then we would trust them. They didn't and they still have surveillance and hacking programs. Why would I expose myself and become a target for the next years? Are they trying to know where are the new targets?
Your comment makes no sense whatsoever. Let's say you're an NSA target. You're probably already hacked. If not, then you are very smart or you haven't been an NSA target for long. Let's assume you're a very smart malware researcher - that means you 1. Already have tools like IDA and don't need this, 2. Have an in depth knowledge of how to acquire and run potentially malicious code safely, 3. Have experience figuring out if that code is malicious.
Do you think the winning strategy for the NSA here is to attack you in a way that you're perfectly equipped to deal with?
So, I've tried it on some mips binaries I've been reverse engineering on and off last 7 years from assembly, for various reasons. I'm completely blown away by the quality of the decompiler output. The binaries include symbols, so everything global is named correctly, which helps. Anyway, nothing I've tried over the years comes even close to the clean output I'm seeing from Ghidra.
I’m really hoping this release will improve the situation with learning RE in universities etc. The free version of IDA is very limiting, and few people use the open source and cheaper alternatives (radare2/cutter, binary ninja, hopper). I’m also hoping I can get that decompiler (or something similar) in cutter at some point, but with the source not yet available we’ll have to wait.
Are they serious? They are banning Russian IPs with decompiler source code. Hmm, I know ARM and x86 assembly. Of course, I don't know how to download these sources :)
As I see it's licensed under Apache 2.0. I don't know about any regional restrictions for this type of license. But it's could be a real reason because of our stupid government.
I'm wondering this too. I haven't heard of retdec being used too much, but it looks very cool. I'll guess that Hex-Rays is better, but I still am interested in the opinion of someone more experienced who has tried retdec.
I'll go one better: I've contributed patches to retdec.
Retdec is ... okay.
On small binaries it's usable. On even average sized windows binaries (a few meg), not really.
Like on things that IDA takes 10-15 minutes and a reasonable amount of memory (like a 7 meg windows binary), retdec can take forever and unlimited amounts of memory.
I started fixing a lot of the memory issues (completely recursive CFG traversal, etc), but there are also very serious algorithmic issues (N^3/N^4 algorithms in the optimizers).
If i disable a lot of the backend optimizers, i can make it work okay.
But then the output is also a lot larger/worse. To be fair: It used to be about 50x bigger than similar IDA output. The latest development version of retdec now has a new backend IR converter, and the output is only 5x-10x bigger than IDA output.
So as a TL;DR: retdec in its default state is unusable for anything but small binaries.
If you understand what is going on, you can get it to work on a lot of binaries as long as you have a ton of memory and time to spare.
Yeah, sorry, as I understand it, it uses Capstone as the disassembler and implements an LLVM lifter over it. It was pretty dumb to describe it as "based on Capstone"; I was just mentally breaking tools down by which CFG recovery system it relied on.
It's not the first real competitor available to the public. Hopper Disassembler and Binary Ninja are both capable. They have been available for a few years.
They're arguably competitors if you don't care about decompilation. But Binary Ninja has no decompiler and Hopper's was awful last I checked. Ghidra's decompiler seems as competent as Hex-Rays.
Binary Ninja has most of a decompiler and is expected to get the rest soon.
Binary Ninja offers multiple views of the code, each with an API that gives you the same access that the GUI has. The different views vary in how much they are like assembly or C. Only that last step, real C code, is still missing. Those other views are quite good if your goal is to understand things, but less good if you were hoping to throw the results into a C compiler.
Binja could get a decent "C-like" view on top of MLIL, sure, but it still fails in a large number of relatively rare cases.
Anybody use SEH or MSVCRT exceptions on x86? Well, there are non-inlined functions that adjust the stack pointer dynamically there. Binary Ninja can't capture that. To be fair, it's unlikely IDA can either- but IDA has a heuristic (read- hack) that treats those functions specially. Result? SP-analysis for all callers generally fails, and Binja becomes convinced that arguments are being passed in eax and ebp.
Ah, but you can just patch the LLIL for calls to those functions to adjust the stack. Oh, no, you actually can't patch LLIL that way- it's immutable after the lifter creates it. Now, you can write your own architecture hook, and there you can be your own lifter- you can call the real lifter, see if it emits a LLIL_CALL to a function you recognize, and if so just emit the stack adjustment LLIL instead. Ah, heh, but you can't- you can't call the real lifter, because it doesn't emit LLIL, it adds LLIL to an existing function, and you can't remove that IL later- it's append-only. And you can't recognize functions easily, because the things passed into your GetInstructionLowLevelIL callback don't include a BinaryView pointer- the thing you'd need to find out anything at all about other functions. You can sort of, kind of, hack around this by calling about five other functions... for every CALL instruction in every function in the binary. This is, ah, less than performant.
Ever reversed a Win32 binary that uses the Win32 API a lot? I hope you like defining structs by hand, because OH BOY are you going to be defining a lot of structs to do anything useful. And you also get to define DWORD, LPDWORD, LPVOID, and every other annoying Windows typedef by hand. (You can be clever and use libclang hackery on the Windows SDK and automate some of this. But you'll have to do it yourself.)
Then there's stuff like type propagation only going forwards inside functions- sometimes. The GUI occasionally deciding that all basic blocks should be laid out in one small square, on top of each other. (You have to reanalyze the function to fix this.)
Mind you, I love Binary Ninja- I bought my own dang commercial license, and renewed it! It's getting better, fast... but it's got its warts.
Oh, and I forgot to mention- despite being multi-threaded, it's slooow on massive (50MB+) binaries. Bother your co-workers! Play Pokemon GO outside! Make lunch! Take a nap! Use the foosball table in the 'game room' that's there because we want to seem trendy! When you're done, perhaps the initial analysis will have finished.
If you're on the dev branch of binja (which, at least until recently, was miles ahead of stable), you get to do this again in a few days when binja updates and throws out all its old cached information.
Also, saving and loading massive databases can easily be a 5-minutes-or-more process. Again- this does provide you with ample time to explore the area around your office building, but still.
(Mind, this isn't a problem if you mostly see small binaries- for malware it's probably entirely fine.)
Ah, I hadn't heard of the IL functionality. From a quick test of the Binary Ninja demo, it looks like an approach that could become a viable competitor to a decompiler in the future, but isn't a good one in its current state.
For instance, one of the most useful aspects of a decompiler for me is the ability to recover high-level control flow, which Binary Ninja apparently doesn't support. Instead it gives you an IDA-like graph view (but with IL instead of assembly in the graph nodes); but at least in my experience using IDA without a decompiler, even moderately large functions tend to result in a spiderweb of a graph, and recovering the control flow by hand ends up feeling like a pointless brain teaser. (This condition being true short-circuits this set of comparisons, so it must be an ||... but it still ends up doing this other set of comparisons, which you can also get to from... wait, where was I again?)
It also doesn't seem to allow eliding temporary assignments. Here's a short sample from some random function (retyped by hand since I don't see a way to copy and paste):
It does do a sort of SSA transformation and assign unique variable names (like eax_1 instaed of eax) to the same register based on the location in the program, so that's nice. But what I really want is
eax_1, ecx_2 = sub_3c670(esi, arg4)
I may be missing some option to do this manually, but it should be automatic.
> It also doesn't seem to allow eliding temporary assignments.
Yeah, there's no copy propagation for MLIL yet. I think they're saving that for HLIL, for some reason. It's exactly as obnoxious as you think it is, though. (For example- click on a variable name. Now other uses are highlighted. Ah, but when 80% of the other uses are just the right hand of assignments, which are then used... you get to trace through that fun chain by yourself!)
There's a community plugin to kind of try to fix this, by actually renaming the intermediate variables to match the RHS's name. This works, sometimes, but is written in Python, which means it's single-threaded and slow, and occasionally it will get stuck in a loop, and sometimes it decides that it wants to rename everything to "ecx_1" or something, in which case you become very grateful that undo exists.
This looks like an excellent free competitor. Been trying to learn; a tedious process without the fancy tools. Even hopper and binary ninja are very expensive (for a student). Radare2 has been a godsend so far and very helpful, but not as user-friendly.
I did some minimal stuff it and I like it. Its slow and has Java UI is as bad on Windows as expected (in part OpenJDK) but I'll reach for it next I have the need. The main thing that I seem to have missed was some sort of scripting like jython which presumably can be added via extensions or just via code if it is actually missing. Its been a while but I'm happy to see some of my tax dollars at work and this stuff being released back to the public for free rather than used against it.
Though the obvious explanation for that is that it was an intentional backdoor, that honestly looks more to me like a legitimate oversight than a backdoor. I think an actual backdoor would be a lot more subtle and clever than that. Especially since this way, absolutely anyone could exploit it (it's just Java Debug Wire Protocol).
Also, you have to explicitly run it in debug mode for this to happen, which probably only a small percentage of end users will do. Kind of seems like the equivalent of running Flask apps in debug mode, which by default will handle exceptions by showing a traceback with an interactive debugger that can be used to execute arbitrary code.
There could be some backdoors in it, but I'm leaning towards that not being an intentional one. (But I definitely could be totally wrong; you never know when it comes to intelligence agencies.)
> Kind of seems like the equivalent of running Flask apps in debug mode, which by default will handle exceptions by showing a traceback with an interactive debugger that can be used to execute arbitrary code.
As an aside, this is no longer precisely the case, though it was for quite some time.
With modern Flask (> 1.0.0), the debug server will start with a randomly generated PIN output to STDOUT when the server starts.
In turn this PIN must be entered on the web interface to execute commands.
I wonder if they run Ghidra on a remote machine and run it with some sort of command and control center to automate tasks (IE, run regular some basic automated stuff).
This makes the whole release even more interesting, I wonder if we'll get a statement on why they have that debug mode.
>but the best way to hide a backdoor is to make it look like a mistake
It is, but usually the best way to do is to make it look like a mistake that's very subtle and difficult to notice without careful testing and analysis, kind of like Apple's infamous SSL "goto fail". That's a classic example of a vulnerability that really could be either an honest mistake or a very insidious backdoor.
This is more like leaving the house's sliding glass door to the backyard wide open for everyone to see.
Fairly portable and future-proof, and very easy to compile (just collect dependencies; you need to really go out of your way to compile and "link" Java applications wrong)
I am going to sound pessimistic here, but isn't there a real danger of having this technology available to bad actors and is there any value to keeping such things confidential if it plays a role in national security?
If someone was releasing malicious software to hijack the power grid as an example, wouldn't they be first able to use this to try to improve the robustness and invisibility of their attack ?
Or is the functionality here common place enough that it doesn't tilt the axis of power in an unfavorable way?
The general philosophy in computer security is that bad actors will be using the tools they need to use anyway, and the world is actually more secure if the best tools to use are widely available, since the security professionals in charge of defending systems are trained on their use, and can better anticipate an attacker's methods. Even better if they're open source, so that they're easily analyzable, and the barrier for training is lowered.
Uh, you're sure about that? I can't say I have first hand experience with the leak (running second hand NSA software not meant for public release seems like a bad idea somehow) but I know I've heard that it was, and this seems pretty suggestive...
So, albeit my use case is a little weird I guess, and I generally am using it for embedded systems, but:
Hopper - is Capstone.
BinaryNinja - The extension API isn't well documented last time I checked. Embedded systems sort of requires letting me fill in some of the gaps myself.
Capstone - I got frustrated when the translation script behind it that autogens code from the LLVM definitions wasn't available (as source or otherwise) which meant that I couldn't add to the instruction set in a meaningful way like I needed to.
Radare(2) - Feels like the barely glued together independent projects that it is. Somehow has a more inscrutable interface than IDA.
One of the frontends I tried (can't remember if it was Hopper, Clipper, or something else) for some reason thought PowerPC had branch delay slots, which was totally screwing up the basic block determination.
I'm not especially a fan of IDA, but I don't do much of this work anymore and haven't had a reason to catch up. IDA definitely wouldn't be the first tool I'd reach for in 2019.
It's the de facto standard and the program you can assume everyone is already using, plus the fact that a lot of tooling relies on IDA (in part because, for a long time, it was the only game in town) for analysis and function recovery. I don't know if that really makes it "better".
I got out of this stuff before decompilation became a mainstream feature, so it might be a big deal that Ghidra has a strong decompiler.
Not at all,this helps analyze malware not create it. There is no security obtained by preventing reverse engineering of a binary. If anything this makes adversaries ability to hide their methodoligies harder,a strategic advantage for someone like the US government.
I am not sure I completely agree. If I know how my adversary detects and studies stealth code, I may be able to design better stealth code that is better at evading their methods of detection.
I mean the evolution of stealth tech in military has followed a similar path. As radar systems improve over decades, they keep on working on new ways to evade detection for aviation/missile tech.
I understand the high level point of good tools being more widely available to the white hat crowd, but I am trying to understand the argument that this is 100% better in all cases and there are no downsides.
It's just another disassembler. There are a bunch of them already. It is, to the state of the art of reverse engineering, about as big a deal as the first release of Sublime Text was for programmers. It's hard to think of a "downside", or at least one that wouldn't be equivalent to "Sublime Text made it easier for people to code malware".
The main cat-and-mouse game with malware isn't in making disassembly/decompilation hard--quite frankly, the problem is simply too trivial--it's in trying to keep the malware analysis people from finding the malware in the first place. The "I'm being run in a VM for malware analysis, so don't trigger my payload in the first place" game.
Nothing is ever 100% better with no downsides, except maybe drinking water. That's a silly standard.
The upsides of people getting it who aren't willing to break the law outweigh the downsides of bad people getting it more easily. Probably. That's the best you can expect with security tools.
Code is so vulnerable right now that only a small number of projects/products get high payments from companies wanting to buy exploits. If this affects the situation, it will be marginal with an increase in vulnerabilities whose number is a drop in the bucket compared to what's already being introduced with poor QA and found by bug hunters. Here's just one project as an example:
I suspect they're just trying to expand US cyber capabilities and recruiting.
“If I go to the next capture-the-flag contest and I see some college students using Ghidra, I will be really excited” - Rob Joyce, senior cybersecurity adviser at NSA
Well, it should. It kept having issues with the only APK I had on hand, but when I just pulled some DEX files out and loaded them it handled them just fine (including decompilation)
>includes a suite of software analysis tools for analyzing compiled code on a variety of platforms including Windows, Mac OS, and Linux
From the site, so yes it works on non-windows binaries. It also runs on Linux, Mac and Windows. This is the list of file formats I found in the docs that are supported by Ghidra
No. This is clear apophenia. Ghidra is a reference to the Japanese video game boss of the same name, which was supposed to be called Hydra, but due to mis-translation, came as Ghidra
In Sanskrit, a vulture is vocally spoken aloud like [Giddh], emphasis on the end.
It appears that it isn't actually available as of now. Apparently, the NSA is going to release it at RSA Conference 2019, so it'll probably actually be published within the next couple of days.
- Ghidra is basically the first real competitor to IDA Pro, the extremely expensive and often pirated state-of-the-art software for reverse engineering. Nothing else has come close to IDA Pro.
- Ghidra is open-source, IDA Pro is not.
- Ghidra has a lot of really cool features that IDA Pro doesn't, such as decompiling binaries to pseudo-C code.
- It's also collaborative, which is interesting because multiple people can reverse engineer the same binary at the same time -- something IDA only got VERY recently.