A commercial decompiler yeilding 0% accuracy sounds odd.
The same AST graph.
The same types.
The same text tokens.
I assume they’re talking about Hex Rays. Optimizing compilers result in different code structure than your input, and Hex Rays takes many liberties on output. It’s not _trying_ to match the input perfectly, it’s trying to emit valid C that a human can understand. It’s full of casts and weird control flow. A break might turn into a goto. Functions will be inlined. Structure and class information is lost. A switch might turn into a few if statements, in the wrong order.
- Hex Rays output is best when massaged by an experienced user (which is the primary mode of use. It’s an interactive tool, not a one way transform). I assume that didn’t happen here.
- The accuracy is probably based on literal structure or tokens, and Hex Rays doesn’t (and probably shouldn’t) try to guess the original structure to that degree. Decompiler output is noisy.
I guess their accuracy means "the same AST", because according to Appendix F of this paper they showed at least one case where RetDec did produced a correct decompilation, but with optimization quirks so it doesn't look like exactly same as the original.
Now that I’ve said it out loud I bet accuracy goes way down if you train or evaluate against a wide spectrum of compiler versions, as these hidden parameters will change.
...but those parameters also don’t entirely matter. So I think they probably need a better evaluation method.
Also, I'd argue their benchmark is unfair. Checkout the appendix F, in the second example RetDec did actually produced correct result (though its output is not the most readable and it didn't de-optimize enough to remove noises introduced by compiler optimization), but they dismissed it as "difficult for human understanding". And in the first example I suspect they hand-picked a case where RetDec failed to perform function signature recovery because the function argument is floating point.
Edit: also, they run their benchmark on MIPS, because apparently their NN performs worse on x86 than MIPS. But no traditional decompiler was heavily optimized for MIPS.
Their version (ii) cannot compile, since the variable names don't match (e.g. they return "c" instead of "v3" and multiply by "b3" instead of "v3").
They did use x86-64 ISA as well, though without any optimisation enabled, i.e. "-O0" (see Table 5 in appendix D). I doubt those results are very useful in practice due to the complete lack of optimisation.
Still, not bad.