
How Professional Hackers Understand Protected Code While Performing Attack Tasks [pdf] - lainon
https://pdfs.semanticscholar.org/4bd1/2a9823b55d29a0d75c9ea9c8cd08b6fdca3e.pdf
======
workerthread
It seems to me that the input data for the study is "just" the final report
from the hackers to the customer. The academic researchers (who are
presumbably nowhere near the level of expertise of the hackers) then annotate
and categorize the conceptual tasks behind each word and sentences in the
report.

It seems to me that a lot of bias on the input data, based on the annotators
knowledge. They try to account for this by using multiple (n=7) annotators,
but I doubt if that is enough.

Two questions come to my mind:

1) What level of detail do the final reports contain? I have procured and read
a few pen testing reports myself, and the level of technical detail seemed too
low to infer the hour-by-hour activities of the hackers in any meaningful way.
Would be nice if the paper explained what those reports actually contained

2) I wonder what it would take to get the hackers themselves to keep a
diary/journal of the hour by hour activities. That would remove a lot of noise
from the input data.

------
bitexploder
Their technique is pretty interesting (open coding).

There is a level in abstraction below (more real) that is greatly helpful in
breaking DRM and finding other weird bits of code. Running code in an
instrumented qemu for example. It is much more accessible and productive for
me over SMT solver tooling.

Generally their attack tree is very nice, but they sweep a lot under the rug
in their dynamic analysis section. This is a short paper though, so no fault
there. Their tree is a concise outline for anyone looking to methodically RE
any black box thing, not just DRM and protection schemes.

