jeremiahar's comments

jeremiahar · 2025-02-12T17:44:12 1739382252

I worked on something like this back in 2016, I'm not sure how much things have changed since then. I used dynamic binary instrumentation to deal with the field encryption. Basically, manually map the executable into executable memory on Linux (as if it were a shared library). Begin execution at the packet switch, but before executing a block of code, disassemble it until a conditional branch, and modify it according to some heuristics to remove the at rest encryption. The original block of code wasn't executed since it might not fit into the original block size, so new blocks were mmap'd for this. Malloc/Free were hooked and replaced with wrappers over glibc's free/malloc, but with bookkeeping so that the memory can be freed after execution of the packet switch. atexit was just replaced with a noop. That all just dealt with the encryption, but there were also randomized packet id's and field orders. Those problems were dealt with by using manually written heuristics based on the packet id's which were actually interesting. Packet handlers with references to text strings (even hashed ones), etc were a gold mine here because they made static detection of packet id's simple. If there was no text string, many of the offsets could be auto detected just by parsing a replay and running small snippets to determine which offsets actually "made sense" for the field that was being searched for. For example, if there was a gold gain packet, the amount of gold gained shouldn't be out of an expected range, or else the offset is likely not corresponding to that field. Once all of the high volume code blocks had been instrumented, replays were able to be parsed in 2-3 seconds (along with generating the desired data aggregations). This is all from memory so it's possible there could be a minor mistake or two.