Wish I'd known about that years ago just for the ability to grab a call stack from a running program.
edit: made more accurate
Amusingly, Simon and I were working on the same codebase during Eventbrite's quarterly hackathon when this happened. He was working solo on one feature while myself and three other developers were working on another. Because we were collaborating, we had to commit frequently simply to stay in sync. Because he was working alone, he was able to avoid committing and get into a situation where figuring out the process in this gist was necessary.
Despite this distraction and 1/4 the personpower, Simon arguably still wrote something cooler than we did.
However, if you're reverse engineering a system you've inherited, or one where source code isn't readily available, then this decompilation makes sense.
Fortunately, you can just write:
(Please, please, don't be condescending to people who are learning something new.)
Of course, since the key is in the binary, you can hack at it: https://www.usenix.org/conference/woot13/workshop-program/pr...
fetched all dependencies..lets try decompiling
no saved opcode mapping found; try to generate it
pass one: automatically generate opcode mapping
pass two: decrypt files, patch bytecode and decompile
successfully decrypted and decompiled: 1727 files
error while decrypting: 0 files
error while decompiling: 196 files
opcode misses: 7 total 0x6c (108) [#9], 0x2c (44) [#14], 0x8d (141) [#15], 0x2e (46) [#1], 0x2d (45) [#14], 0x30 (48) [#5], 0x71 (113) [#11783],
After that you get the binary blob back which you can unmarshall now. But you still need to figure out the opcode mapping. For that I used a trick publicly first done (to the best of my knowledge) by the author of PyREtic (Rich Smith) released at BH 2010. He just compares the stdlib pyc files with the stdlib included within dropbox (after decrypting those pyc files) byte by byte. That should yield a mapping of opcodes.
Then pass everything through uncompyle2 and you've got pretty readable source code back. Some files will refuse to decompile but that means hand-editing / fine-tuning the last bits of your opcode table a bit.
EDIT: follow-up on parent comment; the encryption keys are not in the interpreter. The interpreter is patched to not expose co_code and more (to make this memory dumping more difficult; injecting an shared object is a different technique that I used too). It's also patched to use the different opcode mapping and the unmarshalling of pyc files upon loading them. However the key for each pyc file is derived from data strictly in those files themselves. It's pretty clear when you load up the binary in IDA Pro and compare the unmarshalling code with a standard Python interpreter's code
- A rotating opcode table that changes every X opcodes
- Multiple opcodes that referenced the same operation, selected randomly at generation time.
call rb_eval_string("puts 'hi from the Ruby process'")
You can use that to (try to) load code. For example try loading pry and setting up a pry-remote connection. You can also use any other part of the MRI extension API - gdb can do quite complicated calls.
Note that any uncaught error may easily cause the Ruby script you're connected to to terminate, and there are plenty of opportunities to cause problems since when you're connecting gdp to the process, the Ruby interpreter will be paused at an arbitrary point, which may leave the interpreter in an inconsistent state.
To wrap the above up into anything reasonable, you'd want to first ensure the interpreter is in a decent state by setting a breakpoint somewhere "sane" in the interpreter, continue execution, and then execute some very carefully crafted code to let you inject the code you need without triggering any errors that mess up for the script, or changing variables etc. the script depends on.
It probably could be wrapped up into something quite nice combined with pry/pry-remote, though.
In my case I get "No symbol "rb_backtrace" in current context.", I will look deeper into it.
Also, there is a great gist about Ruby GDB debugging: https://gist.github.com/mmullis/6211061.
One time I was able to go back weeks to fetch some code I'd long since deleted and was nowhere to be found in git.
" persist undo history to file / across sessions
" max out history length
Combine that with https://github.com/mbbill/undotree and you can easily walk the entire history of every file you edit.
(Well, the other big gripe is that when I'm forced to use a program with crippled undo, e.g. any Microsoft product ever, I'm frustrated up the wall. The worst-in-class award goes to Word with Endnote, where many Endnote commands wipe your entire undo history. How people can use that on a permanent basis is beyond me.)
gdb -p `pidof python`
strings foo.core | grep -a200 -A200 knowntext
.. would also work.