Hacker News new | past | comments | ask | show | jobs | submit login
Hackme: Deconstructing an ELF File (manoharvanga.com)
147 points by mvanga on Sept 5, 2011 | hide | past | favorite | 19 comments

Don't want to take the wind out of anyone's sails, but this program is hardly hard-to-hack. Bravo for getting to grips with ELF, assembly and reverse engineering. But this article represented just the first few steps on a long an intriguing road.

If it was hard-to-hack then I would expect (at least) the following:

  * Output messages can't be discovered using "strings"
  * Program is self-encrypted
  * Password isn't even stored, just hash result.
The "hard-to-hack" program presented would take about 30 seconds using IDA[1].

[1] http://www.hex-rays.com/idapro/

(And I consider myself an amateur at this kind of thing).

That was mostly the point of the article: that it wasn't so hard to hack in the end and all the information needed to break it was visible in plain sight.

Like you said, if you really wanted to write a hard to hack binary, just use a strong hash without the plaintext on a hellish password. Heck, just leave the hash in the strings output :)

Thats brilliant. It reminds me of the size matters article I read a couple of years ago, the most difficult title to google too http://www.muppetlabs.com/~breadbox/software/tiny/teensy.htm...

Any chance someone who knows more assembly than me can explain how the symbol names for dlsym() are retrieved?

ie I would have expected to see 'ptrace', 'scanf' and 'printf' in the strings output, but they must be obfuscated in some way (otherwise I guess there's no point using the dlopen/dlsym trick at all.)

I only see one call to dlsym (at 8048506), so it seems to me the program is doing something tricky to build each symbol name string and then calling a routine there to dlsym() it.

That's about where my x86-fu fails me, though, and I remember I should be working on other things. :/

The function names are hidden in the .text section; each character is xored with 0x55. You can see the xoring here:

  80484e0: 83 f2 55              xor    $0x55,%edx
The encoded strings are:

  >>> def ascii_to_xored_hex(s, xorval):
  ...   return ''.join(['%02x' % (ord(c) ^ xorval) for c in s])
  >>> ascii_to_xored_hex('ptrace', 0x55)
  >>> ascii_to_xored_hex('printf', 0x55)
They're hidden in plain sight!

  mrj10@mjlap:~/Downloads$ xxd hackme | grep 2521
  0000680: 008d 7600 2521 2734 3630 0090 2636 343b  ..v.%!'460..&64;
  mrj10@mjlap:~/Downloads$ xxd hackme | grep 2527
  0000690: 3300 6690 2527 3c3b 2133 0090 6afb 4c8d  3.f.%'<;!3..j.L.

  To the disassembler, these strings look like and-xor sequences.  e.g., for 'ptrace':

   8048684: 25 21 27 34 36        and    $0x36342721,%eax
   8048689: 30 00                 xor    %al,(%eax)
As you can see from the hexdump, these did show up when he ran strings (e.g., %!'460 and %'<;!3 ), they just weren't recognizable.

Nice catch! I didn't notice this because I restricted my search to between the printf calls. The strings output makes more sense now!

Neat, thanks. :)

I have a hardcopy of Paul Carter's "PC Assembly Language" book at home that I started reading once but never finished. Some day... :/

Interesting, but the objdump output is very primitive compared to more advanced disassemblers, which should be able to provide string cross-references etc in-line.

I was actually trying to find a good one to use when first trying to be lazy but to no avail.

A few friends also (unsuccessfully) tried in parallel to get the password and they were using IDA (http://www.hex-rays.com/idapro/) but I have not personally tried it. It seems like a good option (although it is not open source, which irks me a little :P).

I also tried to use an existing ASM-to-C decompiler called Boomerang (http://boomerang.sourceforge.net/license.php) but the output was a complete mess to understand (and compile). Maybe I'll try writing one of these when I'm bored on another lazy Friday :)

Any other (preferably open) recommendations for Linux?

I didn't have much luck using Boomerang recently either.

The REC decompiler (http://www.backerstreet.com/rec/rec.htm) isn't horrible. For simple stuff, it'll give you reasonable looking C-ish code. For anything slightly more complex, it may produce wrong code. It's not so good at eliminating duplicate variables, but manually removing them isn't hard, they're easy to see.

I've recently been reversing a few stripped DLLs on Windows. REC worked well on the short functions but severely changed the logic of a few more complicated ones, especially doing bit shifts, concatenating bytes, and doing complex loops.

I've seen IDA. I'd love to use it but it's expensive and I don't reverse engineer enough to justify asking for the company to buy it. That, and I'd also have to learn how to use it effectively, which would add time and possibly stunt me learning the basics first. Since I'm most certainly doing my work for commercial purposes, the demo / educational versions of IDA aren't usable for me (license agreement says so).

EDIT: REC studio does not appear to be free (as in speech) software but it is free (as in beer) to use for most purposes and it runs on Windows / Linux / Mac.

Probably because writing disassemblers is a pain in the ass. (I have a half-finished x86 disassembler written in JavaScript: https://github.com/luser/disasmx86.js )

Not any disassembler, just x86 disassembly. The Opcode + Mod/Rm + SIB combinations is just insane.

Probably worth noting that you can use --disassembler-options=intel with objdump, if that's your thing. Makes it much nicer for me.

That is much better! I had to dig up the articles on the AT&T syntax to understand the plain objdump out but with this, that's no longer needed. I need to get to know my tools much more :)

Yeah, I hate to think about how long I suffered used objdump before I thought to check the manpage for that. :)

Any particular disassembler(s) you recommend? How about on other architectures; MIPS or ARM?

For free (as in beer) you can get IDA Pro 5.0 (an older version than the latest.)

Not sure what architectures that version supports outside of x86 (the latest commercial version does ARM and the Advanced variant does MIPS.)

IDA Free has traditionally only supported x86 PE binaries, but I don't know if that's changed with the bump up to 5.0.

Actually when I dabbled with disassemblers it was mostly on win32, where OllyDBG is nice.

But IDA is probably the best you can get. Haven't had a look at the new versions, but even the freeware version is useful (on windows at least)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact