Hacker News new | past | comments | ask | show | jobs | submit login
Adventures in reverse engineering Broadcom NIC firmware (devever.net)
168 points by jandeboevrie on April 17, 2019 | hide | past | favorite | 34 comments



Wow, I had a very similar LZSS related experience just a few weeks ago reverse-engineering a VW ECU, although my process didn't end up being as painful. Through luck (and no skill of my own, really), I chose to approach the decompression process by analyzing the data rather than disassembling the decompression routine. LZSS compressed data appears more compressed/garbled as the file goes on as the previous contents of the file are used as the dictionary for compression. LZSS basically works by encoding what I'll call a "copy command" - a series of bits which tell the decompressor whether to copy subsequent bytes from the "dictionary" (which is an offset into the previous content of the file) or to copy the bytes verbatim. Thankfully, the beginning of my file didn't have much repetition for the first 20-30 bytes, so I was able to recognize the periodically-zero bitfield "commands." Next I was able to recognize as they became nonzero (as items were available in the dictionary) and begin to see how the bitfield drove the decompressor. Doing this again, I could spot LZSS from orbit, but having never seen it (and having no versing or background in compression algorithms), I ended up putting together enough Google terms around "dictionary compression copy bitfield" to get exceptionally lucky (again!) and land on a page about LZ77, which took me to enough example implementations to allow me to rapidly implement a decompressor.


A wide background experience on computing in general helps greatly for RE; in this case the specific variant of LZ that's used is categorised by an old document I have as "LZ 12/4" (4K sliding window, 16-bit match offsets split into 12 and 4 for offset and length) which was very common in the late 80s/early 90s. A decompressor for it fits in a few dozen bytes of x86, which certainly helped its popularity, and the simplicity meant it was extremely fast.

Unfortunately Google's forgetfulness is infuriating, since the only reference to this variant name I could find is a previous comment I made here about compression algorithms: https://news.ycombinator.com/item?id=14965064


I've dealt with a few.

One turned out to be LZRW3, kind of like LZSS but with the 12-bit string source being a hash table index. That was totally reverse engineered from staring at hex bytes, and then later we found out the identity of the compression algorithm.

Another was never discovered. It was one of two compression algorithms used by VxWorks, as part of transitioning between boot stages. My solution was to put the entire RAM content into an ELF file (including all of VxWorks) as one big section, then hack it up to run as a Linux binary. I thus made a decompression program.

Writing an emulator is also a great choice. I do that a lot.


Fun fact - I wrote some customised firmware for the Tigon2 AceNIC ancestor I n this tale - (video stream offload). The original MIPS firmware solved the lack of mul/div ... by carefully writing code that never emitted those instructions. That caused a few head scratching moments!

Also, an interesting feature of this architecture was that it had no interrupt mechanism. There was a hardware ‘event’ bit register, and a special instruction to convert that it a ‘most important event’ offset into dispatch table. This made all race/concurrency issues go away - and the code was easy to reason about.



>> reverse engineering as a process tends to alternate between periods of exhilaration and of feeling like it's completely hopeless and there's no prospect of ever figuring out what's going on.

Having RE'd even patching bugs in closed source firmware myself (mostly for ARM), I can tell those words describe the process quite well.

Perhaps it's worst when you have to do it for work and not for sport, like most of the times that happened to me.


I do it for work all the time. Maybe that does take something away from sport, like a car mechanic losing the desire to tinker with his own car every day. One does need an income though, and IMHO you might as well do something you basically enjoy. See "Who is hiring?" post at https://news.ycombinator.com/item?id=19543995 if it seems like your thing.


Thanks. I have no complaints, on doing it either for work or sport; just wanted to say that nothing happens if you abandon and declare losing when doing it for sport. Instead for work, when reverse-engineering, options are two: you either figure it out, or you figure it out.


>I was also then able to figure out the origins of the compression algorithm; it's called LZSS, and the particular LZSS format used here turns out to originate from some public domain DOS code which someone posted on a Japanese BBS in 1988.

Seriously. One of the latest high performance server grade NIC firmware from very famous vendor still use the code originated from somebody's comment at 1988 Japanese BBS? We can guess the rest of the firmware quality with this fact.


There is probably no reason to use anything other. You want small and fast decompressor and do not care that much about compression factor. And it is not that things like FastLZ or LZO are that much better to be worth the effort. (Also one can assume that when this was originally written the only widely used algorithm of this class was LZO, whose commercial licence IIRC is not particularly cheap)


FTA :

>>> Since this entire reverse engineering project involved my extensive exposure to reverse engineered, proprietary code, I can't exactly just go and write FOSS firmware for this thing.

he's even good at handling the lawyer stuff :-)


This is also why I think you should never use your real identity on projects like this.

Note that in some countries there is a legal right to RE anything, so people there could be more lax about it.


Will the software produced using this documentation be legal to use ?



Where? I'm pretty sure that here in Poland he would be able to write a FLOSS driver himself.


What a hero! I would also love to hear about the tools/technique used in the process.


> Actually compiling this turned out to be an amusing excercise, because MIPS cores without hardware multiply or divide support aren't officially a thing anymore, which means that neither clang or GCC support targeting such devices.

I'm curious what the demand is for a simple, non-optimizing C compiler that translates code into the most straightforward assembly possible (i.e. a true "portable assembler").


Several of those are available already. https://bellard.org/tcc/ is one example, although not "portable" in the sense that it only has an x86 backend.


> not "portable" in the sense that it only has an x86 backend

Yeah, that's kind of a dealbreaker.


On the other hand, once you have a portable intermediate representation, and code for various backends, it's hard to imagine anything smaller than LLVM?

Assembly itself is highly non-portable.


Define simple - what do you leave out? From the article, not having divide or multiply is an anomaly.


All you need is `mov`:

https://github.com/xoreaxeaxeax/movfuscator/

Including the obligatory port of doom:

https://github.com/xoreaxeaxeax/movfuscator/tree/master/vali...

Note: The mov-only DOOM renders approximately one frame every 7 hours, so playing this version requires somewhat increased patience.


I'm saying something like a very simple mapping between operations and their assembly, which means it's easy to say rip out the multiplication and division and replace it with an alternate implementation in the code generator.


Would be interesting to read about tools used and more technical information on RE process.


There weren't that many off-the-shelf tools involved, other than things like binwalk and disassemblers. My workflow can also be hilariously ghetto at times: I like to output hexdump -C to a file, then annotate that in vim. For example: https://github.com/hlandau/ortega/blob/master/notes/bcm5719_...

Lots of tools were written from scratch. I wrote otgdbg for probing the device; this program has tons of subcommands to let me manipulate the device in various ways, get/set registers, boot a program on the MIPS side from memory, boot a program on the APE side from memory, copy a new image to flash, etc.

otgimg examines firmware images and prints information about them, like the MAC addresses in the configuration block, etc. apeimg shows information about APE firmware images and can decompress them.

Since the image formats are custom, I had to use linker scripts to build the images, but some fixups could only be done programmatically, like calculating CRC fields. These fixups were done with small C programs which the build system runs afterwards.

The APE used a more sophisticated image format with section headers, etc. The fixup program for the APE had to compress some of the sections, etc. before setting the CRCs.

These tools are all available in the repository, but most of them link to small amounts of proprietary/reversed code which is automatically scrubbed from the public release. It's not a large amount of code which would need to be replaced, though, if someone wants a tool like otgdbg to probe Broadcom NICs in arbitrary ways.

Oh, I should also mention that using clang and lld rather than gcc/binutils made targeting different architectures a breeze. It's long been a bone of irritation to me that you have to recompile gcc to retarget it; with clang, I could target both MIPS and ARM without compiling a new toolchain. https://github.com/hlandau/ortega/blob/master/cc_mips https://github.com/hlandau/ortega/blob/master/cc_arm


Great work Hugo. Thank you!


I'm not sure there is much standard tooling to a job like this, other than staring at hex dumps a lot. He mentions binwalk, which is a good place to start (see https://reverseengineering.stackexchange.com/questions/17262... for a good workflow).

Other than that it's a question of building job-specific scaffolding as you go.

> I decided to “emulate” x86 real mode inside C by translating x86 real mode disassembly into C very directly, modelling segment registers explicitly in C

(from the section described as "mentally draining")

Once he'd figured out the algorithm:

> the particular LZSS format used here turns out to originate from some public domain DOS code which someone posted on a Japanese BBS in 1988

Ah, the glory of the Internet long tail.

> After writing the shellcode to facilitate this mode of access, I finally had a way to access the APE's address space.

(relating to the use of provided functions to inject code into the APE, allowing inspection of its point of view and boot code)

> Although the diagnostic tool was quite helpful, ironically this is not because I ever managed to run it. Neither the DOS, UEFI nor Windows versions have ever worked for me. Instead, the diagnostic tools are useful because they contain various routines to probe APE registers, and then print the contents of these registers along with their names. It's not much information, but it's all I have, and it makes all the difference. Pretty much everything I know about the APE that isn't guessed is from the dry reverse engineering of this diagnostic tool.

More reverse engineering heroics, this time on the support tools.

Rather like marathon running or the ascent of Everest, a prime attribute for doing the thing is a refusal to give up long after the process has become painful and unrewarding.


On the topic of tools, the National Security Agency (!) very recently open sourced (!) their tool for reverse engineering under the Apache 2.0 license (!) and it apparently does quite well against the closed source and expensive IDA. Check ghidra out. They are actively pulling in patches from the community, tracking issues, etc. I think this will progress forward quite quickly.

https://github.com/NationalSecurityAgency/ghidra


Note that ghidra is not that useful.for analyzing things that are not well formed loadable object files, because (in comparison to IDA) it relies on autodetection too much.

See for example almost any screenshots of ghidra in mainstream-ish media which show completely nonsencial i386 disassembly interspersed with meaningfull looking autogenerated labels and comments. That is what reliably happens when you feed .NET binary into the thing, as it sees embedded debug info and proceeds to disassemble the .NET bytecode as i386 code...


For RE/exploration the most popular tools at the moment are probably radare2, (with various guide) hopper, ida. Maybe ollydbg is you deal with windows. Each has its fans.


Related: Reverse engineering the Qualcomm baseband processor talk from 28C3

http://events.ccc.de/congress/2011/Fahrplan/attachments/2022...

https://youtu.be/IWSCdpAeONA


BMC = Baseboard Management Controller (for those like me who don't know the acronym)


I feel obliged to confess here that whenever I come upon "NVM" in text like this I have to remind myself not to just skip to the next paragraph (because "nevermind").


That was a wild ride. If he thinks he’s bad at reverse engineering, I wonder who’s good at it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: