
Adventures in reverse engineering Broadcom NIC firmware - jandeboevrie
https://www.devever.net/~hl/ortega
======
bri3d
Wow, I had a very similar LZSS related experience just a few weeks ago
reverse-engineering a VW ECU, although my process didn't end up being as
painful. Through luck (and no skill of my own, really), I chose to approach
the decompression process by analyzing the data rather than disassembling the
decompression routine. LZSS compressed data appears more compressed/garbled as
the file goes on as the previous contents of the file are used as the
dictionary for compression. LZSS basically works by encoding what I'll call a
"copy command" \- a series of bits which tell the decompressor whether to copy
subsequent bytes from the "dictionary" (which is an offset into the previous
content of the file) or to copy the bytes verbatim. Thankfully, the beginning
of my file didn't have much repetition for the first 20-30 bytes, so I was
able to recognize the periodically-zero bitfield "commands." Next I was able
to recognize as they became nonzero (as items were available in the
dictionary) and begin to see how the bitfield drove the decompressor. Doing
this again, I could spot LZSS from orbit, but having never seen it (and having
no versing or background in compression algorithms), I ended up putting
together enough Google terms around "dictionary compression copy bitfield" to
get exceptionally lucky (again!) and land on a page about LZ77, which took me
to enough example implementations to allow me to rapidly implement a
decompressor.

~~~
userbinator
A wide background experience on computing in general helps greatly for RE; in
this case the specific variant of LZ that's used is categorised by an old
document I have as "LZ 12/4" (4K sliding window, 16-bit match offsets split
into 12 and 4 for offset and length) which was _very_ common in the late
80s/early 90s. A decompressor for it fits in a few dozen bytes of x86, which
certainly helped its popularity, and the simplicity meant it was extremely
fast.

Unfortunately Google's forgetfulness is infuriating, since the only reference
to this variant name I could find is a previous comment I made here about
compression algorithms:
[https://news.ycombinator.com/item?id=14965064](https://news.ycombinator.com/item?id=14965064)

------
samlittlewood
Fun fact - I wrote some customised firmware for the Tigon2 AceNIC ancestor I n
this tale - (video stream offload). The original MIPS firmware solved the lack
of mul/div ... by carefully writing code that never emitted those
instructions. That caused a few head scratching moments!

Also, an interesting feature of this architecture was that it had no interrupt
mechanism. There was a hardware ‘event’ bit register, and a special
instruction to convert that it a ‘most important event’ offset into dispatch
table. This made all race/concurrency issues go away - and the code was easy
to reason about.

------
voltagex_
[https://github.com/hlandau/ortega/blob/master/rtg-
spec.md#fu...](https://github.com/hlandau/ortega/blob/master/rtg-
spec.md#fucking-broadcom) is pretty funny/sad.

------
scoutt
>> _reverse engineering as a process tends to alternate between periods of
exhilaration and of feeling like it 's completely hopeless and there's no
prospect of ever figuring out what's going on._

Having RE'd even patching bugs in closed source firmware myself (mostly for
ARM), I can tell those words describe the process quite well.

Perhaps it's worst when you have to do it for _work_ and not for _sport_ ,
like most of the times that happened to me.

~~~
souprock
I do it for work all the time. Maybe that does take something away from sport,
like a car mechanic losing the desire to tinker with his own car every day.
One does need an income though, and IMHO you might as well do something you
basically enjoy. See "Who is hiring?" post at
[https://news.ycombinator.com/item?id=19543995](https://news.ycombinator.com/item?id=19543995)
if it seems like your thing.

~~~
scoutt
Thanks. I have no complaints, on doing it either for _work_ or _sport_ ; just
wanted to say that nothing happens if you abandon and declare losing when
doing it for _sport_. Instead for _work_ , when reverse-engineering, options
are two: you either figure it out, or you figure it out.

------
ezoe
>I was also then able to figure out the origins of the compression algorithm;
it's called LZSS, and the particular LZSS format used here turns out to
originate from some public domain DOS code which someone posted on a Japanese
BBS in 1988.

Seriously. One of the latest high performance server grade NIC firmware from
very famous vendor still use the code originated from somebody's comment at
1988 Japanese BBS? We can guess the rest of the firmware quality with this
fact.

~~~
dfox
There is probably no reason to use anything other. You want small and fast
decompressor and do not care that much about compression factor. And it is not
that things like FastLZ or LZO are that much better to be worth the effort.
(Also one can assume that when this was originally written the only widely
used algorithm of this class was LZO, whose commercial licence IIRC is not
particularly cheap)

------
wiz21c
FTA :

>>> Since this entire reverse engineering project involved my extensive
exposure to reverse engineered, proprietary code, I can't exactly just go and
write FOSS firmware for this thing.

he's even good at handling the lawyer stuff :-)

~~~
simula67
Will the software produced using this documentation be legal to use ?

~~~
hlandau
Yes. That's the whole point.
[https://en.wikipedia.org/wiki/Clean_room_design](https://en.wikipedia.org/wiki/Clean_room_design)

------
huxflux
What a hero! I would also love to hear about the tools/technique used in the
process.

------
saagarjha
> Actually compiling this turned out to be an amusing excercise, because MIPS
> cores without hardware multiply or divide support aren't officially a thing
> anymore, which means that neither clang or GCC support targeting such
> devices.

I'm curious what the demand is for a simple, non-optimizing C compiler that
translates code into the most straightforward assembly possible (i.e. a true
"portable assembler").

~~~
voltagex_
Define simple - what do you leave out? From the article, not having divide or
multiply is an anomaly.

~~~
stevekemp
All you need is `mov`:

[https://github.com/xoreaxeaxeax/movfuscator/](https://github.com/xoreaxeaxeax/movfuscator/)

Including the obligatory port of doom:

[https://github.com/xoreaxeaxeax/movfuscator/tree/master/vali...](https://github.com/xoreaxeaxeax/movfuscator/tree/master/validation/doom)

Note: The mov-only DOOM renders approximately one frame every 7 hours, so
playing this version requires somewhat increased patience.

------
xvilka
Would be interesting to read about tools used and more technical information
on RE process.

~~~
hlandau
There weren't that many off-the-shelf tools involved, other than things like
binwalk and disassemblers. My workflow can also be hilariously ghetto at
times: I like to output hexdump -C to a file, then annotate that in vim. For
example:
[https://github.com/hlandau/ortega/blob/master/notes/bcm5719_...](https://github.com/hlandau/ortega/blob/master/notes/bcm5719_talos.txt)

Lots of tools were written from scratch. I wrote otgdbg for probing the
device; this program has tons of subcommands to let me manipulate the device
in various ways, get/set registers, boot a program on the MIPS side from
memory, boot a program on the APE side from memory, copy a new image to flash,
etc.

otgimg examines firmware images and prints information about them, like the
MAC addresses in the configuration block, etc. apeimg shows information about
APE firmware images and can decompress them.

Since the image formats are custom, I had to use linker scripts to build the
images, but some fixups could only be done programmatically, like calculating
CRC fields. These fixups were done with small C programs which the build
system runs afterwards.

The APE used a more sophisticated image format with section headers, etc. The
fixup program for the APE had to compress some of the sections, etc. before
setting the CRCs.

These tools are all available in the repository, but most of them link to
small amounts of proprietary/reversed code which is automatically scrubbed
from the public release. It's not a large amount of code which would need to
be replaced, though, if someone wants a tool like otgdbg to probe Broadcom
NICs in arbitrary ways.

Oh, I should also mention that using clang and lld rather than gcc/binutils
made targeting different architectures a breeze. It's long been a bone of
irritation to me that you have to recompile gcc to retarget it; with clang, I
could target both MIPS and ARM without compiling a new toolchain.
[https://github.com/hlandau/ortega/blob/master/cc_mips](https://github.com/hlandau/ortega/blob/master/cc_mips)
[https://github.com/hlandau/ortega/blob/master/cc_arm](https://github.com/hlandau/ortega/blob/master/cc_arm)

~~~
markjenkinswpg
Great work Hugo. Thank you!

------
ignoramous
Related: Reverse engineering the Qualcomm baseband processor talk from 28C3

[http://events.ccc.de/congress/2011/Fahrplan/attachments/2022...](http://events.ccc.de/congress/2011/Fahrplan/attachments/2022_11-ccc-
qcombbdbg.pdf)

[https://youtu.be/IWSCdpAeONA](https://youtu.be/IWSCdpAeONA)

------
delta1
BMC = Baseboard Management Controller (for those like me who don't know the
acronym)

~~~
ncmncm
I feel obliged to confess here that whenever I come upon "NVM" in text like
this I have to remind myself not to just skip to the next paragraph (because
"nevermind").

------
monochromatic
That was a wild ride. If he thinks he’s bad at reverse engineering, I wonder
who’s good at it.

