
Reverse engineering a custom CPU from a single program - nneonneo
https://www.robertxiao.ca/hacking/dsctf-2019-cpu-adventure-unknown-cpu-reversing/
======
Arathorn
I spent a summer doing something very similar to this at a major military
radio manufacturer in the mid 90s - it turned out that one of their product
lines from the late 70s used an entirely custom 8-bit CPU for which the
instruction set had somehow been entirely lost. However, they still had the
firmware on a stack of EPROMs. So, the mission was to reverse engineer the old
CPU to reimplement it on a modern DSP. Turned out that you can get
surprisingly far based on a frequency analysis of things that look like
opcodes ("let's assume it has two accumulator registers; that loading is the
most common instruction; etc."), making some educated guesses about how the
designer would have allocated the opcode bits, and then plonking a HP logic
state analyser straight over the top of the 32-pin DIL to check the
hypothesis. Fun times :)

~~~
nomadluap
I'm curious as to why you call the package "DIL" instead of "DIP". I've never
heard them called "DIL" before.

~~~
woodrowbarlow
i'm gonna guess it's because L and P are next to each other on the keyboard.

~~~
misterdoubt
Good ol' DI;s

------
phire
This is pretty much the same process we went though to reverse engineer the
custom "VPU" instruction set for the co-processor that the Raspberry Pi's
firmware runs on.

We used the publicly available bootcode.bin and loader.bin to RE most of the
ISA before the Pis even started shipping, though there were some more obscure
instructions that we weren't sure about until we could run our own code.

But when my pi did arrive, we knew enough to write a binary that would blink
an LED on more-or-less the first attempt.

I guess the real lesson, custom ISAs are not a good form of security.

~~~
earenndil
Security by obscurity is, in general, not a great idea.

~~~
SomeOldThrow
Depends on your needs and the threat profile.

------
ccurrens
Cool! This reminds me a lot of this[0] where there was a specification for a
custom VM in a newspaper along with a binary. It still blows me away when I
read how they solved the puzzle.

[0] [https://safiire.github.io/blog/2017/08/19/solving-danish-
def...](https://safiire.github.io/blog/2017/08/19/solving-danish-defense-
intelligence-puzzle/)

~~~
jve
I imagine that this would be a good way to find some of the best reverse-
engineers in the world :)

~~~
wolfgke
> I imagine that this would be a good way to find some of the best reverse-
> engineers in the world :)

Just a consideration: aren't the names of these persons "principally" known
(at least if you are willing to do some investigations) if you are a
company/government agency that has an interest in them?

~~~
lallysingh
Nope. How would you find out who reverse engineers a lot and gets good at it?

Some names will be popular through fame or common channels, but you'll never
get a full list. Especially RE when some of their activities aren't legal and
they don't want to be found.

~~~
wolfgke
> Especially RE when some of their activities aren't legal and they don't want
> to be found.

I have trouble believing that if you don't want to be found, you will
participate in a reverse-engineering competition with your real-life identity.

~~~
lallysingh
No you wouldn't. I was responding to this ideas that all the really skilled
people are well known.

------
anyfoo
Hah, I use that "resize the window until aligned/a pattern emerges" trick,
too. If you think about it, humans' pattern recognition over vision works
impressively well. I'm sure there is plenty of reason why we evolved that way,
but the fact that you can take that ability and adopt it to something which is
completely artificial and "unnatural" (file representations on a screen),
completely without any conscious effort (you just resize the screen until you
suddenly intuitively "perceive" a very abrupt and markant change), is amazing.

~~~
souprock
That is fun with an FPGA bitstream. Used parts of the chip can look like a
fluffy cloud.

------
zxcvgm
I was following along OpenTechLab [0] as they tried to reverse engineer a real
CPU used in HDMI repeaters. The instruction set was already partially reversed
[1], but gaining code execution allowed a small stub to be written to infer
more based on before & after register states.

[0] [https://opentechlab.org.uk/](https://opentechlab.org.uk/)

[1] [https://github.com/v3l0c1r4pt0r/lkv-wiki/wiki/Instruction-
Se...](https://github.com/v3l0c1r4pt0r/lkv-wiki/wiki/Instruction-Set-
Architecture)

~~~
v3l0c1r4pt0r
Finally mentioned CPU turned out to be a core with OpenRISC architecture, but
nevertheless it was quite interesting challenge. I did my part, based
completely on source file, I was sure was compiled into binary (part of
FreeRTOS), but, yes, possibility to execute code and observe the results
allowed guy from opentechlab to achieve a lot more than I did.

------
archi42
As someone who sees a lot of different assemblers I really enjoyed this read.
But the most important lesson learned for the next CTF: Just probe the PRNG
and see if it is predictable :P

~~~
gp2000
Yes, enjoyable and good ideas. I feel like I've been on that "project" before.
You do a bunch of work, dig into things and then discover a simple answer
that, in hindsight, could have been applied immediately and made all the
investigation irrelevant.

Usually out of pride or the sunk cost fallacy (or something like it) I'll
convince myself there was no other way the problem was going to be solved.
Either way the next time around I spend just a little bit longer trying to
think of an easy way out.

------
saagarjha
> We internally had a bunch other names for these things – I called them
> kibbles, and Zach called them hecs.

Being a CTF challenge, I'm surprised that they didn't settle on something
decidedly more rude ;) I wonder if the organizers can release their assembler
for the architecture, or a spec at least…

~~~
q3k
The author has released their tooling:
[https://github.com/koriakin/cpuadventure/blob/master/README....](https://github.com/koriakin/cpuadventure/blob/master/README.md)

~~~
saagarjha
Thanks!

------
djmips
I'll contribute the obvious observation that game system emulators are reverse
engineered in this way. Most of the time CPU specs are available but in some
cases a weird custom DSP on a cart or other co processor needs to be figured
out and this is the kind of puzzling that does that.

------
classified
I once reverse-engineered the complete instruction set of the CPU in a pocket
calculator with a built-in BASIC interpreter. That was fun. My disassembler
and assembler BASIC programs are still functional today.

~~~
sq_
That sounds awesome. If you don’t mind, could you talk about what you did some
more?

~~~
classified
It's a Sharp PC-E500S. The BASIC has PEEK, POKE, and CALL (for running
assembly) commands, which is all you need for a hacking orgy. I'm not sure you
could even still get hold of that hardware today.

------
zests
How did they even get the game running in the first place?

~~~
artemist
There was a server that we socat'ed into. (I made some minor contributions to
solving this problem on PPP)

------
vectorEQ
very nice write up :D good job!

