
An 8086 PC emulator in 4043 bytes - epsylon
http://ioccc.org/2013/cable3/hint.html
======
mmastrac
This is interesting, though it might be a bit of a cheat. He's put some data
in the BIOS image to assist in decoding instructions (not to take away from
how amazing this is!):

"CPU supports the full 8086/186 instruction set. Due to the complexities of
the 8086’s arbitrary-length instruction decoding and flags, 8086 instructions
are first converted to a simpler intermediate format before being executed.
This conversion, along with instruction lengths and how each instruction
modifies the flags, is assisted by some lookup tables which form part of the
BIOS binary."

With that in mind, the BIOS is _hand-written_ and about 12kB (gzips to <4kB),
so that's also a fantastic achievement. All-in-all, this is a wonderful entry.

~~~
mikeash
Abuse of the rules, if done well, is a time-honored IOCCC tradition.

~~~
TazeTSchnitzel
It's not the only abuse of the rules. Looking at the hint, I think there's
some flaw in the size-testing tool which they've exploited.

~~~
mikeash
The hint mentions "the importance of comments". The only comment I can spot in
the actual code is here (with a couple of characters of context):

    
    
        I/**/n
    

The code up to that point (2100 bytes total) is roughly as long as the listed
output of the size tool (1979). The difference is probably due to the fact
that I just counted raw bytes rather than accounting for whitespace and such.
My guess, then, is that /* */ without a space hits a bug in the size tool
where it probably counts that as the beginning of a comment but misses the
end, causing it to count the entire remainder of the program as a comment. The
compiler, meanwhile, will see that as an empty comment and continue reading
code.

This was a fairly predictable consequence of providing an official size tool.
Used to be, they just laid out the rules for figuring out sizes, and you had
to either count manually, or build your own tool to count. Now, bugs in the
size tool end up being bugs in the rules, and fun like this ensues. Whether
this is good or bad is a matter of opinion. I would say it certainly makes
things interesting!

Edit: HN formatting makes the inline form of that comment look like a
C++/C99-style // comment. Fun!

~~~
jrockway
Clearly the official tool runs the C source code through a markdown
interpreter first!

------
ori_b
I went through and reformatted the code, after expanding all of the macros
that contained unbalanced parens:
[http://eigenstate.org/cable3-formatted.c](http://eigenstate.org/cable3-formatted.c)

I still haven't puzzled out exactly what all the variables are, but I think
that this makes the structure much clearer.

------
petercooper
The source itself:
[http://ioccc.org/2013/cable3/cable3.c](http://ioccc.org/2013/cable3/cable3.c)

Or after running through preprocessing (not a huge help I found, sadly):
[https://gist.github.com/peterc/8259713](https://gist.github.com/peterc/8259713)

~~~
mschuster91
Did you do this by hand? Line 38 looks weird: main(BX,nE)char __nE; {

~~~
sdevlin
That's an old-school C function declaration. Note that the function return
type are parameter types are implicitly int. nE is then explicitly declared to
be a char pointer pointer.

~~~
mschuster91
But the char __is outside of the declaration and before the opening brace - is
this legal? And especially, why isn 't it written like main(BX,char __nE)?
Doesn 't save any bytes,it rather adds one semicolon...

~~~
petercooper
Yes, it's very old fashioned (K&R era).

I think the reason for doing it this way rather than putting the type
declaration alongside the parameter is then it would also demand _also_ adding
an _int_ declaration for the BX parameter which would then cause the code to
be longer. That is to say with:

    
    
        main(BX,char**nE){
    

.. it would throw an _expected identifier_ error on the char. So you'd be
putting the original:

    
    
        main(BX,nE)n**nE;{
    

Against this error-free alternative:

    
    
        main(int BX,n**nE){
    

.. but that's one byte longer.

If I got any of this wrong, I hope someone will correct me though :)

~~~
mschuster91
Ah, thanks for the explanation. Looks like the author did a deep, deep dive in
the C language specs!

~~~
hafflelog
K&R predates the specs, actually. C89 codified the current C parameter
convention. Compilers typically continue to support K&R style for very old C
code and because it doesn't break modern C support.

------
userbinator
> Instruction codings are complex and irregular in size and structure, with
> multiple addressing modes and no consistent memory placement for operands,
> very often multiple possible encodings for the same instruction

It's only complex and irregular if you look at it wrong. ;)

The x86 (like the 808{5,6} and the Z80) instructions look organised in an
octal format: ftp://mipt.cc/Opcode.txt

I haven't inspected the code in any detail to see if it takes advantage of
this.

------
bitwize
I thought "Wow, that's a neat trick."

Then I saw AutoCAD, Flight Simulator, and Windows.

Mind = blown.

------
kyberias
Here's a link to a fully functional Windows binary (built with Visual Studio
2013):

[http://sdrv.ms/1hodqjP](http://sdrv.ms/1hodqjP)

Just add the bios and image files elsewhere.

~~~
joebo
Missing MSVCR120D.dll. Can you compile in release mode or include that DLL
too? Installing the Visual C++ Redistributable Packages for Visual Studio 2013
didn't help.

~~~
kyberias
Updated with Release-build that should use the DLL in VS2013 redist package.

------
lifeisstillgood
/snip This entry weighs in at a magical 4043 bytes (8086 nibbles, 28,301
bits). It manages to implement most of the hardware in a 1980’s era IBM-PC
using a few hundred fewer bits than the total number of transistors used to
implement the original 8086 CPU. /snip

Just a small reminder how far we have come - and what Moore's law (RIP) has
done for us. Yes the original PC had few enough transistors one could actually
count them. NowadAys ...

An awesome piece of work

------
joebo
Neat, the linked hd.img file contains windows, Simcity, AutoCAD, etc. Has
anyone compiled this for Windows? If so, would you kindly post the binary?
It'd be fun to play with this old software again.

------
coin
The BIOS is part of the PC, yet it's not included in the 4043 bytes.

~~~
ctdonath
The BIOS is replaceable software. Standard on the IBM PC, but not the "bare
metal" itself. One can only emulate so much within 4096 bytes.

~~~
swetland
"Due to the complexities of the 8086’s arbitrary-length instruction decoding
and flags, 8086 instructions are first converted to a simpler intermediate
format before being executed. This conversion, along with instruction lengths
and how each instruction modifies the flags, is assisted by some lookup tables
which form part of the BIOS binary."

I agree regarding the BIOS, but this sounds like the BIOS image contains
"microcode" necessary for the emulator to run, which would seem to be cheating
just a little.

------
TazeTSchnitzel
If you're on Linux and don't want to bother installing a VGA font, there's no
need. Do Ctrl+Alt+F6 (or lower) to switch to a terminal and run it there, then
do Ctrl+Alt+F7 to switch back later.

Also, if someone can tell me how to make Windows not make the emulator
segfault on my machine, that'd be great.

EDIT: Oh, it works fine in a graphical terminal. I'm guessing SDL failing to
init on a text terminal broke everything.

EDIT 2: ALL THIS, PLUS REVERSI! You can even play Reversi!

------
kyberias
MS Visual C++ 2013 chokes on this with "'KB' undeclared identifier" error.

~~~
kyberias
Oh it's defined in the makefile... never mind.

------
mschuster91
Has someone already unobfuscated the source code?

~~~
faragon
I did. Do you know the software license of the original code?

edit: P.S. working deobfuscation, i.e. you can build the executable. Including
some minor corrections to the orgininal (e.g. 1MB of RAM allocation instead
2MB, BIOS file size limited to 65536-256 bytes, etc.). It a nice code,
minimalistic, e.g. the "rep" prefixed instructions are actually repeated, not
unrolled. Really optimized for code size. Even deobfuscated it takes 12-13KB
and 22KB of executable (x86-64 Linux, stripped binary). Very cool.

------
lelf
All IOCCC winners —
[http://ioccc.org/years.html#2013](http://ioccc.org/years.html#2013)

------
faragon
Is licensed/copyrighted? Do the IOCCC have any implicit license?

------
kriro
Fantastic. Finally some "hacker news" on Hacker News ;P

