
Show HN: A small virtual machine written in C - tekknolagi
https://github.com/tekknolagi/carp
======
axman6
Seems like [1] could benefit from use of the X macro [2], should make adding
new instructions much easier and you can avoid the hassle of having to keep
two separate tables in sync. There's probably quite a few places where the
code could be made clearer by using it. Also in the implementation of your
functions, there's a hell of a lot of repetition in all the binary operations,
another macro which you pass in the operation and the function name would make
life earier:

    
    
        #define binop(NAME, OP) definstr (NAME) { \
            long long b, a; \
            if (carp_stack_pop(&m->stack, &b) == -1)\
                carp_vm_err(m, CARP_STACK_EMPTY);\
            if (carp_stack_pop(&m->stack, &a) == -1)\
                carp_vm_err(m, CARP_STACK_EMPTY);\
            carp_stack_push(&m->stack, a OP b);}
        
        binop(ADD,+)
        binop(MUL,*)
        ...
    

The repetition of

    
    
        if (carp_stack_pop(&m->stack, &a) == -1)
            carp_vm_err(m, CARP_STACK_EMPTY);
    

seems like a good place to just use a function to encapsulate all the error
checking and handling.

[1]
[https://github.com/tekknolagi/carp/blob/master/src/carp_inst...](https://github.com/tekknolagi/carp/blob/master/src/carp_instructions.h)

[2] [http://www.embedded.com/design/programming-languages-and-
too...](http://www.embedded.com/design/programming-languages-and-
tools/4403953/C-language-coding-errors-with-X-macros-Part-1) and the following
parts

~~~
tekknolagi
Definitely interesting — I shall take a look! Feel free to make a pull request
if you're game.

~~~
axman6
If I had time and a job where I got to code I would, but it'll be a few days
before I have time.

~~~
tekknolagi
Quick question. Is that code you've given ok to use in the project? I just
realized that I added and pushed without asking.

GPL?

~~~
zura
Btw, I remember there was a law stating that the code less than 10(?) lines is
not copyright-able, no?

~~~
voidlogic
That would be a a silly rule, for better or worse 10 lines of Haskell might
perform the computation of 500 line of Java (even if they took you the same
amount of time to write.. heh).

~~~
zura
Well, maybe they made this rule in days when COBOL was in hype? ;)

------
panic
The implementation is really clean! The mixing of general-purpose registers,
special-purpose registers and the stack makes the instruction set a bit weird,
though.

For example, instead of using registers, why not have OR pick values off the
stack like ADD, or vice versa? Why use EAX for the conditional jump
instructions when you could look at the top of the stack? Why have REM when
you can just MOV from ERX?

~~~
tekknolagi
You know, I was thinking about that this morning actually. All great
suggestions! If you could make an issue, that would be fantastic.

------
papaf
The code looks really clean to me.

If you feel the urge to convert the VM to a JIT, rather than calling function
pointers for each instruction, you might find this blog post useful:
[http://blog.reverberate.org/2012/12/hello-jit-world-joy-
of-s...](http://blog.reverberate.org/2012/12/hello-jit-world-joy-of-simple-
jits.html)

~~~
tekknolagi
Thanks! The goal was readability and ease of use, so that's probably why it's
slow :P

Not sure I understand. What would it do differently?

Edit: Hang on, so I should generate C or asm? or something?

~~~
papaf
I think that the method you use is "Token Threaded Code" [1]. It's one good
way of implementing a VM.

I only added the blog link in case you were _interested_ in extending the VM
to use Just In Time compiling. Using a JIT would mean generating assembly.

I like your VM - there is no need to change it. Thanks for sharing.

[1] [http://realityforge.org/code/virtual-
machines/2011/05/19/int...](http://realityforge.org/code/virtual-
machines/2011/05/19/interpreters.html)

~~~
tekknolagi
Thanks for the info! :)

------
tekknolagi
I have been working on this project on and off for about 6 months. I would
love feedback on the code, design - whatever!

~~~
dsturnbull2049
Very concise implementation. I keep thinking there's missing files or at least
I'm missing something in my understanding. :)

I wrote a 0-operand VM in C a few years ago that used a lot of the same
concepts but considerably more code, like 10x at least. I will learn a lot
from this.

~~~
tekknolagi
What kinds of files look missing? I'm curious.

What is a 0 operand VM? I'd love to see it!

~~~
chewxy
Using only PUSH and POP I think. i.e a stack machine.

also, OP, your code looks great.

~~~
dsturnbull2049
Yeah, a stack machine.
[https://github.com/dsturnbull/stack_cpu](https://github.com/dsturnbull/stack_cpu)

What I mean by missing files is that it's so clean that at first glance it
didn't look like the whole deal :)

~~~
tekknolagi
Oh well thanks — that's quite a compliment.

This is pretty cool!

------
lindig
As inspiration, I would suggest to look at the virtual machine (byte code
interpreter) of the Lua language. You can find several papers describing the
design at [http://www.lua.org/docs.html](http://www.lua.org/docs.html). The
code base is also very small and clean.

~~~
tekknolagi
Awesome!

------
akkartik
Can you give some directions for how to run it?

    
    
      $ git clone https://github.com/tekknolagi/carp
      $ make
      $ ./carp
      $ ./carp --help  # after peeking in carp.c
      help msg
    

There's a tantalizing target called 'tests' in the Makefile, but it doesn't
work.

There's some examples, but no makefile to build them. How do you run them?

I see you used to have some .carp files but you just deleted them:
[https://github.com/tekknolagi/carp/commit/fa16eeb443](https://github.com/tekknolagi/carp/commit/fa16eeb443)

I tried restoring one, but it doesn't work:

    
    
      $ git checkout fa16eeb443~1 examples/reg.carp
      $ ./carp -f examples/reg.carp
      Unknown label <add>
    

So at this point I'm ready to give up..

~~~
tekknolagi
Hi there! Sorry for this. I actually have a help message but for some reason I
forgot to commit it. It's coming soon!

You run the examples like so: carp -f examples/carp/call.carp

You can compile C files that use the Carp API, but I have not written any
documentation yet.

~~~
tekknolagi
Oh that's weird. the carp folder from examples disappeared. Gimme a sec.

------
zencoder
>The goal is to try and build a small (and decently reliable) VM from the
ground up, learning more and more C as I go.

The author explicitly mentions, he is in the process of learning C while he's
building something using it.

This is interesting because the folks at
[osdev.org]([http://wiki.osdev.org](http://wiki.osdev.org)) keep stressing
that you must be a god-level expert in C, before you even think of getting
into systems programming.

~~~
TheSoftwareGuy
I think the guys at osdev mean you need to be good at C before doing systems
programming _at any form of a professional level_. If you don't intend for
your code to _actually_ do anything (I mean this is just a side project or
something, not for somebody to use in production) than you can really do
whatever you want.

~~~
tekknolagi
Definitely a side project.

------
abhorrence
This is neat. For others interested in this sort of thing,
[https://challenge.synacor.com/](https://challenge.synacor.com/) specifies a
virtual machine, and comes with a binary that performs a self-test -- it is
perhaps a neat way to get started.

~~~
tekknolagi
Cool!

------
cobookman
Very cool!

Georgia Tech offers a class where you code a VM for the LC3b instruction set
([http://users.ece.gatech.edu/~moin/s13a/hw.html](http://users.ece.gatech.edu/~moin/s13a/hw.html)).

It was a great learning experience, and pretty fun too.

~~~
tekknolagi
Ah, that's neat! I am not headed off to Georgia Tech though :)

------
manish_gill
Can anyone link me to some literature on VMs? Sure, I can analyse the code,
but from a design perspective, I would love to learn more about how VMs work,
where they can be used, etc etc.

~~~
tekknolagi
So I started off following this
([http://en.wikibooks.org/wiki/Creating_a_Virtual_Machine/Regi...](http://en.wikibooks.org/wiki/Creating_a_Virtual_Machine/Register_VM_in_C))
which was interesting but clearly never super influential in my design (except
in earlier stages, but that died).

Then I read this: [http://stackoverflow.com/questions/2034422/tutorial-
resource...](http://stackoverflow.com/questions/2034422/tutorial-resource-for-
implementing-vm)

and this:
[http://courses.cms.caltech.edu/cs11/material/c/mike/lab8/lab...](http://courses.cms.caltech.edu/cs11/material/c/mike/lab8/lab8.html)

and took a look at the MSP430's instruction set.

As far as where they can be used... anywhere? I don't know the answer to your
question - sorry :)

~~~
VLM
"and took a look at the MSP430's instruction set."

I wonder in a general sense if there are many VMs that explicitly copy good
older CPUs. I've always thought the 6809 would make an awesome VM. After all,
it was pretty awesome to code on in the real world. Or a VM based on the
classic 1802, 6502, or Z80 with some minor mods.

If you really want to warp peoples minds give them a VM based on IBM HAL
assembler, basically turn Hercules and VM into a hypervisor rather than an
application.

A PDP-11 inspired/based virtual machine. Hmm.

~~~
astrobe_
CISC would be better than RISC because one should keep in mind that for many
operation most of the execution time is VM overhead.

But if you want performance you'd rather tailor your instruction set to the
language it will execute (assuming you're not making a general purpose VM).
CPU makers like Intel do actually keep a close look on what programs are doing
in order to evolve their instruction set.

------
ravyoli
Out of curiosity - How are you testing your VM? meaning, how do you know it
actually works correctly? I didn't see any tests in the github repository

~~~
tekknolagi
I am now starting to write some tests with libtap. If you are game, come help
out! :D

------
jheriko
good stuff.

the power of function pointers for stuff like this is quite enormous. making a
vm like this is great practice for a real compiler too... :)

i'm curious though, why it isn't a direct copy of x86? other than the obvious
'thats really complicated' it would save a lot of wheel reinventing...

~~~
tekknolagi
I started off following MSP430 but just kind of wanted to make my own thing.

------
n1ghtmare_
"NOP (): Does nothing. Seriously." \- I love this :)

~~~
tekknolagi
Me too :D

~~~
n1ghtmare_
Good job man, code looks tight, wish I had some more time to jump on this.
Keep it up.

EDIT: Btw, do you happen to have a book or some study material on this subject
? Anything you can recommend ?

~~~
tekknolagi
Whenever you get a chance just send a stray pull request :)

I don't - I wasn't really following anything unified. I just pulled together a
bunch of crappy web resources :)

