
CorePy: Assembly Programming in Python - mace
http://www.corepy.org/
======
tptacek
We've got something similar in Ruby (we're not the first, and ours is nowhere
nearly as mature as CorePy). We use it for building ad-hoc debuggers for
reversing targets that don't support native debugging and tracing interfaces.

Ruby is an excellent language for this, because it gets out of the way. Here's
a Rasm snippet:

    
    
                @epilog ||= @prog.add {
                    push ebx 
                    mov ebx, retv
                    mov [ebx], eax
                    pop ebx
                    xor eax, eax
                    ret
                }
    

Obviously, that's all pure Ruby.

Once you have a class mapping for each of the instructions in your target ISA
--- _way_ easier than it sounds --- getting your code to jump into a runtime
generated buffer is pretty easy.

We're starting to throw code onto Github; I've been thinking about publishing
this, but didn't think it would be very interesting, except as a hack.

~~~
tc
SBCL has an extremely sophisticated facility called a VOP (Virtual Operation)
that allows you to write assembly with s-expressions and integrate the code
into the compiler and runtime. The SBCL backend itself is composed of these
VOPs.

Significantly, this means that you can write assembly macros with the full
power and elegance of lisp. A trivial example is the move macro, which is used
extensively in VOPs:

    
    
      (defmacro move (dst src)
        "Move SRC into DST unless they are location=."
        (once-only ((n-dst dst)
                    (n-src src))
          `(unless (location= ,n-dst ,n-src)
             (inst mov ,n-dst ,n-src))))
    

Check out the SBCL source if you want to see some wild uses of VOPs and macros
that expand into assembly code. In compiler/x86/arith.lisp there is a
lisp/assembly implementation of the Mersenne Twister.

VOPs can optionally handle argument type checking and can be tuned to emit
specialized code in various circumstances, such as when one of the arguments
to the call is a compile-time constant.

~~~
tptacek
What I want to see is Python or Ruby taking advantage of a runtime code
generation library to start rewriting the VM or JIT'ing method calls.

~~~
aaronblohowiak
Psyco.

~~~
tptacek
Yep. I forgot about Psyco. Does it still work?

~~~
aaronblohowiak
I am not sure which version of Python will break Psyco. It looks like it
stopped being maintained in '06.

------
est
> "CorePy makes assembly fun again!" (Alex Breuer)

------
illume
This is great because it allows you to create algorithms at runtime for your
data.

So rather than doing...

def doit(option1, data): for x in data: if option1: x += 7

You can create a function that doesn't include the "if option1" part. This is
great if you have many different options... and instead of bloating your code
for many different optimized function, or one slower generic function, you can
create the optimized functions at run time.

Looks very nice. I just wish it had support for 3dnow, and windows :)

------
flashgordon
i would have liked to see llvm bindings to python... i am sure they are out
there...

~~~
flashgordon
sure enough!!!

<http://code.google.com/p/llvm-py/>

------
riobard
any body benchmarked the performance of CorePy?

~~~
snorkel
No, but I'd assume any intense math operations that happen inside a loop would
be much faster.

~~~
wheels
Then your understanding of machine language and compilers would be flawed.

In most cases the compiler will do a better job at optimizing machine code
than humans will.

~~~
lbrandy
I am interested to hear about how often you've actually tried to beat a c
compiler with hand optimized assembly code. In my experience, this statement
is only said by people who have never tried.

For example, numerical calculations that are highly SIMD can be improved
substantially by using SSE instructions. Autovectorization is ok, but not
great. Furthermore, a programmer who is familiar with SSE instructions can
alter/swizzle data to make things easier to use with SIMD instructions. Yet
further, a programmer can take advantage of things like non-temporal storing
which compilers will not do on their own.

Now, granted, you can massage gcc into giving you "good" code with alot of
hints but to do it "right" you are still peering at the assembly and making
sure gcc isn't doing anything "stupid". Highly optimized C code is so dense
with compiler directives as to be unrecognizable to someone unfamiliar with
the underlying architecture.

The belief that your naive for-loop computation is somehow transformed
automagically into perfection by the compiler is a pure and unadulterated
myth.

~~~
wheels
Using SSE is specifically one of the cases where writing something in assembly
can make sense (and in fact, the only case where I've written things in asm
for purely performance reasons).

Let's go back to the original statement:

 _I'd assume any intense math operations that happen inside a loop would be
much faster._

That's what I was responding to. Just trasnlating the logic to asm won't make
it fast. If you look at my earlier followup, I mentioned that if you could
make assumptions about your code that the compiler can't know, then you're
back in the land where asm optimizations can make sense. That seems to be what
you're getting at with the rest of your points.

------
juliend2
Wow. So cool.

Thanks for this news.

