
Segfaulting Python with afl-fuzz - orf
http://tomforb.es/segfaulting-python-with-afl-fuzz
======
__s
I've been segfaulting CPython quite a bit with stack underflows while
developing a befunge-to-python-bytecode JIT that uses the python stack as the
befunge stack. It has to include instrumentation to track the stack depth so
that it substitutes 0 when the user pops a value on an empty stack. Latest
issue was this weekend, reducing the bytecode size of `p` by converting a
while loop to move the stack into an array for recompilation to use FOR_ITER,
it didn't like being called on non iterables

[https://github.com/serprex/Befunge/blob/master/funge.py](https://github.com/serprex/Befunge/blob/master/funge.py)

It'd be neat to see how PyPy handles fuzzing. It uses CPython's bytecode, I
was able to get it to run beer6.bf (it was pretty slow, since that's a
benchmark that mostly tests recompile speed) but it locked up when testing
mandel.bf (odd since mandel.bf doesn't trigger recompilation)

------
mateo411
Here is a quick edit that you should make.

Where you write:

> In laments terms

You probably want to write:

In layman's terms

"In layman's terms" is an idiomatic way of saying, simply put, or explaining
something to somebody who might not be technically inclined.

To lament is to feel upset about something, it often refers to the grief one
feels when a loved one has died.

Overall, this was an interesting read, and I'm looking forward to your next
installment.

------
lmm
So... what are the crashes? What was the goal of all this? I feel like the
article ended just as it was about to get interesting.

~~~
orf
Oh my, I finished this quite late last night and forgot to add a conclusion. I
thought the post was getting a bit lengthy, and the next step is to use gdb to
dive into the crashes + make a patch, which I feel is too much for one post.

So tune in next week :)

~~~
masklinn
Would the core team really be interested in that? The bytecode interpreter
relies on implicit invariants from the codegen, re-checking these invariants
on the bytecode means slowing down the interpreter for very little value.

~~~
electrum
That's interesting, because bytecode verification is extremely well-defined
for the JVM:
[https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.ht...](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.10)

~~~
lmm
The JVM is explicitly designed to run _untrusted_ bytecode.

