
Make CPython segfault in 5 lines of code - coolreader18
https://gist.github.com/coolreader18/6dbe0be2ae2192e90e1a809f1624c694
======
loeg
(CPython 3)

------
westurner
FWIW, this segfaults CPython in 2 lines:

    
    
      import ctypes
      ctypes.cast(1, ctypes.py_object)
    

Interestingly, this works:

    
    
      import ctypes, gc
      x = 22
      _id = id(x)
      del x
      gc.collect()
      y = ctypes.cast(_id, ctypes.py_object).value
      assert y == 22

------
angrygoat
Here's the Python bug tracking the issue:
[https://bugs.python.org/issue39091](https://bugs.python.org/issue39091)

------
PixyMisa
Works fine in PyPy, which is not surprising since the execution model is
entirely different.

------
Loranubi
Even some "safe" python libraries can be segfaulted using similar techniques.
See for example
[https://github.com/pydata/numexpr/issues/323](https://github.com/pydata/numexpr/issues/323)
I think this particular bug is even marked 'wont fix'. But I cannot find the
bpo at the moment.

------
paulddraper
If # of lines are important, this problem can actually be demonstrated in 1
line:

    
    
        for x, x.__new__ in [(__import__('queue').Full, print)]: __import__('glob').iglob(0).throw(x)

~~~
filmor
This doesn't trigger a segfault for me in Python 3.7, just an exception that
`print_exception` expects an `Exception`.

Someone commented below the gist with this one-liner:

    
    
        (i for i in []).throw(type('E', (BaseException,), dict(__new__=lambda cls, *args: cls))())
    

I managed to golf it a bit down to this ;):

    
    
        n="__new__";(i for i in []).throw(type(n,(IOError,),{n:lambda c,*a:c})())

~~~
ehsankia
Just one character but replace `[]` with `n` too :)

~~~
chrismorgan
Or save that character by removing the space before `[]` instead (which you
can’t do if you write `n`).

------
dependenttypes
This would not have happened in a language with a proper type system as the
type checker would have rejected the program at compile time.

~~~
shakna
> This would not have happened in a language with a proper type system as the
> type checker would have rejected the program at compile time.

How about in Rust, then? [0]

Bugs happen in every language. When memory corruption occurs, you can
segfault.

[0] [https://users.rust-lang.org/t/rust-guarantees-no-
segfaults-w...](https://users.rust-lang.org/t/rust-guarantees-no-segfaults-
with-only-safe-code-but-it-segfaults-stack-overflow/4305)

~~~
dependenttypes
I said "a language with a proper type system". Rust is not one such language.

~~~
shakna
You'll need to define "proper type system" then.

At a guess, you want something with dependent types?

Like Idris, or Haskell. You already have a Haskell example. This [0] release
of Idris fixed a segfault when concatenating strings.

Maybe you meant a language that is proven from the ground up. Like CakeML. You
can find a segfault example here [1].

Maybe you meant a language with an algebraic type system like Ada. You can
find a segfault example here [2].

Maybe you meant something like Dotty (Research for the next version of Scala).
You can find a segfault example here [3].

In short: You'll need to describe what you believe to be a "proper" type
system, and name the languages you think fit that description, or no one can
have a conversation with you.

[0] [https://www.idris-lang.org/idris-1-1-1-released/](https://www.idris-
lang.org/idris-1-1-1-released/)

[1]
[https://github.com/CakeML/cakeml/issues/438](https://github.com/CakeML/cakeml/issues/438)

[2] [https://stackoverflow.com/questions/56227629/segmentation-
fa...](https://stackoverflow.com/questions/56227629/segmentation-fault-during-
runtime-elaboration-ada)

[3]
[https://github.com/lampepfl/dotty/pull/7466/](https://github.com/lampepfl/dotty/pull/7466/)

~~~
dependenttypes
You can trivially prove bottom in Haskell. Something like Agda or Idris would
indeed fit the bill better.

Anyway, I said that this specific issue would not occur in a language with
dependent types -- where incorrect code would cause the implementation to
crash. Not that it is impossible to have a buggy compiler that at certain
cases produces segfaults.

~~~
shakna
> Not that it is impossible to have a buggy compiler that at certain cases
> produces segfaults.

That's exactly what happened here, however. The instance check was missing
from the interpreter.

Dependant types wouldn't have solved the underlying problem.

------
sigjuice
A quick search in bugs.python.org shows several crashes. I don’t know much
about Python. Is there anything particularly special about this one?

~~~
lucy_gatenby
Just for fun, here's another one:

    
    
      import sys, threading
      def r():
          sys.stdin.buffer.read(1)
      t = threading.Thread(target=r, daemon=True)
      t.start()

~~~
Thorrez
Here's a Python 2 segfault I ran into recently.

    
    
        import sys, threading, time
        
        t = threading.Thread(target=sys.stdin.read, args=(1,))
        t.start()
        time.sleep(1)
        sys.stdin.close()
    

Run it then after a few seconds press enter. It doesn't segfault in Python 3,
but it still doesn't behave how I'd like, because I would like the close() to
unblock the read(), but it doesn't unblock the read(), the read() still hangs
until it gets some input.

~~~
loeg
The whole threading library in Python is a mess. Python was designed around
single threaded programs with shared-nothing state and the cracks show as you
move beyond that. The whole idea of replacing the GIL with... multiple same-
process distinct-state Python interpreters with cheap-ish message passing sort
of highlights how ugly it gets.

~~~
h2odragon
Back around python 1.5, there was almost a fork of python where every object
had locks, there were memory arenas, and multiprocessing was almost
thoughtlessly easy. That and stackless would've been great.

~~~
PixyMisa
It was also terribly slow.

IronPython did that too, on .Net. It ran around one quarter the speed of
CPython.

------
lpghatguy
Segfaults in scripting languages are remarkably common, especially if
arbitrary bytecode can be loaded into the VM.

One I ran into in the wild recently is that in older versions of Lua,
exceptions in GC finalizers (the `__gc` metamethod) can trigger a segfault. In
those same versions of Lua, the bytecode format is notoriously dangerous to
load.

I wonder whether this will be a large component of newer scripting language
implementations. Do these safety issues warrant use of memory safe languages
like Rust, or use of existing sandboxed VM implementations like WebAssembly?

~~~
schoen
Is anybody fuzzing Python bytecodes? This sounds like a super-great
application for afl.

~~~
poizan42
It has always been the position of the CPython developers that using python
for sandboxing is unsupported. With that in mind it doesn't really matter if
you can "exploit" python with weird bytecode because you are supposed to be on
the other side of the airtight hatchway[0] anyways.

I don't know what the stance of other python runtimes are, but you should
probably just use a sandbox at OS level which is likely to be tested far more
thoroughly.

[0]:
[https://devblogs.microsoft.com/oldnewthing/20060508-22/?p=31...](https://devblogs.microsoft.com/oldnewthing/20060508-22/?p=31283)

~~~
ChrisSD
It's worth mentioning the interesting failure of pysandbox:

> I now think that putting a sandbox directly in Python cannot be secure. To
> build a secure sandbox, the whole Python process must be put in an external
> sandbox.

[https://mail.python.org/pipermail/python-
dev/2013-November/1...](https://mail.python.org/pipermail/python-
dev/2013-November/130132.html)

