Reviving PyMiniRacer: A Python <> JavaScript Bridge

simonw · 2024-03-23T20:11:12.000000Z

This looks very promising!

The problem I most want to solve with this kind of library is execution of untrusted user-provided code in a sandbox.

For that I need three things:

1. Total control over what APIs the user's code can call. I don't want their code being able to access the filesystem, or run subprocesses, or make network calls - not without me explicitly allowing a controlled subset of those things.

2. Memory limits. I need to be able to run code without fear that it will attempt to allocate all available memory on my computer - generally that means I want to be able to set e.g. a 128MB maximum on the amount it can use.

3. Time limits. I don't want someone to be able to paste "while true() {}" into my system and consume an entire CPU thread in an infinite loop. Usually I want to say something like "run this untrusted code and throw an error if it takes more than 1s to run"

My most recent favourite solution to this is the https://pypi.org/project/quickjs/ Python library wrapper around QuickJS, which offers those exact features that I want - memory limits, control over what the code can do, and a robust time limit.

(The one thing it's missing is good documentation, but the https://github.com/PetterS/quickjs/blob/master/test_quickjs.... test suite covers all of those features and is quite readable.)

Can PyMiniRacer handle those requirements as well?

bpcreech · 2024-03-24T03:24:53.000000Z

New owner here!

It should indeed be able to handle those requirements, and I think that matches the original use case from Sqreen.

Obviously, of course, there is no warranty and especially since I only recently adopted this project, I'd encourage anyone using this to run untrusted code go over the codebase and its assumptions very carefully.

And as far as running untrusted code goes, anything running V8 is subject to V8's stream of CVEs: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=v8. Whereas V8 in your browser gets aggressively auto-updated by the browser's auto-update feature, PyMiniRacer's update schedule is unlikely to be as reliable (and depends on your own action to update your pip installation!).

N.B.: the memory limits operate on a per-context (per MiniRacer object) basis, even though (for historical reasons) you can (attempt to) set them on a per-eval basis.

simonw · 2024-03-24T17:34:52.000000Z

Blogged about it here https://simonwillison.net/2024/Mar/24/reviving-pyminiracer/

mahmoudimus · 2024-03-24T02:39:00.000000Z

Google's Starlark is a pretty good python-like language that does a lot of what you want, we use it at my company to run untrusted code. It's not exactly python, but a pretty good subset. It's deterministic, hermetically sealed and non-Turing complete.

jbaviat · 2024-03-25T08:54:07.000000Z

PyMiniRacer (original) author here. PyMiniRacer is definitely a way to run insecure code, but as pointed, the several CVEs in V8 require measures beyond "just" relying on PyMiniRacer to make it safe.

For the record PyMiniRacer was victim of a CVE itself https://nvd.nist.gov/vuln/detail/CVE-2020-25489 - a heap overflow, my mistake.

1. Total control over what APIs the user's code can call: you kinda got it... users can just do plain JS 2. Memory limits: you got it 3. Time limits: you got it, but the current model is unreliable when used at high levels of CPU and a high number of threads.

And thank you so much bpcreech for taking back the ownership of PyMiniRacer!

deathanatos · 2024-03-24T00:24:12.000000Z

WASM meets those requirements. Basically its proposition. The only thing would be whether a higher-level language you want would compile to it.

simonw · 2024-03-24T00:48:45.000000Z

Yeah, I've been hoping to get good results from this out of WASM for ages. My latest attempt was this one: https://til.simonwillison.net/webassembly/python-in-a-wasm-s...

It's still not quite as easy as I want it to be!

simonw · 2024-03-24T00:55:02.000000Z

... and I just spotted this here: https://bpcreech.com/PyMiniRacer/api/#py_mini_racer.py_mini_...

    call(expr, *args, encoder=None, timeout=None, max_memory=None)

Where timeout is "number of milliseconds after which the execution is interrupted" and max_memory is "hard memory limit after which the execution is interrupted" (it doesn't specify what unit, presumably bytes? - Yes, it's bytes: https://bpcreech.com/PyMiniRacer/api/#py_mini_racer.py_mini_...)

nickpsecurity · 2024-03-23T20:23:02.000000Z

On a related note, Brython lets you run Python in the browser through JavaScript. You can even see Python in the HTML with “text/python” SCRIPT tags.

https://brython.info/

leontrolski · 2024-03-23T23:05:21.000000Z

I'm always excited by the idea of rendering jsx from Python in the same process. Mostly as a bridge between eg. an existing Django app and full SPA React land. You'd swap out the scrappy Django string templating with jsx, then once a page passes some frontend interaction complexity threshold shift it over entirely (with shared components between both). Could this project help achieve this or are imports/build processes etc too much of an impediment?

jennasys · 2024-03-24T02:43:00.000000Z

I've made React applications using Python via Transcrypt, but wrap component functions in a Python decorator that make direct calls to React.createElement() instead of using JSX (example: https://github.com/JennaSys/tictacreact2). It's possible to use JSX with this approach as well, but IMO it starts to get messy and defeats the purpose of using JSX in the first place.

btown · 2024-03-23T23:26:42.000000Z

I’ve often wondered about an automatic codemod that takes a Django template and turns it into an API endpoint + React component that does the same thing. Your {% if request.user.is_superuser %} checks get propagated to protecting the API data within. Frankly, though, the impetus to upgrade is often accompanied by a design rework anyways. But I do think it would be viable!

rossant · 2024-03-23T21:59:51.000000Z

There's also https://pyodide.org/en/stable/

punnerud · 2024-03-23T20:01:52.000000Z

All the JSON that work with JavaScript, but not on Python. Finally a good solution?