Nice. Maybe a pie in the sky but considering how prevalent python is in machine learning, how likely is it that at some point most ML frameworks written in python become executable on a web browser through WASM?
Last we checked for one of our use cases around sandboxing, key pydata libraries were slowly moving there, but it takes a village.
At that time, I think our blockers were Apache Arrow, Parquet readers, and startup time. There were active issues on all three. The GPU & multicore thing is a different story.as that is more about Google & Apple than WASM wrt browsers, and I'd be curious about the CUDA story serverside. We didn't research the equivalent of volume mounts and shmem for fast read-only data imports.
For Louie.AI, we ended up continuing with serverside & container-based sandboxing (containers w libseccomp, nsjail, disabled networking, ...).
Hard to motivate wasm in the server for general pydata sandboxing in 2024 given startup times, slowdowns, devices, etc: most of our users who want the extra protection would also rather just self-host, decreasing the threat model. Maybe look again in 2025? It's still a good direction, just not there yet.
As a browser option, or using with some more limited cloud FaaS use cases, still potentially interesting for us, and we will keep tracking.
Python in ML acts as a glue language for loading the data, the interface, api, et. The hard work is done by C libraries running in parallel on GPUs.
Python is quite slow, and handles parallelization very badly. This is not very important for data loading and conditioning tasks, which would benefit little from parallelization, but it is critical for inference.
Depending on the underling machine learning activity (GPU/no-GPU) it should be already possible today. Any low level machine learning loops are raw native code or GPU already and Python’s execution speed is irrelevant here.
The question is do the web browsers and WebAssembly have enough RAM to do any meaningful machine learning work.
> FWIU Service Workers and Task Workers and Web Locks are the browser APIs available for concurrency in browsers
sqlite-wasm, sqlite-wasm-http, duckdb-wasm, (edit) postgresql WASM / pglite ; WhatTheDuck, pretzelai ; lancedb/lance is faster than pandas with dtype_backend="arrow" and has a vector index
The actual underlying models run in a lower-level language (not Python).
But with the right tool chain, you already can do this. You can use Pyodide to embed a Python interpreter in WASM, and if you set things up correctly you should be able to make the underlying C/FORTRAN/whatever extensions target WASM also and link them up.
TFA is compiling a subset of actual raw Python to WASM (no extension). To be honest, I think applications for this are pretty niche. I don't think Python is a super ergonomic language once you remove all the dynamicity to allow it to compile down. But maybe someone can prove me wrong.
We implemented an in-browser Python editor/interpreter built on Pyodide over at Comet. Our users are data scientists who need to build custom visualizations quite often, and the most familiar language for most of them is Python.
One of the issues you'll run into is that Pyodide only works by default with packages that have pure Python wheels available. The team has developed support for some libraries with C dependencies (like scikit-learn, I believe), but frameworks like PyTorch are particularly thorny (see this issue: https://github.com/pyodide/pyodide/issues/1625 )
For popular ML/DL models, you could already export model to ONNX format for interference. However, glue codes are still python and you may need to replace that part with the host’s language