Nice. Maybe a pie in the sky but considering how prevalent python is in machine ...

ilyaraz90 · 2024-04-22T14:34:05 1713796445

You can already run inference of many modern ML models in-browser via, e.g., https://huggingface.co/docs/transformers.js/en/index .

lmeyerov · 2024-04-22T15:22:11 1713799331

Last we checked for one of our use cases around sandboxing, key pydata libraries were slowly moving there, but it takes a village.

At that time, I think our blockers were Apache Arrow, Parquet readers, and startup time. There were active issues on all three. The GPU & multicore thing is a different story.as that is more about Google & Apple than WASM wrt browsers, and I'd be curious about the CUDA story serverside. We didn't research the equivalent of volume mounts and shmem for fast read-only data imports.

For Louie.AI, we ended up continuing with serverside & container-based sandboxing (containers w libseccomp, nsjail, disabled networking, ...).

Hard to motivate wasm in the server for general pydata sandboxing in 2024 given startup times, slowdowns, devices, etc: most of our users who want the extra protection would also rather just self-host, decreasing the threat model. Maybe look again in 2025? It's still a good direction, just not there yet.

As a browser option, or using with some more limited cloud FaaS use cases, still potentially interesting for us, and we will keep tracking.

miohtama · 2024-04-22T15:25:33 1713799533

Pyarrow is giving a headache trying to get it compiled with Emscripten

lmeyerov · 2024-04-22T15:41:10 1713800470

An irony here is that we donated the pure JS/TS implementation of arrow to apache

Beretta_Vexee · 2024-04-22T15:13:31 1713798811

Python in ML acts as a glue language for loading the data, the interface, api, et. The hard work is done by C libraries running in parallel on GPUs.

Python is quite slow, and handles parallelization very badly. This is not very important for data loading and conditioning tasks, which would benefit little from parallelization, but it is critical for inference.

miohtama · 2024-04-22T14:36:50 1713796610

Depending on the underling machine learning activity (GPU/no-GPU) it should be already possible today. Any low level machine learning loops are raw native code or GPU already and Python’s execution speed is irrelevant here.

The question is do the web browsers and WebAssembly have enough RAM to do any meaningful machine learning work.

westurner · 2024-04-22T15:04:17 1713798257

WebGL, WebGPU, WebNN

Which have better process isolation; containers or Chrome's app sandbox at pwn2own how do I know what is running in a WASM browser tab?

JupyterLite, datasette-lite, and the Pyodide plugin for vscode.dev ship with Pyodide's WASM compilation of CPython and SciPy Stack / PyData tools.

`%pip install -q mendeley` in a notebook opened with the Pyodide Jupyter kernel works by calling `await micropip.install(["mendeley"])` IIRC.

picomamba installs conda packages from emscripten-forge, which is like conda-forge: https://news.ycombinator.com/item?id=33892076#33916270 :

> FWIU Service Workers and Task Workers and Web Locks are the browser APIs available for concurrency in browsers

sqlite-wasm, sqlite-wasm-http, duckdb-wasm, (edit) postgresql WASM / pglite ; WhatTheDuck, pretzelai ; lancedb/lance is faster than pandas with dtype_backend="arrow" and has a vector index

"SQLite Wasm in the browser backed by the Origin Private File System" (2023) https://news.ycombinator.com/item?id=34352935#34366429

"WebGPU is now available on Android" (2024) https://news.ycombinator.com/item?id=39046787

"WebNN: Web Neural Network API" https://www.w3.org/TR/webnn/ : https://news.ycombinator.com/item?id=36159049

"The best WebAssembly runtime may be no runtime" (2024) https://news.ycombinator.com/item?id=38609105

(Edit) emscripten-forge packages are built for the wasm32-unknown-emscripten build target, but wasm32-wasi is what is now supported by cpython (and pypi?) https://www.google.com/search?q=wasm32-wasi+conda-forge

movpasd · 2024-04-22T14:40:24 1713796824

The actual underlying models run in a lower-level language (not Python).

But with the right tool chain, you already can do this. You can use Pyodide to embed a Python interpreter in WASM, and if you set things up correctly you should be able to make the underlying C/FORTRAN/whatever extensions target WASM also and link them up.

TFA is compiling a subset of actual raw Python to WASM (no extension). To be honest, I think applications for this are pretty niche. I don't think Python is a super ergonomic language once you remove all the dynamicity to allow it to compile down. But maybe someone can prove me wrong.

calebkaiser · 2024-04-22T14:51:53 1713797513

We implemented an in-browser Python editor/interpreter built on Pyodide over at Comet. Our users are data scientists who need to build custom visualizations quite often, and the most familiar language for most of them is Python.

One of the issues you'll run into is that Pyodide only works by default with packages that have pure Python wheels available. The team has developed support for some libraries with C dependencies (like scikit-learn, I believe), but frameworks like PyTorch are particularly thorny (see this issue: https://github.com/pyodide/pyodide/issues/1625 )

We ended up rolling out a new version of our Python visualizations that runs off-browser, in order to support enough libraries/get the performance we need: https://www.comet.com/docs/v2/guides/comet-ui/experiment-man...

thangngoc89 · 2024-04-22T14:52:02 1713797522

For popular ML/DL models, you could already export model to ONNX format for interference. However, glue codes are still python and you may need to replace that part with the host’s language