benryanx's comments

benryanx · 2026-02-26T22:23:35 1772144615

That's the postMessage bottleneck - PR #1 replaces it with Atomics-based dispatch which should push utilisation much higher. Early numbers look like 6.4 tok/sec on M2 Max

benryanx · 2026-02-23T21:45:57 1771883157

The part I'd point people to first is ARCHITECTURE.md — specifically the WASM binary construction section. Every other CPU inference project I know of uses Emscripten or a compiled Rust backend. PureBee builds the binary in JavaScript. That's the thing I'd most want challenged if I'm wrong about it being novel.