Agreed, I find this to be a super productive environment, because you get all of vscode's IDE plus the niceties of Jupyter and IPython.
I wrote a small vscode extension that builds upon this to automatically infer code blocks via indentation, so that you don't have to select them manually: [0]
I develop Lonboard [0], a Python library for plotting large geospatial data. If you have small data (~max 30,000 coordinates), leaflet-based Python libraries like folium and ipyleaflet can be fine, but because Lonboard uses deck.gl for GPU-accelerated rendering, it's 30-50x faster than leaflet for large datasets [1].
It can read from HTTP urls, but you'd need to manage signing the URLs yourself. On the writing side, it currently writes to an ArrayBuffer, which then you could upload to a server or save on the user's machine.
Arrow JS is just ArrayBuffers underneath. You do want to amortize some operations to avoid unnecessary conversions. I.e. Arrow JS stores strings as UTF-8, but native JS strings are UTF-16 I believe.
Arrow is especially powerful across the WASM <--> JS boundary! In fact, I wrote a library to interpret Arrow from Wasm memory into JS without any copies [0]. (Motivating blog post [1])
Yeah, we built it to essentially stream columnar record batches from server GPUs to browser GPUs with minimal touching of any of the array buffers. It was very happy-path for that kind of fast bulk columnar processing, and we donated it to the community to grow to use cases beyond that. So it sounds like the client code may have been doing more than that.
For high performance code, I'd have expected overhead in %s, not Xs. And not surprised to hear slowdowns for any straying beyond that -- cool to see folks have expanded further! More recently, we've been having good experiences more recently here in Perspective <-arrow-> Loaders, enough so that we haven't had to dig deeper. Our current code is targeting < 24 FPS, as genAI data analytics is more about bigger volumes than velocity, so unsure. However, it's hard to imagine going much faster though given it's bulk typed arrays without copying, especially on real code.
[0]: https://news.ycombinator.com/item?id=47482185
reply