Hacker News new | past | comments | ask | show | jobs | submit login

IMO, the next step for distributed programming, and one web assembly could finally enable, will be communicating with code, not messages.

Imagine you're building an app that lets users of a music streaming service see how much time they spent listening to particular genres of music.

The way you would do this with REST, SOAP or similar could roughly be:

1. Retrieve the user's whole listen history.

2. For each track, send a request to get the track info, which includes album and duration.

3. For each album, retrieve the album info, which contains the genre.

4. Tabulate the returned data.

That's thousands of requests, and a lot of network bandwidth wasted.

GraphQL is a bit better, you only receive what you need, and you send just one request, but you still get a whole lot of track objects, which need to go through the network and be processed on your side.

With my proposed approach, you would send a piece of wasm code, a bit similar to a Lambda function, which would run on the streaming service's API cloud.

Such a function could use fine-grained API calls to retrieve what it needs. It would be executing on the service's network, so round trips would be either short or nonexistent. After tabulating the data, it would send back a simple JSON with the genres and time spent.

This approach would allow developers to extend any service in ways the original developers have never imagined. We could also re-think the way backends and frontends communicate, giving frontend devs a way to do even more, without changing the back end.




On the Windows accessibility team where I work, we've done something like this for the UI Automation API, not using WebAssembly but using our own bytecode. We call it Remote Operations. The idea is that certain parts of the UI Automation API require a very large number of cross-process requests (i.e. from the assistive technology to the application process) to do anything non-trivial. So with Remote Operations, the client (assistive technology) sends a little program to the provider (application) to do the API calls.

This feature hasn't yet made it into a stable Windows release, but it's available in Insider builds, and the high-level client libraries and accompanying tests are on GitHub here:

https://github.com/microsoft/microsoft-ui-uiautomation

Disclaimer: Remote Operations is still a work in progress. There's no public documentation yet aside from the README in the GitHub repo, and the client libraries aren't yet packaged for easy consumption. Moreover, this is not an official Microsoft announcement; I'm just bringing it up on this thread because it's relevant and I'm proud of this feature that I helped develop.


As far as I remember, NVDA either does or used to do something similar. They were injecting c++ code into other processes, so that it could quickly query their APIs and return just the data NVDA needed. I never really delved into that part of the codebase, so I might be wrong here.

As an aside, how do you get to the bytecodeyou use? Do you write quazi-assembly by hand? Did you develop your own compiler? For what language?

Will other, non-microsoft assistive technologies, like NVDA for example, be able to use it?


You're right; all serious third-party screen readers for Windows currently inject code into the application processes. In particular, they all use this technique to efficiently traverse browser DOMs and Office object models. The point of Remote Operations is to provide a way to efficiently get the equivalent information through UI Automation without the risks (in security and robustness) of injecting native code in-process.

As for how the bytecode is built, the GitHub repository I linked has a library with a WinRT API for building the bytecode at run time, by calling methods that correspond to the individual opcodes. It's an object-oriented API, so there's a class for each type of operand. And for control flow blocks (e.g. if-else and while loops), the method takes a WinRT delegate (basically a lambda) that builds the body of the block. You can see how it works in the functional tests; stay tuned for actual sample code.


Do you think it could be useful to wrap this in a LINQ provider for C# usage?


You’re really on to something with this query language for structured data idea! I’d find it super-useful as a dev to be able to think of a backing service not as a set of disconnected endpoints, but a base of operations for dealing with data.

If only such structured query language data bases existed, I bet they would get a lot of use.

(No but really: how is what you’re describing meaningfully different from SQL, especially with the restrictions you’d want to put on the code in practice?)


You're onto something with SQL here. It definitely should be a big part of this, especially for the querying data aspect. Of course, we wouldn't query over raw, database data, but over sanitized, secure views.

SQL isn't everything, though. Be aware that such APIs, for security reasons, wouldn't allow you to modify some of the data in arbitrary ways, but require you to use special functions instead. Think of being able to query the number of upvotes, call an upvote function, without being able to just change the upvote count.

Maybe a mix of wasm (as an imperative platform) combined with a structured query API? Maybe a programming language with something like Elixir's Ecto or C#'s LINQ? Not plain SQL, but an imperative language with the power of relational queries?


Your characterization of sanitized secure views as something other than data in a database strikes me as vitalistic (and dubious).

The “communicate with code, not messages” in your first comment strikes me the same way: code is data, and data is code, and Wasm does nothing to change that. Nothing about making an interface more expressive guarantees one is making it better, and overspecification of user intent can make it much harder for the system to optimize execution.

Even if we really do want imperativity: your upvote example, even including the security aspects, is easily handled using stored procedure support that’s been available in SQL dbs for decades. It is not unfair to describe that support, and its variants (eg postgres lets you use python) as “not plain SQL, but an imperative language with the power of relational queries”.

Can this approach handle everything? No; it would be madness to try and build e.g. huge neural network models in SQL + some stored procedures. But it handles a lot, including your two examples, and handles them more compactly than a general purpose language would, and exists.

Mind you, I’m not saying your general concept isn’t great. It is great, it is a great step. It’s just that distributed programming already took it, back in the 70s.


I agree with your comment highly.

I also find the idea of arbitrary code execution on a remote host to build a derivative service to be a quite naive point-of-view. Even if you can validate that the code isn't going to pwn you - a mighty stretch - how can one justify being a recipient of free compute cycles on someone else's infra? It's not quite realistic to me


> I also find the idea of arbitrary code execution on a remote host to build a derivative service to be a quite naive point-of-view. Even if you can validate that the code isn't going to pwn you - a mighty stretch

Be aware that's exactly what you are doing when you go to any modern site. You let a remote server execute arbitrary code on your computer, trusting it's not going to pwn you. Sure, there were tons of loopholes in the first few years of JS, but now browsers and operating systems are so secure that it's mostly a non-issue. Web assembly is a sandboxing-first technology, so I don't think it's an impossible or even a particularly hard issue to solve.

>how can one justify being a recipient of free compute cycles on someone else's infra? It's not quite realistic to me

Well, there are API call limits, or something equivalent to them. Maybe computing credits in this instance. Calculating how much computing resources you're using is something that i.e. AWS Lambda does already, no reason why an open-source runtime couldn't do so either. The fact it wouldn't be native code, executed directly by the CPU, but Wasm executed by a VM, makes things even simpler.


>but now browsers and operating systems are so secure that it's mostly a non-issue

Right, which is the point - it took an absolutely behemoth effort backed by megacorps to get us to this point. These applications are public-facing because that's the key way for the corps to make money - content delivery.

Why would anyone put this same effort into building more complicated internal services enabling derivative services to extract value from their platform?

> Calculating how much computing resources you're using is something that i.e. AWS Lambda does already

Yes, exactly - another megacorp built a service that makes them huge amounts of money, and now they sell access to it. This code is certainly not publicly accessible

So, again, help me bridge this gap - what possible reason would Spotify (to further GP's example) build such a platform for? This means that Spotify needs to maintain massively more infrastructure in order to help others extract value from their platform.

It just doesn't check out. Is it objectively a more efficient method of computation? Yes. Is it also completely misaligned with corporate incentives in the modern day? In my opinion, yes.


I don't think OP necessarily implied that the compute cycles had to be free.


Right, so then he's proposing that this streaming service is enabling derivative services by not only managing the API for the streaming service but also by running their own public cloud which issues micro-charges for compute cycles. That seems even more pie-in-the-sky.


It's definitely pie-in-the-sky. I wouldn't argue otherwise.

But I do think this is a problem our industry will have to solve eventually. The volume of data we have to move between servers in order to ask questions is exploding. We move petabytes of image data between filesystems in order to train machine learning models. If the problems in this thread can be solved (admittedly a big if), moving the code to the data would be fundamentally better.


Mind you, most of this should be handled by some kind of runtime. Any complex enough web service runs HTTP Servers, databases, sometimes messaging queues, caches or even things like Kubernetes. There's no reason why it couldn't be running yet another program to manage all this.

Such a thing could become the de-facto API, if Rest or GraphQL were needed, they could even be implemented on top of it.


>There's no reason why it couldn't be running yet another program to manage all this

Yes there is - that program would need to be extremely hardened (even against DDOS style attacks - run a ton of memory-consuming cycle-hogs on a service's platform), integrated with tracking & billing, and open-source (or itself charging).

I just don't see the incentive for someone building this, and then I doubly don't see the incentive for a corp running this. Why take on this huge extra burden of infra to help derivative services? It's not going to make you money.


The code you’re talking about is a query. There are straightforward ways to do this now (including graphql). There are some issues with it, though.

The query (or code, if you will) is separate from your data, meaning the aspects of the data a query depends on can’t change without breaking the query. You can have versioned apis and keep the old apis around, or have a query compatibility rewrite layer, etc. but obviously these kinds of things have serious complications/limitations.

That approach is useful only in proportion to how broad, granular, and flexible the API is that you provide for the code/query to access. That sounds good, but such an API is a lot of work to create and a lot more work to maintain. An API is like a promise. If you make big promises, it will take a lot of effort to keep them.

Also, realistically you have to limit arbitrary code sent in from untrusted sources that runs on your server. Even completely non-malicious users will write inefficient code or have inadvertent infinite loops that will eat up your compute budget if you don’t have controls in place.

So...

It’s very often better to add a “listeningStatsByGenre” endpoint if that seems to be what your users want.


Uh... ever heard of SQL? Here's the "code" you would send:

    select s.genre,
        sum(s.duration) as total_duration
    from listen_history h
    inner join songs s on s.song_id = h.song_id
    where h.user_id = 12345
    group by s.genre


Plus

   ; DROP TABLE songs
Sandboxing in environments with no trust is always a challenge. GP's idea was tried and failed many times thus we preffer data passing now.


No doubt. It is shame we are all forced to create own DSL (e.g. GET /genre/12345) that dispatches to the above though. I really had hopes for the OData specification, but it hasn't seemed to get much traction.

It is worth noting, though, that many RDBMS allow for "read only" permission groups, so the risk of malicious DELETES/UPDATES isn't a huge problem in practice. Rather, it's the unbounded queries that could make your severs spin forever that truly represent the problem domain.


What about web browsers?


I'm pretty sure web browsers are sandboxed and that it has taken a significant effort to get there.


If the songs the user has listened to would be stored on the client, it would be trivial to run code on the client to compute anything over that data, rather than expecting the original service to execute arbitrary code.

You can move code to the data, or the data to the code.


Except that sometimes you have a phone, a computer and a smart TV, and syncing the data between all of those is non-trivial. It's easier to assume the server as the source of truth.

Sometimes you want to work with big data banks that you have no hope of ever storing. Your listening history is small enough to fit on your phone. Amazon's whole product directory, with reviews and indexes for quick data retrieval, is not.


Syncing data is an easier problem than what the parent is trying to solve. RSync is one such solution.

If you want to run compute over big data, you replicate the database with a server you own and run compute on a copy of the data. Realistically that is what Amazon would have to do for you either way - they'd be running replicas you could compute over. And those replicas need to say in sync via db replication.

Secure competition over anonymized datasets is useful for security reasons, not efficiency reasons IMHO.


This was implemented ages ago in Voyager application server written in Java. It supported a concept of agents: pieces of code that could be sent to particular location, execute and come back with the results. That was way before Java app server got their standards. Not sure what was the ultimate fate of this particular tech.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: