Hacker News new | past | comments | ask | show | jobs | submit login

On the Windows accessibility team where I work, we've done something like this for the UI Automation API, not using WebAssembly but using our own bytecode. We call it Remote Operations. The idea is that certain parts of the UI Automation API require a very large number of cross-process requests (i.e. from the assistive technology to the application process) to do anything non-trivial. So with Remote Operations, the client (assistive technology) sends a little program to the provider (application) to do the API calls.

This feature hasn't yet made it into a stable Windows release, but it's available in Insider builds, and the high-level client libraries and accompanying tests are on GitHub here:


Disclaimer: Remote Operations is still a work in progress. There's no public documentation yet aside from the README in the GitHub repo, and the client libraries aren't yet packaged for easy consumption. Moreover, this is not an official Microsoft announcement; I'm just bringing it up on this thread because it's relevant and I'm proud of this feature that I helped develop.

As far as I remember, NVDA either does or used to do something similar. They were injecting c++ code into other processes, so that it could quickly query their APIs and return just the data NVDA needed. I never really delved into that part of the codebase, so I might be wrong here.

As an aside, how do you get to the bytecodeyou use? Do you write quazi-assembly by hand? Did you develop your own compiler? For what language?

Will other, non-microsoft assistive technologies, like NVDA for example, be able to use it?

You're right; all serious third-party screen readers for Windows currently inject code into the application processes. In particular, they all use this technique to efficiently traverse browser DOMs and Office object models. The point of Remote Operations is to provide a way to efficiently get the equivalent information through UI Automation without the risks (in security and robustness) of injecting native code in-process.

As for how the bytecode is built, the GitHub repository I linked has a library with a WinRT API for building the bytecode at run time, by calling methods that correspond to the individual opcodes. It's an object-oriented API, so there's a class for each type of operand. And for control flow blocks (e.g. if-else and while loops), the method takes a WinRT delegate (basically a lambda) that builds the body of the block. You can see how it works in the functional tests; stay tuned for actual sample code.

Do you think it could be useful to wrap this in a LINQ provider for C# usage?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact