It would simply take a browser plugin for every type of browser. NewSDK code would be inside HTML comments, just like conditional comments are. The browser plugin would parse this code and cache it (aka compile it). From there, it would be trivial to implement image copy and paste.
Keep in mind that I have no idea what the limitations of a browser plugin are, but if one can:
1) parse the raw source code that's sent down from the server, including comments
2) manipulate the DOM
3) open an outgoing connection
Then you could do, well, anything. Embed Ruby into pages, for example. The reason you can do that is because the Ruby interpreter doesn't have to be a part of your browser plugin. All you have to do is convince users to download your C++ app that DOES contain a Ruby interpreter and runs in the background, then your plugin communicates with that via a socket. Your app then issues commands back to your plugin (you can think of those commands as assembly instructions) which manipulate the DOM or whatever else a plugin can do.
Hmm.. They meet the requirements to an extent. They're really little islands of functionality. I'm talking about a plugin that can react to and manipulate HTML elements, persist across page loads, and doesn't have browser-specific quirks.
I remember hearing about someone making a 'desktop' out of firefox. That would be the perfect way to implement some of the extra functionality he's talking about. Just run a browser as your window manager. Then you could build your interpreter into that and copy and paste between web applications as much as you want.