

Rewriting pixels to add new features to closed-source software - chaosmachine
http://www.cs.washington.edu/homes/jfogarty/research/prefab/?

======
ximeng
Summary:

UI researchers can't easily add new features designed to make software easy to
use to existing software. This is a particular problem with closed source
software. "Prefab" is a tool that looks at the pixels on the display and
infers what the underlying UI widgets are. Once this is done, additional
features can be retrospectively added. This functionality is platform
independent, and is demonstrated to run on YouTube videos (Flash), Mac, and
PC.

Examples of functionality that can be added:

"Bubble cursor" - highlighting of the nearest UI element to the cursor as if
the hit area of the cursor dynamically increased to the nearest element.

Dynamic mouse acceleration change when cursor on a UI element

Animation to indicate state changes in UI elements such as tabs, sliders and
checkboxes

Generation of a preview of a multi-parameter space in graphics tools such as
GIMP or Photoshop. This works by automatically changing several sliders to
affect parameters in a graphics transform and recording the preview image. The
results are then dynamically displayed allowing users to view the effects of
parameter changes in parallel using a grid of output images.

------
rauljara
The technology they demonstrate is impressive and clever. That's just a neat
way to improve GUI's. But what they use their method to do (implement bubble
cursor and sticky icons) seems like it would harm usability. With bubble
cursor, the downside is that any accidental click of the mouse (those happen
with touch pads on laptops) is guaranteed to trigger some gui element. Also,
if you are in the midpoint of two gui elements, a tiny, tiny shift in the
mouse could lead to clicking the wrong thing. Which could be very back when
the "save draft" and "discard" buttons are right next to each other, as in
gmail. The upside is... that I don't have to move my mouse quite as far? Bur
if not having to move your mouse as far is a good thing, why implement that
sticky elements feature, which guarantees you will have to move your mouse
much farther after moving over an element? Both the bubbles and sticky things
seem designed to make ui elements much easier to click on, but unless you have
muscle control problems (which I suppose some people do, especially the
elderly) I don't think that's generally a good thing. There are too many
buttons that do things that you just can't undo.

~~~
ximeng
I just put together a javascript + canvas demo of bubble cursors based on the
demo in the video. It does make it a lot easier to target small elements.

<http://mwomwo.nfshost.com/bubblecursor/bubbles.html>

I agree with you that stickiness doesn't sound like a great feature. Clicking
the wrong thing also seems like a risk, but that might be better solved by
making buttons not do things that you can't undo.

Bubble cursors seems to me like it could definitely be a win.

~~~
rauljara
Upvote for you, for awesome effort.

But your demo confirms for me all my suspicion about bubble cursor. Hang out
at the midpoint between three bubbles of all different sizes and it just seems
incredibly unintuitive to me which one is highlighted. Yes it makes small
things easier to click on (sometimes much easier than gigantic things), but as
I said above, I don't think that's a good thing because of all the buttons
that have functions that can't be undone. You mentioned making buttons that do
not do things that you can't undo, but how would that work in a save dialogue
box, or when you've just typed up an angry email you never intended to hit
send with?

~~~
ximeng
Thanks.

This story:

<http://news.ycombinator.com/item?id=1235081>

talks a bit about the points you've made. Basically you put a passive delay in
that's long enough for people to cancel. It's not going to work for everything
though - sometimes you really do want to start a process immediately.

The biggest benefit comes when you've got a large bit of empty space on one
side of an object. That space then becomes clickable. In a world with big
monitors it's harder to get easy UI wins by putting important widgets on the
side of the screen. You could get a similar effect with these bubble cursors.

~~~
nwinter
I just tried limiting the max bubble cursor radius in your demo and it seemed
to prevent the unwanted gap-clicking problem well enough, while still helping
buttons be easier to click. Not sure if it reduces the win in the proposed
huge-screen example.

~~~
ximeng
On my local copy I put in something so that it looks at the second closest
object to the cursor. Then only selects if the nearest object is 40% closer
than the second nearest. It works pretty well to prevent ambiguity, but still
gets a bit confusing when there's a few objects of different sizes nearby.
Quite want to try this functionality in a real user-interface now!

------
cubicle67
Perhaps it's because it's late and I'm tired and a bit grumpy, but I can't see
the benefit at all here. Only downsides.

1\. The whole idea of teaching a user to accept another programme pretending
to be the one they're running, and intercepting all inputs for that I find
dangerous.

2\. The bubble cursor would be annoying and frustrating, far more often that
it would be useful. Have another look at that video, and see just how far the
mouse is, often, from the target it's selecting. Especially when it's
sometimes over the text of another question. What about cases where you don't
want to click _anything_ , you just want to bring that window to the front, or
remove focus from the flash video so you can press space to scroll (not
pause)?

3\. The slowing of the cursor over controls looks the most useful, but would
make navigation of your application interface like playing one of those games
where your character keeps getting stuck in puddles of honey carelessly left
lying about, and slowing dramatically. It'd turn moving your cursor into a
game of dodge the gravity wells. Now picture your Mum accidentally moving her
mouse into the centre of those six rows of toolbars she has in IE. She'll
never be able to get it out again

~~~
Qz
2\. The example is a conceptual example, the points you raise can be fixed
relatively easily. You can enable deadspace by requiring the mouse to be at
least X pixels from a potential target.

3\. You could simply turn off sticky controls when the mouse is moving
quickly. Target acquisition generally has 2 phases, I high speed movement
followed by a deceleration phase near the target. Just enable the sticky
controls once the mouse speed drops below the appropriate threshold value.

------
Zak
Ok, this thing is cool, and they wrote a paper. Where do I get the code? It's
a bit ironic to use the tagline "What if every GUI were open source" and not
link to the code.

~~~
eob
You could always email the author. Often times academic code is slow to be
released because the types of things that tend to be thought of as required
for software release (documentation, install help, bug fixes, coherent code
organization) simply aren't there. The reward system set up by academia
doesn't place value on those types of things, so it takes a long time to get
them done in your free time. The result is people are hesitant to release
their code because they know it will likely be difficult to put to "real" use
without heavy refactoring.

Just my 2c.

------
rapind
I think this is pretty cool. Somewhat similar tools have been used to cheat in
MMOs where a library would recognize and interpret the pixels and you'd write
scripts to manipulate the underlying platform.

This is actually extremely flexible and could be used not just to _decorate_ a
user interface, but to provide a programmatic /scripted/ interface into an
application that doesn't traditionally provide an API.

------
obiefernandez
Impressive work which I could definitely foresee being used by malicious code
to seamlessly hijack existing GUIs. The security implications are spine-
chilling.

~~~
yangyang
I'm quite sure that malicous code has been written to do similar things
already. I think what they've done with the technique is of interest here, not
the technique (reading pixels from the screen and intercepting the mouse /
keyboard).

~~~
fhars
It is the default method to scrape on-screen keyboards meant to prevent
phishing attacs.

------
emanuer
I am drooling over the idea of my web-cam tracking my eye-movement and combine
it with the power of the bubble cursor. Just imagine, you would not need a
mouse anymore, all the problems of touch-screens are solved. You would just
have your courser on wherever your are looking at.

~~~
Qz
blink twice to click?

~~~
emanuer
blink 3 times for right click ;-)

------
siculars
Very cool research. Not exactly the same, but somewhat similar to what the
sikuli group at mit (<http://groups.csail.mit.edu/uid/sikuli/>) is doing
regarding visual programming. They can take their code, but they can not take
their output.

The future is bright for post processing programs that modify one programs
output in some way. Note greasemonkey that modifies html/dom within a browser,
this research which modifies drawn pixels, sikuli which does programmatic
image recognition and the new javascript audio research within mozilla which
allows one to create and record audio within the browser
(<http://vocamus.net/dave/?p=974>). How will music labels react when it
becomes easy to record audio output to mp3 from html5 video within the browser
via a javascript library?

------
rbanffy
Interesting.

But it's only multi-platform in the sense that it can recognize the underlying
widgets. I am also concerned on the added complexity of bolting a layer of
behaviour on top of software that's unaware of it and was not designed to take
it into account.

I also wonder if is it a coincidence that it's research focused on adding a
layer of complexity to closed-source software is from Washington.

------
robryan
It seems like there is an element of this being a means to an end, I don't
think they really want everyone to use this. Rather widen what researchers in
their field can work on and persuade the companies that make this software to
include solid UI research down the track.

------
meese_
A bit different from this, but OS X has allowed the editing of nib interface
files in any application for years (though that ability seems to have
unfortunately gone away for the most part in Snow Leopard, where most nibs are
compressed to save space).

------
jheriko
Some quite cool ideas... the image processing based approach is a clever way
to avoid the platform specific nature of UI and the fact that lots of apps do
their own thing to render APIs like EnumWindows useless for stuff like this.

------
wallflower
This reminds me of MIT's research initiative to automate any GUI using
screenshots

<http://news.ycombinator.com/item?id=1072710>

------
daleharvey
its really interesting to see ui developments take more of a focus recently,
this looks like some really hefty work and its great to see it in action on
applications I use every day, I might have to take some time to see what can
be done in javascript, the expanding cursor is vaguely familiar but looks like
a great idea.

------
elblanco
Yes, yes, yes, to all of these GUI ideas in the next version of whatever GUI
toolkit I'm using.

------
m0th87
Genuine question: is it breaking HN guidelines to post a duplicate by adding a
question mark to the end of the URL, as in this case? I did think this
deserved more attention than the original post's low score.

<http://news.ycombinator.com/item?id=1233669>

------
bitwize
That's cool. It's like Greasemonkey for your UI toolkit.

