The side-effect that destroyed touch-screens as a mainstream input technology despite a promising start in the early 1980s. It seems the designers of all those spiffy touch-menu systems failed to notice that humans aren't designed to hold their arms in front of their faces making small motions. After more than a very few selections, the arm begins to feel sore, cramped, and oversized — the operator looks like a gorilla while using the touch screen and feels like one afterwards. This is now considered a classic cautionary tale to human-factors designers; “Remember the gorilla arm!” is shorthand for “How is this going to fly in real use?”.
Well, humans aren't designed, but we did evolve holding our arms in front of our faces making small motions.
I know it's hard to remember around here, but it's not just a UI buzzword.
(Which is why I see promise in this tech; using touchscreens is unnatural, but using gestures in front of your body to communicate with something a couple of feet away is quite human.)
Maybe that's the solution to gestural interfaces - quit with the in-front-of-my-face stuff, and go towards more sustainable movements, like we already have with keyboards and mice; everything at elbow height.
I'm not envisioning it as a primary input device, naturally (although the deaf seem to get by fine). But I don't think I've ever been in a situation where a heated conversation made my arms tired; as a replacement for a few basic mouse tasks, I bet it could even improve ergonomics.
Just try it: raising your arms as if gesturing conversationally is way, way less stressful than reaching out to touch your monitor.
Also, the gorilla arm would go away fairly quickly. This used to make me tired:
Now it doesn't. I expect people would get used to minority report interfaces, particularly if they didn't immediately try to transition from 8 hours of keyboard to 8 hours of minority report.
My real question: is source code for this demo available?
I work in construction, I spend my days holding 5lb drills in front of me. Using the standard gesture position (forearm down) your arm won't get tired during a day. I basically live my day in the 'gesture position' often for well over 8 hours a day, always with weights in them.
When the upper arm is lifted is when your arms get not so much tired but literally drained, which is what I think the Kinect will likely have to adapt to.
I saw him at a small conference recently talking about the future of these types of interfaces and I think where he's going is trying to make much smaller movements meaningful. One example was having EPG on your TV that you could navigate by just flicking your hand. He's looking at ways that smaller, more natural movements can be used to for controlling a lot of the interfaces around you.
(behind pay wall unfortunately ) http://www.wired.co.uk/magazine/archive/2010/03/start/dale-h...
That means you can just work normally on your computer, until that moment comes where you want to sit back comfortably and... browse snapshots and rotate them.
I predict there will also be much more interesting input assistance beyond photo browsing. I'm using a Magic Trackpad in addition to mouse/keyboard for a bunch of gestures (window management). However, the number of distinct gestures that can be performed on a flat surface is fairly limited.
I'll gladly take the additional gestures that a kinect-style device will give me. It could very well revolutionize the way we interact with window managers without forcing us to grow gorilla arms.
I think gesture based interfaces may have some quick utility, like remotely operating equipment for small sequences of activities (gestures operating an Asimo comes to mind since it doesn't yet have a real input system like a mouse & keyboard -- come here, go there, stop, move aside, etc.) but I've yet to see a demo of this interface method, in this context, that isn't wildly less efficient than existing input methods.
You cannot, however, use a mouse + keyboard faster than performing very small gestures mainly with fingers (as in, raise your hands from the keyboard and perform similar gestures).
I'm an amateur photographer and after a week of taking photos I have thousands of photos I need to vet. This kind of gesture-based interface would take me until the universe grows cold to go through a single week's worth of shots.
> Sure, even if a less-sophisticated computer user (such as a cop) might not.
I'm not sure how learning a set of rather specific gestures is less complicated to a novice user than hitting the right arrow key or the spacebar a bunch of times (or clicking on a button labeled "next photo"). I've tried to use gesture based systems a number of times in the past and have always ended up just turning that feature off, mostly because I couldn't remember which one of a dozen gestures meant the particular thing I wanted to do.
The idea is a nice one, use something that humans do all the time to build an intuitive interface, but the current state-of-the art doesn't understand normal human gestures very well. Why does two fingers vertically up mean "grab this" and not some other random gesture? In the end the problem is that we're trying to use arbitrary physical gestures, intended for a physical, 3-dimensional world, to interact with virtual objects in a 2-dimensional window, in ways that might not have any particular meaning to us as a form of normal gesturing (let alone cultural specific gestures, the number of different ways people do fairly universal gestures like point at something is mind-boggling).
I don't think that there isn't any value in this work, I just think that the application that these interfaces are currently being designed for simply don't make any sense.
Somebody else here mentioned that it might make a great deal of sense in 3d modeling sense mice-keyboard combinations are rather clumsy in that application. I happen to think using gestures as a control mechanism for robotics or machine operation might make more sense (imagine controlling a crane remotely by making appropriate hand open/close arm up/down gestures).
> I'm not sure how learning a set of rather specific gestures is less complicated to a novice user than hitting the right arrow key or the spacebar a bunch of times (or clicking on a button labeled "next photo").
They are just not as good with traditional computer input methods, negating the advantage that you or I would gain there.
The point about 3D is very good, and something that can be workable even with 2D displays (in a "3D" environment, like your average FPS). Rotating, pulling things from "behind" something else etc. For certain categories of use, it is probably better than voice commands - another natural UI for humans - in the same way some of us still prefer the CLI over a GUI.
The current technology obviously is not quite there, but it is already possible to implement a virtual keyboard by observing finger movements. Combine that with gestures performed above the virtual surface, and you might have something.
Since I don't own a holographic projector I just posted some images and videos of recent relevant Kinect hacks on my blog http://www.pmura.com/blog/2010/11/the-underestimated-power-o...
The key here is not this precise application but the technology that it implies.
Even if there are some bugs and issues with the interface now, imagine that just in few years this will be mainstream :)
As a proof of concept this kind of interface is fine but for real use it's slow and unresponsive.
I don't know for a fact that Minority Report was the first conceptualization of that specific interface, though.
Just from waggling my hands around a little in front of my monitor, I feel like the touchscreen fatigue Apple made a big deal of isn't as much of an issue, since you're holding your arms upright, which has much more structural support than having your arm outstretched to touch the monitor. It actually feels like a very natural way to control virtual desktops or application switching. Additionally, this is a peripheral that can be added to any display of any size, or even used across multiple displays.
Is there a reason this tech hasn't gotten traction as a PC peripheral before Kinect?
Is it really so hard to imagine a task for which gesturing is appropriate?
Yes. Can you think of some examples? I can't think of a single task I'd like to do on a desktop computer that wouldn't be easier and more-precise with a mouse. You know what's easier than pinch-zoom? A scroll-wheel. Multi-touch is great for phones/tablets, but this seems pretty silly to me as an interface for, y'know, computing.
Currently we have mostly just mouse+keyboard, so it's pretty much expected that majority of present applications would be optimized to work best with these inputs.
Once we will have more expressive inputs, new types of applications will start to pop up.
For example, I often work with 3d. For such applications, mouse + keyboard controls feel very clunky. Even simple tasks like setting up cameras or placing objects in the scene are pain.
It would be much more natural to manipulate 3d objects using gestures in 3d space.
Let's say I'm coding on the laptop, and I want to scroll the docs that are up on the monitor. Three things need to happen here: 1) focus on the docs, 2) scroll, and 3) focus back on my work. My goal of course is to do this as swiftly and effortlessly as possible.
So, using the keyboard: I alt-tab from editor to browser, hit the space bar to scroll down, alt-tab back.
Alt-tabbing is more of a hassle the more programs I have open. Additionally, the scrolling is either a lot (page down) or a little (down arrow); usually what I want is something in between. It could be made easier if I had a specific keystroke set up to switch to Chrome, or to scroll by x amount, but that's far from intuitive.
Using the mouse: I move my hand down to the trackpad, or over to the mouse. Wiggle it around briefly to locate the cursor, then drag it across the screen to the second monitor. Use two-finger scrolling or the mouse wheel to scroll to where I want to. On OS X I blessedly don't have to click to focus scrolling, so I could start typing immediately, but to avoid confusion (and because I want to scroll my code) I need to drag the pointer back.
This provides a much better scrolling experience, but the process of moving focus across monitors with a pointing device is a huge drag.
Using a hypothetical desktop gesture reader: I lift my hand in front of the second monitor. This focuses on the window. I draw two fingers down (or up) to scroll. I drop my hand back to the keyboard and begin typing immediately.
My elbow stays on the table, so it's quite comfortable. I have excellent granularity in scrolling— not as good as the mouse wheel, since I'm not actually touching the screen, but probably about as good as two finger scrolling. Most helpfully, the task of focusing on the browser and back is about as effortless and intuitive as it possibly could be without reading my mind.
Coming back and using a desktop PC with a mouse felt very strange - so I'm now far more willing to consider that approaches like this might be feasible after all.
I could see this on a desk, like MS Surface.
Very nice of Microsoft to back down on the anti-hacking threats, this is great for everyone.
I wonder how long until someone hacks it to help certain disabilities.