The article focuses on "everyday object" manipulation, but he's right about technology too: there are a wealth of common HCI tools that glass cannot accommodate.
- The textual keyboard remains one of the fastest methods of text entry. It can be used without looking, offers high bandwidth, affords both serial and chorded inputs, and works well for precise navigation in discrete spaces, like text, spreadsheets, sets of objects like forms, layers, flowcharts, etc.
- MIDI keyboards are comparable, but trade discrete bandwidth for the expressiveness of pressure modulation.
- The joystick (and associated interfaces like wheels, pedals, etc) are excellent tools for orienting. They can also offer precise haptic feedback through vibration and resistance.
- The stylus is an unparalleled instrument for HCI operations involving continuous two dimensional spaces. It takes advantage of fine dexterity in a way that mice cannot, offering position, pressure (or simply contact), altitude, angle, and tip discrimination.
- Trackballs and mice are excellent tools for analogue positional input with widely varying velocities. You can seek both finely and rapidly, taking advantage of varying grips. Trackballs offer the added tactile benefits of inertia and operating on an infinite substrate.
- Dials, wheels. A well-made dial is almost always faster and more precise than up-down digital controls. They offer instant visual feedback, precise tuning, spatial discrimination, variable velocities, can be used without looking, and can be adapted for multiple resolutions.
- Sliders. Offers many of the advantages of dials--smooth control with feedback, usable without looking--but in a linear space. Trades an infinite domain for linear manipulation/display, easier layout and use in flat or crowded orientations.
And these are just some of the popular ones. You've got VR headsets for immersive 3d audio and video, haptic gloves or suits, sometimes with cabling for precise pressure and force vector feedback, variable-attitude simulators, etc. There are weirder options as well--implanted magnets or electrode arrays to simulate vision, hearing, heat, taste, etc...
Dedicated interfaces can perform far better at specific tasks, but glass interfaces offer reconfigurability at low cost. That's why sound engineers have physical mixer boards, writers are using pens or keyboards, artists are using Wacom tablets, nuclear physicists are staring at fine-tuning knobs, and motorcyclists are steering with bars, grips, and body positioning; but everyday people are enjoying using their ipad to perform similar tasks.
Glass isn't going to wipe out physical interfaces; it's just a flexible tool in an expanding space of interaction techniques. More and more devices, I predict, will incorporate multitouch displays along dedicated hardware to solve problems in a balanced way.
I HATE, HATE, HATE the way more and more cars are doing away with the dial. Dials are awesome. Especially in cars.
I can operate the dial while driving and not risk killing someone. Replacing dial controls with flush push buttons, or worse consolidated touchscreens that control virtually everything means either I don't get any control of my radio/AC/etc when I am moving or I risk running someone down because I'm too preoccupied dealing with the shitty no-affordance monstrosity of a UI you replaced a perfectly great thing with.
Same on laptops, if i'm in a coffeshop and forgot to turn the volume down and suddenly some video starts playing at full volume, i want to be able to quickly turn down a physical dial to zero, that would take me under 0.1 seconds. Not press fn+F11 ten times which takes almost a second.
> A well-made dial is almost always faster and more precise than up-down digital controls.
Eight years ago I bought a microwave oven for my apartment that had a digital knob. It's a physical knob hooked to the timer, but since it's digital, it accelerates. Below one minute, each "notch" increases the time with 5 seconds, but as you go higher, each notch adds more and more time to the total until it starts adding 5 minutes per notch.
It's a fantastic input method for setting a timer since it's tactile - you feel each notch where the time changes, it's deterministic - 30 seconds is always a quarter of a turn, 2 minutes is always a bit more than one turn, 7 minutes is always a bit more than two turns, etc, and it's superior to an analogue timer - it accelerates, you get more precision in the lower ranges and less precision in the higher ranges.
A few years ago I moved and had to buy a new microwave. Except I couldn't find one with a digital dial, all manufacturers had switched back to the shitty input method of +/- buttons again because.. I don't know. Fashion?
It makes me furious when people and companies make interfaces that are clearly inferior existing alternatives for no good reason at all. But most consumers don't care, and here we are. :-/
It seems to me that the touchscreen, because of the reconfigurability you point out, is the lowest common denominator of input. It makes a lot of sense for a highly portable device, because you can pack many configurations into a small device.
I guess what I lament is the richness or input resolution you give up. And if general computing trends more toward this lowest common denominator, we're faced with an input paradigm that's impoverished in every vector.
Which, as I've stated before, only really affects the input part of it, and not the consumption part.
You're absolutely right: it is a spatial resolution problem. There are only so many distinct targets for a finger on a given surface.
Part of the reason we're seeing so many finger-oriented glass interfaces is because it lets us sweep the resolution issue under the rug: capacitative sensing just isn't that good yet. However, I expect that in the next twenty years we'll be able to offer affordable glass with both display and touch resolution equivalent to ink/paper. The stylus could conceivably make a return; it'd be the natural successor to Wacom's devices, and could conceivably play a role in replacing scratch paper. We still have to solve the texture problem, but I doubt that's insurmountable.
I'd also be willing to bet on digital whiteboards (or tabletops). They suck currently, but there's no reason we can't have large-format, medium-resolution collaborative writing surfaces. It's an established workflow for corporate meetings, theoretical physics, software planning etc., and has obvious extensions: copy/paste, save-restore, colorize, resize/move, etc.
Your vision has me imagining a future where tiny stylus tips are implanted into the tips of our index fingers. Like a little nubbin that increases the pointing precision of our fingers. Not sure I like that vision.
No. I think individual fingers are actually less precise than manipulating an object with the whole hand. For one, it's hard to see what's under your fingertip, whereas a stylus can be very thin. You're able to use more muscles and apply leverage, which reduces jitter and improves precision. Barring anatomical changes, I suspect external implements will be around for specialized work for some time.
Yes – what this ultimately comes down to is the question of malleability of interface versus specificity of interface.
Capacitive touch screens offer an incredible amount of malleability compared to what we're used to in the history of user interfaces. Within two dimensions, there's simply no limit to what we can do with them. They can be reconfigured infinitely, and yet that infinity is of a low cardinality compared to the infinite number of possible user interfaces.
This difference in cardinality tricks us into thinking it can do anything.
As you say, there are still many applications for which purpose-built interfaces are vastly preferable. Until some sort of science-fiction nano-structure can increase the cardinality of the solution space, the application of malleable interfaces will have distinct limits.
The Kinect is a great example of using the most expressive form of interaction we have - our entire body. It's the right idea. The latest updates even do facial recognition (identify muscle movement). Too bad it doesn't scale to smaller devices/cramped objects. I wonder how small kinect-like tech has to get before it makes its way into smart phones, etc.
However, I think voice might be a more expressive medium there, even better than touch. Imagine being able to detect sarcasm, inflection, accent, etc!
The Kinect is a very cool input device, but it has a lot of the same problems as “pictures behind glass” touchscreens: in particular, it doesn't provide anything like the experience of manipulating real objects: texture, resistance, weight, &c.
I think Kinects/distance detectors are still very young technology. Imagine how far it could be pushed with just engineering. A very high resolution detector could identify facial expressions, smiles, grimaces, smirks, a whole gamut of emotional inputs. Push the resolution even higher and you could use iris movements to zoom in and out based on what the user is trying to look at.
Voice control was omitted from the video, I don't really understand that. I still think natural language is potentially the best mode of interaction and other modes would be secondary after that.
The plan seems to be to make more RGBZ CMOS chips (cameras with range sensors built in). Once the R&D is more available we will start seeing more than just Kinects and engineering improvements (like the Megapixel race to get more and more in smaller form factors).
Funny you should mention MIDI keyboards. Many musicians are now seriously using iPads to play music. For a good example check out ProjectRnL on YouTube. They use a bunch of apps, some developed by Jordan Rudess (keyboard virtuoso and keyboardist for the band Dream Theater) that let you play things you can't easily play with a conventional keyboard. He managed to replicate a lot the features of the Continuum Fingerboard (an expensive "continuous keyboard" controller) with some iPad apps. The main thing missing is, as you stated, pressure modulation. I'm convinced though that something like that will be coming to tablets not too long from now.
I think the tactile response is important. I see the ipad falling at the extreme end of the response spectrum: glass, unweighted cheap keyboards, weighted keyboards, cheap piano, Steinway. Reconfigurability is important, but I suspect that expressive performance is improved by that interplay of position and resistance.
It depends what you are doing, if you just need a keyboard then I think your ordering makes perfect sense. However, for an expressive instrument to be used as a sound synthesizer, I would definitely put it well above any inexpensive keyboard controller. The cheap controllers don't offer anything good in terms of tactile response anyhow, and suffer from expressive limitations. Take a look at some YouTube videos of GeoSynth or MorphWiz on the iPad for some examples.
If the iPad also had tactile response it would surely be even better, but even now it surpasses most offerings.
And one more thing, voice control. Apple has a knack for refining old ideas and making it mainstream. Voice controlled computing has been available for years. Perhaps in the future many of the computer interactions for the non techie consumer will accomplished without using their fingers, swiping.
For me, voice control became refined and mainstream when I told my android phone "Navigate to <friend's name>" and it automatically pulled up Google Maps and started turn-by-turn navigation. (Over a year ago.)
But I don't really think voice control is going to be very useful except in situations where hands are too busy doing something else.
One thing that is overlooked with voice control is privacy. There's a reason why the SIRI demo videos show people using their iPhone in their own home; nobody wants to broadcast their intentions and interactions to bystanders. Even for 'innocent' commands, there is a while to go yet before talking in that strange scifi dialect (required for comprehension by computers) won't get you strange looks.
Maybe with technology that picks up subvocalisations?
I just watched the video, and typed my reactions, as I had them; no idea if this will be of interest.
* * *
People still travel for meetings?
Wait, someone is driving the car?
Thats not very productive.
There are bellhops?
Why are there still bellhops in the future?
What do they do?
Why is the screen so small?
Why have a screen, if you have those perfect augmented reality glasses?
'Creating reply interface'? We still have to wait for computers?
There's still global poverty, and benefit concerts? When these people have all that fancy tech?
Copy and Paste is still around?
Kids are still taught long division? Why? Why do they use a pencil?
Also, won't the future be one of neural interfaces? Isn't there something wrong with interfacing two electrical signal processing machines (brain + computer) via all these muscles and optical sensors and so on?
I know there's a lot of science to be solved first; but surely the future of interfaces is that they are invisible, and built in to us?
I guess it's meant to depict a not so distant future. Either that, or you could criticise the Office team for not having the imagination to take it any further.
Kids are still taught long division? Why?
Same reason people learn it today, same reason we didn't stop the second the slide rule was invented or cheap calculators became available 20 years ago or whenever. Learning simple algorithms is important for future development.
Well - those were just quick reactions I had to the video. But, I think if I could now somehow set my own childhood educational curriculum, I don't think long division would be on it.
I don't think I know how to do long division, at the moment. I certainly haven't done it in very many years. I suspect very few people ever do it. Computers even follow different algorithms.
I would definitely agree that learning simple algorithms is important for future development.
But why division? It'd be good to show kids an example, to let them see how its possible. But I wouldnt make them practice it, to internalise it.
I'd put something like 'binary search' on the general curriculum, instead of long division. I see a future where many more people are going to have to manipulate information, all the time, and I think teaching things like simple programming concepts to the general population, is probably more valuable than teaching simple arithmetic rules.
I don't know what way data manipulation technology will pan out - but I'd hope that in future, we can come up with better things to have children internalise, than simple arithmetic algorithms.
Arithmetic is totally abstracted away in virtually every language we use today.
I don't think it's safe to assume binary search won't be similarly abstracted away in 20 years (to the point that even embedded developers are using something with extremely expressive template algorithms built in.)
Best to start with the bare minimum. Given a set of numbers, how do you add, subtract, multiply, and divide. Granted, it's not too unreasonable to consider a machine that has no divide instruction, but it's still a basic computer algorithm that's within the reach of computers. I think the big problem is that people don't have a grasp that computers compute things in a manner not too different from how we do it by hand (and that we can use more advanced computer methods by hand to good effect.)
I think, we've reached this situation, where huge numbers of people, across different professions, need to manipulate data.
For the last 10/20 years, the dominant paradigm for data manipulation, was probably the spreadsheet. Spreadsheets give an enormous amount of power, for relatively little investment. So, now, a wide range of professionals - business analysts, geologists, even engineers, use spreadsheets as their standard data manipulation tools.
But I think we've entered a world where data manipulation is so central, that spreadsheets just aren't a powerful enough paradigm for these jobs.
At the moment, I feel that the best way for manipulating data, is probably a general purpose programming language, like Python. I would advise people starting a career in information work ( anyone that does advanced data processing and analysis - which is a great many people, these days), to try and learn to program. I see a lot of merit to the argument that programming is becoming a basic professional skill, like literacy did.
(I would advise something like Python, over something like R, or even Matlab; I think its just semantically more consistent, and cleaner. This is completely a personal opinion, but I think R, in its present form, will ultimately be considered a failure, as a data manipulation tool.)
Given this worldview, it'd make a lot of sense to _currently_ be teaching binary search, in school, rather than long division.
Now, programming languages are probably the best general purpose data manipulation tools we have currently, but, for all their power, I suspect that future generations will look back on what we use now, as we look back on writing machine code - a painstaking, and primitive exercise. There's plenty of people trying to build more powerful general data manipulation tools, at present, but I've yet to see something thats a convincing advance over both the spreadsheet, and the general purpose programming language.
But, even if we do have better data manipulation paradigms in future, where the binary search is completely abstracted away from the data manipulator, then I would still guess that it might make more sense to teach school kids binary search rather than long division.
I don't accept the argument that long division is in some sense more primitive, or more 'bare minimum', than binary search is. It is, in the order we currently teach things; but I don't see that order as at all fundamental. I think from an algorithmic worldview, binary search is a pretty fundamental operation. You can do a lot of algorithms before you need to divide anything - why not have them come first?
I'm not saying this is the definitive world view to have, and 'binary search' is just an example here; but I am saying that I don't see why we have to be so focused on the arithmetic-centred view of things; and I would hope that 'in the future' we'd have moved beyond that. Fewer and fewer people are going to have any experience with complex arithmetic, but more and more of them are going to have to manipulate data.
I don't think we give a lot of importance to childrens time at the moment; I think they are sent to spend time doing long division because thats what they've always been sent to do; but I think that this might all be about to change, as education is coming into line for some massive disruption.
What struck me is how lonely the video felt. People hardly talked to each other. The only emotion exchanged between people was with the mom and daughter making apple pie, and that was through the screen.
How is driving a car not a productive task? There can be cases when people "want" to drive out of their preferences and passion. Can we actually classify some random and not-so-appealing work to us as non productive?
The guy that picks her up at the airport, appears to be a professional driver. He has a black cap, and he nods deferentially, before turning around to apparently open the door or trunk for her.
I'd have thought that in a 'Productivity Future Video' you'd have the car driven by computer, given that we seem to be on the cusp of that technology, already.
I'd have thought she say to her device that she wanted a car from the airport to her hotel, and it would have done the rest.
This would clearly be more productive, as we wouldn't be wasting the labour of the man who drives the car.
You are right that its possible that he wants to be a driver for people, out of some individual preference, in which case this is essentially a leisure activity for him. But I'd leave people that do unnecessary work, because of a personal preference, out of a 'future productivity video'.
I can't help but wonder if the reason the cars don't drive themselves is because its a Microsoft video, and Google who have gotten a lot of PR out of driverless cars, are a competitor.
Oki doki, so given plausible future technology let's try to brainstorm a solution that addresses the issue of tactility in interaction, I give you the ... drumroll KinBall (Kinectic ball). The Kinball would essentially be a wireless ball, like a small juggling sack (the smaller ones) that you could interact with to control devices.
So the Kinball would have the following features
* Gyroscope/acceleratometer so it knows which side is up and how fast it's being moved and where it is.
* Sensors so it can feel where it's being squeezed/pressed and how hard
* Some kind of detecting mechanism for when two balls (cough) are touching each other.
* Ability to vibrate in different frequencies and also only partially on different parts of the ball
So with a device like that you now would have to come up with a gesture language, some ideas
* If the future allows it, ability to change color
* Holding the ball and moving the thumb over it is "cursor mode, pressing in that mode would be clicking (and you could "click" and hold for submenus )
* Similarily swiping your tumb over the ball would be the swipe gesture
* Pinch-squeeze could be a specific gesture, perhaps combined with a gesture (like spritzing cookies :)
* If you hold the ball in the whole hand and move it from your chest and forward you could simulate resistance by varying the frequency of the vibrating to "feel" interface element
* you could roll the ball in your hand forwards and backwards, for instance for scrolling
* Double the balls, double the fun. With two balls you could perhaps do interesting things with the distance between them and again simulate resistance by vibrating the balls as you bring them closer to each other
* Social balling, you could touch someone elses ball (ahem) to transfer info, files etc
* You could have the ball on your desk and it could change color or pulse in different colors for different notifications.
This kind of interface would have some interesting features. You get tactile feedback and most gestures are pretty natural. You don't have to get smudge marks on your screens. The ball is pretty discrete and hardly visible in your hand. Heck with a headset (for getting information, like reading smss) you could just get away with a ball and the headset and skip the device altogether for some scenarios.
On the other hand it's another accessory you can lose and a ball in your pants might not be the best form factor.
Anyways, if Apple introduces the iBall you know where you read it first
Well, it wouldn't be a solid ball since it would have to be able to be squeezed so if dropped it atleast shouldn't roll away, and contrary to most smartphones it would actually be able to handle being dropped quite well. Rings/camera/gestures (minority report) in general are of course interesting but I think the downside of that approach is no tactile feedback
That's true. What tactile feedback could a hackysack-size ball really give though besides perceived firmness?
If the finger rings were actually more like thimbles, could they impart some texture onto a finger tip? If you were brushing over a virtual surface, could they then give a sensation of texture? Raised on a page to denote a button, etc?
I really liked the OP's article, but found it frustrating stopping short of real suggestions and predictions - great opportunity to try to forecast where things could be in 10-15 years.
Well, the ball could in theory feel how much pressure you excert on it and where, and provide haptic feedback. In a super advanced scenarion perhaps the ball could provide feedback by varying it's firmness. Guess it depends on how advanced technology is how good the thimble solution could be, but it strikes me easier to manipulate a ball, which you could do even when it's in a suit pocket than to do gestures in the air. Some gestures might be doable though. I have a hard time coming up with a "Cursor" scenario for that solution though and tactility makes it so much more easy to control. Maybe if you used your fingers against a hard surface but that would limit it's flexibility.
I can't help but think that the success of the iPhone and iPad has caused a big step back in usability among devices that try to copy them.
Two similar examples:
Garmin's newer aircraft GPS units have touch screens instead of knobs and buttons. The iPad has proven very popular among pilots. I can see why Garmin would decide that "touch is the future." But, while I'm flying an airplane, for my money I'd rather have knobs to grab and twist, and buttons to push and feel.
Tesla's new Model S uses one huge touch screen for its in-dash interface. Surely, if you want to change your music's volume or turn on air conditioning while driving, it's harder to hit touch targets that are Pictures Under Glass than to grab and twist a knob.
I would counter those sentiments with the ZuneHD experience. Most people haven't used one, but touch isn't vision focused. With the Zune in my pocket, I can unlock, skip a track, stop, play, change the volume (on the screen, not a rocker on the side). I can also try to shuffle/unshuffle, but that isn't as easy.
I was actually amazed that Apple completely missed the ability to use touch when you weren't looking at the device.
Agreed. This is something I was actually surprised at when I first got the HD, being a diehard physical button fanatic (had an iRiver H320, and fat Zune previously) I couldn't stand the idea of switching over to a touch interface, but I had to when my fat Zune died.
Fast forward a week or two after getting the HD, I was able to control the device almost as fast as the origin Zune in my car (without looking at it). It just felt somewhat natural, the way the placed the buttons. Didn't put too much thought on whether or not the positioning was partially intended for quick use.
They didn't completely miss it - go to settings, general, accessibility, and you can enable voiceover which reads anything you touch out loud and makes you double click to activate, so you can navigate without sight.
It's terrible if you want a hardware button to skip a track, but it's usable for basically any iOS app with standard UI controls - including web browsing - which a hardware button isn't.
I've never seen a Zune in person - do you just mean it has lots of buttons for all those tasks, or how does the touch interaction work on it without individual hardware buttons?
No the ZuneHD doesn't have buttons for those interactions, and the ability to read out whatever I touch is more of an aid than an true interaction element.
My ideal is that we shouldn't always be aiming for items on the screen to interact with. We've traded hardware buttons for software buttons, even though the ability to recognize things like swipe now exist and are much more natural. With a button, I have to hit that exact spot. With more natural movements, like swiping, I am creating an action over a more general area. I don't have to be so exact, and therefore, I often don't need to look at the screen at all.
On the ZuneHD, it works like this,
swipe right to left on the screen is skipe forward a track, swipe left to right is back. Swipe up is volume up, swipe down is volume down. Hold the right side is fast forward, hold the left side is rewind. tap the center is play/pause, tap bottom left corner is shuffle/unshuffle (though this is a visual button, it is fairly easy to hit without looking at the screen).
Unlocking a ZuneHD is the same as a WP7 (though many people haven't seen those either). You just swipe up on the lock screen and it unlocks.
With all of these interactions (excluding shuffle), you aren't aiming for a button, just a general area on the screen.
In comparison with the iPod Touch/iPhone, to unlock, you are targeting a fairly specific area at the bottom of the screen. Easy to miss the swipe area if you aren't looking. On WP7 the whole screen is the swipe area. As long as you start at the bottom, and swipe up.
Same with skipping tracks, on iPod Touch, it is a forward button on screen. You have to hit the specific spot. With the ZuneHD, you just swipe across the middle-ish area of the screen, and it will go forward the track. You aren't aiming at a specific item to interact with. It is all a very natural feeling.
This was one of my big disappointments with the iPod Touch. After the original iPod was such an amazing design for navigating and playing music, Apple told us about this great new touch interface and how we could swipe our fingers across the screen to interact, and how it was all so natural. But the swipe interaction was only on coverflow, which is somewhat useless. They missed the opportunity to make swipe a key navigation element in iOS.
I feel the same way about iOS apps today. Why do I have a back button at the top of the app. The back button slides a panel from left to right and shows me the previous content. Why am I aiming for a target at the top left (rarely comfortable if you are holding the phone in your right hand), rather than swiping across the screen and sliding the panel?
I was never sold on the idea of swipe becoming pervasive. That being said, I'm not sure if a 'natural feeling' interaction for the sake of convenience is necessarily the best.
The swipe interaction isn't the most intuitive interaction for new users navigating around screens or other digital content and Apple seems to have used it for content that's derived itself from physical content or pages, such as books or images. If Apple were to use swipes as OS wide navigational gestures, it would preclude the use of swipes in app-specific contexts
I have a ZuneHD and didn't even realize there was a swipe interface except to switch songs (when playing music).
pedalpete is still right though. If you tap the center of the screen when playing the song, you get an overlay that is separated into quadrants. Tap the top of the screen, volume up. Bottom, volume down. Left, previous track. Right, next track. Center, pause/play. I never look at my screen.
This is very interesting, and relevant since I just found out the air traffic control display at the local airport in my hometown has been replaced with, to quote one guy, "a big ipad"
They're up in the air about it because now, instead of knobs and switches, informed by paper maps and pencils, they've got this fancy, 'high-tech' touch screen display - 3x5 feet of shiny glass.
There was a minor storm in a teacup about it, because it leads to a much greater potential for accidents, bad landings and congestion (it's a very busy airport)- but because it cost millions, nobody can go back to good ol' pencil, paper and brain.
I appreciate your point and prefer physical controls when driving, but obviously one major advantage of the described approach is that the same space can be used to control different things - e.g., swap out the sat-nav area for air-con controls, put up weather or traffic reports if the music is switched off, etc.
That's the tradeoff of touch interfaces in a nutshell. They're jacks of all trades, masters of none. This is why musicians, for example, are still willing to shell out many thousands of dollars for a real synthesizer or drum machine even though a standard PC and generic midi keyboard can make all the same sounds. For some tasks, it's worth paying the heavy premium for a dedicated physical interface.
We may shed the need for accurate targets, but I would imagine there would be atleast a need for feedback that is not demonstrated in the 'vision.' But that leads to your second point.
There's been anecdotal evidence that young children can just figure these interfaces out and I'm wondering if it's true that the confidence may come from learning and built up trust. I'm doubtful that the built up trust can account for 100% of the confidence. Touch based interfaces are largely form-less designs. There are an infinite amount of affordances offered by a touch screen. As it stands now, all of these affordances are defined by the visual appearance and lack the 'resolution' so to speak, that a physical form can provide.
You're being snarky about Siri for some reason. But what you describe is in fact similar to the Ground Proximity Warning System, a feature of modern aircraft. When the computer announces "pull up! pull up! terrain!" you can be certain the pilot pays heed.
Another major problem with "research visions" like this is that they portray a thoroughly "bourgeois" future. We know already that in order for every human on this planet to have basic needs taken care of, highly consumptive 1st world lifestyles like the one portrayed in the video will need to be replaced. If you've ever built anything, you'll know that it takes an immense amount of resources to obtain that kind of polish. I know that some designers like clean, shiny things, but perpetuating the meme that the future won't be characterized by rough-edges is escapist if not simply irresponsible. If we don't imagine a future for ourselves that involves patterns of behavior that are conducive to conservation of resources and supply-chain+community resilience, then I'm afraid that the only people using tools other than shovels and guns will be a super-elite living in fortified micro-cities (so perhaps it's accurate after all).
For those sympathetic to the argument of the OP, you may be interested in Bill Buxton's papers on bi-manual interaction. Bill is a huge (and early) proponent of this point of view (that computer interfaces should make full use of the capabilities of the human body): http://www.billbuxton.com/papers.html#anchor1442822
Wise words of Bruce Sterling in "Shaping Things" (MIT Press):
"We need to understand technology with a depth of maturity that mankind has never shown before. We need to stop fussing over mere decade-long "Ages" and realize that there are only three basic kinds of "technology" truly worthy of civilized use. None one them are entirely possible yet.
1. The first kind, and likely the most sensible one, is technology that can eventually rot and go away all by itself. Its materials and processes are biodegradable, so it's an auto-recycling technology. The natural environment can do this kind of work for itself, while also producing complicated forests, grasslands and coral reefs, so, someday, an artificial environment ought to be able to biomimetically mimic that achievement. This doesn't mean merely using available "natural materials" that we repurpose from crops or wilderness. It means room-temperature industrial assembly without toxins. [...]
2. The second kind of technology is monumental. These are artifacts deliberately built to outlast the passage of time. This is very hard to do and much over-estimated. Many objects we consider timeless monuments, such as the Great Pyramid and the Roman Colosseum, are in fact ruins. They no longer serve their original purposes: a royal tomb and a giant urban playground, and they no longer look remotely like the did when their original builders finally dusted off their hands and demanded their pay. Bat at least these "monuments" don't crumble unpredictably, leach into the water table and emit carcinogens while they offgas. [...]
3. The last kind of decent technology is the kind I have tried to haltingly describe here. It's a fully documented, trackable, searchable technology. This whirring, ultra-buzzy technology can keep track of all its moving parts and, when its time inevitably comes, it would have the grace and power to turn itself in at the gates of the junkyard and suffer itself to be mindfully pulled apart. It's a toy box for inventive, meddlesome humankind that can put its own toys neatly and safely away. [...]"
Disclaimer: I used to work in the group that produced this video.
You have to remember, that while the "wow" factor (to some folks) is the screens and form factors, this video is made by a group in Office - the things they're really researching and trying to demonstrate is the vision of how your personal information and your "work" information (i.e. your social circles, your coworkers, your job) interact with each other.
How can context really be used effectively with productivity in an office setting? Context is this huge term here - device form factor, the people you're with, the things you're doing, where you're at; there is a ton of information available to apps / services now about who you are, what you're doing, etc - what are scenarios in which that information is actually combined and put to good use?
They really should've made the Director's Commentary to go along with this, there's a lot of research and data behind this video along with the special effects.
This was similar to my own reaction , that these concept videos don't look far enough forward.
And, maybe because I'm just a born contrarian, as the world moves toward touch-based direct-manipulation paradigms, I've personally been moving toward a more tactile, indirect paradigm. I recently bought a mechanical-switch keyboard, for example, that I'm growing more and more fond of every day. I've also started looking for a mouse that feels better in the hand, with a better weight, and better tactility to the button clicks.
The lack of tactility in touch screen keyboards has always been especially annoying to me. There's just so much information there between my fingers and the keys. I mean, there's an entire state -- the reassuring feeling of fingers resting on keys -- that's completely missing.
I accept the compromise in a phone, something that needs to fit in my pocket so I can carry it around all the time. But this makes me lament the rise of tablet computing. This is the sort of place that I refer to when I talk about tablets privileging consumption over production.
I don't think the problem is relegated to UI hardware, though. I think part of what's holding back a lot richer and more meaningful social interaction online is the fact that current social networking paradigms map better to data than to human psychology. It's the parallel problem of fitting the tool to the problem, but not the user.
I'm not sure I agree with the direction he points to (if I understand him correctly). Making our digital tools act and feel more like real, physical objects is akin to 3D skeuomorphism. It's like making a device to drive nails that looks like a human fist, but bigger and harder. Better, I think, to figure out new ways to take advantage of the full potential of our senses and bodies to manipulate digital objects in ways that aren't possible with physical objects. And, please, Minority Report is not it.
"""these concept videos don't look far enough forward."""
If you look at any 'visions of the future' from the last 150 years, their visions retrospectively look silly and naive. From a forward looking perspective, though, would a true vision of the future make sense to a person seeing it? Maybe a clunky 'TV + rotary telephone' 60's vision of videophones would be more understandable/realistic/visionary than the sight of an iphone with a forward facing camera...?
Incremental steps...there's no point in looking too far forward or you wont get anywhere.
Did you even read the essay? Visions of the future are critical in shaping the focus of today's research. GUIs, laptops, etc. came almost entirely out of Alan Kay's mind. Visions of the future that just plain suck, such as Microsoft's, then become misguiding.
Excellent point on the skeuomorphism, though I'm not sure what the alternative is. He mainly focuses on all these things we don't have to learn to know how to use, but for any kind of sophisticated functionality I think some learning will be required and it's more about making it flexible and easy to learn, and utilizing additional channels for information in both directions.
Man, Bret Victor seems to have some of the most consistently interesting and inspiring articles I've seen.
I suppose he's just pointing out one area of the future to think about, but I wish he'd mentioned other ideas. I think voice and language, in particular, have some of the most room to grow to make interfaces more intuitive.
edit to add: Along this line, I've often wondered if it'd be worth learning Lojban to interact with the computer more easily. Supposedly the language is perfectly regular and well suited to that sort of thing, but I don't know for sure.
It could be easier to teach humans Lojban than computers English (or however many other languages).
Computers can probably learn English as soon as they have learned Lojban. (Does that make sense? Assume a program that can understand Lojban. Then write a program, in Lojban, that understands the basic English grammar and the major exceptions. Then add the ability to deduce minor exceptions.)
The author makes a very valid point, and it would be quite interesting to see what kinds of tactile UI designs might be achieved. But I think there's an important distinction to be made in how we build tools to solve physical problems and how we build tools to solve conceptual ones.
Apart from purely remediative technologies such as Braille, I can't think of any technology from any era of human history in which conceptual information has ever been conveyed via the tactile sense. There have never been tactile clocks, tactile books, or any kind of tactile language. When human minds attempt to import ideas from the outside word, they use the eyes and ears, not the hands.
There's certainly a real problem with the UIs presented in the MS video, but it's not that they're visually-oriented. It's that they're designed to appeal to the eyes themselves, and fail to encode information in a way that's optimally suited to the mind. The aesthetics of the UIs in that video are stunningly beautiful, but I have no idea from looking at them how I would use them as tools; each notification, dialog box, and prompt for input seems fine in isolation, but when I try to conceptually 'zoom out' and understand how each function integrates into a workflow that allows me to apply my capabilities toward fulfilling my needs, I'm completely at a loss.
There seems to be an unfortunate trend toward pure visual aesthetics in the software industry today - perhaps a cargo-cult attempt to emulate some of Apple's successes - and MS seems to be suffering from it almost as badly as the Ubuntu and Gnome folks.
You should check out the book The Myth of the Paperless Office. They report research where they gave folks tasks like writing a summary of several magazine articles and one group did it all on a computer and the other did it on paper and they watched how people actually worked. There was a lot of subtle physical interactions in the paper group, such as moving different articles closer and farther away on the table that the computer group tried to do analogues of and failed because of the limitations of the medium. So it's not just the eyes and ears.
The position of the paper on the table seems like an entirely visual variable. What information about the documents or the content within them did the latter group ascertain through tactile senses?
I'd assume that the amount of information about the content that was acquired tactilely was none. Feeling the paper can only give you information about the paper medium itself, not the ideas encoded in written language on it - conceptual content is non-tangible, by definition.
So the mistake the computer group made was to try to model the human-paper interaction with software, when software isn't made out of paper. They should have attempted to figure out what human-content interaction was being proxied via re-positioning the paper, and modeled that in the software.
The iPad is really, really awesome. But. All that's really changed is that they've added an extra finger. (sure there are three- and four-finger gestures but those just boil down to a different kind of single-finger gesture)
Sadly, we're probably going to have to wait for the advent of supersubstances that can dynamically reconfigure their physical characteristics before we get beyond the finger-and-eye, which I doubt will happen in my lifetime (tears).
This technology has the capacity to bring us beyond "pictures under glass", and seems ready for integration in today's devices, with proper OS and API support.
I could see combining an e-ink display with this kind of tactile feedback surface to replace the user-exposed lower half of a laptop with a device capable of contextual interfaces. Something like this would offer great potential benefits to the user, with no apparent drawbacks.
Love the effort put into the presentation of this blog post.
Although I personally love all the shiny finger gestures, must agree that this "vision" is only a sexy marketing trick and contains very little actual innovation, and probably even less actual innovation that Microsoft will actually build in the near future, or the long future.
As per the abundance of motors skills that we have, it would indeed be lovely to have those utilized in the future, along with voice and vision, all combined in some complexly simple and elegant way of interacting. Baby steps at a time?
The OP is rehashing the concepts around pervasive or ubiquitous computing: the notion that computing will expand out to meet us in tangible products, as opposed to being solely accessed on dedicated computing devices.
There's been much more than "a smattering" of work in this area. Lots of really smart industrial designers and engineers have been working on these ideas for quite some time. I personally based my Industrial Design degree thesis around these concepts almost 12 years ago. Hiroshi Ishii’s Tangible Media Group at the MIT Media Lab comes to mind. The Ambient Devices Orb was a well-covered, if early and underdeveloped, attempt to bring a consumer pervasive computing device to market.
These products are here today and will continue to emerge. A recent example would be the thermostat from Nest Labs, a device that beautifully marries the industrial design of Henry Dreyfuss’ Honeywell round thermostat with a digital display, the tangible and intangible interfaces working seamlessly in concert.
Yeah, I was going to bring up the Media Lab and TMG myself (I was a solder monkey there for a while as an undergrad). Go look at some of the stuff they have on their webpage.
I would submit that part of the problem is that nobody really has a clue how to use your hands. Are we going to have a thingy for every subtask that we want to do? Are our computer workstations going to resemble carpenter workbenches? Probably not, if for no other reason that lack of cost effectiveness. We've got something like the Wiimote as pretty much the epitome of hand-based interaction, but it's not very precise for anything that isn't a game.
I don't mean this as a criticism of the post, I mean it as a stab at an explanation. It is a good point and I've been complaining about the primitive point and grunt interfaces we've had for a while, but it's not even remotely clear where to go from here without (touchscreens are only an incremental point & grunt improvement over mice, you get a couple more gestures at the c another huge leap in processing power and hardware, at the minimum encompassing some sort of 3D glasses overlay for augmented reality or something.
: The mouse is point & grunt. You get one point of focus and 1 - 5 buttons (including the mousewheel as up, down, and click). For as excited as some people have been about touchscreens, they're only a marginal improvement if they're even that; you still have only a couple kinds of "clicks", and you lose a lot on the precision of your pointing. Interfaces have papered that over by being designed for your even-more-amorphous-than-usual grunting, but when you look for it you realize that touchscreens are a huge step back on precision. They'll probably have a place in our lives for a long time but they are hardly the final answer to all problems, and trying to remove the touchscreen and read vague gestures directly has even bigger precision problems.
A relatively near-term possibility is the recognition of natural gestures. If it's too loud, I pat my hand in the air downward a few times or put a finger to my lips for mute. If it's too bright, I shield my eyes. If it's too hot, I jerk my hand back. If I want to see the time, I look at the clock, and it turns on.
There's also plenty of room for big, dumb controls too. Hit the phone to make it stop ringing. Or squeeze it or shake it or tap it on your palm like a pack of cigarettes.
Similarly, how would we use touch feedback? If physics and technology was no objection, would you want your phone to have more mass and inertia when you turn it when it has more email, and be less massive when you've read it?
If you could simulate sloshing water in a glass, what general infomanipulation would map well to that feeling?
What sort of mapping is there between hand manipulation and "next song" or "post Facebook status"?
I'm actually surprised no one has mentioned Rainbows End by Vernor Vinge yet. He actually presents a vision of a very natural, expressive near future UI.
While he doesn't go into technical details about everything, he does describe interacting with "Ubiquity" through small gestures throughout the body, whether small shrugs or interacting with hands. Further, he touches on the issues surrounding flat interfaces and even the virtual 3D interfaces.
the problem with revolutionary user interfaces is that nobody knows how to use them. when you see a "picture under glass" of a piano keyboard, you know that in order to make noise you tap the keys. if your interface is a minor incremental change from the status quo, it doesn't require education.
this vision of the future isn't just cool, it's relatable. anybody can look at the products displayed there and think "hey, i know how to use that". if you dream up some amazing new tactile user experience, it might be revolutionary but will people understand it?
Not only that, but you might not be able to see the interfaces of the future. A couple that come to mind:
mind reading interfaces (ECG Headsets)
camera based (like the kinect)
Not to mention the leaps that machine learning will bring. As we have more and more semantic interfaces with technology things that seem magical now will soon be routine.
What this article does inspire for me is the need for better 3D prototyping with home 3D printers and CNC machines. These are the tools you need to come up with better touch interfaces, not yet another SDK. I think we will see more and more of those as well, so here's hoping for a brighter future for interaction design!
It'd also be fairly hard to convey haptic interaction.
More fundamentally, it seems to me like there's no point in showing us something, even if it's achievable, if we have no frame of reference for understanding it, or believing that it's possible. 'Visions' aren't really visions if we can't imagine it happening. Seeing Avatar didn't make me go "wow...one day", seeing this video does.
We can learn what we don't understand. To use a current example, in modern user interfaces there are two types of metaphors: the ones you can deduce by looking at them (a push button affords to be pushed) and the ones you learn (clicking underlined words was not obvious not so long ago).
If it is done well, they should understand it better.
Think of a progression from the most abstract interface ("C:\>") to the most concrete (hold hammer, hit things). It's implicit in the article that future visions should be moving further along that progression.
Has there been any interaction research done on using something like a stress ball as an interaction device for digital environment? In my imagination a ball would have standard accelerometers and gyroscopes, but in addition fine-grained sensing capabilities to sense different kind of grips. It could also provide tactile feedback.
Well, this is supposed to be a video about the future of interaction design and not the future in general. But I have two points that I want to say:
- The future technology should help the man kind be independent. It doesn't need to make you rich, but just do your own thing. I don't like that someone is driving my car or waiting me in the airport. I'll prefer that they play music or baseball.
- We don't need high tech gadget and assistance. Get out of your computer and go see the world. There are hundred of millions of people that are diabetic around the world. Go and solve that, billions and may be trillions of $$ are there.
Brief, we don't need touch screens everywhere in the future. We don't need valets, actually having them is worse for the man kind. There are huge scales problems like disease and famine and joblessness that need to be solved.
I've read many versions of this rant, written under many different authors' names over the last thirty or more years.
I'm waiting for the iteration of this rant - or even the actual UI some engineer's put together or some designer's rastubated - that shows that anyone involved has spent a single, solitary moment thinking about how a disabled person could use these interfaces. The next one will be the first one.
The more of the range of human ability an interface requires, the more human disability becomes a barrier to use it. What's so exciting about a future that shuts out people because they don't have the full range of action that some twit thought would be cool to make gratuitous use of?
No, contra Victor's ignorance, every human being does not in fact have hands or feet or normally-capable versions thereof.
I feel like I have just undergone a major epiphany. And just after the epiphany, I realized that pretty much the exact same content is hiding as one of those relatively insignificant background world-building details baked into Neal Stephenson's _Anathem_.
And in the non-fiction realm, Wii-mote and Kinect devices. We've totally got the beginnings of tactile, full-body interface technology that's just as reconfigurable and programmable as pictures under glass.
The long standing mantra has been 'form follows function.' This is predicated on a particular product or object having a form (that reflects its purpose) to begin with. As items become increasingly 'form-less'I've always wondered what this means for design. There are a lot more considerations that must be made.
I know it's been mentioned in the documentary 'Objectified,' but has anyone seen any other commentary?
It's a little inconsistant as well. On the one hand, he argues that Pictures behind glass isn't where we should be heading, and instead we should come up with better visions, as an example he uses the someone who came up with the original idea for a "goddamn ipad"...
But wait, didn't he just say pictures behind glass was a bad model to work towards?
Personally I think if we could implement a fraction of the things in that vision video, the world would be a better place for it. If some of the things don't work, or the interaction feels wrong... we can always change the vision.
The Apple Knowledge Navigator doesn't resemble the iPhone at all... But it was/is a good vision to work towards.
You are taking that first bit out of context. H is pretty clear in the article what he means. He believes that pictures behind glass are a transitionary phase and not something we should work toward, because we already have them. The Dynabook was a vision of the iPad but it was a vision someone had 40 years ago. His point is that Alan Kay's vision truly was visionary in that he saw a future interface that did not come to pass for decades, but that Microsoft's vision here is not truly visionary because it basically shows you where we are today.
As I said in my other response, there is a lot in that video that is visionary. Alan Kay's drawing had a keyboard in it... That's not visionary... he's just copying the typewriter - what a stupid vision!?
Augmented reality, Proximity networking, transparent computers, visual recognition, AI, interfacing with the kitchen... seriously we are saying "Not visionary" because there are glass interfaces with hands swiping on them...
He was an UX engineer at Apple. He probably was involved with the "goddamn ipad". He is not critizicing the Interface Method perse, since it is probably the best we can do right now given the size constraints and technological capabilities we have right now.
What he says however is that the long run, pictures behind the glass cannot be the solution. Or better said: shouldnt be.
I know who he is and where he's worked. But authority != always right
The vision held a lot more than just swiping on glass, there were spacially aware devices, proximity networking, augmented reality... But that's not vision worthy... because the ipad already exists!?
People are talking like, if you get a vision wrong (at the start) the whole journey is pointless. I'm saying, we can always change the things that don't work. If interacting with glass stops working out... we can ditch that for something better.
Just like the iPhone/iPad doesn't fold in the middle (like the knowledge navigator vision presented).
I agree that we need more vision to expand our capabilities.
But I would add that all of our senses are ripe for innovations in interaction. There are things we have yet to accomplish in terms of audio and sight, and even smell. Our fingers get in the way when we use a touch screen-- will there be a system powerful enough to track my eye position to the point where it can assist navigation?
One input method that a colleague discussed with me recently was in using a webcam to determine where keyboard input should go. The computer would track your eyes and where your focus went, and then make the application you're looking at have focus. So that it wouldn't matter where your mouse was, but when you typed the application you're looking at would get the input.
He's making a specific point regarding the relative importance of our senses. On the surface level, vision seems to play a far more important role in human life than tactile sensation.
Losing your sight (the sense screens use) results in a normal person who has to make a few adjustments to accomodate their blindness. Losing your sense of touch (the sense that touchscreens can't use) is paradoxically far more debilitating.
"Rant" may not have been the best word choice for the article title. I got the impression that designers shouldn't be so closed minded in futuristic thinking that may or may not be that far away. And I agree with him. Nor did I realize this until reading this excellent post.
I dont think the author thinks kinect is wrong. What he criticizes is that a touch interface we you physically touch something needs to supply tactile feedback.
Kinect however is IMO not bound to the same scrutiny for tactile feedback. If you take the Kinect Sports games, you can pretty much see yourself in the game in 3d, making it really intuitive and very natural to move to the game.
I agree though, the menu navigation in Kinect is horrible.
We've been talking about gloves with accelerometers in them at work, but now that you mention it I expect that has much the same problem as Kinect, though I suppose you could add some feromagnetic material to the gloves and give at least subtle resistance and feedback through electromagnets or something...
When we get one those neural interfaces where a computer is implanted entirely in our nervous system, it'd be great if when I "wonder" what that thing taste like, I'd actually get to taste it, through the computer interfacing with my nerves.
Say I want to buy some flower for my spouse, at my the whims of my thoughts, a smattering array of flower images get into my head. I "wander" to the one I think my spouse would like, and when I feel like I want to smell it, I do. I'd then decide "Yes, buy it", and the order gets sent.
Smell is an interesting one. Memory along with visual is tied to smells. A lot of UI is based off of memory recognition. It would neat to see a UI based off of smell (i.e. smell flowers if hit right button)
That would be so cool! I bet it would also be useful for educational software. You load up the chem program for the first time in months, and immediately smell your studying chem smell, lets say hazelnut coffee, and the memories start flowing back.
I see the article as pointing to incremental improvements by advocating fine motor manipulations with feedback over touching-glass-and-seeing-the-result (which is in ways harder than hitting a keyboard and at least getting some kinesthetic feedback). But I think we need to consider things in greater generality:
1) Interface designers seem universally fixated on designs that are visually and touch/kinesthetically oriented. What's missing in this is language. In a lot of ways this winds-up with interfaces which indeed look and feel great on first blush but which become pretty crappy over time given that most sophisticated human work is tied up with using language.
2) Even the touch part of interaction seldom considers what's ergonomically sustainable. Pointing with your index finder are fabulously intuitive to start with but is something you'd get really annoyed at doing constantly. There are lots of fine motor manipulations will get hard time as well.