How the API works: what and how it tracks. Almost all the key parts are revealed in the post, something makers of Leap didn't want to leak early and did made their point very clear multiple times throughout registration process.
It seems like they encourage sharing demos / videos, as long as we don't talk about API. So I've been talking with other friends quite freely about Leap: after all, most people are not interested in inner-workings as much as just the magic of wiggling a finger in the air and see computer respond immediately and precisely.
I asked if I could release data I've recorded off the JavaScript API and they were okay with it-- they even retweeted it. Anyways, here are a bunch of gestures as JSON arrays. I posted a demo gallery link lower in the thread as well.
Not yet. The idea is to draw variable width ribbons in the air - but given my total lack of experience in programatic 3D modelling, I'm moving very slowly towards this goal.
I have had one way before they shipped out the SDK, and all I have to say is that it is quite a bit more janky than the hype lets it seem.
Fingers will disappear without notice when nothing all that crazy is happening and the frame rate of the device (which is speced at 120+ fps) is much closer to around 45-55 fps. This leads so some major problems with long term finger acquisition that has to be handled by the developer. Quite frustrating to do things yourself that should be handled by the SDK.
While I understand that this SDK batch is a "beta/alpha" test, it is much buggier than it should be. The SDK will hang the entire OS quite often, and there is simply no way to detect if the device is actually plugged in. It will report invalid finger data rather than telling you that no device exists.
And the javascript API is so new, that it is borderline useless. It doesn't even properly report finger width, which is kind of sad since that worked many versions ago.
Overall a cool device with lots of hype, but needs a lot more work to even be mildly useful for anything more than just simple gestures.
It's possible your device has older hardware. I have one from several weeks ago. It does indeed get 120+ fps even with the JavaScript API. Fingers do drop, but I notice that problem less in "precise" mode which is closer to 60 fps.
The JavaScript API doesn't do much for you, but the data is still quite good coming over the WebSocket. I've used it to create some galleries of gestures, and a gesture diagnostic tool. I've been trying to come up with solutions to the problems you've described... if it starts working out I'll release a 3rd-party JavaScript library for better finger permanence and filtering out noisy data.
Updating to this post: I was able to test on newer Leap hardware and fingers no longer drop or jitter. I don't think a JavaScript library will be necessary to filter noise or track fingers in the production version.
You can skip data-binding, the update pattern and SVG for this kind of step-by-step animation. I use Canvas to render these and just a few d3 functions (scales, extents, json). Here's another example to learn from that uses similar techniques to get good performance out of Canvas:
My big question is really a paraphrase of Douglas Adams -
Radio had advanced beyond touchscreen and into motion detection.
It meant you could control the radio with minimal effort
but had to sit annoyingly still if you wanted to keep
listening to the same channel.
I can see it working like the Kinect - really useful in a specific and narrow use case but there is a reason we use pens and not paint brushes for writing. Similarly this does not seem like a tool that is easy to use for say to day tasks.
If you have an informed (hands on) opinion to the contrary I would be very interested
Main problem is there is that as all information used by the leap is continuous in nature there is essentially no (forseeable) way to unequivocally convey the intended start and end of a gesture.
I personally imagine an interface not unlike a wand - people might carry around a pointing stick of some kind, with a button built into the handle to confirm gestures either on or off (at least until we can start doing crazy things like gauging intent - what if the computer only responded to your motions if it knew you were looking at it?).
Try to move away from the PC analogy. Imagine a 30" display, running iPad applications. Your input consists of the exact same gestures, but now in the air instead of touching the screen.
Painting, drum-machines, games and many more applications come to mind. DJs moving from turning knobs on impenetrable devices to conducting a live mix on a huge ass screen? Yes, please.
It seems this is useful for interpreting sign language - and may be another GUI for talking to a computer - but if you have sign language you can type ... Although I can see the value in accurate gesturing in a room to say turn off the lights.
If is so precise I think gesture with less movements are the best. To have the same value of using a mouse for me the best could be to have a finger pointing/moving cursor and to have the middle finger touching the thumb and closing a circle during the gesture draw. When you open the "circle" ( slightly detach the middle finger from the thumb) the gesture is complete and could be interpreted.
In the same way you could simulate single click, double clicks using always the thumb as a button, finally you could use the other fingers on the thumb to have a right click or more buttons. Maybe it's possible to simulate the wheel sliding the finger on the surface of the thumb.
If you try those movements you will understand how is easy and natural to adopt and adapt to them. Imagine to drag and drop a windows on a desktop.. point the bar close the circle to "click" then drag the window and last open the circle to release. This way it's also easy to interface normal gesture build on drawing like the existing ones for mouses and mix also movement of the other hand to obtain more complex stuff like rotations,zoom and other complex manipulations if needed.
People were skeptical about the mouse, but that was before anybody really knew how GUIs would take off. I'm not saying this is a similar advance but its very hard to predict what kind of novel interfaces or applications might be developed.
I'm sure someone probably said something similar when a touch-screen phone was proposed ("But what happens when it's in your pocket??"). I, ahem, do not have hands on experience (so am uninformed, I guess), but it's easy for me to see the large number of huge opportunities.
I think there's a fundamental difference between this, which is interacting with an empty 3D space, and all other modes of input such as keyboards, mice, touchpads and touchscreens which are interacting with a physical 2D plane.
My understanding is the area where leap motion detects the movements is fairly small. So leaning back and throwing hands in the air would not cause problems.
I miss one of these everytime I'm cooking or doing other things that require me to get my hands dirty or unavailable. I'd love to have something like this so I can answer the phone (skype!), check out at recipes/howtos, or change playlists while elbow deep in batter, or while still holding a hot soldering gun and a fiddly piece of kit.
So, as much as Gorilla Arm would be a problem for everyday/all-day use of no-touch gestural interfaces, they are a great solution for existing problems.
Suggestion for the OP - read more about computer vision.
Extracting gestures is indeed a problem. Most of the approaches I know depend on a state triggered by the appearance of a new input (in the video, when you add or a remove a finger) and then work by doing a temporal sum of the movement to get a shape.
This of course introduce problems about how fast or how slow the person draws the shape in the air - unless you trigger that when a finger is added, a finger is removed (as explained before) OR when you have just successfully detected a gesture - I don't mean identified it, but a quick deceleration of the finger followed by a short immobilization of the finger can reset the "frame of reading".
You may or may not have successfully grasped what was before that shape, but then an human will usually stop and try again so you get to join the right "frame of reading"
I've done a little work (computer vision MA thesis) on using Gestalt perceptual grouping on 3d+t (video) imaging. The goal was automating sign language interpretation (especially when shapes are drawn in the air, something very popular with the French Sign Language - and therefore I suppose with the American Sign Language considering how close they are linguistically)
However we were far from that in 2003, and we used webcams only. A lot of work went to separate each finger - depending on many things on its relative position to other, ie at the extremity of the row you either have the index or the pinky, and you guess which one if you know which hand it is, and which side is facing the camera)
I don't think it is or even it was that innovative. I've stopped working on that, so I guess there must have been lot of new innovative approaches. So once again, go read more about computer vision. It's fascinating!
I'd be happy to send anyone a copy, but it's in french :-)
Its innovative because its a solid implementation. You can write all the papers you want on the subject but until somebody creates an accurate, low cost application of it then its as good as nothing.
This has been tried many times before but this is the first product i've seen that is accurate, extendable (with sdk), and offered at a decent cost.
This is apparently a slick hardware implementation, accurate and all.
But after reading the OP article, you'll realize that all he did was to play with the hello word like demo given by the SDK and create nothing (and BTW broke the SDK agreement giving to much details about how it works since he couldn't do anything else with it - oh, I forgot he complained about the bent mini usb cable too)
So no, I would not call what he did innovative by any stretch of imagination. Using the algorithms I proposed, he could have done something better than hello world. At least I had the excuse of no real hardware implementation - I would have killed for something like that Leap thingy.
I had the opportunity to get some hands on time with the Leap before Christmas. Leap Motion sponsored a hackathon at my school and brought in dev boards for everyone to borrow, although they were not the completed product like is pictured in this article.
I cannot tell you how incredible this product is. I'm a first year CS student, and I've never done anything even remotely close to gesture tracking before. But at the end of the night, I was able to play rock paper scissors with my computer. The API is that simple to use.
Yet, as mentioned, it's so incredibly accurate. One of the biggest bug we faced was even when we thought our hands were still, the device still registered imperceptible movements which were translated into false moves.
Overall it's a great product, especially for the price.
There's a paper [1] that looks back at a decade of gesture tracking research and talks about the end system: "an automatic Australian sign language (Auslan) recognition system, which tracks multiple target objects (the face and hands) throughout an image sequence and extracts features for the recognition of sign phrases." which achieves "Using a known grammar, the system achieved over 97% recognition rate on a sentence level and 99% success rate at a word level."
While this is about the extraction of finger gestures from 2D video, many of the techniques translate forward and are applicable to 3D point cloud data (if that is what the raw leap motion device exposes).
I also have one of these through the developer program. It is very neat, but it does not seem ready for prime time. When it is locked on to your fingers, it is fast; shockingly and amazingly fast and accurate. However, it drops fingers all the time; it almost never gets thumbs.
Even doing the calibration I could never going to all four corners of a 24inch screen. I suspect they will get it ironed out in the end, it does seem like AI/software issues rather than hardware issues.
I will say, that when it's working its really magical feeling. It feels accurate and like I am truly controlling something; but beyond that it didn't feel nearly robust enough for real world use.
Seriously. I have a very niche (but killer--at least for me) application for a Leap Motion–Oculus Rift combo. I can't wait to give it a go, but I'm waiting for the consumer versions to be released first.
What people never seem to get in the Minority Report is, it’s not cool because he used his hands.
It was cool because it had kick ass software behind it that could do all the work.
I see this continuously with things like glass/mirrors that can be touch screens etc
They look awesome because the (imaginary) software demoing it does awesome things.
Leap could be a cool device, but you’ll need to think outside the box to see how.
My fat ass is not going to wave at anything that it can do with a mouse let along the quicker speed at which we can type/shortcut/mouse compared to physical movement.
Personally if I was a developer I'd look at things totally new.
This gives me some good/interesting ideas. You could mount it under a cabinet or on a wall to control the lights/music/alarm system/etc... Or mount it to your shirt to control your cell phone + bluetooth headset. Mount it to your door / car and use gestures as keys. Build it in to the armrest of your couch so you can control your entertainment center.
Stop thinking about it as a computer accessory and start thinking about it as a general UI device that can be used in everyday situations.
It also might be something that works well with a mouse or touchscreen, redesigned with the new interaction in mind.
For example, most gamers would say 3D shooters work well from the keyboard. However, if you could just point your index finger and then simulate the recoil motion (saying 'poof' would be optional), I think one could make a nice shooter (probably a bit slow, as the software could only simulate firing after it detected the 'recoil', but if all users suffer from that, it can be designed around)
Here's my big question on the interfaces people are making for the leap motion:
Everyone seems to be trying to replicate iPad gestures in 3D, e.g., point your finger at something, drag something around by pointing.
How about instead we create a virtual hand that shows up in your screen and it mirrors the movements of your real hand and you use that to interact with objects on the screen?
I just think it would be awesome to be able to virtually reach into your screen and move things around! And it seems like it would be quite intuitive, no?
As some examples: I'm picturing moving your hand to the top of a window, make a grabbing motion and grab the top of a window and move it around. Grab the corner of a window to resize. You could even have the hand type on a virtual keyboard shown inside the screen. What do you guys think?
It feels like you're taking a 3D interaction and smashing it back down into 2D. While neat, it's unlikely to catch on.
What would be amazing is using this to interact with a 3D display system. Not the 3D in a box that you get with TV's, but something closer to what we often think of holograms.
Holograms are probably a ways off (though they are doing interesting things with targeting light at retinas).
The Occulus Rift though (as others have mentioned) may be able to do something truly awesome when paired with an interface like this.
Both devices are incredibly exciting. If they manage to mesh up...
Good point. I think this would complement the Occulous rift perfectly. You could actually use your hands to interact with the enviroment with centimeter precision! Let's hope someone thinks of this.
I'm excited about the Leap because I rapidly transition my hands from the keyboard to the mouse and back again. That takes a lot of time and the mouse is hard on my wrists. I would much rather "mouse in the air" even if I lost some precision. My productivity would soar.
Furthermore, my pinkies and thumbs take a beating pressing the command, control, and shift keys. I would much rather wave my thumb or pinky in a particular direction to get those modifiers. This may not be possible with the current Leap, but will no doubt be possible soon.
It seems like a fit for rough 3D sculpting, but not CAD work, in my opinion. CAD generally requires inputting precise dimensions and perfect angles. Artists and sculptors however, will have fun making 3D models by hand.
It's nice. Very simple: plastic case, black plastic on top, black rubbery base on the bottom. Two USB ports on the side, but they are hardly noticeable. Overall, it's one little brick, and there are no moving parts, so construction is very simple and works very well.
Is it really USB 3.0? That sounds ... over-speced to the extreme, especially considering that USB 3.0 ports are still quite rare. I couldn't find any indication on Leap's site (or Wikipedia) that the Leap uses 3.0.
The Leap uses quite a lot of bandwidth, and yes, I'm sure it's USB 3.0 because it uses a micro USB 3.0 port, which is backwards compatible with micro USB 2.0 but physically longer.
The device itself is very small and well-made. It could do with being mounted on something, as it easily moves around the table if you pull on the USB lead.
I have a developer unit and can confirm: it's legit. I don't want to go into too much detail because of the developer agreement, but think it is best characterized as like a "useful, portable, affordable, easy to integrate kinect". I'm having a lot of fun messing around with it, though I don't have much to share yet as other work has taken precedence. SOON.
I spent a week or so playing with a leap device recently, and had a great time developing a sort of Minority Report style interface for some display screens in our office recently. I did end up writing all of the gesture recognition from scratch which was a bit more difficult than I expected. :)
Not so much a touch as a point and gesture deal. Touch screens force you to have the screen close, or your extending your arm too far to be comfortable. The Leap would let you "touch" (and its in quotes because you don't touch anything) what would be the screen if it were close, and have that affect the screen as if you did touch it. So it is fine motor skills gesture based (unlike Kinect et al.) and yet you don't compromise the focus/size issue the screen.
Yes, as far as I understand it, it is an IR LED light source with a camera sensor and some simple optics. I don't know if it has stereo vision or not. Although even if it did, since it is such a small unit, the spatial difference between the two views would be quite low which would mean advantages of stereo would be quite negligible in terms of occlusion, etc.
Here are some images of an early version. Basically it's two webcam-grade cameras and 3 IR LEDs. http://monleap.fr/537-kit-developpement-leap-motion-photos/ That's pretty much it. It makes me wonder why people are getting so excited about this ...
I think it's opposite. At university we've written software for tablet (like wacom) for users with Parkinson's. Because we've full information of movement (speed, pressure, trajectory), not just result (like in ocr) we could better interpret input.
BTW, this post reveals a bit more information than Leap Motion would like developers to reveal. Essentially, OP broke the agreement.