Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Kinect reverse-engineered; open driver available (adafruit.com)
369 points by jgrahamc on Nov 10, 2010 | hide | past | favorite | 85 comments


The code for the camera.c is here: http://git.marcansoft.com/?p=libfreenect.git;a=blob;f=lib/ca...

I've always wondered how people reverse engineer these things. Do they just guess what the interface might be based on the chips? Or are they able to probe it somehow through the port?


Here is a very nice metaphor by Andrew Tridgell (from Samba fame) on the subject:

http://www.samba.org/ftp/tridge/misc/french_cafe.txt

"I call this method the "French Cafe technique". Imagine you wanted to learn French, and there were no books, courses etc available to teach you. You might decide to learn by flying to France and sitting in a French Cafe and just listening to the conversations around you. You take copious notes on what the customers say to the waiter and what food arrives. That way you eventually learn the words for "bread", "coffee" etc.

We use the same technique to learn about protocol additions that Microsoft makes. We use a network sniffer to listen in on conversations between Microsoft clients and servers and over time we learn the "words" for "file size", "datestamp" as we observe what is sent for each query.

Now one problem with the "French Cafe" technique is that you can only learn words that the customers use. What if you want to learn other words? Say for example you want to learn to swear in French? You would try ordering something at the cafe, then stepping on the waiters toe or poking him in the eye when he gives you your order. As you are being kicked out you take copious notes on the words he uses.

The equivalent of "swear words" in a network protocol are "error packets". When implementing Samba we need to know how to respond to error conditions. To work this out we write a program that deliberately accesses a file that doesn't exist, or uses a buffer that is too small or accesses a file we don't own. Then we watch what error code is returned for each condition, and take notes. "


Hector said in the video that he doesn't even have an Xbox. I guess that's like turning up to the cafe five minutes after opening time when it's just you and the waiter.


I don't think the xbox would be of much use in that case. Can he run a USB sniffer on an Xbox? Makes more sense to connect it to his PC and use one of the hundreds USB sniffers available.


The guy actually says in the video "this is not a sniffer, this is not a man in the middle", which is quite incredible. Just a laptop running Linux.

Look at his desk: a complete mess of wires and hardware and a single Rubiks cube. Total hacker :-)


I loved this book as a kid: http://www.amazon.com/Reversing-Secrets-Engineering-Eldad-Ei...

It's probably a bit out of date now, but my dog-eared copy is still a good read. Ah, nostalgia. There once were days when I dreamed that a CS degree would make me as a god; the silly thoughts of a child. Now I know that it is a _PhD_ which makes gods of men.


Now I know that it is a _PhD_ which makes gods of men.

I know you're joking, but...

When I was halfway through my Ph.D. I formulated a hypothesis: The proximate challenge that keeps you from graduating is that you have to write a thesis. But the ultimate challenge to getting your Ph.D. is this: You somehow have to learn to understand, deep down, that all your romantic notions about the Ph.D. are bunk, that you will be exactly the same person on the day after you get it that you were the day before, and that you need to stop waiting for the day when you feel like a god and just write something down and get on with life.

It may take you years to accept this, and it may drive you to drink, but after you get to that point you can graduate.

Only then will you be able to live with the fact that your thesis looks like crap to you. Your thesis will always look like crap to you. Either you will have figured out absolutely everything and your thesis will look incredibly boring to you, because you've moved on, or -- vastly more likely -- your thesis will look woefully incomplete because, geez, there is so much that you couldn't figure out, and you're just so stupid!

Or, most likely of all, you will think both of these things at the same time.

Similarly: Being the world's foremost expert on a particular scientific problem is a lot less exciting in real life than it seems in the movies. In fact, being on the frontier of science feels like being totally, hopelessly lost and confused. Why this came as a surprise to me I'll never know.


When I went to do mathematics at Chicago I figured I was the smartest person alive. There I was, facing the gargoyles of my dreams; a poor kid aspiring to a better life by shrugging off the accent I was born into and the mentality of defeat so common among the poor. But I had gone too far, became too confident and failed horribly. I was sure that the world had failed--I was too good--and that everything was bullshit. I left, walked away from a full scholarship because I had overcome the constrains of my life before and Mathematics and University were no different. I took a job at a small software shop in Portland, OR instead, enrolling part time at PSU doing computer science.

I failed at both, as you might expect. The world wasn't wrong, I was. While I could program, I had no discipline. While I had intellect, I had no ability to learn. The world was not wrong, I was. All of my anger and suffering and frustration were my fault. From the defeat of my new University and my new job I learned that my romantic notions of most things were not reality. Enrico Fermi, on whose stairway I bounded up, did not simply decide to conjure nuclear fission under what is now a library. He worked for years, a thing which I had never done.

The novice says to the master, "Coal is black." The master replies, "No, it is not."

The intermediate says to the master, "Coal is not black." The master replies, "Of course it is."

The masters say among themselves, "It is coal."

I hold no romantic notions as I held when I were a boy; I have not become a cynical man. Life is suffering and pain. Life is joy and love. I have built a business from nothing and sold it for a profit. I am now very poor. Life is life and that is beautiful. What we learn, what we truly learn, we so incorporate into our being that we cannot perceive it as unknown to all. We are the streams into which a man steps: never the same, yet always the same.

To gain mastery over the frontier of science is to gain mastery over nothing, over one's self. It is confusion and pain and truth and beauty.


You wrote all that as a 3rd level reply to an offhand comment in a random thread? Wow. This is the reason I keep coming back here.

Just yesterday I saw so much negativity and pettiness on another thread that I had pretty much written off HN as a lost cause.

Your post brought me back. Thanks!


It came as a surprise because up to that point, someone had the answer. Even if you had great teachers and even if you're a problem solver...at the end of the day, someone had done what you were doing.

I think that's why most people I know that are on "the cutting edge" are very humble: either they got "it" right and know 40 people just as smart that went in one of 40 equally promising directions and got it wrong. Or they're one of the 41 people still trying to figure out just where the heck they can go from this apparent dead-end.

Then along comes 42...


The hopelessness and confusion that comes at the frontier of science is precisely why I stopped studying biochemistry for my bachelors degree. By the first 300 level course, people begin to start asking relatively simple questions that are not yet known to mankind. It freaked me out. I couldn't imagine ever discovering new knowledge and subsequently dropped out and into computer science...


Ah, but the spice of life is staring into the Abyss of Unknowing and recognizing your very self in it! The most beautiful questions of mathematics and computer science so very often start out "Does there exist..." and we are left with no answer other than, "Who knows!" The world is wide and strange and we are very small indeed. That is beautiful to me.


hmmm... maybe thats the upside of a masters degree. You have that "come to jebus" moment the day after you graduate, but without all the nasty research and writing.


All of your hypothesis is in total agreement with this: http://matt.might.net/articles/phd-school-in-pictures/


Can I get a Reverse Engineering PhD?


You can study molecular biology and become an expert at reverse-engineering the most amazing machines in existence.

Though the Kinect is apparently a lot more tractable.


I've been writing some code for people working in bioinformatics recently. It's pretty similar.


I've looked into this a bit. There are three programs that I'm aware of:

The BitBlaze project at UC Berkeley. http://bitblaze.cs.berkeley.edu/

There's also http://www.cs.kent.ac.uk/people/staff/amk/ which offers a PhD studentship in "Reverse Engineering for Security."

CERIAS at Purdue will definitely have some RE related courses, e.g. http://www.cerias.purdue.edu/site/projects/detail/malware_re...


you can probe USB commands, but that takes a lot of time. much faster to connect a usb analyzer and then 'replay' to see what commands do what.

http://www.adafruit.com/blog/2010/11/09/kinect-hacking-video... (demo of analyzer)

http://github.com/adafruit/Kinect (USB log dump)


I've reverse engineered some protocols -- haven't done a USB one yet, but I'm sure the principals are similar. Grab some data, look at it (usually with a good hex editor -- last time I used ghex2), look for patterns. Usually there's some sort of packet structure, or maybe some data that looks like something in ascii -- ghex2 shows you what every 2 bytes or 4 bytes are if they were signed/unsigned ints, floats etc. Usually something will jump out at you.

It's a fun game usually.


And if you can actually input data using the protocol, you can take some standard packets and tweak a byte/short/long at a time and see what changes.

Reversing USB is the same as any protocol on top of TCP is the same as any other protocol, just with different tools.

I wish there were an open source hex editor like Hex Workshop for Windows - one of the features I loved was tagging a section of bytes with comments, and being able to use those same tags across multiple data dumps.


Particularly handy for USB on Linux is the 'usbmon' module. Which when used with a kernel that has debugfs support, you can mount debugfs and use a new enough Wireshark to monitor the USB traffic.

http://wiki.wireshark.org/CaptureSetup/USB


You could write your own by extending OSS like hexdump or hexcurse or any other hex editor. It shouldn't be to big a deal to add that functionality. It's just beautiful what you can do with open source code! Don't forget to share.


As others have pointed out, in this case he probably used a hardware USB sniffer (though he apparently doesn't own an XBox, so who knows). If your device has existing drivers that will run on an OS in a VM, you can use USB passthrough (most virtualisers including VirtualBox) to run it on top of Linux, which comes with a USB logging module called "usbmon" (http://www.mjmwired.net/kernel/Documentation/usb/usbmon.txt).

There are also software loggers for Windows such as "Snoopy Pro" but last time I tried that it dropped certain types of packets. No such issues with usbmon.


Wireshark has had support for usbmon for a few releases now too, which gives you a considerably more useful GUI for working with USB streams.

Sadly, there aren't many (any?) protocol dissectors yet, so comms with common chips like FTDI devices don't automatically translate into something human-readable like, say, HTTP conversations.


Really? That's sort of baffling. FTDI's USB chips are damn easy to work with, I'd expect dissectors for them to be out there already. I'll have to remedy that.


The README says he used the great big (~500mB) data dump of the USB data that Adafruit released the other day.


You use a USB sniffer to observe traffic between the host and the device and make guesses about how to interpret the data.


Exactly this.

I'm knee-deep in a personal project that involves reverse-engineering a USB device; since the only drivers for the device are for Windows, the solution was to virtualize Windows on a Linux host (presenting the USB device to the guest OS), fire up wireshark on the host OS (using the usbmon kernel module), interact with the Windows software and drivers as per usual, and capture anything the guest OS sends to the device for analysis.

Really simple stuff; much easier than following this stuff with scopes or logic analyzers.


Guessing the interface is like shooting in the dark. More likely is someone had access to a USB snooper like the USB Beagle - http://www.totalphase.com/products/beagle_usb480/


My last reverse engineering project consisted mostly of hours sitting at Chipotle with hex printouts and a highlighter. That was a file format though, not a protocol.


So much for "With Kinect, Microsoft built in numerous hardware and software safeguards designed to reduce the chances of product tampering" ... :-)


Microsoft had already started backpedaling with it's "that's not hacking" comment to Gamespot.

"Kinect for Xbox 360 has not been hacked--in any way--as the software and hardware that are part of Kinect for Xbox 360 have not been modified. What has happened is someone has created drivers that allow other devices to interface with the Kinect for Xbox 360. The creation of these drivers, and the use of Kinect for Xbox 360 with other devices, is unsupported. We strongly encourage customers to use Kinect for Xbox 360 with their Xbox 360 to get the best experience possible."

http://www.gamespot.com/xbox360/sports/thebiggestloserultima...


Yeah. I thought it was some exec shooting his mouth off the first time. There didn't seem to be much a of a traditional DRM reason to lock the Kinect down. So the most obvious answer is it wasn't.


At the start of a product cycle, game consoles are typically sold at a loss. Manufacturers recoup this loss through licensing fees paid by game publishers.

If Microsoft is selling the Kinect at a loss, then Microsoft will be motivated to keep it locked down to ensure that they can recoup the loss through licensing fees.


MS actually makes money on each Kinect sold. See http://www.mcvuk.com/news/41487/Kinect-priced-for-profit

So in theory each Kinect sold for hacking is money in MS's pocket.

Given that MS has bought a slew of patents all around this space, they may be playing coy here.

"Please don't hack this... please don't. Oh drats, you figured out a way around our advanced hack-proof system. But what can you do with it? Nothing probably, you surely can't think of some really cool ideas that we haven't... Oh you can give some additional hints to your Roomba from the couch with hand gestures... go on..."


I don't think this is a very convincing argument. I don't think there have been many examples of people building games for the PC using the Wii controller. I suspect that the same goes for the Kinect. The vast majority of people buying this will be people buying it for the XBOX, with a small proportion of geeks using the open source driver for interests sake, but not the sole purpose of buying it.


Sony suffered a bit with the PS3 being used by Universities as cheap super computer clusters. I believe from memory PS3 had a very low attach rate (the number of games purchased per console). Though from Sony's view point they are also trying to move TVs and ensure the success of Bluray so perhaps they were happy with a low rate.


The Kinect isn't a game console. It's possible that they are selling it at a lost, but not likely. That said, Nintendo doesn't sell game consoles at a loss (at least they didn't with the Wii). I think that by 'typically' you mean that Microsoft and Sony do it. I have no idea if Sega or ATARI ever did that for their consoles.


The Kinect isn't a game console, yes. It is still a significant piece of hardware though, and there is quite a bundle of technology inside it. Just because it is an "Add On" device does not mean they priced it as a single use device. From what I've heard it requires its own developers kit additional to the standard kits sold now, and is unique to the way Microsoft usually deals with developer extras. It sounds like they are investing in this taking off and the licensing of game studios, and the sales of those games, to subsidize the product in some degree.

To answer the second question, I know the Dreamcast was sold at a loss. The Nintendo 64 and GameCube were as well, but Nintendo decided that the Wii could be a success if they tried something new instead of trying to be bigger and better.


Selling the Wii above cost may not be standard industry practice, but it was standard practice for Nintendo. The GaneCube was the only Nintendo console that was ever sold at a loss. And it was a small loss for a very small period of time. The N64 definitely wasn't -- why do you think Nintendo used a cartridge system while everybody else was going for rotating discs?

http://www.actsofgord.com/Proclamations/chapter02.php


Given these two statements by MS, I wonder what threats they do care about. I don't understand what "tampering" means in this context.


Given these two statements by MS, I wonder what threats they do care about.

Getting laid off before retirement.


This is a misleading title. The Kinect sensor has been hacked, but Kinect proper is a combination of hardware and software. Arguably the more interesting aspects of Kinect are in the software.

That being said, this is still pretty cool, it'll be interesting to see what people come up with using this technology.


Honestly the hardware is more interesting than the software. Yeah, skeleton tracking is cool, but the hardware is capable of so much more. 3D scanning, mapping, and localization for robots come to mind. With the microphone array built in, Kinect is basically begging to be the eyes and ears of a robot.


Right now the Kinect drivers can't do anything that a pair of cheap USB webcams can't. The only advantage seems to be the on-board depth processing instead of having to use something like OpenCV. http://opencv.willowgarage.com/wiki/


Completely wrong. Kinect is not a stereo vision system. It's the first consumer application of a completely new class of cameras which directly measures depth by the time of flight of a reflected laser beam. It's literally a 320x240 array of laser rangefinders. It's called a flash LIDAR camera, and the quality and robustness of the depth data is far beyond anything achievable from stereo vision today even with high-end cameras, let alone webcams.

edit: Actually, there seems to be some confusion on the Internet about exactly how Kinect's depth sensor works, and it may be more likely based on structured light scanning than LIDAR (though Microsoft has recently acquired some LIDAR startups). Either way, though, the depth data quality is far beyond any kind of stereo vision.


From what I understand, an IR pattern is projected on to the room and the cameras pick it up. The level of distortion in the pattern shows how far the object is. NOT Lidar. Probably a laser projector though.

The tech is nice, and it's nice that it's bundled up in one affordable package, but the real power in Kinect is in the software. Right now all that the drivers return is raw video and depth. It's what you do with that data that counts.


It's a cheap and off-the-shelf colour and depth capturing from IR structured light. That's nothing to scoff at. A pair of cheap webcams only gets you trying to do real-time depth processing from passive stereo and thats not easy.


Yeah, but it does make it easier and more accessible to those who don't hack with OpenCV every day. One could also argue that one can always build digital components from scratch instead of using something like an Arduino.


A pair of cheap usb webcams are pretty suboptimal for stereo vision. To start, they aren't synchronized. Plus you have to rig them somehow to know their relative positions. They may use rolling shutter, which is bad for stereo. It's not that it can't be done, but it's not easy.


Yeah, but one of these things I can buy at the store, go home, and an hour later be doing cool stuff with.


"The Kinect sensor has been hacked"

I did not know that writing a driver for a piece of hardware can now be called hacking the chip, but well, I guess things have changed...


When it's proprietary hardware it's fair to call it a hack I think.


Oh man. Ordering one ASAP. So much cool stuff could be done with this. Going to try to create a gestures thing so I can browse my email from bed.


Could this all have been genius marketing by Microsoft?


I'm imagining a sweaty Ballmer holding an impromptu press conference tomorrow announcing a Kinect SDK screaming Developers! Developers! Developers! Developers!


I think it's probably an executive with enough power to say things but without enough knowledge to know what he's saying. That's fairly normal for large companies.


I thought it was mandatory for large companies. If lower-upper management actually had new ideas they might usurp someone elses power and then where would the top execs be? And where would the company be then!? (besides rapidly bankrupting itself through moron ideas)


Perhaps. Although there is speculation that the Kinect is a loss-leader, so that wouldn't be the smartest move.


I'm worried that this will lead to bad consequences by Microsoft to prevent the misappropriation of Kinect devices, which is very, very bad. Hopefully we'll see a teardown and a BOM sometime soon...


Gestures to browse ...ummmmm... email from bed???


I can imagine the other kind of hand-waving might confuse Kinect.


Don't hate, email is a big problem of mine. :p


I agree. How precise is the measurement of depth perception?


That was quick.


I'm actually surprised it took this long - how much more motivation did the REs need than MSFT's lawsuit threats?


He did it within 3 hours of it going on sale in Europe. That's pretty quick!


I think the internet is getting people to the point they expect it to be hacked a month before the product is actually released ;)


I should have read TFA. I was going off of time between headlines.


depressingly quick. Where was the suspense?


This is an exciting achievement and I'm very impressed that it was carried out so rapidly. Good hacking indeed!

Although, I have two questions which HN may be able to answer for me:

1) What are the benefits of the Kinect over building a servo-driven IR bar with audio yourself out of cheap commodity hardware? Is the price far less when all components are integrated? Is the construction just that much simpler?

2) Presumably now that the "easier" task of reverse engineering the comms protocol has been achieved, the next step is to understand/replicate/replace the "proprietary" algorithms in use by Microsoft that run in the XBox in order to have some meaningful interaction with the device. Is it possible to use some FOSS such as OpenCV with Kinect? I know it's early days and many of you probably haven't had time to look at the protocol yet, but I am curious.


What would be very interesting, is if you could mod the device to work with a better camera. Then, it would probably be useful in the photography industry. Maybe it would allow you to correct issues with lighting? It would allow someone editing pictures to easily select a part of the picture in the foreground or background. With selections being much easier, you could enhance specific parts of photos to make them stand out around other less-important parts without much effort at all.

I'm sure there are many more applications for this technology in the photography and even videography industry. Any ideas?


I always wanted to be able to set focus on windows by looking at them. Perhaps that can be achieved with a Kinect.


You should be able to do that with a normal webcam, since you don't need depth information. I've thought of that too; I think there will be a similar interface when heads-up displays in glasses become viable.


I haven't been paying attention, but has eyetracking software gotten that good? Can you point to people or companies who are working on this, or currently available solutions?

I've been doing some usability testing stuff and it'd be nice to have eyetracking (especially paired with click/mouseover data.) and not much less helpful head/face tracking that is built into the webcam.



GazeHawk (YC funded) is, but the software isn't open.

http://www.gazehawk.com/


If I am wearing glasses it becomes much easier (3 IR LEDs, one webcam) to track where my head is turned.


I would have thought there was some onboard CPU on the kinect, based on the power requirements (it can't be powered by a USB port alone.) If so, I suspect that any heavy lifting the unit does is probably by software that uploaded to the camera via USB at startup.

Anyone have further details?


There is an onboard CPU on the kinect, but it doesn't do all the work. It still needs the 360s CPU to finish the processing and such. As a result, Kinect games will suffer a bit visually compared to normal 360 games. It's outlined in the ars technica review (and I'm sure in other places)

http://arstechnica.com/gaming/reviews/2010/11/buy-a-house-cl...


Yes, there's at least one CPU. It looks like the Kinect performs depth extraction on-board and the 360 performs gesture recognition.

http://www.ifixit.com/Teardown/Microsoft-Kinect-Teardown/406...


Which, as stated elsewhere in the comments, is the really interesting bit of this tech.

On the otherhand, it's still an impressive hack in terms of speed.


Certainly there's some post-processing going on on the Kinect; whether it's via CPU, ASIC, DSP, or FPGA I have no idea. I'd imagine there's some signed code image that is factory-installed to some Flash or EEPROM on the Kinect; it's probably upgradeable via USB, but I don't think it can rely on the code being transferred over every time, since the posted driver gets depth images off the Kinect without doing so.

EDIT: The link posted by another commenter http://www.ifixit.com/Teardown/Microsoft-Kinect-Teardown/406... (Step 11) shows that the Prime Sense reference platform has Flash storage, almost certainly for code and calibration data.


For posterity: it was revealed that the Kinect has a Marvell PXA168 (1GHz ARM + many peripherals).

Source: http://www.eetimes.com/electronics-news/4210757/Teardown--Ki...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: