The quality of image (depth and rgb ) is staggering compared to everything else I've used (zed, realsense, kinect v2, and a couple of others).
They can be chained together for near microsecond timing accuracy.
The mic array is awesome.
The IMU is awesome.
But right now I've set them all back in their boxes.
Why? Because without being a vision specialist, there's nothing I can do with these devices.
The SDK and sample code is so incredibly bare bones, it is almost laughable.
There's no way to make use of those mics for anything. Its literally not in the SDK.
There's no way to make use of multiple devices in any practical manner. No point cloud merging, no calibration or shared space alignment.
Then there's the problem that buries deep in the SDK is a binary blob that is the depth engine. No source, no docs, just a black box.
Also, these cameras require a BIG gpu. Nothing is seemingly happening onboard. And you're at best limited to 2 kinects per usb3 controller.
All that said, I'm still a very happy early adopter and will continue checking in every month or two to see if they've filled in enough critical gaps for me to build on top of.
If any devs in Seattle want to collaborate (or know computer vision well enough to fill in some of these gaps for the OSS community) let me know :)
This is why I held off ordering a few. I took an hour looking through the docs and concluded the overall offering isn't fully baked yet. The hardware looks incredible, but the software looks anemic.
I must say: I'm liking this new Microsoft.
Raspberry Pi or a Jetson Nano are probably not gonna work... Seems to be x86 only.
The other increasing slice is services, where they get you to buy stuff from their storefront, or subscribe to their game pass, or office 365, or make personal interactions with Bing and Windows (so they can sell targeted ads). This naturally gives Microsoft the incentive to know everything about you.
This camera is way better quality, so it'l be neat to see the sort of projects can be done now.
Here is the actual Microsoft link just in case you don't want blog spam that is nearly unreadable on a phone.
So my next question would be why would it be significantly harder than regular facial recognition approaches as found in say OpenCV, naively one would think more data makes it easier not harder,
neglecting hardware requirements/performance, but just from accuracy perspective from a trivial refactor of current facial identification algorithms.
I’m not talking about identification of people moving or far away but straight looking at from a fairly close distance.
If you just need the same sensors for depth, but significantly cheaper, then look at occipital.
This exhibit had at least a dozen Kinect cameras: https://nysci.org/home/exhibits/connected-worlds/
>The system requirements are Windows® 10 PC or Ubuntu 18.04 LTS...