
Direct Sparse Odometry (DSO) open sourced - runesoerensen
https://github.com/JakobEngel/dso
======
runesoerensen
I'm pretty excited to see this project released. This is from the researchers
who made LSD-SLAM[0], a widely used monocular SLAM system, and DSO is showing
some very impressive results in terms of accuracy, robustness and efficiency.
For convenience, here are a few more resources related to this release:

* Video demo (with ORB-SLAM and LSD-SLAM comparisons): [https://www.youtube.com/watch?v=C6-xwSOOdqQ](https://www.youtube.com/watch?v=C6-xwSOOdqQ)

* DSO research paper: [https://arxiv.org/abs/1607.02565](https://arxiv.org/abs/1607.02565)

* A small ROS wrapper for DSO was also released yesterday: [https://github.com/JakobEngel/dso_ros](https://github.com/JakobEngel/dso_ros)

It'd be interesting to try and integrate DSO as a visual "sensor" for use with
Google Cartographer [1]. This might allow people without a LIDAR or
stereo/RGB-D/FOV cameras to perform fairly accurate and robust SLAM that's
quite possibly efficient enough to run in realtime on a smartphone (although
the rolling shutter cameras usually found in smartphones might complicate that
a bit).

[0]
[http://vision.in.tum.de/research/vslam/lsdslam](http://vision.in.tum.de/research/vslam/lsdslam)

[1]
[https://github.com/googlecartographer](https://github.com/googlecartographer)

~~~
ice109
forgive me if this is way off but does this, as a happy side-effect, solve the
inertial navigation problem (i.e. commercial grade MEMS accelerometers in
smart phones accumulate high amounts of error and therefore make intertial
tracking impossible)?

~~~
Jack000
thats called a loop closure in SLAM systems. You would typically fuse the
accelerometer data with odometry in an EKF as the best guess from dead
reckoning, then correct that guess when it disagrees with lidar/camera data.

Im not sure if DSO is a full SLAM system or if it only provides visual
odometry, however.

~~~
robotresearcher
The demo video shows VO only, with no loop closure.

------
rasz_pl
Ever since I learned about SLAM/SfM I was wondering why are both feature based
and direct methods outputting point clouds. If you detect a line of points why
store it as points when you could store a vector representation, line
primitive. A win in storage, tracking and potentially drift correction (easier
to correlate?).

------
zump
This math is crazy. How to understand it?

