
PointCNN – A simple and general framework for feature learning from point cloud - lainon
https://yangyanli.github.io/PointCNN/
======
andreyk
Link to Arxiv:
[https://arxiv.org/abs/1801.07791](https://arxiv.org/abs/1801.07791)

Pretty exciting, the problem of learning with Point Cloud data is still very
open and this beats the prior best work on it (PointNet++). Seems they treat
point clouds as quasi graphs based on point closeness. Promising, but it
requires nearest neighbor search which seems like it'd be expensive.

~~~
0xfaded
I’ve been working to get orb_slam to run on the raspberry pi in real-time. One
of the most interesting parts, to me, is the bag of words approach to feature
matching. It’s shockingly reliable for such a simple method.

It’s used for initialization when no world model is available, a backup for
when the motion model fails to track a new frame, for relocalization and loop
closing.

Orb_slam wouldn’t work without it, and all it is is a kmeans tree for
classifying nearest neighbors.

When I was starting out, I was looking for a SLAM system, not just visual
odometry, that could realistically be sped up for the pi. The winning factor
for orb_slam was that it used sparse simple feature points that could be
cheaply classified.

After spending almost a year working on this, I have the system running at
10fps on the kitti benchmark, which are sequences of larger images also
recorded at 10fps. VGA images can run up to 20fps depending on various trade
offs.

My present opinion is that orb features are great for mapping, but not robust
enough for reliable visual odometry.

It wouldn’t be so much of an issue to generate a local point cloud on the pi,
but processing the cloud in real time is out of the question. If someone works
out how to reliably and cheaply classify point cloud features it would be a
game changer.

------
billconan
why can this model handle permutation? whereas pointnet, the paper pointcnn
based on, used a permutation invariant function?

~~~
jimfleming
If I'm understanding correctly (I haven't read it in depth yet):

Points are processed as local clusters. Each cluster of points is ordered
according to a transformation matrix X. This produces a "canonical" ordering
so the processing of points in these local clusters do not need to be
invariant to the order since the order now has consistency and meaning.

It's kind of like placing each point into a regular grid like an image before
running the convolution. The trick is which points to put in which grid cell
which is determined by the transformation matrix X. This takes advantage of
locality which PointNet does not, if I recall correctly. By acting locally and
stacking many layers of these you can produce a hierarchy of more and more
abstract clusters of points, each with an inherent relationship to nearby
clusters of points. In addition, the transformation matrix also appears to act
as an attention over the points in the cluster.

