Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Data augmentation tools for 3d object detection
28 points by iluvdata 3 months ago | hide | past | web | favorite | 6 comments
What are good data augmentation techniques which can work with 3d data specifically depth being third dimension. Also, any pointers on tools around them would be of great help.

Thanks for sharing these links. I found https://github.com/aleju/imgaug and https://github.com/mdbloice/Augmentor to be good when you have 2D data. My challenge is that I have x,y and depth(Z) now all image transformations in colour space can't be applied to depth space so which ones would work best and any pointers on tooling around them would be helpful.

> all image transformations in colour space can't be applied to depth space

Why? If you treat the depth channel as an additional color, you might even be able to use the libraries you mentioned without modification (unless they have hard-coded assumptions on the number of color channels). Depth isn't really special, all the same ideas for data augmentation still apply. You just have to transform it with the rest of the data, so if you e.g. mirror the image, the depth gets mirrored as well.

But when RGB scales, tilts or shears then how do you mathematically move depth accordingly?

Are you talking about a depth channel on an image plane? It also largely depend on the problem but here is a few tricks that helped me for my problems (object tracking).

- Generating synthetic data is powerful if you have the depth modality as it is easy to render. Also the real/synthetic domain gap is narrow compared to RGB. I consider it as data augmentation: you usually do many renders from a single 3D model.

- If you can somehow normalize the offset (e.g. compute normals) that can help. In my case I could offset the center of the object as 0 depth and it greatly help the network to converge.

- Classic augmentations like gaussian noise, gaussian blur and also downsampling the depth helps (apply these randomly).

As for tooling, I just use numpy/pytorch for most operations and OpenGL for renders.

Regarding tools, OpenCV has a PLY-to-2D-images renderer [1].

[1]: https://github.com/opencv/opencv_contrib/blob/master/modules...

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact