Hacker News new | past | comments | ask | show | jobs | submit login
Creating aerial imagery with a bike helmet camera (GoPro) and OpenDroneMap (jakecoppinger.com)
223 points by jakecopp on Dec 11, 2022 | hide | past | favorite | 24 comments



I wrote this guide on generating 3D models and orthorectified imagery from 360 degree cameras. Usually this sort of thing is a lot easier with drones, but there are a lot of places you can't fly drones!

I'm hoping this makes mapping curb/street/parking data more accessible.

For example, OpenStreetMap has a new street parking spec which could make use of lots of street imagery! https://wiki.openstreetmap.org/wiki/Street_parking


I've spent a lot of time trying to get 3d models of rural roads with a truck-mounted go-pro and photogrammetry.

At the end of the day, I couldn't get it to work despite having some great data sets. I talked to someone who did manage to get this work, proved down to centimeter accuracy but they were only able to get it to work with great difficulty and a rig with 5 different cameras. They said by the time you do all that, you'll have invested enough that lidar would make as much sense cost-wise.

I believe the problem is at the MVS step, but it seems like none of the libraries handle this use case very well for some reason, despite having great overlapping features.

When taken on the ground, doing orbits looking in toward a common object works reasonably well, and colmap handled this the best. Especially with exhaustive matching, but it takes DAYS to process. (Or, 4-6 hours if you REALLY optimize the process.)

But moving in a line, no matter how much overlap of the features you get, just doesn't work that well, and I do not understand why. I think it fails in the feature matching step and it makes no sense since it has really similar images to try and match if you capture at any reasonable frequency.

I think THIS worked better than I would have expected because of the 3d aspect where the lens gives you multiple perspectives on objects along the sides.


The reason is that the underlying matrices are rank deficient and the calculation gets numerically unstable to impossible, if you don't vary all kinds of degrees of freedom in the motion. Just changing one axis is not enough. If you look at 3D reconstruction image sequences from around 2000, you get motion sick from looking at them, because they made sure to always also vary rotation and translation in more than one direction. One video with some explanations I found:

https://youtu.be/AQfRdr_gZ8g


This sounds correct to me and aligns with a lot of other results from my experiments. I thank you for the insight.

The only think I'll add is that it can't be an impossible task, because I can look at the same sequence of images and get a pretty good understanding of the 3D topography. That said, some things DON'T come through very well, like slopes so I think starting with priors that contain the gimbal position in addition to the GPS-derived location would really help pick up slopes, if that information were actually included in the pipeline.

As advanced as it all is right now, computer vision still has a long way to go I think, with some very fruitful ideas yet to be implemented.


I think that is the strength of the deep learning methods that have been thrown at the problem in the meantime. They learn to pick a reasonable choice out of the null space by looking at the scene. Only good if the assumptions encoded into the network actually hold though. Kind of applies to us deep human networks as well. That's why optical illusions are fun to see.


> But moving in a line, no matter how much overlap of the features you get, just doesn't work that well, and I do not understand why.

How far apart were the photos taken? I think OpenDroneMap needs lots of overlap with drone photos, and even more with spherical photos to extract features well.

I haven't experimented with a non-360 degree camera - the benefit of a 360 degree camera is you effectively double the photos you get (front/back). Cycling or walking also ensures huge overlap.


When you said 360 camera, I had forgotten about the el cheapo versions with only 2 lenses. I'm used to 8-16 lens setups doing live action 360 VR video. It's been years now since I was involved in that, but the 16 camera system was the one from Google, and because of the extreme overlap, every object was visible in at least 4 lenses at any time. Because of this, it was some of the most compelling 3D available at the time. Funny, with 16 cameras each with their own microSD card that had to be data managed (puke), we felt it wasn't enough as it only produced a cylindrical output and not a full sphere. So, like the glutton for punishment fools we were, we mounted an additional 6 cameras to the rig to capture the zenith/nadir and would stitch/composite into the final.

All of that to say, generating 3D photogrammetry images was something we never even considered as a purpose of the footage. We were soley focused on creating VR content. Never mind that some of us involved absolutely would acquire footage for the sole purpose of creating 3D models to be used in SFX for post work. I guess sometimes, you're just too close to something to see it?!?!


I can confidently say that sufficient overlap was not the issues. I experimented a lot, but I definitely had well over 99% overlap on some runs without really any improvement. I suspect it has to do with the training data used for feature matching.


Been meaning to play with opendronemap for quite a while now. Thanks for the write up.

Regarding the fisheye correction, may be you can find some code and presets from the gyroflow project? they have presets for a lot of camera lenses out there, and they do fisheye correction and stabilization real time on GPU: https://github.com/gyroflow/gyroflow/blob/master/src/core/ca...


Thanks! OpenDroneMap is an awful lot of fun to use, I got into it when making models with a DJI Mini Pro 3.

Wow that is some awesome software! I know OpenDroneMap does have a library of lenses but it doesn't seem to be able to detect the GoPro Fusion. I can't find the code with the presets right now, but https://docs.opendronemap.org/arguments/camera-lens/ is the documentation for selecting the lens type.

My understanding is that if it can't find a lens preset it's not an issue; it will instead figure out the lens geometry which it also can output. I haven't tinkered with the code though and I'm still getting to know the software!


This is fantastic! Great write up. A question: any idea what sort of accuracy you’re getting at the end of the process with respect to things like carriageway widths? As you allude to in the article, this could be a great tool for advocacy groups to identify where a road is wide enough to accommodate cycle lanes, or where footpath widths are substandard. I was just wondering if it’s possible to get the +/- 10cm accuracy that’d be ideal for doing this sort of work. Thanks!


> any idea what sort of accuracy you’re getting at the end of the process with respect to things like carriageway widths?

Good question! I'll go out there later today and get an on the ground measurement and get back to you on this thread (and upload the blog post).

> great tool for advocacy groups to identify where a road is wide enough to accommodate cycle lanes

Agreed :D


I added a section on taking measurements (under heading "Output accuracy: taking measurements"), it's pretty damn accurate!

Large path:

- OpenDroneMap: 4.54 metres

- Tape measure: 4.56 metres

Small path:

- OpenDroneMap: 1.89 metres (though this varies slightly due to distortions)

- Tape measure: 1.89 metres

> I was just wondering if it’s possible to get the +/- 10cm accuracy

I'm quite confident this is possible - maybe do a few sanity checks though!


Brilliant, thank you so much. Impressive numbers!


Hey nice writeup. I saw you linked to Mapillary on there, have you used this and the API? The CV side can really add to mapping functionality.


Yep I've contributed a fair few Mapillary images! I like that OpenStreetMap integrates with it well; I don't like that it no longer shows _who_ contributed an image (whether in the OSM viewer or the Mapillary viewer).

I believe OpenDroneMap uses OpenSfM which Mapillary open sourced:

https://blog.mapillary.com/update/2016/10/31/denser-3d-point...

> OpenSfM is also used internally by the OpenDroneMap project. OpenDroneMap creates 3D models and orthophotos from drone imagery. It includes all the steps of the pipeline so one can go from images to 3D models in one command. It uses OpenSfM to get the camera positions, and can now use it to compute the dense point clouds as well.


Is this an alternative to mapillary? I'm not very motivated to contribute more images since FB bought them


[flagged]



The best example that I've seen of this, have been this: https://twitter.com/klaskarlsson/status/1583401741386936320

It's made using a GoPro on a stick and they even added a tank and a stormtrooper to the final image!


How about using NERF based techniques? Shouldn't they be much better for this use case?


You mean using something like this technique?

https://developer.nvidia.com/blog/getting-started-with-nvidi...



I've heard nerf.studio is pretty good for something like this


This is cool! I had an idea a few years ago about crowdsourcing google street view by ingesting the data from forward and rear car cameras that are growing in popularity.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: