
Motion Estimation with Optical Flow - ole_gooner
https://blog.nanonets.com/optical-flow/
======
pontifier
I hit upon a similar idea back in about 2000... the key at the time was video
encoding hardware that was designed to incorporate MPEG compression. Pulling
motion vectors from the encoded stream could have given realtime optical flow
way back then. Unfortunately I was nowhere good enough as a programmer to take
advantage of it for real projects. A team at MIT had a paper about this that I
was able to find.

Had a page about it on robots.net, but it seems to be down now

~~~
tbirdz
We're doing this with H.264 at work. Pretty useful since we need to generate
the encoded stream anyway, so now we get motion estimation for free.

~~~
rjeli
Awesome. Which encoder provides this? Can you pull it from cuvid?

~~~
haxiomic
NVidia has some example code for getting the motion vectors during video
encoding

[https://github.com/NVIDIA/video-sdk-
samples/blob/master/Samp...](https://github.com/NVIDIA/video-sdk-
samples/blob/master/Samples/AppEncode/AppEncME/AppEncME.cpp)

~~~
rjeli
Thank you!

------
ole_gooner
Hey,

Most real-time video processing systems/techniques only address relationships
of objects within the same frame, disregarding time information. Optical flow
accounts for this temporal relationship between frames. Advances in Optical
Flow have changed the game in Object Tracking and Human Activity Recognition
in videos.

This article explains the fundamentals and gives you the code to try it out
for yourself.

------
Darkphibre
I've dabbled with optical flow for a hobby side project (using the Windows
ShaderEffectClass of all things). I worked on the Kinect back in the day;
while I was primarily audio, I had a fascination of applying DSPs to two- and
three-dimensional temporal data.

I've always felt that it was a missed opportunity to tap into temporal
information for entity recognition. I'm excited to see this take hold!

~~~
chuanenlin
@Darkphibre, wonderful background! Yes, and optical flow + recent advancements
in deep learning is also certainly something exciting to see flourish! Action
recognition seems like a promising research area
([https://research.google.com/ava/](https://research.google.com/ava/) CVPR
'18).

------
haxiomic
Back in 2008 there was an amazing demo of realtime dense optical flow on a
GPU[0] but all the links are dead now. I've searched hard for another
comparable implementation since then but had no luck.

Does anyone have a hint on what technique they might have used?

[0]
[https://www.youtube.com/watch?v=ssINeWRb58M](https://www.youtube.com/watch?v=ssINeWRb58M)

~~~
lcrs
A lot of that group's publications are listed here, many involving optical
flow:
[http://web.archive.org/web/20161014025823/http://gpu4vision....](http://web.archive.org/web/20161014025823/http://gpu4vision.icg.tugraz.at/index.php?content=Cat_0)

Maybe this one from 2007: [https://www-
pequan.lip6.fr/~bereziat/cours/master/vision/pap...](https://www-
pequan.lip6.fr/~bereziat/cours/master/vision/papers/zach07.pdf)

~~~
haxiomic
Oh amazing find Icrs, thank you! :)

~~~
mlthoughts2018
Thomas Brox’s lab had a ton of these around 2008-2012 as well, such as [0]. I
believe Brox had some freely available early CUDA program to calculate optical
flow that was sort of SOTA for many years.

[0]: [https://lmb.informatik.uni-
freiburg.de/Publications/2010/Bro...](https://lmb.informatik.uni-
freiburg.de/Publications/2010/Bro10e/)

------
bitL
(Hierarchical) Optical flow is slow and prone to the need to manually adjust
its constants for each scene (not really universal). Did you think about using
3D convolutions & attention with Deep Learning instead?

~~~
chuanenlin
Hi @bitL, I'm the author of the article. Yes, I think deep learning is a
promising solution that replaces the need for manual fine-tuning and is
certainly driving momentum in optical flow research. Something you may want to
look into is sequence-to-sequence (seq2seq) learning
([https://google.github.io/seq2seq/](https://google.github.io/seq2seq/)).

------
adzm
Really neat article. I have always wondered if these techniques are used at
all in higher fps interpolations which always seem... off.

~~~
youbetcha
Yes, they are. In fact, I think my Samsung TV from circa 2009 actually calls
it "Motion Estimation". It makes everything look like it was filmed on a Sony
Handycam from the 80s. I don't understand why anyone would want to turn it on.

