
Helping computers fill in the gaps between video frames - jonbaer
http://news.mit.edu/2018/machine-learning-video-activity-recognition-0914
======
menzoic
This could lead to major size reductions in new compression algorithms based
on AI that can take frames that are partially filled or even sequences with
completely missing frames and fill in the blanks.

------
knicholes
It's taking everything within me to not scream with excitement (I'm at work).
I've never seen activity recognition before, let alone something so detailed!

~~~
exikyut
What would you do with this? More accurately, what would you [like to] do with
this if you could?

The reason I ask: you obviously see the thousand and one possibilities this
enables. I look at it, appreciate that it's ridiculously good and go
"...coool.", but the best I can come up with are the So Very Original™
robotics suggestions already mentioned in the article.

The only related thing that this prompts me to think of of is motion analysis
in video encoding (a la `mplayer -lavdopts vismv=7 <file>`), but I suspect
video encoding requires a different kind of specificity than the one
demonstrated in the article.

~~~
knicholes
I'd use it for something like processing through video so I don't have to. For
example, locally people have been robbing flowers and other things off of
graves. I'd like to set up a camera and receive video clips of people when the
rob the graves, but not any other time. I don't want to sift through hours and
hours of footage every day.

I wouldn't mind something similar for someone stealing a package off of my
porch, or a dog crapping on my lawn. I know dogs crap on my lawn, and I want
to know which owner isn't picking it up. If I set up a camera to send me a
clip of just the dog crapping on the lawn, I'd be much closer to solving my
mystery.

Another use could be if crosswalking guards are leading kids across the
streets or if cars are stopping past a stop line (or at all).

~~~
exikyut
Okay, these use cases are _very cool_ , and I suddenly see a lot more
relevance. Thanks very much for the enlightenment; I reread the article with
newfound interest, while I sadly ruminated how sad it was that so many new
areas of research are announced without any way for me to play with them too.
" _Where 's the source code?_" I thought... as my eyes fell upon the last
words of the article,

> _“We also open source all the code and models. Activity understanding is an
> exciting area of artificial intelligence right now.”_

Oh.

A couple of clicks later, in rapid sequence:

1\. Name of author:
[https://www.google.com/search?q=Bolei+Zhou](https://www.google.com/search?q=Bolei+Zhou)

2\. Their MIT page:
[http://people.csail.mit.edu/bzhou/](http://people.csail.mit.edu/bzhou/)

3\. Linked GitHub account:
[https://github.com/metalbubble?tab=repositories](https://github.com/metalbubble?tab=repositories)

4\. This looks like It™!: [https://github.com/metalbubble/TRN-
pytorch](https://github.com/metalbubble/TRN-pytorch) (BSD 2-clause license)

(NB. Someone needs to clue in MIT's media department about the existence of
the wonderful HTML _anchor_ tag.)

~~~
knicholes
Just had some more ideas-- Video categorization. A less-than-upstanding
example could be knowing when certain actions are being performed in porn
videos by auto-tagging portions of the video. Maybe something less nefarious
could be for education, where you pull in automatically for a compilation a
bunch of videos of, say, beetles flying-- Could be useful for art, as well.
I've seen a recent project that creates an intermediate stick figure for
"DeepFake" image generation from one video to another. I think the paper was
called "Everybody Dance Now". Maybe for stores to know when someone might be
shoplifting.

I think my point is that if you could label activities in video, you could
collect datasets more quickly for training other networks as well.

------
John_KZ
Paper? The video is really impressive if it's not just a highlight reel.

~~~
yorwba
[https://arxiv.org/abs/1711.08496](https://arxiv.org/abs/1711.08496) is linked
in the "Related" section.

~~~
John_KZ
Thanks, I couldn't find it on mobile.

------
foxfired
Not that it is any less impressive, the title gave me the impression that it
was about Tween frames.

~~~
Wistar
Same thing here. The title should be something such as: Helping Computers Fill
the Gaps In Machine Observed Actions

