Ask HN: What is the engineering behind Netflix's 'Skip Intro' button? - rahulskn86
======
pontifier
Is it so hard to believe that they are manually tagged?

They only have several thousand videos. Each one has some sort of process that
it goes through to be included in their service. Someone decides which
screenshots will be shown, edits the intro video, writes the synopsis, etc.

~~~
tsjq
this has been my assumption ! :)

------
andreareina
Don't know if it's what Netflix actually uses, but you can use perceptual
hashing[1] to identify spans that are common to multiple episodes of the same
series.

[1]
[https://en.wikipedia.org/wiki/Perceptual_hashing](https://en.wikipedia.org/wiki/Perceptual_hashing)

------
darepublic
Is it naive to think that just as the shiws provide previews, box art etc they
would provide the length of their intro? Then skip intro just needs to
increment the video time by that amount

~~~
quickthrower2
It makes sense that they’d provide the length because they wouldn’t want it
split in two by ads when shown on TV.

------
tastroder
In addition to the other suggestion, I'd expect it to be something like this
[0] or [1], a mix of data on what people skip in the first 10 minutes and the
superstar of modern machine learning: human annotations. I've seen several
shows where all episodes where consistently off by a few seconds, that might
be either algorithmical bias or due to some annotators personal preference.

[0] [https://outline.com/BjsXnF](https://outline.com/BjsXnF)

[1] [https://medium.com/an-attempt-at-writing/netflixs-skip-
intro...](https://medium.com/an-attempt-at-writing/netflixs-skip-intro-
feature-how-the-hell-do-they-do-that-7c5db9408f82)

Edit: looks like these sequence fingerprinting patents might be relevant:
[https://patents.google.com/patent/US20190028525A1/en](https://patents.google.com/patent/US20190028525A1/en)
[https://patents.google.com/patent/US9418296B1/en](https://patents.google.com/patent/US9418296B1/en)

------
karmakaze
In the case of Netflix created content. It can be as simple as a number of
seconds in metadata. Things get easy if you control both the production and
consumption ends.

