
Show HN: I made an extension to watch Netflix films with screenplays in sync - justEgan
https://screenplaysubs.com/
======
crazygringo
Very interesting!

Fun tidbit: for TV actors, regularly reading pilot scripts and then watching
the produced pilot for comparison is a _huge_ common educational technique.
You get to imagine what kind of acting and directorial choices you'd make, and
then see what was actually done. Often times you'll realize you had totally
misinterpreted what a scene was even about.

It's also fun to see how every script is filled with lines that are
"unactable" \-- there's just no way any real person would ever say anything
like that. Then nine times out of ten, those lines are cut from the final
product, because even the best actors couldn't make them work.

~~~
justEgan
Fascinating! It can also be the other way around, where an actor miraculously
interpreted an unactable script well. E.g. Joaquin Phoenix delivered "It vexes
me. I am vexed." in Gladiator quite well.

~~~
Kaze404
I watched a video recently that touched on this subject when talking about the
TV show House. If you read the script for a regular House episode, there's no
conclusion to be had besides House being an insufferable, racist prick.
Instead, Hugh Laurie delivers the obviously racist lines so sarcastically that
it makes them work very well (which is obviously what the writers were going
on).

~~~
learnstats2
I've recently watched House and I found this more uncomfy than I did the first
time.

House (the character) is often being plainly racist and sexist. The fact that
he presents it as sarcasm is a vehicle for him, used to make his racism more
difficult to challenge.

~~~
matt-attack
But can’t shows be about racist people? I mean there’s shows about murderers
all day. Do people critique shows (think Dexter) as “uncomfy”? You can laugh
at, and even find endearing, racists characters. It’s part of being a grown up
I believe.

~~~
learnstats2
If there are more shows with racist people as the endearing title character
than there are with black people as the endearing title character, is that not
a problem?

------
AriaMinaei
This is great!

Also check out LanguageLearningWithNetflix [0] which lets you watch videos
with two subs in different languages, displays the subs as HTML so
select/copy/define will work (and it has a built-in dictionary too). It also
allows you to quickly jump to the beginning of each sentence so you can hear
it multiple times, which helps improve your listening skills. For me, it has
been a fun way to improve my German.

On a side-note, please notice how none of these great features are available
to mobile users. iOS for example, is technically perfectly capable of
supporting this kind of extensibility, but the App Store model limits it to a
few narrow and specific use-cases.

[0]
[https://languagelearningwithnetflix.com](https://languagelearningwithnetflix.com)

~~~
leppr
LLN is good but seems to favor anti-features like disabling text selection on
subtitles (except in the side pane) in order to push their in-house paid-for
features.

Regarding your comment about extensibility, this applies to mostly every
software platform other than the web, unfortunately. And even there, it feels
like a happy accident. There's much work to be done in this area.

~~~
davidzweig
Text selection is a fiddly CSS issue that Ognjen didn't get around to fixing
yet. There's only two paid features (saving words and machine translation.)
About 3500 paying users at $3.50/mo after taxes and fees (from 800k total
users), but before paying for servers/APIs, if you are curious. Free users are
welcome.

[ Dear Netflix, let's be friends. There's a lot of work to do still, and we
can go further, faster, with some small helps. Can we get a test account?
Regards, David. languagelearningextension@gmail.com ]

------
Animats
Oh, someone has to manually time the movie. They only support about six
movies. I expected that it would use closed captioning data, do the sync
automatically, and support far more titles.

~~~
justEgan
The syncing is done automatically, at least mostly.

TL;DR: ScreenplaySubs fetches the subtitles from Netflix, parses the PDF-
formatted screenplays into JSON, and syncs by calculating the sentence
similarities between subtitle and screenplay dialogue.

In particular, we use the Universal Sentence Encoder for deciding whether a
subtitle matches with a screenplay dialogue. If a screenplay dialogue is
similar enough with the subtitles, the former will be tagged with the
timestamp provided by the latter.

A lot of the underlying problems presented with each step sounds deceptively
simple at first, but turns out to be quite challenging and fun to research.
E.g. Parsing PDFs in general are not straightforward
([https://filingdb.com/b/pdf-text-extraction](https://filingdb.com/b/pdf-text-
extraction)), and there’s only a handful of resources on parsing PDF
screenplays beside a handful of research papers
([https://github.com/drwiner/ScreenPy/blob/master/INT17_screen...](https://github.com/drwiner/ScreenPy/blob/master/INT17_screenplays.pdf)),
which lead us to create our own open source repo for this
([https://github.com/SMASH-CUT/screenplay-pdf-to-
json](https://github.com/SMASH-CUT/screenplay-pdf-to-json)).

Our screenplay-pdf-to-JSON converter is able to contain all dialogues,
transitions, actions within a particular screenplay scene. With this, we’re
treating scenes as atomic, being able to detect changes in scene ordering
based on the tagged scene timestamps. This also means if dialogues are swapped
within a scene in the movie, there will be some syncing inconsistencies.

Some scenes do have little to no dialogues, which would pretty much cause the
extension to work on a best-effort basis. E.g. The opening scene of There Will
Be Blood has very minimal if not no dialogue at all. This is the case where I
need to jump in and sync up the screenplay manually. OTOH, the opening scene
of Inglourious Basterds will work very well, since there are tons of dialogues
in it. This is the reason why I can’t just add movies and instantly upload it
to the site.

Would you be interested for me to get into more details? I was thinking of
writing a series of technical blog posts if there are enough interests!

~~~
walterbell
Please blog about the details! Are you following the W3C work on synchronized
multimedia?

[https://github.com/w3c/sync-media-pub](https://github.com/w3c/sync-media-pub)

[https://www.w3.org/community/sync-media-
pub/](https://www.w3.org/community/sync-media-pub/)

~~~
justEgan
Will do! I am not aware of that, tell me more!

------
mapgrep
Very cool. I have been fascinated by this whole area of what I call “media
stapling” since I spent about two years obsessively watching the Big Lebowski
as a stress reliever. This film has no commentary track so people have
recorded their own and you have to sort of just manually sync up the mp3. I
also do a lot of interview transcription where text is stapled to audio.

Anyway I see you have a comment here where you say you use the closed captions
to figure out where to staple in the script. Would be cool to be able to
staple in arbitrary other media - text audio video whatever.

~~~
schwartzworld
Sounds like Rifftrax, the successor to Mystery science Theater 3000

~~~
mapgrep
Wow, thanks for the pointer, had no idea about this. They have an app that
looks like it could be so cool if they opened it to other people to use as a
platform. Although for now just using it for their own commentaries (which I'm
sure are great). Sort of a Hollywood approach as opposed to a Silicon Valley
approach (even if they aren't literally in Hollywood).

------
nmstoker
I've seen plugins like this from time to time and I always wonder to what
extent using them with a secured service (like Netflix etc) means that you've
opened yourself up to them doing all sorts of things with your account. You
need to login and once that's done the plugin code effectively acts as you
doesn't it? I'm guessing there are Chrome/FF protections on the password
field, but if the plugin can do anything on a site, might it not draw their
own fake password box on top of the real one?

I'm certainly not suggesting this is done by this author and I applaud the
creation of the tool, but I'd be interested to hear opinions as to whether my
interpretation above is correct or if I'm overly cautious/overlooking
something.

~~~
apendleton
I mean, unsurprisingly, it looks like it requests permissions to execute
arbitrary code on your behalf on netflix.com. So yeah, it can do... a lot. It
could, for example, click the logout button on your behalf, wait for you to
log back in again and keylog your password when you do (there aren't any
special protections there -- you can access it like any other field from
privileged JS, and other extensions like password managers depend on this
being the case), then use that to, in the background, change your password and
recovery email, and then log you back out again. Any use of Chrome extensions
that can execute scripts requires some degree of trust, for better or for
worse.

------
justEgan
ScreenplaySubs is a browser extension for Netflix that syncs up movies with
screenplays, displaying them side by side. It's like having a subtitle that
provides more insights to your films.

Demo: [https://vimeo.com/447986440](https://vimeo.com/447986440)

~~~
tyingq
I really like the Amazon Prime Video "X ray" feature that shows the actors,
bios, etc, in the current scene. It's odd to me that other services like
Netflix don't have something similar.

~~~
sigjuice
Most likely other services don't do this because of patents.

~~~
smegger001
and they don't own IMDB where much of this information is already organized
into a usable state.

------
noisy_boy
As I checked the demo video and read the screenplay, I could actually imagine
the shots, camera angle - the images basically appeared in my head with Tom
Holland in them (without actually playing the video or remembering the movie).
This is very interesting.

------
qwertox
It would be interesting to have a TTS synth output the screenplay on another
card, one which could be used by a blind person to plug some headphones in
(not covering all the sound, in order to hear the environment and speeches).
Maybe even optionally disable the spoken words, and only output the scene
description, and emit a beep on a cut.

The demo on the page looks great, and this is stuff which should be
automatable at some point by AI.

------
Gaelan
Where do you get the screenplays, out of curiosity?

~~~
abathur
I wondered this, too. Also--which draft, if there's more than one? Is it like,
"latest available", or "easiest to parse", or do you have a policy like _only
shooting scripts_?

------
gitgud
Nicely implemented, you just need to recruit people to add more movies!

------
maydemir
It's not for me but it's a great extension!

~~~
justEgan
much appreciated!

------
anotheryou
I don't use netflix and prefer subtitles, but it looks nice!

Maybe you could implement smooth scroll and some sort of an overlay mode.

~~~
justEgan
Thank you, and that's a great idea I've considered for future releases. More
specifically, the layout presented in this video looks ideal:
[https://www.youtube.com/watch?v=HybzbDBF7HQ](https://www.youtube.com/watch?v=HybzbDBF7HQ).
Where the screenplay can replace the letterbox at the bottom.

One of the reasons we decided not to implement that for now is to provide a
bigger room for error since our algorithm is still not perfect. Sometimes the
extension choose to focus on 1 or 2 sentences next to the accurate dialogue.
Having an entire viewport height to show the screenplay means even if some
inconsistencies occur, the user may still be able to see the accurate
dialogue.

~~~
anotheryou
makes sense :)

yea I like this example better, but thought of an actual overlay.

------
sytelus
This is Chrome extension that is asking to read browser history. Why?

------
ryanisnan
Really cool! It seems like the data synchronization is done manually?

~~~
justEgan
There’s some automation going on. Check out my reply to one of the skeptical
comments!

------
kyle_martin1
This cool. I did this exact thing in 2013 with DistanceFlix.com

------
digitalsushi
Nice!

Nerd out with your word out.

------
csours
Site is flagged from work PC

------
phreeza
Data source for GPT-4?

