
Show HN: QuoDB – Movie quote search engine based on subtitles - lusob
http://www.quodb.com
======
boomzilla
Ah, some search relevance needs to be worked on :) No movie should rank higher
than the Terminators for "I'll be back":

[http://www.quodb.com/#search/i'll%20be%20back](http://www.quodb.com/#search/i'll%20be%20back)

~~~
cettox
"asta la vista baby" query does not result with Terminator at all!

~~~
pelario
For that you need auto correction; "hasta la vista baby" gives the correct
result

------
thisjepisje
I think lots of us have had this idea, great to see it implemented.

------
superasn
I've often wondered how a database such as this can be used in other fields of
programming like say, a text-to-speech engine[1] where using subtitles the
algorithm can guess the context of the conversation to produce better results.

[1]
[http://www.slate.com/articles/technology/technology/2009/03/...](http://www.slate.com/articles/technology/technology/2009/03/read_me_a_story_mr_roboto.html)

~~~
stingraycharles
I actually worked on this exact problem as an intern job at our university. We
used a _huge_ corpus of communication (for example, we had access to all the
emails every sent internally at Enron).

We used this as the basis to train a speech-to-text engine by automatically
correcting likely-wrong interpretations. "I go loo school" would be corrected
to "I go to school", for example. It worked remarkably well.

The basis of all these subtitles can be used, but there are far bigger (and
better?) collections of data to be used to train these machine learning
engines.

~~~
haraball
Could you recommend any of these data collections if they are open to the
public?

~~~
law
This is very likely the Enron corpus that was used:
[https://www.cs.cmu.edu/~./enron/](https://www.cs.cmu.edu/~./enron/)

~~~
stingraycharles
I can confirm that this is the corpus. I can also confirm that, even though
the emails are all from mid-to-senior management, the writing style is very
sloppy.

------
acangiano
The first thing I searched for was "you look like shit" which is a very common
remark in movies and shows.

[http://www.quodb.com/#search/you%20look%20like%20shit](http://www.quodb.com/#search/you%20look%20like%20shit)

531 titles. Wow.

~~~
jnks
I'm partial to "We've got company!"

[http://www.quodb.com/#search/we've%20got%20company](http://www.quodb.com/#search/we've%20got%20company)

Incoming is the moral equivalent (and is much more popular), but is less
impressive since it's only one word.

[http://www.quodb.com/#search/incoming](http://www.quodb.com/#search/incoming)!

~~~
oneeyedpigeon
[http://www.quodb.com/#search/is%20my%20middle%20name](http://www.quodb.com/#search/is%20my%20middle%20name)

------
jamespo
Where are the movie cover thumbnail images from? I'd like a source for an idea
I have.

------
adityar
Suddenly Fight Club is in the same league as Ugly Betty...

[http://www.quodb.com/#search/i%20want%20you%20to%20hit%20me%...](http://www.quodb.com/#search/i%20want%20you%20to%20hit%20me%20as%20hard%20as%20you%20can)

------
cafard
Very neat. I queried for a line I remembered as "a story Englishmen tell when
they're down in the mouth", and it corrected this "Englishmen tell it [etc.]",
identifying the movie as Beat the Devil.

------
krmmalik
I typed in "finality" as the search term. There's a scene in which this word
is used where Nick Nolte gives a speech to the Hulk. It only came up with
results that had "finaLLy" in the results(?)

------
wamatt
Nice design; fast and functional too. Kudos!

So while the fuzzy matching is neat, sometimes it's handy to be able to
perform an exact search as well.

Typically this is done using "quotation marks" around the search term(s).

------
xerophtye
Interesting case: i searched "screw you" and it als turned up results like
"are you screwing with me?" , "we would have been screwed..." etc

------
LukeShu
Very cool!

What is it using localStorage for? Without dom.storage.enabled, it's just a
blank white page with a footer.

------
fletchowns
Is this legal?

------
golergka
As someone who often hunts old movies for samples of random phrases, this is
just so perfect. Thanks.

------
boristhespider
Excellent. One useful feature would be the ability to sort search results by
age, for example.

------
mileschet
congrats ! do you plan to include subtitles from another languages ?

------
tonglil
Waterboy: "this is some high quality h2o" yields nothing.

------
HugoDias
Ok, really works!
[http://cl.ly/image/2b1l1s321123](http://cl.ly/image/2b1l1s321123)

------
d0100
This is halfway trough what I wanted to do. Just add the movie's clip together
with the quote and, boom, gold.

~~~
xerophtye
One developer did something like that. She'd use subtitiles to create .gif's
of movie quotes. made a utility for it! will post if i can find the link

~~~
mpeg
[http://quotacle.com/](http://quotacle.com/)

Also posted to HN not too long ago

