

Over 500,000,000 assertions extracted from 100 million Web pages - coderdude
http://www.cs.washington.edu/research/textrunner/

======
onewland
Predicate = Invented has some scary results for "reliable" sites:

Al Gore invented global warming (33)

Jews invented the Catholics (7), Christianity (5), the legend of the Holocaust
(5)

Staples invented the office superstore concept (6)

~~~
hugh3
Well, didn't Jews invent Christianity?

------
hugh3
My first attempt was "Who shot JFK?" The first answer I got was "President
Kennedy shot dead JFK", which is at least a novel theory.

~~~
ultrasaurus
On the other hand it kind of correctly fingers Kristin as "Who shot JR"
(although it considers the JR of "Secretary of Balloon Doggies" fame to be the
more relevant).

Video is correctly tagged as killing the radio star, I'm mostly impressed.

~~~
hugh3
It also correctly identifies the shooters of Abraham Lincoln, Ronald Reagan
and Mr Burns.

In unsolved mystery news, it told me that Jimmy Hoffa is buried under Giants
Stadium (I thought Mythbusters busted that one...) and John Karr _did not_
kill JonBenet Ramsey. In answer to "Who was Jack the Ripper?" I was simply
told "the fictional world is Jack the Ripper", which seems a little too
abstract to be useful.

------
cldwalker
Why query open extraction data when there are billions of triples available
via the semantic web?

Movie queries: [http://www.snee.com/bobdc.blog/2008/11/sparql-at-the-
movies....](http://www.snee.com/bobdc.blog/2008/11/sparql-at-the-movies.html)
Geographical queries: <http://geosparql.appspot.com/> Misc queries:
<http://sparql.me/queries.php>

~~~
henrikschroder
Because noone gives a shit about the semantic web except its core suporters?

------
waterlesscloud
Whatever else it does, the site makes me immediately start thinking of prolog
projects (not that I'll actually pursue them).

------
bnoordhuis
The results for 'what causes promiscuity' are both amusing and slightly
worrisome.

    
    
      objectification of the self (2), Spring break parties (2) can lead to unintended promiscuity
      vaccine (2) will cause promiscuity
      sex education (2) causes promiscuity

~~~
Tichy
Pondering the concept of "unintended promiscuity".

~~~
Deestan
A nicer way of saying "gang rape"?

------
ph0rque
Does anyone have a query with an impressive answer? The ones I thought up have
no results.

~~~
moconnor
Argument 1: Paul Graham

Predicate: is

Top result: Paul Graham is Dead (17)

Which just goes to show what happens when you consider ancient slashdot
comments to be an authoritative source, I guess...

~~~
hugh3
Or you can just ask it "who is dead?"

You get 287 results, the top ten being: Queen, Hip Hop, Heath Ledger,
Microsoft, Science, Democracy, Rosencrantz and Guildenstern, Jazz, All Humans,
and Elvis.

(Incidentally, after that I had a sudden urge to check google news to make
sure the Queen wasn't dead. She isn't.)

------
stcredzero
_names of python libraries_ \- only 1 result (PyRobot)

But "python library" returned 219 results. It also understands relationships
in the text.

 _script requires the Python Imaging Library (3)

EAN bar code required PIL ( python imaging library (2)

python bindings require the C library (2)_

------
ilitirit
"TextRunner searches hundreds of millions of assertions extracted from 500
million high-quality Web pages."

~~~
coderdude
I got the title from this page:
<http://www.cs.washington.edu/research/knowitall/>

"Demo: TextRunner extracted over 500,000,000 assertions from 100 million Web
pages.

I didn't realize there was a discrepancy between the two pages when I
submitted the URL.

------
slapshot
Strangely, it has no answer for "What is textrunner?" (does not appear to be
linkable, but I'm serious).

~~~
sumeeta
I really wonder what “high-quality Web pages” are. `What is Hacker News?`
doesn’t give any results either.

~~~
coderdude
In the context I believe they mean pages with a large number of grammatically
well-formed sentences.

------
krosaen
"what has apps has apple banned?" didn't work :)

------
mitko
Another great application of CRFs. Go AI!

