

Google "Reading Level" filter - madhurk
http://www.google.com/support/websearch/bin/answer.py?hl=en&answer=1095407

======
jimrandomh
You can estimate the literacy of a forum by choosing "annotate results with
reading levels", then doing a search for site:example.com. It gives a
breakdown of the fraction of pages at each reading level. Far from perfect,
but interesting nevertheless.

    
    
      site:news.ycombinator.com: 39/55/4
      site:mathworld.wolfram.com: <1/3/96
      site:stackoverflow.com: 5/88/5
      site:lesswrong.com: 12/67/20

~~~
MichaelApproved
Site:reddit.com 80/19/1

All sites I've searched seem to have at least 1% advanced.

~~~
m-photonic
site:4chan.org - 38/59/2

Huh.

------
anigbrowl
Oh YES. This makes me very, very happy. Now, can I have it as an option in the
search settings, or at least readily accessible from the left-hand gutter,
please?

Pretty please? I'll be your best friend, heroic anonymous Google search
engineers!

------
njrc
It would be awesome if they could integrate that with a translation feature,
such that content in high reading level (perhaps dense with technical jargon,
etc.) could be translated into a simpler, yet semantically correct version.

~~~
nowarninglabel
And then run it across Wikipedia and add the subsequent results as
edits/additions to <http://simple.wikipedia.org/>

------
6ren
Searching for "xsd" (xml schema document), the actual specs only come up at
the _advanced_ level, which is pretty accurate if you've ever suffered through
them. Wikipedia entries appear here too.
[http://www.google.com.au/search?tbs=rl%3A1%2Crls%3A2&q=x...](http://www.google.com.au/search?tbs=rl%3A1%2Crls%3A2&q=xsd)

 _intermediate_ gives you tutes and tools.

 _basic_ has news, forums posts, youtube (comments?) and (apparently) random-
ish lists

You can also use it to filter HN, for the _advanced_ "interesting" comments:
[http://www.google.com.au/search?tbs=rl:1,rls:2&q=site:ne...](http://www.google.com.au/search?tbs=rl:1,rls:2&q=site:news.ycombinator.com+interesting)

------
carucez
I got to thinking, how would America's universities compare? It's an
interesting thought: The best universities should have the most advanced
reading level.

I've published my findings, a complete ranking, and source code:
[http://log.largevoid.com/2010/12/ranking-colleges-by-
reading...](http://log.largevoid.com/2010/12/ranking-colleges-by-reading-
level/)

------
yagibear
I'm surprised I can't yet find a service that rates the reading level of books
so as to guide selection by early readers (typically children, but also
learners of foreign languages). Amazon? Startup?

------
MichaelApproved
Not available from the mobile site. Make sure to switch to full site if you're
using your phone.

Edit: Doesn't even seem to be there when I switch to classic. Using Droid 2.

------
akozak
I'm curious if they'll ever supplement machine learning data with structured
data, e.g. something like dct:educationLevel expressed in rdfa.

~~~
adambyrtek
Who do you mean by "they"? This is something that only site owners might be
expected to do, definitely not Google. But even assuming there is a standard
way of describing that, I doubt that anybody would care enough to include such
meta-data. Algorithmic approach is so much more effective in this case.

~~~
akozak
I suppose I should have said "supplemented the search results". I disagree
that people don't care (many people already do care about educational
metadata).

------
coderdude
Does anyone know of a technical overview of this? I'm curious as to whether
they're assigning every word in some lexicon a reading level.

~~~
ronnier
There are multiple ways of doing it, here's one:

[http://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readabil...](http://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_test)

~~~
coderdude
Thanks for the link!

------
gojomo
Anyone know what level-estimator they're using?

Game: find the 1-word query that has the largest percentage of 'basic' or
'advanced' results.

My starting entry after about 5 tries: 'doctoral' -- 51% advanced.

Update: 'enzyme' -- 94% advanced.

Update2: 'iatrogenic' -- 95% advanced. And no more entries from me.

~~~
whatusername
theorem:97%

Theoretical: 60%

Theory: 41%

~~~
wnoise
Theorem made me try lemma: 99% advanced, <1% basic, <1% intermediate.

~~~
whatusername
nice.

A new challenge for HN: Find a pair or series of words that are synonymous but
have the largest variance in Advanced readers. Bonus points if you can provide
a progression of words from basic to advanced.

~~~
derefr
"black hole": 57/37/5

"gravitational singularity": 4/7/88

------
TotlolRon
Search for [Donkey].

Advance results bring the _Equus africanus asinus_ from Wikipedia first. Good.
Basic results bring _Donkey (Shrek)_ from Wikipedia second. Google might be
onto something here .... but wait.... what is it there at number one for the
basic reader... humm... "YouTube - Donkey Rapes Man". Hummm... who are our
"basic" readers? ... gota think about this feature a little more.... ;)
Meanwhile, a classic: <http://www.youtube.com/watch?v=qRm8okHhapU>

~~~
Lewisham
This is actually a reasonably good search. It does highlight the clashes
decently well.

I would like it if Google expanded this to allow you to prioritize.
Prioritizing advanced reading level documents seems like it would be of value
to me.

It's worth nothing that, yes, effective communication means that the document
should be readable by anyone, but it doesn't seem that Google is tagging
pompous, impenetrable documents as "advanced". For example, the [donkey]
search comes up with "All About DONKEYS!" [1] as an "advanced" text. It
doesn't seem to be difficult to read, it is just information-rich.

The search for [cheese], however, doesn't sort the wheat from the chaff like
[donkey]. Gnome's Cheese project, for example, is classified as "Basic reading
level", and so is filtered out from the "advanced" search.

[1] <http://www.lovelongears.com/about_donkeys2.html>

