
The science of science - pama
http://www.economist.com/node/18618025?story_id=18618025
======
adi92
They are talking about Latent Dirichlet Allocation which was made by Blei in
2002 ( <http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation> )

Here is video by him where he explains the basic model as well timeline-
related hacks the article talks about -
<http://video.google.com/videoplay?docid=3077213787166426672#>

You can very easily play with LDA yourself with this toolkit -
<http://mallet.cs.umass.edu/>

~~~
jkan
I think they're talking about this one:

<http://projecteuclid.org/euclid.aoas/1183143727>

This work uses the correlated topic model, which extends LDA to model
correlations among the extracted topics.

Edit: and this one [pdf]:

<http://www.icml2010.org/papers/384.pdf>

------
Anon84
"If It Has "Science" in the Name, It's Not"

~~~
possibilistic
Unfortunately true. This seems to be a pop-sci cover of some data mining and
semantic clustering algorithm for scientific publications. The author claims
the groupings may be better than citations (less political, for instance) in
showing how ideas impact a given research community.

~~~
dereg
People with last names in the "lower" part of the alphabet are also cited more
than their higher-alphabetized counterparts.

~~~
crocowhile
[citation needed]

------
btilly
Classifying science sounds like a boring application of this algorithm. I'd be
more interested to see it used for spontaneously identifying groups of related
publications any kind, such as blogs.

~~~
jacques_chester
Blogs are a simple case because the default is to refer to each other through
inline hyperlinks.

I've also seen linguistic data mining studies done that suggest that blogs
which commonly agree with each other, commonly agree with each other. For
purposes of politeness, I acted astonished at this revelation.

