
Stuff Harvard, MIT, Stanford & Caltech People Like - krat0sprakhar
http://blog.echen.me/2011/09/29/stuff-harvard-people-like/
======
dm8
_Berkeley, sadly, is perhaps too large and diverse for an overall
characterization._

So true. As a Cal alum, I think he is spot on here. :-)

 _I pulled about 400 followers from each school, and added a couple filters,
to try to ensure that followers were actual attendees of the schools rather
than general people simply interested in them_

How did he ensure followers were actual attendees of the schools
programmatically? It would be really hard to find out this type of
information. And can be considered as borderline creepiness in some cases.

EDIT: He also works as a data scientist at Twitter. So I'm sure he has access
to lot more internal data rather some sort of mashup between Twitter, FB,
LinkedIn APIs.

~~~
aggie
_How did he ensure followers were actual attendees of the schools
programmatically?_

He mentions in the comments "I basically just checked that they didn't follow
any other schools (from a small list). It's certainly not the greatest filter,
but it did seem to work for a small number of people I hand-checked."

------
ScottBurson
Interesting. Would also like to see p(topic|school=X).

~~~
ballooney
Well, god gave us Bayes' rule for a reason.

The article does not point out that sampling quora users might not be (I would
say is probably not) an unbiased estimator of the the students of these places
as a whole. Quora attracts a certain kind of person, I've never seen it
mentioned anywhere outside of techy startup / valley circles. Maybe that's
implied by virtue of it being in the HN ecosystem, but still it should be
explicitly stated as a flaw in the method. Lies, damned lies and statistics
etc.

~~~
azylman
_The article does not point out that sampling quora users might not be (I
would say is probably not) an unbiased estimator of the the students of these
places as a whole._

From the article:

 _Also, a word of warning: my dataset was fairly small and users on Quora are
almost certainly not representative of their schools as a whole (though I
tried to be rigorous with what I had)._

~~~
ballooney
I _completely_ missed that, mea culpa, my apologies to the author. I'll leave
the message above unedited as a little warning sign to other people who can't
read: you too can look silly like me.

------
rjtavares
MIT and Stanford like Hip-Hop Music. I'm pretty sure MIT prefers Biggie and
Stanford prefers Tupac (note: this has nothing to do with east/west coast, but
rather pure skill vs. charisma)

------
carlob
Interesting, but wrong use of conditional probabilities. All OP is saying is
that the frequency of people from school x following topic y is p.

The dataset is just not the right one to say that P(x|y) = p, because of, say,
all the people who follow food in NYC and go to NYU which were not taken into
account here.

------
sesqu
Surprisingly diverse interests, considering the huge bias in the dataset
(public quora profiles). Well worth reading, though I was left wondering about
the inverse probabilities.

------
sadga
Article title should be "Reverse index of where people went to school who talk
about various topics".

Obviously, the author chose a more wieldly title, but a far less correct
wrong.

------
romain_g
Taylor Swift ?

~~~
probably
Dude Taylor Swift is awesome. I can relate.

But good eye on that. ;)

------
wilfra
"Berkeley, sadly, is perhaps too large and diverse for an overall
characterization."

That isn't sad at all. It's great. As a UC grad, the diversity of the student
body was one of the absolute best things about my college experience. Do I
wish I went to Harvard or Stanford? Sure, but not for their homogeneous
student bodies.

Otherwise, pretty interesting read.

~~~
echen
It was a tongue-in-cheek comment :) -- sad from the perspective of me trying
to find a cute pattern for Berkeley.

------
baritalia
Interesting thing to read while sipping your morning coffee but that's it,
really.

~~~
nuttendorfer
Which is just what I did. Nothing wrong with that.

