
What makes one appear smarter and more sociable?  - bvi
http://judg.me/blog/judgment-day/
======
pflats
When you don't start your y-axis at 0, you skew the interpretation of your
data. At best, this is a significant mistake, and at worst, this is
intentionally misleading.

Take this graph:

<http://i.imgur.com/bBzCK.png>

It looks like women are rated as more than twice as smart as men. Huge
difference.

Except until you run the numbers. Women are rated about 4.3% "smarter" than
men. Not twice as smart, like the graph implies. Not 20% smarter. Not even 5%.

Please, pay attention to your graphs. They're great tools, but they can
mislead as much as they can help elucidate.

~~~
bvi
True, and I was having a tough time deciding which option to take (whether to
start the y-axis at 0, or 5). As far as averages are concerned, the
differences are in the tenths and hundredths - so starting the y-axis at 0
would have it more difficult to see any differences across the graphs.

~~~
mhartl
_starting the y-axis at 0 would have it more difficult to see any differences
across the graphs._

That's the point: the differences are small.

~~~
scott_s
I would go further: the differences probably are not _significant_. (Small
differences may still be significant. In this case, I doubt it.)

------
boredguy8
Inter-rater reliability is super important for tests like this.
<http://en.wikipedia.org/wiki/Inter-rater_reliability> The gist is: you can't
simply mark one person as "asian" and assume that categorization is correct.
In that respect, the data would reveal more about the person sorting the
photos than it would reveal about the perceptions of those that are rating the
photos.

Second, there is a huge problem with causality here. So for instance, the
author writes: "Be Asian if you want to appear smart; Latino if you want to
appear extroverted." The problem is that there is a methodological flaw. On
the first photo I saw on judge.me, I was presented with this image:
<http://images.judg.me/82e7fcbd988dbdcac0d00bd53fb93e96.jpg> This would appear
to me to be a latino or hispanic male at a party. I'm highly inclined to rate
them highly on the extrovert scale: they're at a party. But that doesn't
indicate stereotypically latino or hispanic features indicate extroversion. It
could be that people with stereotypically latino or hispanic features were
more likely to upload photos _in which the image portrayed_ a more
stereotypically extroverted activity.

Third, it appears that users can upload a photo to the site _and_ see their
feedback from votes. It seems highly possible that users self-select a photo
that will best affirm the image of themselves they wish to cultivate. In that
respect, there's both a huge confirmation bias and huge self-selection bias.
If I want to think of myself as an academic, I'll upload a picture of me at my
desk studying and watch the "intellectual" ratings pour in. Then I can feel
assured that other people perceive me the way I want to be perceived.
Additionally, if one wants to conform to social expectations (and things like
Asch's line test <http://en.wikipedia.org/wiki/Asch_conformity_experiments>
indicate conformity is common), this data might really be nothing more than
showing the degree to which people post photos affirming their conformity to
their social expectations (i.e. 'smart' ethnicities posting 'smart-looking'
photos) and be saying nothing at all about how people actually perceive ethic
cues.

There are huge methodological concerns for this 'study'. Instead, the
revelation of this data might _actually_ be the insight that "pictures of
yourself at social events makes you look more social." Taking much of anything
at all away from this data set would be rather unwise.

~~~
blahblahblah
In addition to the inter-rater reliability issue, there are also a lot of
unanswered questions about the statistical distributions involved. The results
are reported as population means, but without information about the underlying
distribution of the results it's unclear whether the mean is a meaningful
measure of central tendency for the data or how much overlap there was in the
distributions. How did the mean compare with the median and mode? What were
the standard deviations? Interquartile range? They're using a visual analog
scale for the ranking which is reasonable, but it seems that it's just been
assumed that the data can be treated as interval data for the analysis and the
validity of that assumption hasn't been established. If I were doing the
analysis I'd have been inclined to bin the data and report the results as odds
ratios with 95% confidence intervals (e.g. people wearing glasses are N + or -
95% CI times more likely to be regarded as "smart", where "smart" is defined
as a score >= some reasonable threshold on the "smartness" axis than those
without glasses).

~~~
Homunculiheaded
"it's just been assumed that the data can be treated as interval data"

Which is especially problematic since user generated ratings are ordinal, not
interval data. Since the idea of an interval between points in ordinal data is
essentially meaningless the summary statistics you mentioned are not
meaningful either.

It's one thing for Amazon to come up with a mean user rating to give you a
sense of how people like something, but it's not a valid method of comparing
the data we have here, especially when the differences are so small

------
pessimizer
The problem with this blog is so obvious that I'm suprised that I haven't seen
it in the comments yet (I probably missed it), but you can't use a random
selection of photographs for this if you want to expect gross ratings to mean
something. You would have to normalise each trait that you were comparing
against every other trait. Otherwise, when you were trying to isolate how
smart people judge black people to be, and black people were wearing caps a
quarter more ofter than the average person, you would think you were getting
interesting data for blacks when really you were getting interesting data for
caps.

If you didn't plan to use gross ratings like this blog did (I think), then I'm
pretty sure that you could do a post-normalization by analyzing the
frequencies in the sample and determining how much you'd expect each of the
traits to affect the rating for every other trait, then trying to determine
the if the deviations from that were statistically significant in a universe
that contained only those traits.

Honestly - just take the original data and assign every trait a 5 rating, then
pick a random trait and pull that value up or down, then check and see what
the gross ratings now say about the other traits.

I apoligize if the methodology was more complicated than it looks, and I hope
there's a link to the spreadsheet of the original distribution somewhere in
the blog that I missed, so someone could make sense of this data.

------
ori_b
I have no idea what the standard deviation on this is. Lots of the numbers
look close enough to be noise. Others in this thread have pointed out other
missing information that makes this a fairly poor survey.

~~~
andrewfelix
Agreed. What's the margin of error on these? Sample group of 1000 is also far
too small to make any kind of conclusion considering all the variables.

------
kvh
What a tragic waste of data and time. Not one mention of confidence intervals
(are _any_ of these differences statistically significant??), selection bias
(who was more likely to submit photos, and why did they choose a specific
photo??), or sampling errors (who rated the attributes, and how consistent
were they?). The OK Cupid blog posts are a great source for similar (but
statistically sound) studies.

------
larrys
"thousands of photos have been uploaded and judged by users since."

Who are the users that are _judging_? What is the breakdown of those users
(age,sex,location,education etc.)? What can possibly be inferred from this
without knowing that info?

~~~
bvi
It's entirely anonymous - I know nothing about the users who are uploading the
photos (apart from the email addresses they use when uploading a photo), and
nothing about users who judge the photos.

The entire premise of the site is for the user to be judged by strangers. Why
would age/sex/location/education of the person doing the judging matter?

~~~
oskarth
Because it's most likely skewed. Without knowing anything about where your
votes come from, I'm pretty sure the WEIRD demographic is over-represented
(white, educated, industrialized, rich, democratic).

Not trying to be overly critical - I like the concept and execution, but these
things are really important in statistics.

------
Mz
I read just enough to decide it isn't really worth reading. I love the
articles OK Cupid does with hard statistical data backing up their inferences
about similar social stuff. This does not strike me as of that ilk.

I am disappointed. I was recently thinking about how people are judged based
on looks (and blogged about it) so was hoping for/looking forward to something
meatier.

------
nates
Your graphs appear to be very misleading. There is little to be learned from
the data. Learn some Data analysis and learn how to not provide bias via
graphs.

------
Danieru
I think they drove too far into the details considering their sample size.

It is an interesting study so I hope they update the post once they have been
in business longer.

------
simonster
These graphs could use some error bars.

~~~
slashcom
Indeed. While the results are very interesting, and some of them are clearly
very strong differences, I'd like to see some t-tests.

~~~
celer
When you mention the strong differences, are you accounting for the y axis
starting at varying locations? Once I realized this, the significance of the
results went down significantly.

------
scotty79
Why people assume that parameters are independent?

If most black woman that sent their photos are fat and people don't rate fat
woman high then the black women will be rated low not because of the race but
because being black woman and being fat woman is correlated in the sample
data.

Owner of such sites have large sample of some data and they assume that large
equals representative and they go on slicing their data by different
parameters not controlling for anything and making statements that are only
technically true with respect to their data but strongly misleading in many
ways.

------
lunchbox
The authors conflate "extroversion" and "social skills". For example, based on
his pic I'd rate this guy high on extroversion but low on social skills:

[http://madconfessionsofaman.files.wordpress.com/2011/05/douc...](http://madconfessionsofaman.files.wordpress.com/2011/05/douchebags2.jpg)

Similarly, being introverted doesn't mean you have low social skills.

~~~
paulhauggis
"Similarly, being introverted doesn't mean you have low social skills."

I don't know how you can say that.

The definition here: <http://dictionary.reference.com/browse/introvert?s=t>

implies low social skills.

Even when people discuss it here, they talk about wanting to be left alone,
not going to parties because they aren't interested in socializing, and
feeling "weak" after socializing for a short amount of time.

With all of this, I don't know how your social skills could ever be considered
high.

~~~
mmcdan
As an introvert, I can be a social butterfly at parties but need to leave
after an hour because it felt like work. As an extrovert, I can be a wallower
but want to stay out all night because the ambiance is energizing.

------
drewwwwww
the y axes vary a great deal. there's no information on the distribution of
ratings for each class. really difficult to tell if there's anything
meaningful or even interesting here at all.

------
tsumnia
Awesome analysis, though I agree they results look like error bars.

I know you mention a random sample of 1000 images, but what were your overall
metrics? Did you have a good data set across the board (ie as many Hispanic
females as Caucasian males)? What kind of advertising did you do as well?

Reason I ask is I've been working on trying build a face-morpher based on
different criteria (make you look 80, fat, African) and these are some of the
questions I've got bouncing in my head about how to collect the data.

------
kami8845
Coral Cache to the rescue

<http://judg.me.nyud.net/blog/judgment-day/>

------
dmvaldman
Error bars!

Law of large numbers says that your error should scale like 1/sqrt(N) where N
is the sample size. In this case N = 1000, so 1/sqrt(N) ~ 3%

This measures 1 STD (68% of values lie in an interval of 3% of reported
value). To be on the safe side you should take 2 or 3 STDs for the error bars.
This already nullifies most of the results!

------
willpearse
I'm sorry, but there's no 'unsexy data crunching' here - just a series of
ratios compared against one another. There is a whole body of statistical
literature about how to do anything of this kind, and they haven't done any of
it. I'd quite happily believe that none of these differences have any kind of
significance in the statistical sense (i.e., it's due to background
variation). But then again, I wasn't given any information to know whether
they've even looked. So I can't say...

------
MichailP
Prof. Dan Ariely mentioned Hot or Not website in one of his books. He used the
website to get his attractiveness score and other interesting data. The book
is a great read, and analysis of how people _percieve_ you by the looks. As
for the website, I think that judg.me looks very promising as a source of
social data, which is otherwise very difficult to obtain.

~~~
bvi
That's very interesting - I had no idea. Do you know which book it is?

~~~
MichailP
Its "The Upside of Irrationality", Chapter 7. Great read.

~~~
bvi
Thanks, appreciate it! I'll try to get a copy.

------
B-Con
These people have a wrong definition of extroversion.

The actual site rates extroversion vs introversion, but the analysis here
mistakenly uses the term social scale, implying that extroversion and
sociability are interchangeable. They are correlated, but by absolutely no
means are they interchangeable. This analysis should have stuck with the
original vocabulary more consistently.

------
jcc80
For someone in his early 30s whose hair is starting to thin out the results
are interesting, though expected. I won't lose many social points but will
pick up a good amount of perceived smartness when the baldness battle is
finally lost. And to throw the "I'm a fun-loving extrovert" vibe out there for
special occasions, I just throw on some shades.

------
craigmoore
Long live Hot or Not! I don't want to like this sort of site. I'm ashamed that
I read the whole post (and found it interesting).

------
geraldfong
This is cool data, but it would be best if you could release numbers about the
distribution more than just the average, ie standard deviation, quartile,
medians.

It is hard to determine significance from these graphs, especially as pflats
commented that the y-axis are skewed.

------
inDesperateZone
Okay, that's it, I'm cutting of my hair. Everyone seems to hate long hair on
man.

But I wonder how many long haired man were in the sample. They are quite rare
and a few ponytail grad students might lower the score.

------
Kiro
The comments here are depressing. Why can't you just enjoy it for what it is?
No-one believes this is pioneering research so you don't need to analyze it as
such.

~~~
bvi
That's okay. :) I'm learning quite a bit from the comments, and it's always
good to know what others think.

~~~
Mz
Upvoted. I hope you do turn this into a source for solid analysis of this
sort.

Best of luck.

------
blt
here is a 2D scatter plot of the smart/social ratings:
[https://docs.google.com/spreadsheet/oimg?key=0AmoarnvJ2W0ndF...](https://docs.google.com/spreadsheet/oimg?key=0AmoarnvJ2W0ndFh5aTF2Ti1Va2VKUnNDNTE0Vm1WX1E&oid=2&zx=t13973iv1gp2%22)
. It is still very misleading without axes going to zero, but at least you can
see the purported differences clearly.

------
sparkie
I'm confused as to what Perceived Smartness `div` Extroversion is meant to
represent. Or is it implying that extroverts are perceived as smart?

~~~
dpark
They phrased that poorly, but the graphs make it pretty clear. Every picture
was rated for "looks intelligent" and "looks extroverted". The two are
orthogonal in presentation to the voter and as presented in the graphs. The
blue bars are for the "smartness" ratings and the red bars are for the
"extroversion" ratings.

------
MarlonPro
Be asian and bald = smart! It ain't that simple!

~~~
antihero
And female?

------
antsam
I guess all future photos of me will have to be full body shots, with five
o'clock shadow, outside, smiling, and died gray hair.

------
AngryParsley
According to this, the smartest most sociable people should be smiling bald
indian women with glasses and 5 o'clock shadows.

------
roarktoohey
What would be interesting to see is the likely decline in intelligence as the
user base increases.

------
mc32
I had to stop reading after the following entered their glossary:

sistas, ladies.

Sorry but that is a turn-off to me.

~~~
sycr
I'll give you sistas. But ladies? Really? Why?

~~~
mc32
It's the context.

It's pretend deference and, to me, mildly sexist. He said men but not
'gentlemen'.

------
mahrain
So, please mark what data is statistically significant at a 95% level?

------
hoop
Oh good, just what the Internet needed: two-axis hot-or-not

------
superslug
Nothing says sociable like an iced grill ..

------
carguy1983
So in other words (for men) to get the most responses from dating sites, be
happy and white and wear sunglasses in an outdoor setting with a 5 o'clock
shadow and show your fit body?

In other words be a rugged, outdoorsy, all-american white guy.

Pretty sure this is only confirming what was already common knowledge.

