Hacker News new | past | comments | ask | show | jobs | submit login
Instagram photos reveal predictive markers of depression (arxiv.org)
157 points by minimaxir on Aug 12, 2016 | hide | past | web | favorite | 52 comments

Well, I came in skeptical about this, but the analysis seems pretty solid.

To;dr: fairly strong and reliable correlation between depression and posting images that are bluer, grayer and darker. This is predictive in that users can be identified as depressed before they are disgnosed. # of faces appearing in user pics was also indicative of depression. Fewer faces per picture correlated with depression. # of comments more weakly correlated with depression, and # of likes was negatively correlated. Mechanical Turk-tasked humans were also able to fairly accurately identify depressed users, but often identified different users than the machine.

Statistical methods: Bayesian feature extraction with uninformed priors. 100-tree random forest for classification.

Some points of caution: depression is a broad, fairly fuzzy term. The authors acknowledge that this complicates matters. Some self-selection bias possible, as users had to provide permission to access Instagram streams and many users opted not to.

The study has examined 166 individuals. Some sub-group of these were depressed, and many depressed people could be identified based on photos.

This is then compared with GP's rate of success at diagnosing depression. I fear that's a slightly misleading comparison, because the study works with a different pool of patients to be diagnosed than the typical GP. The Instagram methodology solves a different, maybe easier, problem.

Yeah, and the samples don't seem like they come from the same population.

You have to go to the Appendix, but it appears that 71 of the sample were depressed (which presumably means that they have answered yes to the depression question). It's not clear if they used the CES to identify depression.

The remaining participants were classified as healthy (N=95). These do not seem like balanced samples to me, at least (it would be nice if we knew what proportion of depressed vs not depressed agreed to share IG data).

Additionally, the appendices also mention that gender was not available for the depressed sample, which leads me to believe that they collected one sample of depressed participants and then another from the general population. This is a little shady (at least to me).

All that being said, I really like this paper. I think that the approach is novel, they use pretty good methods and it actually represents a contribution (if small) to human knowledge.

And I definitely went in with a prior against it. The sample size is far too small to support their inferences, but that's a less hard problem to fix :)

    70% of all depressed cases (n=37), with a relatively low number of 
    false alarms (n=23) and misses (n=17).
Not sure but is n=40 "wrong" not a lot more than found cases????

Not just that, but if we take 54 out of the 100 mentioned by that quote as being depressed (37 correctly classified and 17 misses), that means of the 46 non-depressed there are 23/46=50% false positives... I mean, maybe that's acceptable, but imagine being not depressed and getting targeted incorrectly with the same chances as a coin toss... seems like an undesirable aspect of that model to me.

Edit: they mention an alternative model too that gives about 30% true positive rate for depression but is more accurate for non-depressed (probably since it mostly classifies people as non-depressed). The whole survey methodology and everything sounds suspect though, this study is just not something I'd put alot of weight on.

Thought exactly the same.

>> "Some points of caution: depression is a broad, fairly fuzzy term."

I've only had a quick scan through this but it seems like they were simply asking people if they were depressed. Is that correct? If so it seems very broad and unreliable. Why wouldn't they just use the standard PHQ-9 questionnaire to diagnose?

I think the idea is to "learn" with human assistance and then use the "intelligence" to classify other individuals without having to ask them.

I interpret the issue k-mcgrady brings up as the inconsistency inherent to self-assessing issues like these.

Standardized assessment has its issues (it doesn't replace a professional), but at least it works with a single definition that doesn't rely on calibrating each participant's understanding of depression. People struggling with depression often fail to recognize or be willing to admit it, and the opposite can be true as well.

By testing against participant answers, this study is actually determining if your photos correlate with saying you're depressed, which is a different and less predictable thing.

It seems they worked with two groups. One consisted of individuals clinically diagnosed with depression.

This seems pretty intuitive to me. I don't need this paper to know that depressed people post darker pictures. But it is kind of interesting to see this 'formalized'

Also, I think detecting depression premtively via social media is a terrifying idea.

Well I was skeptical, wondering if maybe people posted darker / grayer images because of where they live; so seasonal affective depression (SAD) was causing the depression, not that depression was swaying people's filter choices. However this study seems to account for that causation (I only skimmed it quickly):

"We also checked metadata to assess whether an Instagram­ provided filter was applied to alter the appearance of a photograph."


"A closer look at filter usage in depressed versus healthy participants provided additional texture. Instagram filters were used differently by depressed and healthy individuals. In particular, depressed participants were less likely than healthy participants to use any filters at all. When depressed participants did employ filters, they most disproportionately favored the “Inkwell” filter, which converts color photographs to black­ and­ white images. Conversely, healthy participants most disproportionately favored the Valencia filter, which lightens the tint of photos."

> Also, I think detecting depression premtively via social media is a terrifying idea.

Yeah, they also called their model "Pre­-diagnosis". :)

That doesn't account for editing outside of Instagram, though, which many people do exclusively.

You certainly do need this paper because saying "I have a hunch" without supporting it with anything does not further discourse and it definitely is not actionable.

It's more than just 'I have a hunch' - I'm sure there are studies that show depressed people prefer darker colors. For example I found this after a quick search for "colors and depression"[0] and there are many more like it. Why wouldn't that carry over to Instagram too?

[0] http://www.academia.edu/3880952/RELATIONSHIP_BETWEEN_COLOR_A...

You need to prove that it carries over to social media in a meaningful manner.


You support "I don't need this study" with prior studies.

It's very very important for even intuitive things to be backed by research.

Very true, though on the other hand, we might have finally discovered an actual useful side of this "social media".

When I said I was skeptical, it wasn't about the concept or theses per se. It was that the study would be conducted well and seem solid, versus being attention-grabbing... which sadly all too much "science" lately is.

yeah, all of those buts and ifs make the analysis not statistically significant. It's not a random sample and asking people whether they are depressed is not meaningul, Depression is a disease that is quite distinct from simply being sad or in a melancholic mood.

This may not be a good thing. Always remember that social media is used by industry as a means of carrying out background checks. What's this? One of our job applicants has a poor mental health rating? Let's pick the other one instead.

Anonymity for free expression is the only solution to prevent any of these social 'big data' problems. Tieing real ID to human data is not just bad for people. It is bad for business. Ad tech relies on a user base that doesn't filter themselves or find means to circumvent tracking.

De-anonymization is relatively easy. The only real solution is for users to not use online services. It's sad, but it's just how it is now.

This is why ephemeral identities and vanishing content shines; for example, on 4chan your post is only online for a few weeks maximum (as it's stored in a board's particular archive after it gets pushed off the end of the board). Only 4chan knows what your IP is. Archive websites collect posts from 4chan but can't see your IP.

So eventually, if you make a post with the name "Anonymous" on 4chan, the only record of your post will be on an archive website where they don't have your IP (so they can't semi-uniquely identify you) and your name is "Anonymous". That's virtually untraceable except with prior knowledge of posting habits (time of day and what board) and text analysis (which has shown to be quite effective in revealing authors behind pseudonyms).

When someone only has the content of your post, where you posted it, and at what time (the minimum amount of information 4chan lets you submit), we can express ourselves but quite effectively avoid employer snooping or otherwise.

...which is the right of the employer. They are free to discriminate based on social media contents, shirt color, or any other non-protected criteria they choose.

Too many misspellings or rageful comments on social media would be a clear NO HIRE signal for me.

Why shouldn't they be?

Depression is a disability and you're not allowed to discriminate people because they're depressed.



I thought you are allowed to discriminate based on disability, if it impedes work function.

Any depression that gets to the level of being qualified as a disability seems severe enough to impede work.

IIRC, you are allowed to discriminate if it impedes your ability to work. But you are required to provide them a chance (aka equal opportunity).

What about the knock-on effects of everyone doing that: a society where it's common knowledge that the 'wrong kind' of public speech can make you unemployable?

Isn't that kind of totalitarian?

>a society where it's common knowledge that the 'wrong kind' of public speech can make you unemployable?

This has always existed, and you're more okay with it than you think you are: would you hire someone who says "nigger" in public?

The above is a prime example of the 'wrong kind' of public speech that makes you unemployable, as it should. Free speech doesn't mean "speech without consequence".

> would you hire someone who says "nigger" in public?

Yes. Obama said it in public, after all.


In the US the n-word is part of common speech, especially for young blacks. If you refused to hire anyone who said it in public, that would be a form of racist discrimination against black people.

This all relates to the racial double standard over permissible speech, of course.

> Free speech doesn't mean "speech without consequence".

In that case it isn't really free speech, any more than you've got free speech if I'm holding a gun to your head and threatening to shoot you if you say something I disagree with.

I just said "nigger" technically, but as you well know, there's a difference between using the word in the first-degree and referring to the word to make a point about its semantics or cultural context. "Fuck you, nigger" is not the same as "The word 'nigger' has a lot of baggage", and that is still not the same as "that's my nigga over there". You know this, and I know that you know this.

Further, if you don't understand that "free speech" refers to governmental policy rather than social norms, and if you don't understand that free speech comes with limitations (shouting "fire" in a crowded place, blah blah blah), then there is no hope of a thoughtful discussion.

You are confusing free speech with the right to free speech. The latter, in the US, is a Constitutionally guaranteed right. The former is a description of your state of freedom, and the law saying you have the right to free speech does not necessarily mean that you actually have it. The government is not the only entity that has an effect on free speech.

Not all rights that you have on paper are enforced. A right to life is not the same as being alive. A right to free speech is not the same as actually having free speech. A society where saying unpopular things will get you attacked by a mob, instantly fired with loss of healthcare/housing, etc does not have free speech, even if the letter of the law says this right exists.

> shouting "fire" in a crowded place, blah blah blah

I've said this on the internet many, many times: That quote comes from a USSC case where the right to protest against the draft in the First World War was taken away, it was not upheld, and frankly, it's fascist.


I think we're already there - stand on a corner of your street and yell ALL BUSINESS IS EVIL!! and you're pretty much unemployable in nearby businesses. Do it publicly and under your own name on the internet and you're unemployable globally by businesses who bother to check. And why would they hire you? You consider them evil after all.

But how is this fair or moral?

If someone lived in the USSR, wouldn't it be right for them to protest against communism, even if they had to participate in the communist system to survive?

Would you tell them they should starve if they were against communism?

If you think all businesses are bad and go work in a grocery store and complain about "The Man" to customers, then you'll be fired. It seems reasonable to not bother with them in the first place. A better "communism" analogy would be: Someone who works for the USSR but protests communism. It's not good for business if your own employees are bad mouthing your company.

It isn't totalitarian.

There are countries where there may not be legally enforced death penalties for Islamic blasphemy, but defaming the prophet Muhammed will surely get you murdered by an enraged mob. I don't think this distinction makes a lot of difference to the blasphemers.

A society so averse to dissenting speech that it bears comparison with totalitarianism doesn't have to be state-enforced. The internet seems to do a good job of amplifying outrage and enabling witch hunts.

seems to me they used the same data that was used to fit the models, to estimate prediction accuracy. This overestimates prediction accuracy. It would be better to use data that was not used to fit the models to assess accuracy. This is known as cross validation

Wow, this is cool. I wonder what the end game of machine learning is, given its inherently successful only with a noisy data-set?

Obvious next step: automatic police notification of potential suicide in progress whenever a depressed photo is posted.

Palantir will gladly take that contract.

Wow, it's scary to think that someone could know that about someone else simply from checking out their Instagram feed.

I'm surprised that color factors had greater predictive power than presence of human faces.

I'm an amateur photographer and a lot of pictures I take don't have too many human faces in them just as an aesthetic thing, and I'm pretty sure I'm not depressed when posting them :)

A note -- if you're linking to arXiv, it's better to link to the abstract (https://arxiv.org/abs/1608.03282) rather than directly to the PDF. From the abstract, one can easily click through to the PDF; not so the reverse. And the abstract allows one to do things like see different versions of the paper, search for other things by the same authors, etc. Thank you!

You seem like you might be knowledgable here, so I'll ask this one here.

Is Arxiv considered a Peer Reviewed Journal or is it Cornell's submission database?

Their site really doesn't allude one way or the other so it makes me think its just a database.

ArXiv is certainly not a peer-reviewed journal. There is some basic review (https://arxiv.org/help/moderation) and an endorsement system (https://arxiv.org/help/endorsement), other than that you can freely submit content there.

Cheers for the clarification :D

Thanks, we updated the link.

Considering that depression is mostly defined as lack of social interaction picture streams that depict fewer social interactions correlate well with photographer depression. Much Science.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact