Well, I came in skeptical about this, but the analysis seems pretty solid.
To;dr: fairly strong and reliable correlation between depression and posting images that are bluer, grayer and darker. This is predictive in that users can be identified as depressed before they are disgnosed. # of faces appearing in user pics was also indicative of depression. Fewer faces per picture correlated with depression. # of comments more weakly correlated with depression, and # of likes was negatively correlated. Mechanical Turk-tasked humans were also able to fairly accurately identify depressed users, but often identified different users than the machine.
Statistical methods: Bayesian feature extraction with uninformed priors. 100-tree random forest for classification.
Some points of caution: depression is a broad, fairly fuzzy term. The authors acknowledge that this complicates matters. Some self-selection bias possible, as users had to provide permission to access Instagram streams and many users opted not to.
The study has examined 166 individuals. Some sub-group of these were depressed, and many depressed people could be identified based on photos.
This is then compared with GP's rate of success at diagnosing depression. I fear that's a slightly misleading comparison, because the study works with a different pool of patients to be diagnosed than the typical GP. The Instagram methodology solves a different, maybe easier, problem.
Yeah, and the samples don't seem like they come from the same population.
You have to go to the Appendix, but it appears that 71 of the sample were depressed (which presumably means that they have answered yes to the depression question). It's not clear if they used the CES to identify depression.
The remaining participants were classified as healthy (N=95). These do not seem like balanced samples to me, at least (it would be nice if we knew what proportion of depressed vs not depressed agreed to share IG data).
Additionally, the appendices also mention that gender was not available for the depressed sample, which leads me to believe that they collected one sample of depressed participants and then another from the general population. This is a little shady (at least to me).
All that being said, I really like this paper. I think that the approach is novel, they use pretty good methods and it actually represents a contribution (if small) to human knowledge.
And I definitely went in with a prior against it.
The sample size is far too small to support their inferences, but that's a less hard problem to fix :)
Not just that, but if we take 54 out of the 100 mentioned by that quote as being depressed (37 correctly classified and 17 misses), that means of the 46 non-depressed there are 23/46=50% false positives... I mean, maybe that's acceptable, but imagine being not depressed and getting targeted incorrectly with the same chances as a coin toss... seems like an undesirable aspect of that model to me.
Edit: they mention an alternative model too that gives about 30% true positive rate for depression but is more accurate for non-depressed (probably since it mostly classifies people as non-depressed). The whole survey methodology and everything sounds suspect though, this study is just not something I'd put alot of weight on.
>> "Some points of caution: depression is a broad, fairly fuzzy term."
I've only had a quick scan through this but it seems like they were simply asking people if they were depressed. Is that correct? If so it seems very broad and unreliable. Why wouldn't they just use the standard PHQ-9 questionnaire to diagnose?
I interpret the issue k-mcgrady brings up as the inconsistency inherent to self-assessing issues like these.
Standardized assessment has its issues (it doesn't replace a professional), but at least it works with a single definition that doesn't rely on calibrating each participant's understanding of depression. People struggling with depression often fail to recognize or be willing to admit it, and the opposite can be true as well.
By testing against participant answers, this study is actually determining if your photos correlate with saying you're depressed, which is a different and less predictable thing.
This seems pretty intuitive to me. I don't need this paper to know that depressed people post darker pictures. But it is kind of interesting to see this 'formalized'
Also, I think detecting depression premtively via social media is a terrifying idea.
Well I was skeptical, wondering if maybe people posted darker / grayer images because of where they live; so seasonal affective depression (SAD) was causing the depression, not that depression was swaying people's filter choices. However this study seems to account for that causation (I only skimmed it quickly):
"We also checked metadata to assess whether an Instagram provided filter was applied to alter the appearance of a photograph."
[...]
"A closer look at filter usage in depressed versus healthy participants provided additional texture. Instagram filters were used differently by depressed and healthy individuals. In particular, depressed participants were less likely than healthy participants to use any filters at all. When depressed participants did employ filters, they most disproportionately favored the “Inkwell” filter, which converts color photographs to black and white images. Conversely, healthy participants most disproportionately favored the Valencia filter, which lightens the tint of photos."
> Also, I think detecting depression premtively via social media is a terrifying idea.
Yeah, they also called their model "Pre-diagnosis". :)
You certainly do need this paper because saying "I have a hunch" without supporting it with anything does not further discourse and it definitely is not actionable.
It's more than just 'I have a hunch' - I'm sure there are studies that show depressed people prefer darker colors. For example I found this after a quick search for "colors and depression"[0] and there are many more like it. Why wouldn't that carry over to Instagram too?
When I said I was skeptical, it wasn't about the concept or theses per se. It was that the study would be conducted well and seem solid, versus being attention-grabbing... which sadly all too much "science" lately is.
yeah, all of those buts and ifs make the analysis not statistically significant. It's not a random sample and asking people whether they are depressed is not meaningul, Depression is a disease that is quite distinct from simply being sad or in a melancholic mood.
This may not be a good thing. Always remember that social media is used by industry as a means of carrying out background checks. What's this? One of our job applicants has a poor mental health rating? Let's pick the other one instead.
Anonymity for free expression is the only solution to prevent any of these social 'big data' problems. Tieing real ID to human data is not just bad for people. It is bad for business. Ad tech relies on a user base that doesn't filter themselves or find means to circumvent tracking.
This is why ephemeral identities and vanishing content shines; for example, on 4chan your post is only online for a few weeks maximum (as it's stored in a board's particular archive after it gets pushed off the end of the board). Only 4chan knows what your IP is. Archive websites collect posts from 4chan but can't see your IP.
So eventually, if you make a post with the name "Anonymous" on 4chan, the only record of your post will be on an archive website where they don't have your IP (so they can't semi-uniquely identify you) and your name is "Anonymous". That's virtually untraceable except with prior knowledge of posting habits (time of day and what board) and text analysis (which has shown to be quite effective in revealing authors behind pseudonyms).
When someone only has the content of your post, where you posted it, and at what time (the minimum amount of information 4chan lets you submit), we can express ourselves but quite effectively avoid employer snooping or otherwise.
...which is the right of the employer. They are free to discriminate based on social media contents, shirt color, or any other non-protected criteria they choose.
Too many misspellings or rageful comments on social media would be a clear NO HIRE signal for me.
What about the knock-on effects of everyone doing that: a society where it's common knowledge that the 'wrong kind' of public speech can make you unemployable?
>a society where it's common knowledge that the 'wrong kind' of public speech can make you unemployable?
This has always existed, and you're more okay with it than you think you are: would you hire someone who says "nigger" in public?
The above is a prime example of the 'wrong kind' of public speech that makes you unemployable, as it should. Free speech doesn't mean "speech without consequence".
In the US the n-word is part of common speech, especially for young blacks. If you refused to hire anyone who said it in public, that would be a form of racist discrimination against black people.
This all relates to the racial double standard over permissible speech, of course.
> Free speech doesn't mean "speech without consequence".
In that case it isn't really free speech, any more than you've got free speech if I'm holding a gun to your head and threatening to shoot you if you say something I disagree with.
I just said "nigger" technically, but as you well know, there's a difference between using the word in the first-degree and referring to the word to make a point about its semantics or cultural context. "Fuck you, nigger" is not the same as "The word 'nigger' has a lot of baggage", and that is still not the same as "that's my nigga over there". You know this, and I know that you know this.
Further, if you don't understand that "free speech" refers to governmental policy rather than social norms, and if you don't understand that free speech comes with limitations (shouting "fire" in a crowded place, blah blah blah), then there is no hope of a thoughtful discussion.
You are confusing free speech with the right to free speech. The latter, in the US, is a Constitutionally guaranteed right. The former is a description of your state of freedom, and the law saying you have the right to free speech does not necessarily mean that you actually have it. The government is not the only entity that has an effect on free speech.
Not all rights that you have on paper are enforced. A right to life is not the same as being alive. A right to free speech is not the same as actually having free speech. A society where saying unpopular things will get you attacked by a mob, instantly fired with loss of healthcare/housing, etc does not have free speech, even if the letter of the law says this right exists.
> shouting "fire" in a crowded place, blah blah blah
I've said this on the internet many, many times: That quote comes from a USSC case where the right to protest against the draft in the First World War was taken away, it was not upheld, and frankly, it's fascist.
I think we're already there - stand on a corner of your street and yell ALL BUSINESS IS EVIL!! and you're pretty much unemployable in nearby businesses. Do it publicly and under your own name on the internet and you're unemployable globally by businesses who bother to check. And why would they hire you? You consider them evil after all.
If someone lived in the USSR, wouldn't it be right for them to protest against communism, even if they had to participate in the communist system to survive?
Would you tell them they should starve if they were against communism?
If you think all businesses are bad and go work in a grocery store and complain about "The Man" to customers, then you'll be fired. It seems reasonable to not bother with them in the first place. A better "communism" analogy would be: Someone who works for the USSR but protests communism. It's not good for business if your own employees are bad mouthing your company.
There are countries where there may not be legally enforced death penalties for Islamic blasphemy, but defaming the prophet Muhammed will surely get you murdered by an enraged mob. I don't think this distinction makes a lot of difference to the blasphemers.
A society so averse to dissenting speech that it bears comparison with totalitarianism doesn't have to be state-enforced. The internet seems to do a good job of amplifying outrage and enabling witch hunts.
seems to me they used the same data that was used to fit the models, to estimate prediction accuracy. This overestimates prediction accuracy. It would be better to use data that was not used to fit the models to assess accuracy. This is known as cross validation
I'm an amateur photographer and a lot of pictures I take don't have too many human faces in them just as an aesthetic thing, and I'm pretty sure I'm not depressed when posting them :)
A note -- if you're linking to arXiv, it's better to link to the abstract (https://arxiv.org/abs/1608.03282) rather than directly to the PDF. From the abstract, one can easily click through to the PDF; not so the reverse. And the abstract allows one to do things like see different versions of the paper, search for other things by the same authors, etc. Thank you!
Considering that depression is mostly defined as lack of social interaction picture streams that depict fewer social interactions correlate well with photographer depression. Much Science.
To;dr: fairly strong and reliable correlation between depression and posting images that are bluer, grayer and darker. This is predictive in that users can be identified as depressed before they are disgnosed. # of faces appearing in user pics was also indicative of depression. Fewer faces per picture correlated with depression. # of comments more weakly correlated with depression, and # of likes was negatively correlated. Mechanical Turk-tasked humans were also able to fairly accurately identify depressed users, but often identified different users than the machine.
Statistical methods: Bayesian feature extraction with uninformed priors. 100-tree random forest for classification.
Some points of caution: depression is a broad, fairly fuzzy term. The authors acknowledge that this complicates matters. Some self-selection bias possible, as users had to provide permission to access Instagram streams and many users opted not to.