
U.S. military to analyze 350B social media posts to understand popular movements - pseudolus
https://www.bloomberg.com/news/articles/2019-05-25/u-s-military-to-trawl-through-350-billion-social-media-messages
======
Anon84
This seems to be a bit of too much ado over nothing. This type of projects
have been going on for over a decade now (just search for Twitter or Facebook
in Google Scholar). Even the government has been involved in financing some of
this research (For example:
[https://www.iarpa.gov/images/press/osi/24_open_source_indica...](https://www.iarpa.gov/images/press/osi/24_open_source_indicators.pdf)).

Many interesting things can be done with this kind of data (and it's what
originally got me interested in data science
[https://www.springer.com/us/book/9783319140100](https://www.springer.com/us/book/9783319140100)).
That being said, the specific solicitation
([https://www.neco.navy.mil/synopsis/detail.aspx?id=537809](https://www.neco.navy.mil/synopsis/detail.aspx?id=537809))
doesn't seem particularly exciting or novel. Just an overgrown grad school
project to keep the students busy. With some Python experience and patience to
collect the data, pretty much anyone can do something similar.

You can find the slides for one of my old (PyData 2015) tutorials on social
media mining here: [https://www.slideshare.net/bgoncalves/mining-
georeferenced-d...](https://www.slideshare.net/bgoncalves/mining-
georeferenced-data)

~~~
coldtea
> _This seems to be a bit of too much ado over nothing. This type of projects
> have been going on for over a decade now_

Yeah, the Overton window has now moved, fascism is now acceptable. /s

~~~
jcims
Is there something recent that brought ‘Overton window’ into common use? I
honestly believe the first time I’ve ever seen the term was within the past
few months and now I see it regularly.

------
Gpetrium
The reality is that private entities and external actors are already using
data to figure out better offensive and defensive maneuvers. If the entities
of your nation do not take similar steps, they are likely to be ill prepared
to "serve & protect" you and your interests, which may include a fair and
functional democracy.

Some may ask "why is the military involved" the answer relates to the fact
that today's military have realized that foreign actors will weaponize
technology (e.g. internet, social media) to serve their goals. To ensure that
the goals of your nation are protected, the military has moved more and more
towards the cyberspace.

Here are a few benefits that can come from such endeavor:

* Relief efforts - How do people react in social media before, at the time and after a disaster happens? How can social media be used to connect to the ones in need? How can X entity use social media to advise groups in a risky area (e.g. landslide) to diminish damage and risk of life?

* Trolls - Are there cues that can be used to determine when someone is a troll? Can that information be used to help society become more resilient in the future?

* Training - Can this research help dozens/hundreds of personnel to gain valuable know-hows and capabilities in the cyber arena? Will that knowledge translate in a more capable entity?

Another question to ask is whether X entity follows your ideological view and
whether there is enough oversight. If the answer is no to either/or, then as
part of a democracy, you should raise your concern with representatives & the
people around you.

------
AndrewKemendo
This is a scary headline for a run of the mill research project at Naval
Postgraduate School. Here's the solicitation:
[https://www.neco.navy.mil/synopsis/detail.aspx?id=537809](https://www.neco.navy.mil/synopsis/detail.aspx?id=537809)

> Our research aims to provide enhanced understanding of fundamental social
> dynamics, to model the evolution of linguistic communities, and emerging
> modes of collective expression, over time and across countries ...

> As a central requirement for this research, we seek to acquire a large-scale
> global historical archive of social media data, providing the full text of
> all public social media posts, across all countries and languages covered by
> the social media platform. We aim to use this research to advance knowledge
> through scientific publications. This data will also be used for pedagogical
> purposes in the classroom, giving students new opportunities for thesis
> research and the development of “big data” analytic skills.

How can the Department of Defense do a better job at interfacing with the
general public to better explain intentions, motives etc... with the work that
we do?

~~~
soulofmischief
They could start with being honest that those aren't the DoD's _only_ motive
behind wanting such a corpus and analytical model.

There are many positive and negative things that could result from something
like this, and it isn't the fact that the Navy is doing it that scares people.
It's the fact that the US government is doing it.

~~~
AndrewKemendo
_They could start with being honest that those aren 't the DoD's only motive
behind wanting such a corpus and analytical model_

That would be akin to asking [insert company] if it will possibly ever use
data collected from one thing on a different project at some later date. That
is a complete unknown, so it wouldn't even be appropriate to put into a
proposal, as they have specific formats.

NPS is a graduate study program, with a research component and long history of
academic publishing. Whether that data could possibly, someday be used beyond
that is completely out of scope for this solicitation. Further, if other
groups in the military wanted similar data, they would likely not go to NPS
for it, for a variety of reasons.

I think at the end of the day though, if someone has an assumption that
everything the DoD does or messages to the world is somehow subterfuge,
there's no amount of discussion that would reverse that feeling.

~~~
soulofmischief
As I just mentioned in a sister comment, I know this kind of research is
inevitable and like I said there are positive aspects.

Similar undertakings have no doubt already been made in private inside other
organizations.

I'm just explaining why people feel uneasy about this kind of thing. It's
because of who's boss.

------
cjslep
I remember being a uni student at a career fair nearly a decade ago and the
folks at the CIA booth were wanting to hire to "predict the next Arab Spring"
(which at that time was fully underway). When I asked a little bit, they
didn't divulge more and weren't interested in me (my degree is not comp sci).

I've just been assuming this sort of work has been going on since then.

~~~
dictum
> predict the next Arab Spring

The idea of the military arm of the U.S. government predicting the next Arab
Spring reminds me of that classic mind-bender of a quote, which I think is
often misunderstood as a defense of lying to the public, when it is about
something deeper:

> The aide said that guys like me were "in what we call the reality-based
> community," which he defined as people who "believe that solutions emerge
> from your judicious study of discernible reality." I nodded and murmured
> something about enlightenment principles and empiricism. He cut me off.
> "That's not the way the world really works anymore," he continued. "We're an
> empire now, and when we act, we create our own reality. And while you're
> studying that reality -- judiciously, as you will -- we'll act again,
> creating other new realities, which you can study too, and that's how things
> will sort out. We're history's actors . . . and you, all of you, will be
> left to just study what we do."

[https://www.nytimes.com/2004/10/17/magazine/faith-
certainty-...](https://www.nytimes.com/2004/10/17/magazine/faith-certainty-
and-the-presidency-of-george-w-bush.html)

~~~
GreedCtrl
Have you heard of the free energy principle? The two ideas seem related.

From Wikipedia:

> The free energy principle is that systems—those that are defined by their
> enclosure in a Markov blanket—try to minimize the difference between their
> model of the world and their sense and associated perception. This
> difference can be described as "surprise" and is minimized by continuous
> correction of the world model of the system. As such, the principle is based
> on the Bayesian idea of the brain as an “inference engine.” [Karl] Friston
> added a second way to minimization: action. By actively changing the world
> into the expected state, systems can also minimize the free energy of the
> system. Friston assumes this to be the principle of all biological reaction.

~~~
sitkack
I don’t have to change my behavior if I change the world instead.

------
lettergram
If they are using public posts, this seems fine. It’s more of a general study
and could have a real impact on our understanding of how stories spread (at
least via this medium, today).

~~~
tty2300
What is not fine is the fact that social media companies have spent years
using dark patterns and tricks to get us to publish more and more of our info
publicly. It used to be common knowledge that you never post any of your real
life details on the internet but now the most popular websites will ban you
for not using your real name.

~~~
austincheney
I remember a decade ago when
[http://pleaserobme.com/](http://pleaserobme.com/) came out and how my
coworkers were shocked. I honestly thought over-sharing was a common sense bad
idea, but it was not as common sense as I had thought. They did not require
any dark patterns to willfully give up everything to social media.

------
jmartrican
I just assumed this was going on already. Maybe because they are working with
a professor on this effort it had to be revealed. I wonder if behind the
scenes the Navy or the military had already done this multiple times and this
is just another attempt.

~~~
mlb_hn
US Government's been great at collecting tons of data. Analyzing it... not so
much.

~~~
squarefoot
"Analyzing it... not so much."

Or maybe they have learned a few things by reading Sun Tzu.

~~~
mlb_hn
I'd guess there have been significant advances in network theory and NLP since
Sun Tzu's time but I'm not an expert.

~~~
squarefoot
Yes, my point is that the military usually don't show what they can do and
more importantly what they can't do. Should they have the best algorithm in
the world to match anything with anything in that data, there would be many
reasons not to reveal that.

~~~
mlb_hn
I'm not sure there's been any evidence that the military has had success
utilizing current computational approaches to data. E.g. the DCGS project was
a failure and I don't think that project tried anything beyond what you'd get
out of Microsoft Excel ([https://nypost.com/2014/10/27/army-spent-5b-on-
failed-techno...](https://nypost.com/2014/10/27/army-spent-5b-on-failed-
technology-created-by-vets/)). DoD plans for projects that last for decades
and the algorithms people are building for commercial purposes today are
developing way faster than DoD's procurement can keep up I'd guess,

------
chiefalchemist
> "...and individual users won’t be identified..."

Perhaps not as individuals, with enough personal meta data your identity can
be reverse engineered.

That aside, anyone you believes there is no hidden agenda here is naive. Any
understanding will have a second benefit for doing the opposite (i.e.,
preventing movements).

~~~
jolfdb
That's a boring complaint. Any tech can be used for good or evil. When is a
movement and movement, and not a guerilla terrorist cell network?

~~~
SturgeonsLaw
> When is a movement and movement, and not a guerilla terrorist cell network?

That depends on who wins the war, and thus writes the history books. From the
point of view of the British, the American Revolution was an act of terrorism
and treason.

Attempts by the military to understand (and, one would assume, prevent)
popular uprisings would nip a second American Revolution in the bud, for
better or worse.

The linked article refers to an attempt to entrench the status quo, and not
everyone is pleased about that.

------
mtgx
Why exactly is the military doing this? What's the business of the military
what "popular movements" do inside the country?

~~~
itronitron
The US military engages in a lot of aid/relief work, this may help them better
understand where that work will best be received and its benefit.

~~~
rjf72
Is your comment meant to be sarcastic? If not, can you elaborate on why - of
all possible options for the department who's primary purview _in practice_
(in contemporary times) is the destabilization and/or destruction of other
nations, you think this would be the most probable explanation for a mass
scale analysis of 350 billion social media posts?

If that message comes off abrasive it's only because I am genuinely confused,
but would like to know where you're coming from. There's always a risk of us
living in our little bubbles and so I make every effort to try to see things
through the eyes of others, especially when I simply cannot otherwise
empathize with their views on any level.

~~~
itronitron
I wasn't being sarcastic, they care a lot about situational awareness so have
an interest in 'checking the weather' local to their operations, whatever
those may be.

------
techrich
If you post it publicly, then its fair game what it can be used for.

~~~
luckylion
Should you expect & accept for everything you speak in public to be recorded
and monitored? Is it "fair game" to do facial recognition on everybody that
shows their face "in public"?

~~~
soulofmischief
Walking around the park and speaking privately with a friend is not the same
as making a permanent, public post on Twiter.

~~~
luckylion
You're "tweeting" to your followers and random people that happen upon your
feed can read it too, just as some random person sitting on a bench in the
park would hear you speak. I don't believe that most people (normal people,
that is, not professional pundits that use Twitter) are fully aware that they
are "completely public" most of the time. They'll likely "know" if you ask
them, but they'll use Twitter as if it's an extended group chat a lot of the
time: they don't expect that the world is watching/can watch.

There's probably a cultural component to what side of this one comes down to,
how much privacy you can expect (and be granted by law) in public spaces is
different between countries.

~~~
soulofmischief
When I'm done speaking, I'm done speaking. The moment has passed. I know the
information is now safe unless my friends share it.

I don't have to worry about someone seeing a crude joke made 10 years ago and
starting a massive deplatformization campaign against me in the name of social
justice.

When I make a post, it's there forever.

I know your analogy seems to make sense, but these two phenomenon are
incompatible. You cannot translate this digital phenomenon to that physical
phenomenon.

~~~
luckylion
> You cannot translate this digital phenomenon to that physical phenomenon.

I agree, they aren't the same in that regard. The expectation is similar,
that's what I was trying to say. When people make a crude joke or call
somebody an asshole on Twitter, they don't expect that to be read by
everybody.

Somebody compared a public post to standing on a soap box. If you're standing
on a soap box, you expect what you say to be heard publicly. I believe that's
primarily because you usually don't stand on a soap box, so being that exposed
really stands out. If everybody stood on a soap box all the time because the
floor is lava and soap boxes are heat resistant, that would quickly change and
people wouldn't associate a soap box with "public". The early days of social
media certainly had that, it was new and everybody was acutely aware of it.
That changes with exposure, now it's no longer an outlet to let the public
know about your opinion, it's a chat where you talk one on one or in small
groups (for most users, very few have large followings and get a lot of
visibility) to friends and strangers alike. Having some tweet go viral is the
equivalent of some reporter overhearing your private conversation and
mentioning what a vile person you are in his column.

~~~
soulofmischief
> When people make a crude joke or call somebody an asshole on Twitter, they
> don't expect that to be read by everybody.

If we're keeping up with analogies, that's like saying someone doesn't expect
what they say on a soap box to be heard by anyone. That's a false expectation,
however.

We have to strike a balance between ideology and pragmatism. The best
compromise is likely preventing the government from having access to data we
consider private, while understanding anything on the other side of the line
is fair game and accessible to friends, family, politicians, corporations, and
historians alike.

The only way we can achieve this is by limiting what can be done with this
data aka fixing the US government. We cannot, for example, have public or
inferred health data used to prevent people from attaining health insurance.

~~~
luckylion
> If we're keeping up with analogies, that's like saying someone doesn't
> expect what they say on a soap box to be heard by anyone. That's a false
> expectation, however.

That's my point. If "soap box speaking" was as common as tweeting or posting
on Facebook or speaking to a friend in a pub, people wouldn't expect it to be
different. When social media changed from "technology pioneers use it
carefully" to "everybody uses it and rarely thinks twice before hitting send",
the expectation of the average user shifts.

> while understanding anything on the other side of the line is fair game

I do agree: Fair game for any _person_. As soon as technology enters the game
(scrapers, databases etc), it changes. Similarly, you can sit in a park and
catch fragments of what the people walking by talk about, that's all right. If
you're using a directional microphone to listen in on people on the other side
of the park, that's not. A user doesn't expect to the target of a microphone
or a webscraper/data mining operation.

~~~
soulofmischief
Fair game means fair game. It means its open for anyone.

If people don't _expect_ this, it's because they don't _understand_ it.

Let's work on educating people instead of warping reality around the ignorant.

~~~
luckylion
> Fair game means fair game. It means its open for anyone.

In that case, you're allowed to remotely listen in on everyone as long as
they're in a public space.

> If people don't expect this, it's because they don't understand it.

Absolutely, and because everything is done so that they either don't
understand it, or quickly forget it. Change Twitter's send button to "I want
everyone, including my enemies, the government, journalists and my mother to
read this" ... and you'll pretty much kill Twitter.

> Let's work on educating people instead of warping reality around the
> ignorant.

The ignorant make up the majority. We can't educate everybody about machine
learning to the required degree where they can make an informed decision about
it. The same goes for financial instruments or law. Instead, we just outlaw
the most egregious transgressions ("no, it said on page 277 of the contract
that we get all the money he'll ever make in return for this bicycle") and
base what's allowed on what a reasonable person would expect.

~~~
soulofmischief
> In that case, you're allowed to remotely listen in on everyone as long as
> they're in a public space.

You're willfully continuing to ignore the soap box analogy.

------
skilled
In a perfect world people would use this as an opportunity to tell the
government to shove it. But, sadly, it's not a perfect world.

~~~
JumpCrisscross
> _In a perfect world people would use this as an opportunity to tell the
> government to shove it_

In _your_ vision of a perfect world. Personally, I’m fine with this. They
aren’t using call logs or other private data. Information on social media is
public or semi-public. Fair game, and I’m curious about the results.

(Agree that a more productive public debate is needed. But that shouldn’t
start by presuming values.)

~~~
abdullahkhalids
The problem in this case is not that private data might have been used. The
problem is that the outcome of this research will be the US military
developing a weapon that can be used to create or stop or influence political
movements across the world (including inside the US). The question is whether
it is a good idea to create such a weapon. War is inevitable but ever since
time immemorial people have respected certain rules of engagement. Its
difficult to draw the line in the sand to decide what type of propaganda is
unacceptable, but it should be obvious that propaganda created using every
dark pattern in the book, trained on 350B social media posts is probably not
going to bode well for Earth's future. Its a lose-lose situation.

~~~
JumpCrisscross
> _whether it is a good idea to create such a weapon_

These “should we do X” questions tend to be useless. If it can be done, it
will be done, absent multilateral action. Being able to predict social
upheaval for humanitarian and geopolitical positioning purposes is obviously
useful.

The government using such models to suppress activism in the U.S. is bad. But
knowing how such phenomenon work is also a precursor to preventing its abuse.

------
zozbot123
Maybe we could all settle on a common insult towards the government then?

------
freeflight
I would be very surprised if this hasn't been going on for years already, back
in 2014 they were already as far as using Facebook users as test-groups [0].

Then there was also the case of a Pentagon contractor leaving their AWS
buckets open exposing billions of scraped social media posts. [1]

As much as people can reach for benign reasons for "research" like this, the
reality is that it's not solely done to "understand" movements but rather to
better influence them [2].

[0] [https://news.vice.com/en_us/article/zm5k4a/the-troubling-
lin...](https://news.vice.com/en_us/article/zm5k4a/the-troubling-link-between-
facebooks-emotion-study-and-pentagon-research)

[1] [https://www.upguard.com/breaches/cloud-leak-
centcom](https://www.upguard.com/breaches/cloud-leak-centcom)

[2] [https://www.theguardian.com/technology/2011/mar/17/us-spy-
op...](https://www.theguardian.com/technology/2011/mar/17/us-spy-operation-
social-networks)

