
The Socialist Origins of Big Data (2014) - 666_howitzer
http://www.newyorker.com/magazine/2014/10/13/planning-machine
======
DanielBMarkham
It's the causation problem.

Big Data is getting better at giving us lots of odd correlations -- people buy
poptarts before a hurricane. Uber knows where people are going to go. This
information is extremely useful.

But what we're missing here is this central fact: correlation data is only
good inside a very limited set of preconditions. Once you have a WalMart or an
Uber, it helps them operate better. It does not have the ability to create the
next WalMart or Uber.

That means that Big Data, as it is now, will always be able to continue to
optimize within a limited system, but will not be able to see outside of that
system. Big Data will not be able to create the next paradigm-changing thing
like Uber, because paradigm-changing things are by definition outside the
scope of the data already collected.

Recently there was a a study published by some MIT students about startups. It
ran a bunch of numbers and gave you advice: pick a small name, operate outside
the valley, use older workers, and so on. But as somebody pointed out, you
really need to have a great idea, spot-on execution, and market traction. If
you have that, the rest of it doesn't matter. More to the point, if you have
all of those things the MIT guys came up with, they're not going to give you
the other things you need. As it turns out, the things you need for a great
startup are still fiercely debated -- hence the MIT study in the first place.
Correlation does not equal causation.

I see this in big companies all of the time. Our projects are running, on
average, 100% late! So somebody looks at the data and finds that most of the
time is spent testing. What do we need? Better testers, of course!

Simply because you can point to a couple of different things that track
together does not mean you understand anything. And in IT, unlike Big Data, we
_do_ have to worry about causation, because we're inventing the universe every
time we ship.

You always optimize complex systems from the bottom-up, never from the top-
down. Otherwise you're just fooling yourself in various interesting ways.
That's true no matter what the system is.

------
hownottowrite
Recommended Eden Medina's Cybernetic Revolutionaries:
[http://mitpress.mit.edu/books/cybernetic-
revolutionaries](http://mitpress.mit.edu/books/cybernetic-revolutionaries)

------
swatow
The article doesn't show that the Cybersyn experiment actually influenced the
modern big data movement (or anything for that matter). I think that "The
socialist origins of the Big Data nation" is much more accurate, since even
though the article didn't show any influence on current government big data
initiatives, at least there is a lot of similarity.

In spite of being a free market supporter, I think that command economies have
been useful at times, e.g. the PRC and the USSR in their early stages. Both
these countries practiced something similar to what Chile tried, in the 70's
and early 80's. My theory is that before communism these countries were
practicing a a very corrupt state capitalism. So a logical step before
capitalism, was to rationalize state capitalism by making government decisions
based on utilitarian goals instead of corruption.

EDIT: modified my request for title change after I read the original title
more carefully.

~~~
shoegumfoot
Prior to their communist revolutions, both the USSR and PRC were actually
practicing something not very different from feudalism. In fact, one of the
reforms of the last Tsar was to unbind (legally, if not in practice) serfs
from their estates.

~~~
pandaman
>In fact, one of the reforms of the last Tsar was to unbind (legally, if not
in practice) serfs from their estates.

Serfs were freed in 1861[1]. A year before slaves were freed in the USA[2]. By
Alexander II - the grandfather of the last Tsar.

[1]
[http://en.wikipedia.org/wiki/Emancipation_reform_of_1861](http://en.wikipedia.org/wiki/Emancipation_reform_of_1861)

[2][http://en.wikipedia.org/wiki/Emancipation_Proclamation](http://en.wikipedia.org/wiki/Emancipation_Proclamation)

------
drawnalong
There will always be power and profit in being able to monitor and control
what people are going to do. This is a scary fact of life in the human arena.
And though this is not the first time that technology has advanced so far so
fast, some pretty scary basic concerns about governance and fundamental rights
have been raised in recent years.

Eventually, in the midst of the well-wishers there will come those souls who
are not good actors, and they do seem to seem to spoil the bunch sometimes -
don't they?

Given enough power, anything is worthy of unusual inspection. Similarly, the
current success of Uber is no different. Regardless of whether the social,
political and market narratives about Uber are currently accurate, fair or
not, it is the sudden nature of company power and the driving essential
questions around the nature of new economies that compel articles like these.
Uber and AirBnB are challenging fundamental labor and property rights
shibboleths, and demanding real consideration from people across the
philosophical, commercial, and political realms.

Similarly, the 20th century spread of dictatorial socialism was incredible. No
matter how much I try to sympathize with the promises of various ideologies, I
always return to the premise that in all circumstances, it is the
CONCENTRATION OF POWER that is the problem. It is even more disturbing for
pragmatists and empiricists when they see something moving fast that is
denying anyone the chance to scrutinize, test and debate the merits of the
method, the motives, the means.

As a good friend says though - it is easy to make these conceptions
"heuristics" and see someone's intent as globally bad. Some people's
intentions can be good and means and methodology sound. Yet with data as with
literacy at large, the medium by which the person't intent is made manifest is
the very means by which the power will be exercised and the rules challenged.
In this case, massive volumes of data are being generated that have multi-
dimensional consequences for all humankind. This IS a concentration of power.
Even though a witch hunt is far from a beautiful thing, the world citizen has
very real reason to be concerned, and truly should be immediately concerned,
about the very topics that are daily discussion here on HN. Matters of
technology, privacy, rights, commerce and liberty are not weak, and neither
are musings on the mating habits of whales.. from time to time.

HN is a fount of a certain kind of literacy. How does that literacy get
expanded to all, and seen as a human right?

~~~
Pamar
Another good read on related subjects (i.e. data-driven management, and the
dilemma faced by Socialist state in trying to remove prices - i.e. market
forces - as self-regulating mechanisms) is "Red Plenty" by Francis Spufford.
Here is the website:
[http://redplenty.com/Red_Plenty/Front_page.html](http://redplenty.com/Red_Plenty/Front_page.html)
but I heartily reccomend reading the book.

------
Tycho
One thing the often bugs me is when management demand more analytics on the
work flow of a department. This usually entails more manual logging by the
staff, which takes time out of their day. 'But we need this information,' they
say, 'otherwise we're just flapping about in the dark not knowing the full
picture.' Or something to that effect.

The thing is, how is it that we seemed to manage _just fine_ without all those
business analytics in the past? Somehow people were able to exercise their
judgement and run the department effectively. Not that there's no room for
improvement of course, but maybe you'd be better skipping straight to
diagnosis/solutions rather than spending time logging and collecting
indicators.

~~~
Pamar
I believe this is a result of the shift from production of physical goods to a
more service-oriented economy. If your main concern is "manufacturing X" it is
usually fairly simple to tally "Number of X produced" (even "Number of X
produced by operator #128") number of defective parts, number of returned
items after sale... If you deal with software development, translations,
advertising, copy editing getting an idea of "productivity" is far more
complicated. Do you count "lines of text written by operator #128"? This might
make marginal sense for translators (and you are still missing any idea of
quality/defects, because even if a customer rejects your work as "completely
wrong" you won't get any specific metrics) but completely misses the mark for
the other examples I listed.

So yeah, we are stuck with the idea that "you cannot control what you cannot
measure", but we have a serious deficit in terms of measures.

------
SocksCanClose
Check out Morozov's amazing work on the philosophical underpinnings of one of
the web's most powerful figures: [http://www.thebaffler.com/salvos/the-meme-
hustler](http://www.thebaffler.com/salvos/the-meme-hustler)

