
The Fall of Big Data - henridf
http://www.argmin.net/2016/11/14/fall-of-big-data/
======
sudoscript
> The third major failure has been a general apathy about politics amongst my
> colleagues here in the Bay Area. When many of the best minds in machine
> learning have decided that the most existential threat to civilization is
> the rise of Skynet, we have had a major failure of group think.

I cannot agree more. I am also in the Bay Area, and I feel surrounded by
people who are deeply out of touch with the rest of the country.

There is real desperation in this country. That's why Trump is President.
People didn't vote for him because they are racist. They voted for him because
they worry about putting food on their tables. They worry about being left
behind in a world that has fewer and fewer ways for the average Joe to dream
about a future. They're angry, rightfully so, and in Trump, they found someone
willing to listen.

They sure as hell didn't find it in us. Sure, we've given them lip service.
We've looked at them through our rose-tinted Google Glasses, and at best,
we've felt a twinge of regret at the way the world is going. Not enough to do
anything. For all the efforts to teach girls to code (a problem deeply related
to our own talent shortage), how much have we done to figure out what a blue-
collar worker in rural Indiana is supposed to do when the sole factory in his
town shuts down?

Because let's face it -- we look down on them. We call them hicks and
rednecks. We scoff at their lack of high-falutin education. We think they got
left behind because they're not good enough. Because they didn't pass the
tests we did, didn't have the ambition we do, didn't try as hard as we do.
It's a nice thought if you want something to let you sleep at night, but it's
not true. Those people aren't worse than us. They're not left behind because
they're stupid. They're being left behind because people like us stopped
giving a damn a long time ago.

The solutions we did offer were exactly the ones you'd expect from a far-
removed elite aristocracy. Teach everyone to code, or throw them money.
Neither of which they want. So the people you laughed at, the people you
thought yourselves better than, the people you offered little more to than pie
in the sky techno-utopian dreams -- is it any surprise they turn around and
elect someone you don't even understand? You never understood them anyway.

~~~
lsc
>There is real desperation in this country. That's why Trump is President.
People didn't vote for him because they are racist. They voted for him because
they worry about putting food on their tables. They worry about being left
behind in a world that has fewer and fewer ways for the average Joe to dream
about a future. They're angry, rightfully so, and in Trump, they found someone
willing to listen

How do you square the poverty story above with the fact that trump ran on a
fiscally conservative platform, within a fiscally conservative party that
promises to reduce the safety net?

~~~
quicklyfrozen
I have to assume that many don't want a safety net; they want to be able to be
able to support themselves.

~~~
lsc
Which is fine... but personally, I don't see that much difference between
charging a tax that makes my imported goods more expensive (in an effort to
subsidize American workers) and, say, just setting up some sort of reverse
income tax, or strengthening the Earned Income Credit or what have you so that
you directly tax me and directly give those dollars to people who don't have
competitive job skills. If anything, I bet the latter would be more
'efficient' in that I'd have to pay fewer dollars to keep those folks fed and
housed, just 'cause if you make imports more expensive, sure, more stuff will
be manufactured here, but most of that manufacturing is going to be done by
robots, so it seems to me you'd mostly be making me pay more for stuff I buy,
then turning around and increasing the demand for labor from people with my
sorts of skills.

------
bitL
I don't think this has any relation to Big Data. It's simply Trump supporters
were clever enough to give out false signals due to being ostracized by media
and worried about consequences of going public (fired from a job, removed from
social circles etc.) So the massive pressure on conformity created a late-
Soviet-Union-style dissonance, outwardly many supporting "approved"
candidates, inwardly supporting Trump (maybe due to the this pressure from all
sides as a simple vote against). Many would've probably voted for "nobody" if
that were a choice alongside Hillary and Donald.

------
huffpopo
A) This is NOT Big Data

B) This election was 101 in ML. The biases were obvious and easy to undo.

C) I called it and won a lot of money
[https://news.ycombinator.com/item?id=12863445](https://news.ycombinator.com/item?id=12863445)

D) Best not to take advice from people who are constantly wrong

E) Big Data and ML are still awsome :)

See below from my linked post a week before the election:

The problem with Bayesian is that it's often over confident in the model.
Error bars don't take this into account and people over rely on error bars -
especially with complex domains. Then there are the intentional and
unintentional sampling biases in the polls that have to be taken into account.
Then there are known human effects that add additional biases. E.g. the shy
voter effect (e.g. Torres and Brexit), silent majorities (Nixon), momentum,
dam breaking (Reagan), enthusiasm gap (Obama), the giant fk you to the
establishment vote, pending indictments etc. In my view with the election
being so emotional this year these effects are big enough to swing it to Trump
making a poll based model wildly inaccurate.

I've taken an actual position in the election. $4k on Trump to win at 15% odds
made just after the tapes. At the time I felt Trump had at least 50% chance so
the bet was positive expected return for me. I figured the public would get
over it. My Bayesian friends had his odds at 2%. Today I consider Trump to be
at least 70% with the whole FBI inditement saga picking up steam.

In addition, the betting market behavior mirrors Brexit. A few very big bets
on the status quo and many small bets against. It appears as if once again
deep pocket punters are intentionality trying to manipulate the odds on the
illiquid market in order to send a message that effects the vastly larger
financial markets. So I think the odds are that there is some free money
there.

~~~
matt4077
Congratulations on the money, but it doesn't really proof anything, especially
not (B). There were hundreds of people working on this, considering all sorts
of possible sources of errors. Any bias that is "obvious and easy to undo" was
priced in.

And the polls were actually not as bad – the difference was below the 3% from
2012. It just happened to swing the result.

~~~
huffpopo
By "obvious" I meant to ML practitioners (esp frequentists which is more
typical for big data). Not everyone. So not priced in. It is not an efficient
market. There are huge information asymmetries. I will give you that calling
it ML 101 was over the top.

If you dig into the polling breakdown and compare the individual polls (states
as well as national) with the actuals it is clear that the polls were bad.
Just because the aggregate error was less off than it otherwise could have
been is more luck than anything. It could be argued that you can assume a
gaussian error distribution so a more precise aggregate is to be expected but
I would argue that that is not a safe assumption.

TLDR; Just because it was close doesn't make it a good model.

As an aside. All of my Bayesian friends lost money and all of my Frequents
friends made money :)

------
forgetsusername
The Fourth major failing is in assuming that the majority of the country faces
the same concerns as the rich people piled into the coastal cities.

There's nothing inherently wrong with calling upon like-minded people to
change government policies as the author asks (it's called lobbying), but the
assumption that the "other-half" are just mistaken bigots and racists is
exactly how this mess started.

~~~
michaelsbradley
Exactly. Frank Bruni over at the NYT wrote this yesterday:

"Liberals miss this by being illiberal. They shame not just the racists and
sexists who deserve it but all who disagree. A 64-year-old Southern woman not
onboard with marriage equality finds herself characterized as a hateful boob.
Never mind that Barack Obama and Hillary Clinton weren’t themselves onboard
just five short years ago.

"Political correctness has morphed into a moral purity that may feel
exhilarating but isn’t remotely tactical. It’s a handmaiden to smugness and
sanctimony, undermining its own goals."[+]

And note too that some of us in tech who went to big-name private
universities, maxed out our SATs, are good at math and love the physical
sciences....... some of us "held our noses" and voted for Trump as well,
finding, in comparison, the politics and platform of Hillary Clinton to be
alienating and slightly worse, on the whole.

[+] [http://www.nytimes.com/2016/11/13/opinion/the-democrats-
scre...](http://www.nytimes.com/2016/11/13/opinion/the-democrats-screwed-
up.html)

------
jimmydddd
It does seem like a smart guy with a twitter account, an airplane and a lot of
energy was able to beat large embedded organizations and large
infrastructures. Maybe it's like a nimble startup overtaking a large status
quo industry. Some people say that his party actually had a more sophisticated
big data system, though. It's just that they used it, instead of talking about
it.

------
sluggg
I urge everyone to read Social Physics by Alex Pentland, it addresses some of
the topics in the article and provides a more holistic mental framework for
big data. It's a good read. [https://www.amazon.com/Social-Physics-Networks-
Make-Smarter/...](https://www.amazon.com/Social-Physics-Networks-Make-
Smarter/dp/0143126334)

------
ramblenode
Clickbait.

The TLDR:

A) Polls were biased and we don't have many elections on which to build good
models.

B) Social media created echo chambers.

C) Other people's politics are sci-fi and more people should agree with me.

------
LordHog
I stopped reading after the first few sentences that were politically driven.
Click bait?

------
clifanatic
Wow, talk about a click-bait headline... absolutely nothing to do with big
data.

