

The next frontier for big data is the individual - tellarin
http://www.technologyreview.com/news/514346/the-data-made-me-do-it/

======
EvanMiller
Ah yes, "the power of a society based on 10 times as much data". The best
examples that the authors could come up with are:

* A professor receiving an automatic notification about when his flight was delayed. This saved him approximately 2 minutes as compared to checking the flight number before he left for the airport.

* Stephen Wolfram figuring out what time he likes to send emails.

Big data proponents need to tell a better story about how data will empower
the individual. So far it seems like it's just about large corporations
engaging in cross-merchandising, advertisers getting you to click on things,
and spy agencies building dossiers without all that pesky legwork.

~~~
taliesinb
There's a really rich set of analyses in
[http://blog.stephenwolfram.com/2012/03/the-personal-
analytic...](http://blog.stephenwolfram.com/2012/03/the-personal-analytics-of-
my-life/) . "Times he likes to send emails" is a poor description for that
body of work [I know the people who worked on it, and they worked hard _and_
smart on it for a long time].

While I like neither buzzword, "data science" is probably better than "big
data", for the following reasons:

1\. not all data needs to be big to be interesting

2\. the majority of 'science' that is typically done on data, both by
enthusiasts and corporations, is both stereotyped and shallow. it won't be
long before this is disrupted. unfortunately "big data" makes it sound like
the problem is an engineering one -- in reality, the problem is cultural.

3\. like 'traditional' science, data science is irreducibly _hard_. You need
to be smart, creative, to know a diversity of methods, and you need
interactive tools that allow you to explore and test hypotheses.

------
brandon_wirtz
My company's core product (<http://www.stremor.com> ) is designed to take
unstructured data from text and convert it to structured data. But also look
at qualitative factors from language and assign them quantitative values.

In working on a "detector" to determine if an author was paid to write an
article about a company which we got working most of the time. I discovered
something else we could detect.

Only in women bloggers I noticed that every so often after profiling how the
author writes that suddenly I'd get a whole bunch of false positives for "paid
shill" detection.

Turns out we were detecting a change in optimism. Which was what we were
trying to detect, but that wasn't because of monetary incentives to appear
optimistic, it was because the women had learned they were pregnant.

I also did analysis to see if I could detect bloggers that were turning
suicidal and could actually. But I would get false positives for things like
"was diagnosed with a terminal illness" or "Lost a parent or spouse".

While I wasn't detecting exactly what I was looking for, imagine the
possibilities of being able to know when an employee had a life changing
event, and be able to offer them the help they need.

Going through the corpus of Enron Emails I can detect when employees started
to know when they were doing something wrong.

I don't know that all of this data is "good" or that I would want an employer
to have all the metrics I can extract, but for things like scouring the Enron
emails for witnesses that would be sympathetic and willing to testify it could
be amazing. For monitoring people with depression to make sure they aren't
getting worse it seems like it would be worth the privacy invasion.

I also think I might be willing to let a machine do things that I wouldn't let
a human do.

~~~
taliesinb
These are interesting examples. Have you given any talks or written any blog
posts about these results?

------
Executor
At least they promote a bill to allow people to view all the big data that
companies have of them. Not a fan of giving away my privacy.

