Hacker News new | past | comments | ask | show | jobs | submit login
The next frontier for big data is the individual (technologyreview.com)
17 points by tellarin on May 3, 2013 | hide | past | favorite | 6 comments



Ah yes, "the power of a society based on 10 times as much data". The best examples that the authors could come up with are:

* A professor receiving an automatic notification about when his flight was delayed. This saved him approximately 2 minutes as compared to checking the flight number before he left for the airport.

* Stephen Wolfram figuring out what time he likes to send emails.

Big data proponents need to tell a better story about how data will empower the individual. So far it seems like it's just about large corporations engaging in cross-merchandising, advertisers getting you to click on things, and spy agencies building dossiers without all that pesky legwork.


There's a really rich set of analyses in http://blog.stephenwolfram.com/2012/03/the-personal-analytic... . "Times he likes to send emails" is a poor description for that body of work [I know the people who worked on it, and they worked hard and smart on it for a long time].

While I like neither buzzword, "data science" is probably better than "big data", for the following reasons:

1. not all data needs to be big to be interesting

2. the majority of 'science' that is typically done on data, both by enthusiasts and corporations, is both stereotyped and shallow. it won't be long before this is disrupted. unfortunately "big data" makes it sound like the problem is an engineering one -- in reality, the problem is cultural.

3. like 'traditional' science, data science is irreducibly hard. You need to be smart, creative, to know a diversity of methods, and you need interactive tools that allow you to explore and test hypotheses.


I quite agree! I enjoyed this story along those lines a while back: "The best minds of my generation are thinking about how to make people click ads. That sucks." -- http://www.businessweek.com/magazine/content/11_17/b42250609...


My company's core product (http://www.stremor.com ) is designed to take unstructured data from text and convert it to structured data. But also look at qualitative factors from language and assign them quantitative values.

In working on a "detector" to determine if an author was paid to write an article about a company which we got working most of the time. I discovered something else we could detect.

Only in women bloggers I noticed that every so often after profiling how the author writes that suddenly I'd get a whole bunch of false positives for "paid shill" detection.

Turns out we were detecting a change in optimism. Which was what we were trying to detect, but that wasn't because of monetary incentives to appear optimistic, it was because the women had learned they were pregnant.

I also did analysis to see if I could detect bloggers that were turning suicidal and could actually. But I would get false positives for things like "was diagnosed with a terminal illness" or "Lost a parent or spouse".

While I wasn't detecting exactly what I was looking for, imagine the possibilities of being able to know when an employee had a life changing event, and be able to offer them the help they need.

Going through the corpus of Enron Emails I can detect when employees started to know when they were doing something wrong.

I don't know that all of this data is "good" or that I would want an employer to have all the metrics I can extract, but for things like scouring the Enron emails for witnesses that would be sympathetic and willing to testify it could be amazing. For monitoring people with depression to make sure they aren't getting worse it seems like it would be worth the privacy invasion.

I also think I might be willing to let a machine do things that I wouldn't let a human do.


These are interesting examples. Have you given any talks or written any blog posts about these results?


At least they promote a bill to allow people to view all the big data that companies have of them. Not a fan of giving away my privacy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: