
A Funny Thing Happened on the Way to Academia - tim_sw
http://cacm.acm.org/blogs/blog-cacm/157645-a-funny-thing-happened-on-the-way-to-academia/fulltext
======
jacoblyles
There are a lot of good ideas and sentiments in this article. And then at the
end I find out the trail-blazing product he left academia for is "a newsfeed
based on your interests."

How many startups like this are there now? 300? That is probably a low
estimate. I'm sure that everyone living in the Valley personally knows at
least one founder working on the exact same product. And none of them are
better than my Hacker News/ Facebook/ Reddit/ RSS feed combo.

I've heard some of the many Prismatic competitors describe themselves as "a
Pandora for news" which is an apt description, since rdio, soundcloud, and
Spotify are better than Pandora. Self, social and community sourcing for
digital entertainment tends to be better than algorithms. While I believe the
"problem" domain is severely overworked, if you're going to stick with it then
my bet is that social algorithms like collaborative filtering a la Netflix
does a better job than topic modeling.

On the optimistic side, a good company is more than a single product. Maybe
the author's considerable expertise and experience from building Prismatic
will lead to something cool down the road.

~~~
dbecker
As a prismatic user, I think the product he's built is pretty far ahead of any
competitors I've seen.

I don't think the fact that others have tried with limited success is an
indictment of prismatic at all. If anything, that strengthens that argument
that this project required a lot of NLP skill.

~~~
jacoblyles
When I think about what I want the future to look like and what's missing now,
a smarter news aggregator isn't on the top 10 list. I have no problem wasting
infinite amounts of time with existing entertainment technologies. At the same
time, thousands of engineers clearly disagree with me as they are risking
their livelihoods and investing their their talents in that field.

I do find the whole field boring. Making a news aggregator that's X% better
than existing ones doesn't improve the world a whole lot, or even offer a
compelling consumer value proposition.

I don't want to sound like I'm against all consumer entertainment tech. It's
clearly something that people enjoy, and it's exciting for some people to
make. But this particular application is one whose value I am very skeptical
of.

~~~
dbecker
Completely agree that a news aggregator isn't one of the world's 10 most
pressing needs. It isn't in the top 100 either.

But Prismatic has told me about articles that make me better at what I do.
Depending how widespread that experience is, they may be having a larger
impact on our 10 most pressing needs than most teams that attack those needs
directly.

------
jamesjporter
Great post. As someone who works in a different field of science, one bit got
me thinking:

>Like any academic community, the work within NLP had become largely an
internal dialogue about approaches to problems the community had itself
reified into importance.

I would argue that this is an issue of goals. If you're motivated by the
application of research results to solve practical problems (as the author
is), then this is a valid criticism. But for me science is also self evidently
valuable — understanding the principles that govern the universe is a noble
goal irrespective of finding opportunities to apply them.

Perhaps the field plays a role as well. In all sciences one's goal is to
discover the rules and facts that govern a particular system. In the natural
sciences, this system happens to be the world in which we live, whereas in the
formal sciences (math / CS), the system is often an artificial one of human
construction. In the natural sciences, the importance of the rules you
discover is self evident (they govern our own lives and capabilities!),
whereas in the formal sciences, its more necessary to justify the importance
of your discoveries (with, say, practical applications) because the importance
of the system you're studying isn't as self-evident.

This isn't to say that the natural sciences are in some way superior; just a
speculation about the attitudes/motivations of academics in different fields.

~~~
iyulaev
_In the natural sciences, this system happens to be the world in which we
live_

Not to say I disagree with your post, but it's it more accurate to say that
the natural sciences today explore _models_ for the world in which we live?
The models tend to be fairly obvious for things on the human scale, but when
you talk about systems on the atomic scale, or on the cosmic scale, the
immediacy of the models tends to break down. The result is that, just as for
the abstract sciences (math, CS, etc), the usefulness of the models has to be
justified. So, I feel that the line between purely abstract sciences and
"natural sciences" is not quite as fine and well-defined as your post makes it
to be.

~~~
tsewlliw
I really like knowing that there are people dedicated to thinking deep
thoughts. There was a recent potential proof published on a tough math
problem, and apparently the author had gone dark for 15 years creating a
fantastic mathematical universe and vocabulary all on his own. Thats
worthwhile. Maybe not for everyone to go off doing it, but definitely for
some.

~~~
Evbn
He didn't go dark. He has been publishing but no one knows what he is going on
about.

------
aria
I'm the post author and would be happy to answer any questions and discuss
academia CS and startups.

~~~
temphn
Great article, and agreed on most everything you said. Regarding Prismatic
itself, some constructive feedback (feel free to ignore).

My immediate first impression was that this was another app that would waste
my time. I want less of those, not more. The absolute best news feed app I've
used is Flipboard for the iPad, and even then I don't use it too much as I
feel too much like I'm wasting time with it (like HN!).

Secondarily, the homepage doesn't grab you enough. The text in the graphics is
too blurry and the pictures are generic. The homepage below the icons doesn't
have the production values to explain why it's cool.

I'm not trying to be negative here. Some of the points in your article really
hit home (email summarization would be awesome and paradigm shifting). So I
wonder if you might focus more on making something that will save people time
and solve a pain point vs. "another web-based time-wasting thing" (that may
not be fair, but that was a first impression).

For example, can you scrape an inbox and list of
Facebook/Github/Tumblr/RSS/Twitter feeds to get a single high-sensitivity
greatest hits from all monitored news feeds? This way you can check a single
page in five minutes on your phone and feel reasonably content that you saw
the top headlines for that week. Kind of like news.ycombinator.com/best.

Just some thoughts, FWIW.

~~~
robrenaud
> For example, can you scrape an inbox and list of
> Facebook/Github/Tumblr/RSS/Twitter feeds to get a single high-sensitivity
> greatest hits from all monitored news feeds

You've missed the point already. The web is vast, on the whole filled with
99.99% crap. But .01% of something large is still very big. You cannot list
the interesting things a priori with a few feeds. Prismatic is about learning
about what you actually like to read and giving it to you. It's got the
smooth/sleak usability of Google reader (read title, keep smashing J when
something isn't worth reading, essentially letting you skim/reject more than
10 uninteresting articles/minute), but without the low recall "enumerate every
site I think I am interested in" subscription model.

It's compelling and awesome.

~~~
temphn
Ok, I might well be missing the point. But enough people upvoted that I think
this is a common perception: "Oh, not another time-waster". So, maybe make the
homepage address this issue head on with a good video that shows why this is
compelling, awesome, and productivity-increasing.

~~~
robrenaud
I do think it's mostly a time filler like say, hackernews or a good subreddit.
Maybe you'll find some article that will increase your productivity, but
that's definitely not the focus. It's just about giving you a stream of
articles that you'll probably enjoy reading.

------
marshray
A really fantastic article overall. I was never personally at risk of
academia, so I don't resonate with that part of the article. But I
particularly liked these two sentences for their practical insight:

 _that process of taking qualitative ideas and struggling to represent them
computationally is the core of artificial intelligence (AI)._

and

 _that path from research to product rarely works, and when it does it's
because a company is built with research at its core_

~~~
aria
Thanks! It seems like an obvious thing, but thinking about "operationalizing"
intuitions computationally really helped me hone in on what I should be
focusing on with research. My advisor Dan Klein was really awesome at teaching
that.

------
iamtheneal
Awesome post. I -- along with many of my colleagues from grad school -- have
had pretty similar experiences with academia (minus the outlook for a
promising academic career ;) ). This sort of thing (very smart people
abandoning academia) is going to continue to happen unless academia figures
out a way to make itself more relevant.

academic |ˌakəˈdemik| adjective

2 not of practical relevance; of only theoretical interest : the debate has
been largely academic.

~~~
hyperbovine
My experience in academia thus far: for every five smart people abandoning
academia for industry, one truly brilliant person remains behind. That's all
that is required for the model to sustain. (And even then, there are not
enough academic jobs to go around.)

------
betula82
Do you have ideas for how academia can be brought back into touch, or is it a
lost cause?

~~~
aria
Well, I think the number one thing that would correct the balance between
academia and industry is if people could freely go from one to the other.
Someone could take insights from one arena to the other, enriching both.

There are many reasons why this is difficult, but the primary obstacle is the
tenure system.

~~~
opminion
In the UK it is possible to switch between academia and industry, so long as
you are satisfied with fixed term contracts in academia.

Recruiters tend to get confused when someone turns at their doorstep with a
PhD ("which marks did you get?"), and academics tend to not have a clue of
what's the benefit of industrial experience (in spite of being effectively
"agile" product managers themselves when participating in, say, a European
Framework project).

------
mitultiwari
Nicely written blog!

Agree with the author that academia sometimes focusses on a very narrow band
of research topics, which might or might not improve end user experience.
Also, agree with the author that personalized news has very interesting
NLP+Machine Learning problems.

However, I am not convinced that personalized news is what users wants, and
whether users want to discover popular news and articles by serendipity,
socially, rather than personalization by algorithms. Further, I think
personalized news has very limited opportunity for generating significant
revenue.

~~~
jacoblyles
My suspicion about personalized news is that entrepreneurs are building things
they know how to build rather than building something people want.

------
junktest
Zite too seems to be working in the same space of ML-aided personalized news
magazines.

~~~
mikeklaas
Several of us Zite employees are also refugees from academia, it turns out.

------
zissou
I faced a similar problem in academia, but it was in economics. I started my
undergrad career as a computer science major, later ending up in economics,
and I went on for my PhD in economics after finishing my BA. For those that
aren't familiar, a PhD in economics is a lot like a PhD in mathematics. If you
don't already respect economists mathematical ability, you really should, but
that's not my point here because math is just a tool, and contains no absolute
truths for the social scientist.

In an effort to combine solid theoretical research with well built empirical
models, I learned how to program Python to scrape data from web sites. Since
nobody around me knew anything about computer science, I found myself having
to teach myself everything when it came to "how would I write a program that
collects X things from Y products in Z Internet markets". In the beginning,
when I was learning to use Python libraries for doing HTTP requests
(eventually converging to using 'requests'), how to parse that HTML (started
with BeautifulSoup and then converged to 'lxml'), and how to aggregate and
analyze that data (eventually converging to 'pandas'), I had to spend a lot of
time learning the ins and outs of Python. To this day, I still think it was a
great choice because Python has changed the way I approach any kind of
business question that could be answered with a well defined empirical model.

As a research assistant, I would spend 12 hours writing scripts to automate
the collection and analysis of data for some project I discussed with a
professor. The day after writing that script, I would go and talk to the
professor and want to discuss some of the computational issues with the script
(say, encoding issues, or even the use of computers on EC2 to help collect
massive amounts of data every day), but the economics professors would not
care at all to hear about any of that. "You need to be studying the economics,
not the computer science" they'd tell me. This would infuriate me, because in
my mind, if you wanted to be able to ask all these interesting questions that
rely on the data, then you need to spend time to make sure you've done the
computation correctly and in a way that is reliable. Maybe I dug my own grave
by showing my excitement as I began to become more fluent with Python, and
thus more confident in my abilities.

Nonetheless, I eventually decided to leave my PhD program because my interests
in topics that lie at the intersection of computer science and economics were
a bad fit for my program. It was so disheartening to me because I truly loved
economics, and was confident that what I intended to do with my PhD research
was going to be unique and lay a framework for the field of industrial
organization. However, the experience I received in my program was the lack of
an adviser that could truly help me achieve what I wanted to do, and a regular
negative response to anything I'd bring up that was out of the realm of what
my professors were used to talking to graduate students about.

Overall, as my tone may signal, graduate school has left me very, very bitter.

~~~
zzleeper
I've talked with many friends in a similar situation as yours (econ phd) and
they all faced the same issues. Some advisors don't even care if you are
coding in Python or in stone tables, as long as you "have three stars in your
results table". Your comments about the quality of the computations also
reminded me a lot about Ken Judd's (often ignored) arguments about paying
attention to all these implementation details.

Good luck with your current goals, and I really hope you are more successful
than within the limitations and issues of academia.

PS: Just out of curiosity, what were you interested on, regarding your IO
research?

~~~
zissou
>>> Some advisors don't even care if you are coding in Python or in stone
tables, as long as you "have three stars in your results table".

That is beautiful. It perfectly describes the situation!

I still to this day have scripts that collect data on a variety of markets,
but I tend to favor applications to platforms and multi-sided markets. My main
one was mobile apps and app stores, but I also am researching video games,
hotels, desktop CPUs, to name a few. The questions that interest me most are
antitrust problems and finding creative ways to measure competition in an
Internet market. I also enjoy IO theory, with my absolute favorite topic being
low-price guarantees. To this day, I'm still digging for a creative way to do
an empirical study on low-price guarantees to test a theoretical hypothesis I
wrote a paper about in an IO theory course I took in the 1st semester of my
PhD (was the only 1st year student to take a field course in year 1).

In the first semester of my 2nd year of my PhD (left after year 2), I got my
solo work into a conference held jointly by Harvard and MIT called the "First
Cambridge Area Economics and Computation Day", but my advisors didn't seemed
to care much. The conference was awesome and my work got a lot of attention
there, including a long talk with the Chief Economist at Microsoft.

~~~
zzleeper
Trying to understand how firms compete in the android store actually sounds
like a great topic. Of course, we can only observe--at most--prices and sales
per product, but it would be nice to think of how firms compete between each
other and their reactions. My guess is that the biggest problem would be to
find an identification strategy, like a change in the structure of the app
store, so we can exogenize our regressors.

BTW, great job with AppNash, it looks very interesting!

~~~
zissou
If you want to talk more about the econometric approach, feel free to email
me. Would love to chat more. If you discovered AppNash, then you can also
figure out my email :). FWIW, your point about finding exogenous changes is
right on (model = 2SLS), however getting the quantity sold data is not a
simple procedure -- it's a matter of converting sales ranks to quantities.
With out giving away all the secrets in public, here's a classic paper that
guided some of my approach:
[http://www.business.illinois.edu/finance/papers/2003/chevali...](http://www.business.illinois.edu/finance/papers/2003/chevalier.pdf)

The original intent for my dissertation was the following: Chapter 1: Describe
the theoretical framework of how to set up a system to automate the collection
of data from an entire market (enabling the safe use of population
estimators); Chapter 2: Apply that theory to analyze the app store data to
test various firm- and product-level questions about the structure of
strategies taken by firms (developers) in the app stores; Chapter 3: Another
application of the theory using higher frequency data from the Internet
(probably hotels, since hotels change their rates at the minutely/hourly level
quite often, and regional hotel markets provide very well defined markets so
new entry takes a while and is thus easy to account for).

------
Irishsteve
What exactly do people get out of all these news aggregators? I'd imagine the
front page of reddit or at least alocalised version is more interesting than a
personilised version.

When does it go from news to infotainment?

And if thats not the case why would someone think something beyond simple
facebook like or twitter comment extractions is a big deal? I've seen lots of
papers where representing a web page and a user in a vector space based on
TFIDF performs well.

I've even see Yahoo research post slides about using some form of random
bucket for news recommendation because the novelty of newer / stranger
articles improves ad click throughs.

Then again did this article come out the same time as the PR release about
funding for their company? If thats the case is this just a wide scale PR
initiative?

------
yawgmoth
I signed up to see how accurate it could be - the interests outlined in the
e-mail I received seem pretty accurate but the stories that are being
displayed have nothing to do with those topics. There was a dearth of
interesting reading material.

------
sbashyal
Having spent several years doing academic research before finally leaving to
work on real-world problems in machine learning and NLP, I can relate to this
article.

If any of you have problems in this domain, I am interested to chat.

------
galois198
I was wondering, is it possible for individuals with no formal ML background
to implement algorithms with as much subtlety as Prismatic?

~~~
aria
Building Prismatic is way more than just about having a formal ML background.
It's that along with large-scale systems skills and having a sixth sense for
working with data (text especially) and knowing what simple ideas will work
and what needs to be complicated.

So to answer your question yes, you need a formal ML background but you need a
lot else. Luckily, you can pick up all these skills from online courses, real
world building, and a lot of self study and improvement

------
hna0002
<http://techcrunch.com/2012/12/05/prismatic/>

------
HalcyonicStorm
I love that the name is a play on the amazing movie "A Funny Thing Happened On
the Way to the Forum" +1

