

Ask HN: An academic web app idea, anyone interested? - alimoeeny

Problem: 
As an academic I am overwhelmed by the number of articles that are published in the relevant journals every week and there is no "efficient" way to filter out the irrelevant ones and keep up with the on going progress in my field of research.<p>Solution: 
All good journals have RSS feeds and in the context of academic interests you can "easily" detect relevant papers based on author, keywords and affiliation (one might eventually add some social feature as well, but it definitely would work based on individual ratings without any social features). 
So a user would select some journals and starts by staring the relevant papers or authors or ... 
After a while there would be enough evidence accumulated to support the classification of new papers into relevant or irrelevant groups.<p>Also there would be room for all sorts of alerts and sharing and commenting and ...<p>I (and I am sure many others) have had this idea for a long time but I am a currently not a position to do this myself and other people who attempted to do it had done it wrong, I was wondering if there is anyone who want to do this (I would be happy to help in design or even implementation or I'd be also happy if anyone gets the idea and build their own tool without getting me involved).<p>Ali
======
sunir
Hey Ali and others who are interested,

I've been working towards launching a startup called Bibdex
(<http://www.bibdex.com>) for a little while now. I want to help all research-
oriented fields to sift through the large number of publications.

It's amazing how many articles and manuscripts are published every day, not to
mention blog posts and social commentary. Unlike the average person's surfing
habits online, professionals in these fields are required to be on top of the
available research, but most of the available research is not very good. That
seems like a problem worth solving, and a potentially profitable one if
industrial and institutional researchers are willing to pay to save their time
and publish higher quality research.

I'm curious. Have you seen <http://www.mendeley.com> yet? If you have, why
doesn't that solve the problem for you?

\-- Sunir, Founder & Developer, Bibdex

P.S. I'm interested in having other people join as co-founders.

~~~
jsarch
Sunir,

If you are not heavily invested in the name "Bibdex" for your company, you
might want to look at changing the name because "BibDesk"
(<http://bibdesk.sourceforge.net/>) already exists as an usable product
organizing publications. Specifically, the problem is that I would
type/write/search for "bibdesk" when I hear "bibdex" and be rewarded by a
product in the same market.

-J

~~~
sunir
Yes, thanks for the feedback. As with all things, good names are in short
supply. I've learnt that it's a very common thing for grad students to build
citation management software, and I fell into that trap as well.

------
emilis_info
I am develoing an opensource webapp with similar requirements (noticing
relevant parliament/government news in Lithuania).

It can index more than RSS feeds (you know, government websites ;-)) and uses
Lucene/Solr for searching documents.

Our website: <http://kaveikiavaldzia.lt/> Source code:
<http://github.com/emilis/PolicyFeed>

I would gladly answer any questions.

------
adbge
It sounds to me like maybe you should go one further than just creating a
superb academic RSS reader, and instead try to create an online community for
academics -- a la reddit for PhDs.

Instead of simply indexing links to other sites, I think it would make sense
to provide the necessary tools for academics to _easily_ host papers via this
service. This way you wouldn't need to rely on typically paywalled online
academic journals, and you wouldn't have to wait for papers to trickle down
from Nature, etc.

~~~
barmstrong
Yep - do some crowd sourcing like HN for academics.

~~~
alimoeeny
You all know, things like HN does not happen that often, specially in a world
ruled by old professors. To build an HN for academics, first you need to be
field specific (really narrowly specific!) and you need to be well known or
supported in that community (this would make you an old professor!)

~~~
stoney
Don't start with the professors - start at the bottom and work up. Find a
large-ish existing community of PhD students (e.g. all students at a certain
lab) and build up from there.

------
equark
This needs to be done well, I agree. Everybody I know in academia has had this
idea, so there is demand. However, journals are far too late. In many
disciplines everything is shared a year in advance via working papers. These
are often posted on authors webpages or on conference pages before anywhere
with a formal rss feed. I want to subscribe to specific authors and be
confident I get things first.

This will be hard to do well and while the CPM might be great for some
categories, it's a niche product. It would be a great service though.

~~~
alimoeeny
not all fields are like that and not in all areas you want to be on the
bleeding edge. For many purposes keeping up to date at journal paper level is
more than enough.

~~~
equark
True, but that isn't really that hard to do now. I can basically keep up to
date by subscribing to top journals in my field.

------
boskone
Well here is the rub of it. Are you willing to pay a monthly subscription fee?
How much?

~~~
_corbett
if you could convince labs themselves to pay for it, it'd be a money maker,
although this may require more direct marketing at first. many academic
institutions hold subscriptions/licenses in the $10k range.

~~~
alimoeeny
this would actually boost education and research performance and I am sure
with the correct product and correct marketing could make good (!?) money.

~~~
gsaines
I wouldn't bet on it. Making money in the academic institutional licensing
world is tough, especially if you are marketing something that doesn't impact
bottom-line figures for people who are concerned. Arguably this use case is as
far removed from bottom line value adds as possible. In an optimal world you
have a killer app that reduces University lead COA by 30% or something, but
this would have to grow organically (so a lot of direct sales approaches are
no good).

A friend of mine has been doing a lot of direct sales (quite successfully I
might add) to institutions, but the reason he's making money is that his
target audience is already doing a lot of number crunching, they have big
budgets, and he's directly saving them cash. Something like this is probably a
"nice to have" feature that would be near impossible to convince the school to
pay for, to speak nothing of a worthy price tag (greater than $2k/yr for
instance).

I sounds like a fun project, but I would not expect it to pay much, if
anything. And an ad model's monthly revenue would be a joke.

------
crazyjimbo
I've looked into this problem briefly myself, and as a PhD student who runs an
academically oriented start-up/web app, it's something I'd be very interested
in seeing happen. In principle it should be quite easy to get working, at
least for a few select journals (in my case it's the arXiv I'm interest in).

However, the one stumbling block that I see is an easy and effective way to
extract citations. An important filter for whether a paper is relevant to me
would be whether it cites some seminal papers in my field. There are services
out there which index citations (SPIRES in my field[1]) but there is always a
bit of a delay before they are updated. Using them would ruin the real-time
nature of the service.

The only way I see around that is to extract the citations from the papers in
the journal feed as they come in. But now the service has gone from a weekend
project to a large undertaking that would require a substantial infrastructure
to work.

I'll see how this thread continues to evolve, but I'd like to have further
discussions about making this a reality.

[1] <http://www.slac.stanford.edu/spires/>

------
delano
Whatever you do, don't reinvent the wheel. Build on top of a service like
<http://superfeedr.com/> or <http://www.postrank.com/>

------
alimoeeny
Wow! I am speechless at the moment!

Thanks for all the comments and suggestions.

Go on guys!

I am checking the websites and products mentioned.

At least now we know for sure that everybody is having this problem AND there
is no popular solution to this.

By the way, I am in Brain research.

<http://www.yourxiv.com> at least explains the problem and promises a solution
but it only covers Physicists.

<http://www.academia.edu/> seems to cover everyone (I am giving it a try right
now) but they are overdoing the social part. (too much facebook!)

<http://pubget.com/> is not addressing this problem (as far as I understand)

<http://techlens.cs.umn.edu/> TechLens also addresses this problem but it
seems not to cover every body (all research areas).

Many other things mentioned are not related or are addressing different
problems.

Still reading your comments, Ali

~~~
nuitblanche
Techlens does not cover preprints!

------
bioinformatics
Isn't that something that Mendeley is trying to do? I would be interested in
seeing something like this and helping when and where needed.

~~~
_corbett
I've used Mendeley (and use papers.app when I'm on a mac) and it has helped me
more with organizing papers than in filtering them. What you see as an
academic in the morning is a 100 papers staring you in the face and you fall
back on pattern recognition at points if you don't have more than a few
minutes. Whose names and institutions do you recognize? Which titles seem most
similar to your line of research? But you can miss a lot of interesting things
and not view things tangentially related, but possibly useful as often.
Journal clubs attempt to mediate this by having a group of people each pick an
article and bring it to the table but crowdsourcing this and having cross lab
"bringing to the table" would be very useful.

~~~
jcruz
I started JournalFire so journal clubs could post what they were reading
online. You can follow clubs or individuals, and the articles they're reading
or discussing will show up in your feed. Let me know what you think
<http://journalfire.com>

~~~
borga
I'm using it and trying to setup our journal club through it. Nice tool.

------
_corbett
I've faced the same problem with info overload, even with the new astrophysics
subfields introduced by arXiv. I am an active user of papers.app and have used
Mendeley in the past; it helps with organizing but filtering is still a
challenge.

I've thought about doing something like this for the arXiv, even something as
simple as setting up a <http://www.pligg.com/> may be a start; I'm planning to
do at least that much for my own lab in the next few months.

I've already got two other startups of my own so I couldn't commit to a larger
version of this side project at this point, but ping me if you want to chat
about ideas and I could possibly point you to others with more bandwidth. I
have experience in iPhone app development, web dev, and machine learning--and
of course in startup founding and academia (I've worked in 7 labs doing
research, now am a PhD student in astrophysics).

------
nuitblanche
Ali,

I have some experience in the process of doing the filtering you are
mentioning as I write a blog on compressed sensing (<http://nuit-
blanche.blogspot.com/>), a field that spans a wide range of communities that
are, often, not talking to each other nor have the same culture.

I don't think Mendeley or any RSS feed services help much as they both target
the same dead tree people who are generally about two years behind on the good
stuff. Arxiv is good but not all the communities use arxiv either and so it
really is a mix of different tools that yield the most results.

I have a high interest in expanding and automating my current processes and
would be very much willing to talk to somebody who would want to spend some
time on this.

I also agree with one of the posters, how much do you think people would be
willing to pay for this service ?

~~~
nl
I thought you meant you were using compressed sensing powered filters, maybe
working on the summaries of papers from an RSS file.

That would be VERY interesting (and - dare I say it - probably impossible)

------
danielrhodes
<http://academia.edu/>

~~~
SkyMarshal
Was going to post this too. Seems designed precisely for what the OP wants.

~~~
alimoeeny
Yes, it claims so, but so far I can't see how it does it. It is full of social
feature but seems to lack the intelligent filtering thing we need.

------
hooeezit
Looks like what you want is a version of this:
<http://techlens.cs.umn.edu/tl3/>

Read an article about Techlens on ACM here:
[http://portal.acm.org/citation.cfm?id=1297268&dl=GUIDE&#...</a>

~~~
Elite
The ACM abstract gave a good overview of what the site was supposed to do, but
when I went to the actual site, it was horriblyi designed with no clear
description of what the service was supposed to do.

Maybe if the OP Gave a very well designed application using existing
implementations it would take off.

------
organicgrant
There is definitely a need for this.

@kaelswanson had this idea 4 years ago (and I'm sure many others)

BUILD IT

------
grinich
Have you heard of PubGet?

<http://www.PubGet.com>

~~~
sunir
I haven't heard of it yet. Have you been using it? What do you like / dislike
about it?

------
fnazeeri
I think you could construct this in a day using Yahoo Pipes.
<http://pipes.yahoo.com/pipes/> It doesn't require programming skills.

------
stoney
Mendeley has been mentioned a few times already. I'll just point out that it
has some kind of API. I have no idea what it enables, I've been meaning to
look into it, but it might be useful in terms of getting citation information
for new papers, etc. In other words you might be able to build the system you
want on top of Mendeley.

~~~
MrGunn
Hi alimoeeny, et al., I'm the community liaison for Mendeley and would like to
make myself available for any questions you may have if you decide to start
building with the Mendeley API. You can indeed get citation information from
it, as well as tags for finding related research, readership, and article
metadata. The developers portal is <http://dev.mendeley.com>.

Additionally, on article pages on the main site, for example,
[http://www.mendeley.com/research-
papers/search/#0/abstract:r...](http://www.mendeley.com/research-
papers/search/#0/abstract:recommender+systems+discipline:computer-and-
information-science) you can see a "more like this" link which finds related
research.

The combination of article recommendations and collaborative filtering
improves discovery, so if you want to build something, I would of course
suggest that Mendeley is a good place to start.

------
sz
Check out <http://www.yourxiv.com>

------
wybo
Have you had a look at PhilPapers: <http://philpapers.org/> ? It does exactly
what you propose, but only for philosophy.

------
Jerpi
This problem isn't unique to academia. If I were you I'd create a few Pipes
(Yahoo Pipes) and then use Google Reader as a front-end that supports sharing
and commenting.

~~~
alimoeeny
I agree I am surprised why Google scholar has not made such thing (maybe as a
Google Reader feature).

~~~
nl
What exactly is RSS feeds from Google Scholar missing? (eg
[http://nsaunders.wordpress.com/2010/06/17/create-your-own-
go...](http://nsaunders.wordpress.com/2010/06/17/create-your-own-google-
scholar-rss-feed/))

------
go4ajeet
There is one website working on these lines <http://www.NanoInfoline.com/>

------
tmcw
Don't most professors pay students to do this?

------
T_S_
Have you tried Google Scholar Alerts? Works well for me. If that doesn't suit,
consider it as a baseline for functionality.

~~~
alimoeeny
I actually use Pubmed to get notifications on people or keywords that I am
interested in. And I should confess that it is not that bad (but it could be
much much better)

------
psyklic
<http://journalfire.com/>

------
benjaminlotan
i want something like this. :-)

------
korch
I'll volunteer 10-15 hours a week to build a recommendation engine like this!

I too have wanted a search-engine/feed-reader, no-holds-barred web-experiment
that does exactly what you describe. I'm not an academic, but I have fairly
deep & broad reading interests across mathematics, physics, economics,
history, etc. Ok, make that broad interests in basically every subject that is
truly interesting. I'm the guy who pretty much reads journal articles for fun
because newspapers, magazines, blogs, and most books suck and just aren't
smart enough.

But it's more work for me to find reading material I like and want to learn
from than it is for me to read it.

I could certainly hack around on something like this, as I too have been
wanting to _scratch this itch_ for a many years. I've got the time, since I'm
currently un(der)-employed(ahh, gotta love contracting) and bored, but my
first two roadblocks would be:

1) instant non-starter: I no longer have any university library proxy account
to use to login and scrape the journals, abstracts, authors, citations, etc. I
personally could care less about copyrights on academic papers. _I justify my
position via jury nullification using a jury of one—me._ And I don't give a
second thought to doing whatever dirty work it takes for building a system of
scraping scripts to pull all info about every paper from every journal out
there.

2) If the beta site gets even the smallest amount of traffic, I will be
crushed by EC2 server and bandwidth bills.

I realize a bajillion other folks probably want to build the exact same thing,
and I realize it's stupid to build your own when someone else might already be
well along the way and just needs some help.

Anyone need a Linux/Ruby/SQL/Sysadmin/Wear-All-Hats-Get-Shit-Done developer?
I'm in LA and on #freenode. Just say the word!

