
ArXiv preprint server plans multimillion-dollar overhaul - micaeloliveira
http://www.nature.com/news/arxiv-preprint-server-plans-multimillion-dollar-overhaul-1.20181?WT.mc_id=TWT_NatureNews
======
karpathy
I developed and maintain Arxiv Sanity Preserver ([http://www.arxiv-
sanity.com/](http://www.arxiv-sanity.com/)), one of the Arxiv overlays the
article mentions. I built it to try address some of the pains that the "raw"
arXiv introduces, such as being flooded by paper submissions without any
support or tools for sifting through them.

I'm torn on how Arxiv should proceed in becoming more complex. I support what
seems to be the cited poll consensus ("The message was more or less ‘stay
focused on the basic dissemination task, and don’t get distracted by getting
overextended or going commercial’") and I think the simplicity/rawness of
arXiv was partly what made it succeed, but there is also a clear value
proposition offered by more advanced search/filter/recommendation tools like
Arxiv Sanity Preserver. It's not clear to me to what extent arXiv should
strive to develop these kinds of features internally.

Whether they go a simple or more complex route, I really hope that they keep
their API open and allow 3rd party developers such as myself to explore new
ways of making the arXiv repository useful to researchers. Somewhat
disappointedly, the arXiv poll they ran did not include any mentions of their
API, which in my opinion are a critical, overlooked and somehow undervalued.
For example, today their rate limits are very aggressive and make it tricky to
pull down publication metadata for Arxiv Sanity Preserver, even when this
undoubtedly costs minimal bandwidth. In the future, I'm concerned they will
discontinue the API altogether and prevent similar 3rd party overlays from
being built.

~~~
visarga
Hi Andrej, first of all let me thank you for creating my prefered interface to
arXiv.

Do you select papers to be displayed on arXiv-sanity by hand or automatically?
Does manual selection explain why there are sometimes 2-3 days with no
publications, and the suddenly a bulk of papers?

~~~
karpathy
I don't select anything by hand, it's listed by date, as it comes from arxiv
API. The 2 day lags are due to weekends, when arxiv does not update.

------
stared
There are some arXiv overlays for voting and comments, like:
[https://scirate.com/](https://scirate.com/) (which didn't catch for some
reason).

While I was initially for having arXiv with more features, now I appreciate
this unix philosophy of "doing one thing, but doing it well". Various voting,
comment, recommendation or accreditation system may catch or not. So I am all
in favor of providing a separate services (even if by the same team),
communicating by API.

~~~
jessriedel
The major issue is just that arXiv maybe be able to gather a critical mass in
a way that third parties like SciRate can't. Also, if a third party _did_ get
a foothold, it's possible they could go bad (in a way that's less likely for
the arXiv), and the community would be stuck.

Still, all things considered I agree that I'd rather arXiv not take this on.

------
j2kun
Readers commenting on papers would be a welcome addition. Hopefully, with a
reputation system to go with it.

~~~
santaclaus
I would _love_ to have some kind of stack overflow type system for commenting
on papers. If you find a bug in some paper chances are someone else will
benefit from the knowledge. Oh, the authors didn't report some key step in the
algorithm and you figure it out? Pay it forward and leave a comment! Crowd
sourced errata, if you will.

Hell, integrate with GitHub so you can link to implementations, etc.

~~~
fuzzythinker
[http://gitxiv.com/](http://gitxiv.com/)

------
slater
I just hope they keep it Lucida Grande :)

