Hacker News new | past | comments | ask | show | jobs | submit login

I wrote http://www.arxiv-sanity.com/ (code is open source on github: https://github.com/karpathy/arxiv-sanity-preserver) as a side project intended to mitigate the problem of finding newest relevant work in an area (among many other related problems such as finding similar papers, or seeing what others are reading) and it sees a steady number of few hundred users every day and a few thousand accounts. It's meant to be designed around modular views of lists of arxiv papers, each view supporting a use case. I'm always eager to hear feedback on how people use the site, what could be improved, or what other use cases could be added.



Andrej, thank you very much for making this site. I use it every day.

A problem: I think one of the most necessary things that are missing from arXiv.org is comments. People just come, read, and then take their discussions somewhere else, fragmented all around the net. Arxiv-Sanity already filters just the ML articles and does personalized feeds, maybe it could also be a place of discussion. I know it potentially leads to other complications (like moderation), but I really think readers would benefit from reviews, questions and answers.

The current ML related discussion sites (blogs, /r/machinelearning, G+, Twitter, StackExchange and YC) are often mixed with lots of noise. I'd like to read what researchers think.

Another suggestion: add links to code repositories, where they are available. Maybe some of your trusted users could be empowered with the right to add such links, if it's too much work for a single person. If interesting discussions are reported on other pages on the internet, they could also be added to the article, to make them easier to find.


A simple-ish way of subsidizing some of that effort is to just make a subreddit for arxiv submissions and link to the comments section from arxiv-sanity for a given paper. You still don't tie into other communities, but if someone has something to say about a particular paper it provides a straightforward mechanism (until the, what, 6 months at which point the submission is archived and can't be voted on or commented on any further). You only need a couple moderators and some strict rules (automoderator rule to only allow submissions from the arxiv-sanity user, etc).


If you want links to code repositories for each of the papers, there is already a project http://www.gitxiv.com/. It also has a comments section. Maybe both the maintainers can work together to get both the projects integrated. I actually subscribe to GitXiv mailing lists as well since they send list of top articles under particular categories.


Thanks! The option to contribute additional links would be a great feature.

As to discussions about papers there are plans (semi-related to arxiv-sanity) in motion to do that well and correctly, not just from me alone. I think we'll see a big delta here over the coming months.


What about a Gitter-equivalent for each paper with logs. That's one way I envision paper's and conversational threads get related. Each paper would be a different channel. Maybe there are topic channels too.


For me getting alerted when there are new papers that cite papers that are relevant towards my current research topic would be ideal. Google scholars has alerts on authors and search queries but for me they don't have enough recall.

Its much easier to tell when a paper is relevant for me if it happens to cite 3 of the commonly used datasets for my particular task.

btw I use arxiv-sanity, its pretty great, thanks a lot!


Thanks! Email alerts are one of top requested features, it's definitely on my shortlist for the next feature to incorporate.

Another feature I'd like to add is an ability to follow people, but I'm worried about the exact implementation since the current assumed contract is that your library is private.

One more feature of course I hear about often are comments, but I'm afraid of the site disintegrating into YouTube comments. I think comments have to be done very carefully and would require significantly higher code complexity to incorporate moderation tools, etc. Tricky and non-trivial not just implementation wise but design-wise, incentive-wise, etc.


Consider having a minimum number of characters or words for comments. It's basically the opposite of twitter, and would result in people having to actually put some effort into their comments. Also I've found that even on youtube, if the comments are moderated, the degenerates stop showing up.


Draft help from your readers. There have to be a few who want to contribute.


I also use a homemade code to keep up with new papers.

I feed in a .bib file with papers I like and use a Naive Bayes classifier to find papers I might like in news feeds (science, nature, PNAS, etc).

It works pretty well. As a bonus you can use post high ranked papers to slack or use papers sent to me by other people to repopulate the bib file.

Always welcoming suggestions: https://github.com/pfdamasceno/shakespeare


not exactly the same thing, but http://www.gitxiv.com/ is pretty cool for pairing papers with source


Wouldn't you miss a lot of important publications when just checking arxiv?


In AI, all the important publications are posted on arxiv first. If a paper is sent to a journal before arxiv, it is clear the authors believe their paper is not significant enough to alert the community.

The publishing culture from the life sciences is toxic and will be avoided by the AI community.


Thanks. This is an interesting approach to get the needed.


it is very helpful




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: