
Best of Arxiv.org for AI, Machine Learning, and Deep Learning – January 2019 - ghosthamlet
https://insidebigdata.com/2019/02/20/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-january-2019/
======
Cynddl
I see more and more posts on HN that rely on ArXiv for scientific articles.
While many researchers in CS publish their research on preprint servers, these
are (a) only one part of the scientific publications out there and (b)
articles often not (yet) peer reviewed.

I get that ArXiv is a convenient way to download scientific papers, especially
without access to universities or libraries' access, but one should always be
careful with non-peer reviewed research.

~~~
mlevental
what a weird criticism. first of all in CS/math/physics I would wager >90% of
publications are on arxiv as preprints before they're accepted into
journals/conferences. so while it's only "one part" it's the definitely the
biggest part. second of all what should one be careful about wrt prereview
CS/math papers? that you implement some algo that doesn't work? this isn't
cancer research that informs treatment regimes nor social science that informs
public policy. it's code and theorems (please no imaginative extrapolation to
algos that encode racism or something like that as a very very low likelihood
risk).

but I agree with you that it's unfortunate that all papers aren't open. good
things there's sci-hub.tw and libgen.io though :)

~~~
throwawaymath
It's not that cut and dry. Only a small subset of arXiv preprints ever make it
into a journal or conference proceedings.

On balance it's good that more researchers these days will publish preprints
before submitting for peer review. But there are still two specific dangers
with relying on preprints:

1\. Most people reading papers don't try to implement or test them. Unless
there is a glaring error, it's hard to tell if a paper is critically incorrect
without a lot of effort.

2\. Even if most people _did_ try to implement papers, that would still make
for a poor heuristic on the paper's merit. In an ideal world every valid paper
could be implemented. But even in conference proceedings, it's extremely
common for papers to be missing details critical to their implementation. In
many cases you _can 't_ do the implementation because it requires a vast
amount of computation or proprietary data available only to the company whose
researchers wrote the paper.

~~~
yorwba
Peer review doesn't really address those problems, though. Reviewers are just
people reading papers, if it's hard to reproduce a paper's results, they can't
verify that they are correct. Peer review only helps to decide whether a paper
looks worthy of attention, and if you found a paper on arXiv, you probably
already have some other reason to think that it might be worth looking at.

If the paper comes with code you can simply run to reproduce the results,
that's a stronger signal for correctness that whether it went through peer
review or not.

~~~
throwawaymath
_> If the paper comes with code you can simply run to reproduce the results,
that's a stronger signal for correctness that whether it went through peer
review or not._

1\. Most peer reviewed papers do not come with code.

2\. Many that do come with code don't work out of the box.

~~~
yorwba
Exactly. That's why it's a stronger signal.

------
pmoriarty
For AI- and CS-related papers, I've long used CiteSeerX[1]. I never use ArXiv,
and would be interested to hear a comparison.

[1] - [http://citeseerx.ist.psu.edu](http://citeseerx.ist.psu.edu)

~~~
mooneater
I just searched citeseerx for recent AI topics, results were super stale and
no results for important recent advances.

arxiv is where its at.

