
Let’s Publish Everything - luu
https://statmodeling.stat.columbia.edu/2019/05/29/lets-publish-everything/
======
dwheeler
The lede got buried here. This article does not just propose "let's publish
everything", it proposes "let's publish everything and have endorsement as a
completely separate step." I think that is a far more rational approach than
the current model. The current model assumes that the only way to distribute
information is by printing on paper. That is clearly no longer the case, in
fact it is a laughable assumption. So let's get the papers published online,
and then separately have discussions leading to various kinds of endorsements.
Or not, depending on the strength of the arguments.

------
rdiddly
The same thing is trying to happen in scientific publication that's been
happening in every other published medium: scarcity is gone. So why not
embrace the abundance? Well if there's any contrary case to be made, it's that
the explosion in quantity inevitably reduces the average quality. In the arts,
low quality corresponds to that ineffable quality of _shittiness_. In news,
it's fakeness. In science, it's irreproducibility. So what do you do? All the
former built-in scarcity-based ways of filtering for quality went out the
window along with the scarcity itself. You can no longer use the limited space
on physical sheets of papyrus (or vinyl LP records, or celluloid) as an excuse
to exclude things. So you need new ways of sorting out good quality work from
the (suddenly vast) quantities of mediocre work. Seems like you're doing
pretty well if you're lucky enough to have a whole new class of people step in
to help with the filtering part. Unless of course you believe a scientific
discovery is fundamentally the work of a person, an ego, rather than a fact
that existed long before that particular person happened to discover it.

Democratization ends elitism; unfortunately elites sometimes really do consist
of the best of a given thing. Especially newer elites. Every elite starts out
as a meritocracy and ends up some kind of weird legacy-based cabal/cartel that
deserves to be overthrown (that's the one you're probably accustomed to
thinking of when you hear the term "elitism").

~~~
petschge
(I agree with the parent poster, but this seems like the right place to add
the following.)

One thing that people who claim "the scarcity is gone, let's just publish
everything" do miss is that there still is a scarcity: The time and attention
of researchers.

Let me expand a bit on that from my own field of studying reconnection in
astrophysical plasma. As you can tell from the description it is not a well
defined closed topic. Instead there is tons of more or less adjacent fields.
The study of solar flares (which might be triggered by reconnection), the
study of coronal heating (the heat might be due to magnetic field getting
converted to heat by reconnection), astronomical observations of AGN jets
(reconnection might be what produces energetic particles there that we see in
the SDE of those sources), observations of pulsar wind nebulae (the energetic
particles there might have been accelerated at the termination shock. or due
to reconnection), observations of giant pulses in (some) pulsars (that might
be due to reconnection at the Y point of the current sheet just outside the
light cylinder) and so on. On top of the related physical topics I need to
keep on top of development methods in the simulation method I use (particle in
cell codes) and in several related simulation methods (either because
improvements there might make them viable to study reconnection which wasn't
possible before, or because the neat new trick in method X might also improve
the characteristics in PiC codes).

If I wanted to I could spend easily 100 hours per week just reading papers.
But staying current with the field is just 5 or maybe 10 percent of my job. So
what I do is the following: Papers that fall close enough to my special topic
I will read. All of them. I will actually print them out, annotate them, go
over them with a fine toothed comb. For many other papers I will read the
abstracts (10 to 50 each morning) to find the 1 or 2 paper that are worth
reading. And this is where selection by the editors and limitations to
publications come in. They rank important papers. I am much more likely to
read "a novel approach to plasma simulation" if it was accepted into the
Astrophysical Journal than if it just appears on ArXiv. Because ApJ does not
like pure code papers. So if it made the cut 2 or 3 experts in the field
deemed it worth the time of the community.

Now that doesn't mean that all ArXiv papers are bad, that we should publish
less or anything like that. When I talk to a colleague and ask about a
technical detail I am SOOOO happy when they say "it's in the Arxiv paper ID
1906.bla". And on the other hand there is the notion of "it was in Nature but
it still might be right". Bottom line is:

Do not discount the sorting by topic (numerical vs observational vs
theoretical), impact ("here is one data point" vs "here is a completely new
approach") and quality ("I'm not too convinced, but maybe it gives somebody an
idea" vs "holy smokes how did we all miss that!") that journals provide. Any
better, future alternative needs this sorting and ranking. Just dumping it
onto the internet is not the solution.

~~~
marcus_holmes
Interesting. This seems to correlate almost exactly with discovery in music
streaming too. Given the torrent of new music available, how do you discover
the new music that you like?

At the moment, AI is doing a pretty bad job of this - my Spotify Discover
Weekly is an interesting listen, but I know it's not the best selection of new
music out there suited to my tastes (to be fair, it's not really trying to be
that, though). My "recommended" list on Netflix bores me. I get why they're
recommending them, but it's all things that I've seen and rejected from My
List, rather than things that I would find genuinely new and interesting.

I think this is the next big problem to solve, for everyone. The combination
of Discoverability (how do I get my music/paper/novel/art/film discovered by
the audience of people who will like it?) and Search (how do I find new
music/papers/novels/art/film that suit my tastes/research subject?).

~~~
isodude
Way back it was always through word of mouth that one discovered things. I
still hold it true. You may have 10 friends how have their nieche and once in
a while they recommend something they think you want to listen to.

I think that the music matter more based on whom recommended it. An algorithm
won't have the same effect. It's missing the storyline on how you ended up
with watching x movie or listening to y song.

~~~
marcus_holmes
really good point. There are a number of bands that I listened to (and ended
up liking) purely because of who told me about them.

~~~
isodude
I think the same applies to most stuff. Your idols influence alot. I see this
as the best reason to not rely purely on algorithms.

Producing playlists is also a sort of art.

------
lifeisstillgood
>>> Publishing in Psychological Science and PNAS has value because these
journals reject a lot of papers.

That seems to be the crux - scientific papers will have to fall into two camps
"blogs or basically deciphering the lab notes of everyone in your field" and
"look this is a real effect and worthy of your attention"

People seem to want the second without wading through the first - but i don't
think you can

~~~
mirimir
For PNAS, I've heard, getting sponsored by a NAS member greatly increases
chances for publication. So it seems an odd example to cite for "value".

~~~
zwaps
Its more that NAS members can publish some articles basically for free, so co-
authoring with them is a good way to get in easier.

That being said, NAS members I know do not want to erode their reputation by
publishing junk.

------
wsxcde
I am in sympathy with the author's goals but the major problem with publishing
everything on arXiv like forums is that: (i) it becomes impossible to sort out
the good stuff from the nonsense produced by cranks, and (ii) it unfairly
advantages "high profile" groups.

Today, if I have a grad student interested in security who wants some ideas
for things to work on, I could ask them to go look up papers in Oakland, CCS
and NDSS over the last couple of years and see if anything catches their
fancy. This works because there's a reasonable number of papers that I could
reasonably expect a PhD student to look through. Asking them to look up all
the stuff that ends up on arXiv or the IACR's ePrints is not reasonable.
There's just too much stuff there and most of it is not worth looking at that.

So, you might say the endorsements will take care of this. CCS et al. could
just endorse some limited subset of the IACR ePrints. But this leads to
problem (ii) above. Right now, we have a bunch of high-profile researchers
(who are mostly at places like MIT and Stanford) who just put up stuff on
arXiv without any peer review and they start picking-up citations right away
because of their "elite" status. Some of this is deserved, because these
people have done good work. But some of this is also just a publication cartel
where everybody cites their friends' work and make it impossible for others to
break into a field.

The larger point is that in a scenario where there are so many papers that
nobody could possibly look at all of them will lead to a few groups
accumulating all the citations and all the awards. This will especially be
unfair to researchers in the developing world -- people at places like
Bilikent, Tsinghua, and the IITs -- where researchers don't have the PR muscle
power to highlight unpublished stuff.

~~~
diffserv
> I could ask them to go look up papers in Oakland, CCS and NDSS over the last
> couple of years and see if anything catches their fancy.

Isn't this exactly the major thing that is wrong with research today? Limiting
work/creativity to a few well known conferences done by elites for elites? I
read blog posts, posted daily here on HN, that are way more informative,
honest, and replicable than many papers published in the three conferences you
named.

> "elite" status. Some of this is deserved, because these people have done
> good work. But some of this is also just a publication cartel where
> everybody cites their friends' work and make it impossible for others to
> break into a field.

Elite status happens exactly because there are conferences like the ones you
mentioned. If you work with an advisor that publishes in Oakland, your chances
of getting a paper in Oakland gets increased multiplicatively. And hint,
that's not because your ideas (or papers) are better than anybody else's.

> The larger point is that in a scenario where there are so many papers that
> nobody could possibly look at all of them will lead to a few groups
> accumulating all the citations and all the awards.

This is already happening. Look at all the "prestigious" conferences.

> where researchers don't have the PR muscle power to highlight unpublished
> stuff.

Who cares? If the work is worth anything, people will cite it. If not, it will
remain as is. Why does it matter? Why do you care if 10 people cited your work
or 100 people if you are happy with the work?

Unfortunately, nobody in this forsaken field (computer science) cares about
the scientific aspect of the field anymore; everybody wants their name to be
known and that's all there is to it. The measure of success is how many papers
you publish in elite conferences ...

I do actually think that by breaking down all the barriers people care less
about having their name in conference X or Y and more about the scientific
aspect or citation cartels. First one is good, second one can be fixed (at
least more easily than giving a few elites lots of power with no checks and
balances).

~~~
wsxcde
> Isn't this exactly the major thing that is wrong with research today?
> Limiting work/creativity to a few well known conferences done by elites for
> elites? I read blog posts, posted daily here on HN, that are way more
> informative, honest, and replicable than many papers published in the three
> conferences you named.

I totally disagree. The quality of papers at the "elite" conferences is way
higher than most things I've read on HN. What is an example of an HN post that
in your opinion is better than equivalent academic research in that area?

> Elite status happens exactly because there are conferences like the ones you
> mentioned. If you work with an advisor that publishes in Oakland, your
> chances of getting a paper in Oakland gets increased multiplicatively. And
> hint, that's not because your ideas (or papers) are better than anybody
> else's.

Sure, having an advisor on the Oakland PC helps a great deal. But it doesn't
follow that your work is just the same as everyone else. Have you peer
reviewed papers for these conferences? A majority of submissions, even at the
"elite" conferences are just junk. That doesn't mean everything that gets
published is not junk, but the stuff that does get published is significantly
better than the average submission.

> Who cares? If the work is worth anything, people will cite it. If not, it
> will remain as is. Why does it matter? Why do you care if 10 people cited
> your work or 100 people if you are happy with the work?

Because the point of my research is not to sit in an ivory tower and produce
academese that no one cares about. The goal is to have real impact on computer
system design, and in my specific case, push practitioners towards
methodologies that make systems more secure. That's not going to happen if no
one reads our work.

Another way of looking at it is that a lot of our work is funded by taxpayer
money. They aren't paying us to have fun proving lemmas that no one else cares
about, the taxpayer would like us to produce research that results in tangible
improvements in computer system design. In the system that we have today, the
only way to have this tangible impact is to produce high quality papers that
other people read, cite and build on top of.

------
noname120
I believe that there is a misunderstanding regarding the role of academic
publishing. Before the Internet era, the only way to make your work known to
other scientists was to publish it in a journal—prestigious if possible.
Journals were a medium to spread scientific information and make it
_available_.

Nowadays the situation is very much different. For example, there are preprint
repositories such as arXiv where researchers can publish papers with very
little oversight[1]. The journals don't serve the role of making information
_available_ anymore. Their main role is to act as a _filter_ for scientific
information, and the reputation of a journal signals whether it's a high-pass
or a low-pass filter. Researchers want to publish in prestigious journals
since it signals that their work is high quality, and it gives greater
exposure to their work—prestigious journals have a broader reader base than
unknown journals.

This considered, the author got a point. In experimental domains of science
such as Biology, Physics, Chemistry, Psychology (and sometimes even Math!
[2]), reproducibility is of paramount importance and publishing negative
results helps both researchers save time by not doing repeated work and avoid
statistical bias in meta studies.

In fact, what the author of this article implicitly looks for already
exists![3] But they usually fail to gain traction, and this for a precise
reason: writing and publishing a paper takes a great amount of time.
Publishing a paper in favor of the null hypothesis usually does not bring much
recognition, and the time could instead be used to research or write the next
paper.

What can be done then? A solution could be to create a platform where
scientific _results_ can be made available without the whole publishing
overload. No literature review, no verbose explanations, only "what we did"
and "what we got". And if possible, in a machine-readable format. Any fellow
HNers?

[1]
[https://en.wikipedia.org/wiki/ArXiv#Controversy](https://en.wikipedia.org/wiki/ArXiv#Controversy)

[2] [https://arxiv.org/abs/1201.0749](https://arxiv.org/abs/1201.0749)

[3] [http://www.jasnh.com/](http://www.jasnh.com/)

------
ridaj
It's a good idea, but without scarcity and the attendant prestige of being a
reviewer for fancy publications, how do you get people to volunteer their time
to review and endorse papers?

------
droithomme
_> the authors of the above article, and other people who present similar
anti-replication arguments_

Study replication is good, and fairly rare.

