Hacker News new | comments | ask | show | jobs | submit login
Redesigning the Scientific Paper (theatlantic.com)
339 points by jsomers 10 months ago | hide | past | web | favorite | 107 comments

> These programs tend to be both so sloppily written and so central to the results that it’s contributed to a replication crisis, or put another way, a failure of the paper to perform its most basic task: to report what you’ve actually discovered, clearly enough that someone else can discover it for themselves.

This is the crux of the of the problem IMHO - at least for the fields I study (AI/ML). Replicating the results in papers I read, is way harder than it needs be, i.e. for these fields it should just be fire up a jupyter notebook and download the actual dataset they used (much harder than it seems to actually get your hands on). Very few papers actually contain links to all of this in a final polished manner so that it's #1 understandable and #2 repeatable.

Honestly, I'd much rather have your actual code and data that you used to get your results than read through the research paper if I had to choose (assuming the paper is not pure theory) - but instead there is a disproportionate focus on paper quality over "project quality" at least IMHO.

I don't really know what the solution is since apparently most academics have been perfectly fine with the status quo. I feel like we could build a much better system if we redefined our goals, since I don't think the current system is optimal for disseminating knowledge or finding and fixing mistakes in research or even generally working in a fast iterative process.

I've had a paper peer reviewed. It was ultimately rejected but I can't help but suspect that by making all my code publicly available, I hurt my chances of publication. The reviewers comments were about my coding style, my choice of build tool (I didn't use make, but something else which is just as easy to use), the choice of C vs C++...

It's like best practices for computer security -- always strive to minimize the attack surface. :) Without source code there is much less stuff to criticize!

>The reviewers comments were about my coding style, my choice of build tool (I didn't use make, but something else which is just as easy to use), the choice of C vs C++...

I don't know precisely what field you submitted in, but this is maliciously bad reviewing practice. You should have submitted a rebuttal and written to the editor calling the relevance of such "reviews" into question.

Yeah, just reading about that made my blood boil. Especially if it was a biology paper, I don't know what I would have done...

> It's like best practices for computer security -- always strive to minimize the attack surface.

I suspect that's also why some papers are unnecesarely verbose and describe simple things as complicated as possible. Can't criticize something that can't be understood.

It's unfair that your comment is downvoted because it's spot on.

Hiding code, obfuscating language, fudging data, all are symptoms of the same problem: of being interested in getting paper on cv instead of doing research.

There are many circumstances that can put even a good scientist in a situation where he/she has to do this but that's not a good argument for not sharing the code.

Then why submit it for peer review at all?

Because we need peer reviewed papers on our CVs!

I also detest simple things made complex, though. In my experience (with has covered electronics, epidemiology and geography) reviewers tend to pick up on obtuse issues in text but miss glaring errors in the math. It's sad, and you can see why someone less than scrupulous would exploit that tendency by over complicating things. That said I think plenty of authors are honest but just not very clear thinkers!

In machine learning / computer vision people often release their code after the paper is already accepted.

Time before the submission deadline is usually used to do more experiments and write text, not to polish the code. And after the deadline there is no hurry. What people (who want to share code) consider important is to release it till a bit before the actual conference (but this doesn't transfer to journal-based fields).

The paper should be accepted in rough form, but publication should be held up pending approval of the data and code.

Or a paper should be published in a probationary form, and not certified (by the journal) until an independent lab replicates the result. A paper that isn't making adequate progress toward replication should be retracted by the publishing journal.

That's terribly unfortunate.

I'm entirely open to being shown otherwise, but working in that field seems more akin to working in physics than in software engineering (those are engineering problems, after all, not computer science; and sometimes they're only opinions)— and being critiqued for that in an ML/AI paper would be like critiquing a physics paper over the author's coding style— it is misdirected, IMO.

They could be squashing some legitimately good work by being too heavy handed around coding style and build process.

This is why artifact reviews should be separated from publication review. Artifact reviews are notoriously horrible, with reviewers inexperienced in it simply bike shedding and picking at straws. At best, artifact reviews should simply be checking for reproducibility (i.e. can they get the code to run at all).

I'm very surprised to hear this: I've submitted several artefacts; co-run an AEC for a (small-medium) conference; and spoken to a lot of people about it. I've heard virtually nothing negative until your post. Indeed, the artefact reviews I've received have nearly all been thorough and considered (one review was slightly nitpicky, but that's one review out of 10-12). For paper reviews, on the other hand, I'm very happy if 1/2 of reviews are thorough and considered. Bear in mind that most of the artefact reviewers also implicitly review the paper, and you get some idea of how good a job they do.

My main bugbear with the whole thing is the incorrect spelling of "artefact". And when that's my main bugbear... well, things aren't too bad!

I recently reviewed a computational materials science paper and was quite impressed by the fact that they included some data and a Jupyter notebook. Long term, ecosystem will be an issue, but in the short term, it's invaluable. It does make it easier to check for obvious errors. I think more incentives should be given by funding agencies to encourage this.

I'm really sorry to hear about your experience.

I don't work primarily in computer science, but rather in math/physics, and both as a reviewer and as an author, I have only seen a positive impact for sharing code. When I review, if code is made available, it is easy for me to see the details of a model or a calculation, which I really appreciate. When I am writing a paper and developing a model, knowing that I will make my code available ensures that I write things in a clear, transferable, and understandable way (which ultimately ends up being quite beneficial to me).

Then do we even need peer review? In my experience it is always superficial, people just feel that they have to say something, so they say something about writing style, or similar trivia.

The way it should work is you put your stuff with code and all data on github. People interested in the field or working for journals read it, and rate it, journals collect links to paper repositories that are highly rated by scientists who have many highly rated papers in the field, and call that publication.

I've been peer reviewed once (and waiting for the second) and it was very in depth, giving me a couple of pointers to improve my paper. Field was mathematics, though.

sure it depends on field, on journal and on reviewer, and but with a github like interface, and public reviews it will only get better.

Problem is, if the review is public then it means the article is also in the public and some (most) publishers are not OK with that. At least yet, hopefully it will get better.

I don't know what field you work on but this would be very atypical in mine.

While it's true that minimizing the attack surface is something that can work in papers, in my field reviewers typically don't look at the code. Many of my papers include code or links to it, and I haven't ever had a comment about it in reviews.

Aaah, brought me memories from the 7 years I was in academia. All the publish and perish and the peer reviewing process is completely broken. Academia is completely broken, I would hate having to go back to academia now that I have been 7 years in industry and earning 6 figures.

Nah, they probably used your code to scoop you.

I also work in AI/ML field (deep learning), and usually I don't care if the paper has corresponding code or not. I read papers to find good ideas. If I find it, I can implement it myself. I rarely need more than a couple of days to test an idea (e.g. Hinton's capsules model took 4-5 hours to implement). The benefits of own implementation should be obvious.

If something important is missing or does not make sense, I usually just email the first author. Usually they respond within a couple of days, and unlike looking at code, I can also get an explanation of why they did it that way.

In fact, I don't even usually care that much about stated results (such as improvements in state of the art).

Things that matter are: deep insight into a problem, new angle to look at something, discovery of a new phenomenon, high quality explanation, practical tricks to save resources, and comprehensive prior/related work review. That's why I read papers.

this is the right way to go about things if you have certain goals, for sure.

sometimes you need to replicate exactly the same training method, on exactly the same data — for instance if you want to use it as a baseline on a known dataset. then it becomes really important to have the code, because while an adequate replication might be easy, it takes a lot of trial and error to get perfectly the same model.

Sure, but if the code for some result is not available, I feel free to report whatever result I got implementing their method. I’m also perfectly fine with using “couldn’t reproduce” phrase in my papers.

You seem to be exceptionally well funded, and/or have few deadline constraints. Your strategy will only work until you get spammed with "good ideas".

You seem to be exceptionally well funded, and/or have few deadline constraints

I wish! :)

you get spammed with "good ideas"

Again, I wish!

In the subfield I'm focused on at the moment (efficient mapping of NN algorithms to specialized hardware, low precision computation, model compression) I don't see good ideas very often (fewer than one good paper a week). Previously I worked on music generation - also didn't really feel spammed with good ideas.

I don't mean this to be adversarial, but what exactly is it you do that would not be sped up by checking someone else's results directly before fiddling around and then trying out your own implementation?

But that's my point: their results are not that important to me.

As an example, recently I saw a paper on NN weight quantization, which had a very interesting idea, but the results were not impressive. I don't remember if they had any code published or not, but it didn't matter - I wanted to see what kind of results I'd get if I implemented it. Turned out it works really well, much better than what they reported in the paper.

Here is an idea: inverse dropout.

How would you implement that?

Link to paper?

You linked to the original dropout paper. What’s “inverse dropout”?

It is just the description of an idea I came up with without any implementation.

I was leaving it purposefully vague, just do the "inverse" of what it says in that paper.

That's not a description of an idea, just like a paper doesn't only consist of the title. This kind of argument is insincere and not helpful at all.

What your preferred software to implement these? A framework like chainer, or purely in numpy/MATLAB?

Tensorflow or Pytorch. Plain Numpy for quick prototyping/testing. Sometimes have to write/modify Cuda kernels.

I would not say that most academics are "perfectly fine with the status quo". But I would say that most academics have enough competing interests taking their time away from research that they're uninterested in taking on another one with such uncertain payoff.

In a way bringing about the kind of change you reference in scientific publishing would actually be a pretty significant research accomplishment -- the field would be that much better for your efforts! But the road to get there is filled with political wrangling, talking to and serving on committees, probably forming dedicated organizations and painstakingly getting buy-in. This is not something you can realistically achieve without probably a good career's worth of political capital in your field and the drive and people skills to make it happen.

Until it does happen, making your own lab adhere to these standards is admirable but with unfortunately limited upside. I'm not saying the status quo is good, just that there are reasons for it still being the status quo.

>"Honestly, I'd much rather have your actual code and data that you used to get your results than read through the research paper if I had to choose (assuming the paper is not pure theory) - but instead there is a disproportionate focus on paper quality over "project quality" at least IMHO."

I think at first new students know this wrong but then get dragged into the circular logic of:

  it is standard in the field -> it is ok -> it is standard in the field
It starts with just being so busy and confronted with so many new things that you just use the standard behavior as a "stand-in" (no pun intended) for a rational approach. Then you never have the time to go back and reassess that decision.

> I don't really know what the solution is since apparently most academics have been perfectly fine with the status quo.

Simple - change the incentives. Currently, academics are evaluated based on paper publications not "actual code". If you want code and data to be shipped, create enough incentive for them and you'd see the change.

> there is a disproportionate focus on paper quality over "project quality"

One problem is bitrot. Stuff that runs now is not guaranteed to work in 1 or 2 years, let alone 10 years.

Even more so when it runs on fancy hardware, like GPUs.

This is one of the main reasons to require source release. Open source software is much more likely to run in 10 years. It’s actually useful to package everything together into a container or VM so all the packages are there too.

I work with some genome guys and they have this problem as their sequencers basically turn over in a year or two the advances are so fast. So they have to maintain the specimen as well as all the software versions they used for analysis. It’s a pain, but otherwise nothing is reproducible.

The work should reproducible from not just artifacts, but also from a container. Sourcing compilers, libraries, etc is almost impossible. The NSF should really be running an archive and cluster for housing reproducible research that remains executable far into the future.

I'm in opto/bio/eng. I think you misunderstand the 'real' reason for research papers as they currently stand: Money. It's a bit of a path, but I'll try and explain.

In the US at least, research costs a LOT of cash. Many departments are chronically underfunded. In my state, the university only gets ~10% of it's funding from the state-house. The rest is grants. The only real writers of grants are the professor corps. So, departments look to the professors to fund the enterprise. Some of my advisers spent about 40 hours per week just on grant writing, neglecting the teaching and research hours required alongside. It is not a fun/good job. So most/all research is done by students, mostly PhD students, with little to no input from their advisers, and it's a stressful mess. As a result, most research is, well, amateur. Stats get mangled, code quality is non-existent, rats get loose, etc. Yes, yes, none of that 'actually' happens, but for real? It's a shitshow.

So, where does that leave the PhD student that has been in the program for 7 years? They may have one first author paper, if that, a thumb-drive filled with nearly unreadable 'data', and a dozen failed experiments. Failed experiments don't get published, mostly because science is hard and doing all the controls to say that you have a genuine/real failure is much harder. So the professor, now running into a very firm deadline to graduate the student via the grad office, must rush and publish something, just to get the student to leave. The professor's track record in graduating students is part of their evaluation, as well as their publication record. Hence, the unreadable graduation paper; one of two types of unreadable paper.

This paper is a targeted missile that is meant to do one thing: get the student off the payroll. It is not meant to be good, or a viable piece of science. It is never meant to be replicated. It is trying to be obtuse. It is there just to graduate a student, nothing more, nothing less.

The other class of unreadable paper is the turf-war paper. These papers are also meant to be just readable enough, but not so much as to be repeatable. The reason is that the paper is a 'big' paper. What is published is meant to stake a claim in a 'big' area of the field. Hopefully this will guarantee more funding in the future as now that professor is a 'big' player in it. Hopefully no others can report that it is unrepeatable before the next grant comes in. The trick is make certain that the paper exposes just enough of the experimental design as to truly 'claim' the new big thing, but not enough that you can replicate at all. Karl Disseroth is infamous for this in the bio world. The paper creates jazz, but safeguards the turf of the lab from any other lab that may want to replicate it independently; they need the first lab to re-do it, and they must come with funding in hand.

So, to sum up: papers are weapons. One type is the missile that causes a student to graduate. The other is a trap with a golden idol on it.

This is spot on. I was surprised the first time I worked at a major university just how toxic the environment was and how little mindshare was spent towards actually contemplating compelling hypotheses / experiments. It was much less of the ideal "life of the mind" I thought it would be and much more like show business / social climbing, minus the widespread name recognition and glamor.

I was already on the way out of science when I started working at that job, but the publish or perish culture really accelerated my departure.

It's also interesting how the current incentives really warp the incentive structures not just at big research universities, but also at small liberal arts colleges. I grew up as a fac brat, and so I've been able to tune into a lot of dialogue about the latest crop of new professors coming in to replace older professors as they retire, and a lot of the older professors are genuinely shocked at how little emphasis the newer professors place on teaching (traditionally what SLACs have focused on) compared to research. Even at schools with around 2000 students, new professors are demanding generous starter packages that no one would really have thought to ask for in the 70s.

To be fair, it's been ~50 years since the 70s. The Professor Corps should pretty much be entirely different people.

From an ideal point of view, I agree with your criticism. Probably most honest academics would, as we all have had frustrations after spending a lot of time trying to reproduce someone else's research. But it is very difficult to solve this problem.

Peer-review takes a large amount of time from most academics, time that is totally unpaid. With the status quo, we are OK with that - it's a service we do to each other (we need our own papers reviewed, after all) and reviewing also has the advantage of finding new ideas sooner. Although precisely in AI/ML, many academics are currently complaining: due to the rapid expansion of the field, the peer-review load has gone beyond acceptable in many cases. For the last AAAI conference I had to review 6 papers in a not too long deadline. In the last 3 months I have reviewed like 40 or so papers, and I'm very far from being a top-tier star in my field, there's people who are probably getting much more review requests (although they're probably saying no to some if they want to keep sanity).

Reviewing code and data seriously can take, how long? I would estimate an order of magnitude more than reviewing a conventional research paper in PDF.

So currently, the situation is that if you post a link to source code you may get some positive reaction in the reviews, but in 99% of the cases reviewers are not going to actually look at the code (or at least not beyond a cursory look to see if it seems coherent at a first glance) because there is just no time.

Unless we fix this, I don't think we will see papers really focusing on the code and data, regardless of good intentions.

Have you seen OpenML? There are solutions for this, and I think most people would agree they are useful, it's just the change/adoption/standardization cost is high as always.

Seems like this could pair well with the journal crisis and suggestions to implement a blockchain journal: Your paper cannot be accepted by the journal unless it includes executable code; the results of which are then injected into the "paper" view...?

So, basically - a paper consists of what it takes to replicate the paper, and the blockchain journal's first step is running the replication.

This would be problematic for papers that require expensive computation, however...

So where, exactly, is a block chain required here? Everything you listed could just as easily be a requirement set by the journal, after all. I mean, every journal has at least some requirements already (at the very least, nearly all require publishing in a specific language). So aside from jumping on the blockchain bandwagon just because that's the new exciting thing, what value is added here?

Good God, I'm getting tired of every single thing needing use the magic word 'blockchain' ATM.


It actually seems like journals could benefit from the application of this technology.

So yes, you don't need a block chain to set these requirements, and if you're using a blockchain you don't need these requirements, _but_, a blockchain journal and these requirements would likely pair very well together, as they cover respective weak-points (centralized journals might only ensure the journal's publisher can replicate; decentralized journals have to have some kind of automated validation).

Buzzwords become buzzwords because there's something to them, after all.

> How to integrate billions of base pairs of genomic data, and 10 times that amount of proteomic data, and historical patient data, and the results of pharmacological screens into a coherent account of how somebody got sick and what to do to make them better? How to make actionable an endless stream of new temperature and precipitation data, and oceanographic and volcanic and seismic data? How to build, and make sense of, a neuron-by-neuron map of a thinking brain? Equipping scientists with computational notebooks, or some evolved form of them, might bring their minds to a level with problems now out of reach.

The article seems to conflate the praxis of science with the archival of it. Scientists do all of the above on gigantic clusters, not on an IPython/Mathematica notebook. The purpose of publishing papers, on the other hand, is adding to the archival of knowledge, and they can be easily rendered in a laptop with LaTeX.

And they are excellent at archival, by the way. You can see papers from the 19th century still being cited. On the other hand I have had issues running a Mathematica notebook from a few releases back -- and I seriously doubt one will be able to read any of my Mathematica notebooks 150 years from now. The same with the nifty web-based redesign of the Nature paper that is mentioned: I bet the original Nature article will be readable 150 years from now, whereas I doubt the web version will last 20.

A group upstairs at the Broad Institute built out a system to use Jupyter notebooks for analysis of genomic data, with backend computation happening on a Spark cluster[1]. Science on large datasets can happen via interactive notebook. In a connection with a recent GWAS on a massive dataset from the UK Biobank, the researchers involved decided not to write a traditional scientific journal article (at least for now) since their analyses will continue to mature. Instead, they've been posting insights online in blog form, with associated code on GitHub[2]. It's a daring move toward publishing at the speed of research. Once their conclusions mature, traditional journal articles may follow to distill and preserve the key findings. In the mean time, those in the field can apply the same code to their data, replicate the analyses, and get an early look at the output of the research. This works partially because the methods (univariate GWAS) are understood in the field and the interpretation and rendering of a particular dataset is the science in this case, rather than a new method (which would still likely warrant a paper).

1. https://hail.is

2. http://www.nealelab.is/blog/2017/7/19/rapid-gwas-of-thousand...

> Science on large datasets can happen via interactive notebook.

I did not claim the opposite, just that it regularly happens without interactive notebooks. This seems like an interesting project though. Regarding the blog posts, it seems that there's a bug that makes all the entries appear as published on September 20, 2017?

Of course people will be able to read your Mathematica notebook in 150 years. They will just load it into a 2020-era Mathematica-engine.

The problem of software packaging will be solved by then, at least well enough to trivially emulate any of the popular environments we have today.

I used to work as a software developer for a research institute. I wanted to open source our research code and tools, and the department head was in favour of it because it would raise the profile of the research unit.

There were two forces working against us. First many of the grants came from governments, and a stipulation was that we would devote some resources to helping startups commercialise the output of the research. Some felt that open sourcing would remove the need for the startups to work directly with them to integrate the algorithms, and that this would hurt future grant applications by making the research look ineffective.

The main opposition though came from PhD and Postdoc students. Most didn't want anything related to their work open sourced. They believed that it would make it easy for others to pick up from where they were and render their next paper unpublishable by beating them to the punch.

Sadly I think there was some truth to both claims. Papers are the currency of academics, and all metrics for grants and careers hinge off it. It hinders cooperation and fosters a cynical environment of trying to game the metrics to secure a future in academics.

I don't know how else you should measure academics performance, but until those incentives change the journal paper in its current form is going nowhere.

> Papers are the currency of academics, and all metrics for grants and careers hinge off it. It hinders cooperation and fosters a cynical environment of trying to game the metrics to secure a future in academics.

I honestly can't contemplate who in their right mind would want "a future in academics" where academia is defined as a constant stream of metric gaming rather than actually accomplishing what you originally set out to accomplish.

Imagine it like this: maybe you joined your field (be it software or academia) with dreams of making the world better. So it shouldn't matter to you who does the problem solving, as long as the problems get solved and the world gets better in a timely manner. But then, you've found a gig solving an important problem, with ok-ish pay. And you've also found yourself with dreams of a home, a spouse, maybe children. And suddenly, it starts to matter to you whether or not you get paid, so that you can fulfill your dreams. Maybe someone else could do your important work better than you. But then you won't have money to support your family. So it's gaming time, and now you end up focusing on "a future in <your field>".

Cynicism is rife in every industry and walk of life.

But I think most people tell themselves "once I get the PhD... once I have a permanent position... once I have tenure..." and by that stage they're institutionalized.


Maybe embargo works, to be progressively published over time, with grace period being extended by reaching goals/milestones.

In other words: use it or lose it.

Regarding the article itself: Brett Victor is amazing and so is Strogatz. They are both my heroes actually.

But I do think there is a difference between scientific professionals communicating amongst each other and scientific communication to the public. And if mathematicians understood Strogatz his paper at the time when it was published, and there were enough mathematicians to disseminate the knowledge, then should you require that you create algorithms as animations?

Part of the reason why mathematicians and computer scientists (as researchers) conceive of new algorithms in the first place is because a lot of them are very strong in visualizing algorithms and 'being their own computer'.

Though, if a scientist wants to appeal to a broader group of scientists, then I'd recommend her or him to use every educational tool possible. For example, they could create an interactive blogpost a la parable of the polygons[1] and link that in their paper.

On an unrelated note, it is such a pity ncase isn't mentioned at all in this article!

Also related is explorabl.es, not everything is science communication in an interactive way but a lot of it is[2].

[1] http://ncase.me/polygons/ [2] http://explorabl.es/

>Part of the reason why mathematicians and computer scientists (as researchers) conceive of new algorithms in the first place is because a lot of them are very strong in visualizing algorithms and 'being their own computer'.

Can you explain this in more detail?

We need GitHub for science. But that's not enough. It needs to be combined with a mechanism for peer-review and publishing that funding agencies will find acceptable--that's the key.

There are academics interested in building journals on top of the arXiv. It's an idea that's been kicking around for a while. In math, Tim Gowers is one of the people who has been pushing for something like this.

If your background is in not-academia, you may well be forgiven for looking at the scene and seeing no progress. However, academia is more slow-moving and conservative in some aspects than other places. This is not a bad thing; not everyone has to move fast and break things. But at academia's normal pace, things are going pretty fast.

It's not only about funding agencies, though. Researchers need to be trained in the use of a GitHub like system - as a former graduate student, I knew NO ONE using Github. Furthermore, after working in scientific publishing where it was MS Office or bust, I doubt you will convince Elsevier and the publishing giants that GitHub for Science is the way to go. How do these huge companies make money if scientists are sharing their work for free online? Meanwhile, you have scientists themselves, many of whom are required to hit certain numbers of articles in certain "prestige tiers" of journals in order to get tenure. If you're an extremely busy junior faculty member who will be FIRED when tenure review comes up if you don't have 1 paper per year, will you risk it on a GitHub scheme? I doubt that.

These problems are not unique to science. Take any large group of humans, be it government, the military, a company, a set of companies: they will coalesce onto the use of a certain number of tools, and it will take a LOT of energy and work to switch tools. Getting an entire industry to switch tools is extremely hard, perhaps downright impossible, and I'm not sure if it can be done in the rather slow-moving world of academic science. It simply won't be possible to enact a huge change on the system with so much institutional momentum built up.

It seems to me the best solution is to make incremental changes toward a github like system, training new scientists in grad programs to have better practices. Maybe we can't get researchers to post their full data set for all the public, but perhaps it can be made available to non-competitors, those who researchers aren't actively competing with for the same grants. Maybe researchers can be required by journals to post the peer-review comments/responses, as well as the drafts of the article, alongside the article itself. Maybe we can have researchers posting extremely detailed code/methodological information alongside the article, without forcing them to give over the entire dataset to other researchers?

Finally, maybe the whole incentive structure of academic science needs to change: maybe articles/publishing metrics should be de-valued in favor of teaching skills, mentorship ability, and collaboration within and across disciplines?

I doubt you will convince Elsevier and the publishing giants

Who says we need to convince them? How about we leave them behind? They are rentiers, gatekeeping society's access to publicly funded scientific knowledge. I can't think of a reason why society should allow this hostage situation to continue.

OK, but you need to figure out a way to take over the universities then because scientists are also animals who need food and shelter etc. etc. and depend on grants, stipends, and salaries to buy those things.

I'm not saying this to be dismissive - I'm strongly in favor of faculties organizing to unseat administrators from their privileged positions. It's baffling to me that bright minds on campuses complain at length about the state of higher education but seem oddly averse to doing anything about it.

Exactly. These rent-seeking gate keepers should be sent to the dustbin of history where they belong

When did you do your PhD? When we collaborated for papers we used Subversion (at the start of my PhD) & git (towards the end) both for code and papers (in latex).

I assume you don't just means using Github for version control of scientific papers, because that sounds pretty pointless to me.

As far as tracking incremental improvements over time goes, I think it'll be hard to do better than our current method of including references to papers. It's impossible to track ideas the same way tracking code works (which itself is limited for similar reasons).

It would be nice if you could reference papers in a way that immediately allowed you to access them. And the technology to do so is there, but has limited use if papers aren't freely accessible from the internet.

But, given that the system of papers with references is essentially a DAG, I imagine someone will attempt to 'solve' the problem with blockchains before you can say 'initial coin offering'.

ArXiv has versioned papers and a way of citing them. It’s working pretty well.

> I assume you don't just means using Github for version control of scientific papers, because that sounds pretty pointless to me.

Why? I recently wrote a paper on GitHub. I loved it.

> I recently wrote a paper on GitHub.

When you have a paper deadline in two weeks and four authors are furiously hacking in edits left and right, Git is absolutely invaluable when writing a paper. I can't fathom how people handled merges of written text without a version control system.

I mean sure use whatever you like, but I'm just not sure how much use it would be for people interested in reading your paper.

Oh I follow. No, I meant I like using git for version control.

Some scientists are using github as github for science, including development done in the open. For instance, this is just one particle physics project I happen to be aware of https://github.com/DisplacedHiggs

Open development is a policy for these projects and part of the grant stipulations.

Of course, data sets are more closely guarded initially until the groups that created them can publish. After that, though, CERN /LHC has done a decent job of making data publicly available from my understanding (not as someone directly involved).

I would be interested to hear more from scientists involved in projects doing open science.

I think that's sort of what services like Texture [1], ShareLaTeX [2], and Authorea [3] are trying to become (at least for papers). For code GitHub for science is generally the same as GitHub for everyone else.

[1] https://elifesciences.org/labs/8de87c33/texture-an-open-scie...

[2] https://www.sharelatex.com/

[3] https://www.authorea.com/

> We need GitHub for science


> It needs to be combined with a mechanism for peer-review and publishing...

Maybe an "interpretation" mechanism, similar to the Distill[1] project is doing, can serve two purposes at once: review and digestion.


You rang https://osf.io/

I would really like the Open Science Framework to become just that (with other tools like OpenML as needed). But it requires people to actually work on it to happen...

I really want to like OSF, but I feel like they sometimes waste time on pointless efforts that go nowhere instead of working on substantive features that scientists would actually use. In particular I found their effort to create badges for open science to be misdirected at best, since credentials of any kind don't really count for anything unless they're backed by an authority that's seen as legitimate universally in a field.

I also feel a bit weird about badging in science in general, since most of the most passionate people I know in science are intrinsically motivated enough that I could never really see them really concerning themselves with such carrots unless it meant that they'd get more funds to do more of what they find fun.

I do not agree that the scientific paper needs to be replaced. It should be complemented with the help of new tools, that is a very good thing, but I still want the article.

I work everyday with papers from decades ago, and I hope people will work with my papers in the future. How can I guarantee that researches of 2050 will be able to run my Jupyter notebooks?

Moreover, it is not uncommon to not be able to publish source code. I can write about models and algorithms, but I am not allowed to publish the code I write for some projects.

"... the skill most in demand among physicists, biologists, chemists, geologists, even anthropologists and research psychologists, is facility with programming languages and "data science" packages."

If I wanted to prove to someone this statement was true, what would be the most effective way to do that?

Is author basing this conclusion on job postings somewhere?

Has he interviewed anyone working in these fields?

Has he worked in a lab or for a company doing R&D?

How does he know?

What evidence (cf. media hype) could I cite in order to convince someone he is right?

When I look at the other articles he has written, they seem focused on popularised notions about computers, but I do not see any articles about the academic disciplines he mentions.

GitXiv very much worth taking a look at if you're into this kind of thing: http://www.gitxiv.com/page/about

edit: as is Chris Olah's Distill project: https://distill.pub/

How well does it work to version control Mathematica notebooks in git? For example, is it possible to get meaningful textual diffs when comparing two versions of a mathematica notebook, and can git compress them enough to keep repo size down?

With iPython this is also an issue -- tracking code in JSON is much less clean than tracking code in text files.

It's interesting that Mathematica and iPython both left code-as-plain-text behind as a storage format. I wonder if it would have been possible to come up with a hybrid solution, i.e. retain plain-text code files but with a serialized data structure (JSON-like, or binary) as the glue.

I use Mathematica daily and frequently store large-ish notebooks in Git. The format is textual, but the diffs are filled with a lot of noise.

as a practical matter, papers will remain relevant as long as they are the metric by which grant applications and tenure decisions are made.

as a philosophical matter, for computation heavy fields, i would love to see literate programming tools become de rigeur in the peer-reviewed distribution of results. In some fields (AI) this basically happens already — the blog post with code snippets and a link to arxiv at the end is a pretty common thing now.

Papers (and PDFs) are relevant because they are easy to organize and archive, essentially in perpetuity. Source code is too, so nothing wrong with a "Github for Science". Notebooks, blogs, or interactive dashboards, on the other hand, are an amazing tool both for research and for communication, but they are far more ephemeral than a paper. They need a large overhead to keep them running that cannot be sustained over decades or centuries. Typically, you'll have lots of trouble re-running a 5 year old notebook. That's not to say they're useless, e.g. as supplementary material (quite the opposite). They're just not going to replace papers anytime soon.

Those are good points. Journals have to care about this, too, these days -- supplementary information now routinely includes videos (hope you have the right codecs!), word documents, audio, and full or redacted data sets.

I think what appeals to me about literate programming style is that it encourages a return to a more clear and expository style of writing, which has been squeezed out of scientific writing in journals over the years. I don't care what instantiation is required to produce a more uniformly clear and cogent document, I just care that it happens.

As is usual for Stephen Wolfram, he has a point, and then blunts it by trying to own the whole thing. Edit: to expand, part of the answer to his question, why don't more people do this, is that it requires his expensive proprietary software. Scientific papers are (nominally at least) a commons.

Aside from my other comment, I think that any discussion about the scientific paper and the way knowledge is communicated is incomplete without a mention of Nick Sousanis' Unflattening. It is a thesis for a Doctor of Education degree about this very topic, that practises what it preaches by being written as a comic book.


This is a really interesting article. The use of jupyter as a publication mechanism is a really neat idea! I think this path will be fruitful, and I am all for it. I do think however that some low-hanging fruit should be addressed in parallel - stuff that makes looking through the existing work a total pain:

* Date of publication and dates of research should be required in every paper. It's really difficult to trace out the research path if you start from google or random papers you find in various archive searches. Yes that info can be present but often its in metadata where the PDF is linked rather than the PDF itself. Even worse is the "pubname, vol, issue" info rather than a year that you get... now I have to track down when the publication started publishing, how they mark off volumes and so on. I just want to know when the thing was published.

* Software versions used - if you are telling me about kernel modules or plugins/interfaces to existing software, I need to know the version to make my stuff work. Again - eventually it can be tracked down, but running a 'git bisect' on some source tree to find out when the code listings will compile is not OK.

* actual permalinks to data, code, and other supplimental information. Some 3rd party escrow service is not a terrible idea even. I hate trying to track down something from a paper only to find the link is dead and the info is no longer available or has moved a several hour google journey away.

shameless plug for the Popper Convention and CLI tool http://github.com/systemslab/popper . Our goal is to make writing papers as close as possible to writing software (DevOpsify academia) but in a domain-agnostic way.

I love science, but I have a lot of issues with it lately. I'm going to express some of them since they're related to this topic.

The basic function of a scientific paper is understanding and reproducibility (inspired by jfaucett his comment).

I wonder, is reproducibility necessary? Is it even possible when things get really complex? Isn't consensus enough? I feel in the field of psychology (and most social sciences) that is what happens. I suppose consensus can be easily gamed by publication bias and a whole slew of other things. So I suppose as jfaucett puts it, a "discover for yourself" type of thing should still be there. I wonder how qualitative research could be saved and if you could call it science. In Dutch it is all called "wetenschap" and "weten" means to know.

But how should we go about design then? HCI papers use a a lot of design of design that is never justified. The paper is like: we build a system, it improved our user metrics. But is there any intuition or theory written down as to why they designed something a certain way? Not really.

I suppose one strong way to get reproducibility is by getting all the inputs needed. In a psychology study this means getting a dataset. Correlations are fuzzy but if I get the same answers out of the same dataset, then the claims must be true for that particular dataset.

Regarding design and qualitative studies, maybe, film everything? The general themes that everybody would agree upon watching everything would be the reproducible part of it?

Ok, I'll stop. The whole idea of that a paper needs to satisfy the criterion of reproducibility confuses me when I look at what science is nowadays.

> I wonder, is reproducibility necessary? Is it even possible when things get really complex? Isn't consensus enough?

If my results are not reproductible, then I'm basically asking for your trust. So now, instead of actually doing the experiment, there's an incentive for me to forge my results, And they don't have to match reality anymore, by the way.

Gotta go ; back to working on my paper about psychic powers.

Sometimes you don't need to redo everything from scratch to change things.

There are a number of problems in scientific publishing. Two big ones are:

1) Distribution hurdles and paywalls imposed by rent seeking journals - who knows how much this has prevented innovation and scientific advancement in the last 20 years

2) Easily replicating experiments / easily verifying accuracy and significance of results - this is related to for instance making data used in research more easily accessible and making it easier to spot p-value hacking

Fixing these might not require a completely new format for papers. Or it could. I can envision solutions both ways.

I really like what the folks from Fermat's Library have been doing. They have been developing tools that are actually useful at the present time and push us in the right direction. I use their arXiv chrome extension https://fermatslibrary.com/librarian all the time for extracting references and bibtex. At the same time they are playing with entirely new concepts - they just posted a neat article on medium about a new unit for academic publishing https://medium.com/@fermatslibrary/a-new-unit-of-academic-pu...

> His secret weapon was his embrace of the computer at a time when most serious scientists thought computational work was beneath them.

They still think this.

Interesting timing: for the last two years I have worked for a research group headed by Sten Linnarsson at the Karolinska Institute[0]. I was specifically hired to build a data browser for a new file format for storing the ever-growing datasets[1][2][3]. The viewer is an SPA specialised in exploring the data on the fly, doing as much as possible client side while minimising the amount of data being transferred, and staying as data-agnostic as possible.

Linnarsson's group just pre-published a paper cataloguing all cell types in the mouse brain, classifying them based on gene expression[4][5]. The whole reason that I was hired was as an "experiment" to see if there was a way to make the enormous amount of data behind it more accessible for quick explorations than raw dumps of data. The viewer uses a lot of recent (as well as slightly-less-recent-but-underused) browser technologies.

Instead of downloading the full data set (which is typically around 28k genes by N cells, where N is in the tens to hundreds of thousands), only the general metadata plus requested genes are downloaded in the form of compressed JSON arrays containing raw numbers or strings. The viewer converts them to Typed Arrays (yes, even with string arrays) and then renders nearly everything on the fly client-side. This also makes it possible to interactively tweak view settings[6]. Because the viewer makes almost no assupmtions of what the data represents, we recently re-used the scatterplot view to display individual cells in a tissue section[7].

Furthermore, this data is stored off-line through IndexedDB, so repeat viewings of the same dataset or specific genes within it does not require re-downloading the (meta)data. This minimises data transfer even further, and makes the whole thing a lot snappier (not to mention cheaper to host, which may matter if you're a small research group). The only reason it isn't completely offline-first is that using service workers is giving me weird interactions with react-router. Being the lone developer I have to prioritise other, more pressing bugs.

In the end however, the viewer is merely a complement to the full catalogue, which is set up with a DocuWiki[8]. No flashy bells and whistles there, but it works. For example, one can look up specific marker genes. it just uses a plugin to create a sortable table, which is established, stable technology that pretty much comes with the DocuWiki[9][10]. The taxonomy tree is a simple static SVG above it. Since the expression data is known client-side to generate the table dynamically, we only need a tiny bit of JavaScript to turn that into an expression heatmap underneath the taxonomy tree. Simple and very effective, and it probably even works in IE8, if not further back. Meanwhile, I got myself into an incredibly complicated mess writing a scatterplotter with low-level sprite rendering and blitting and hand-crafted memoisation to minimise redraws[11].

Personally, I think there isn't enough praise for the pragmatic DocuWiki approach. My contract ends next week. I intend to keep contributing to the viewer, working out the (way too many) rough edges and small bugs that remain, but it won't be full-time. I hope someone will be able to maintain and develop this further. I think the DocuWiki has a better chance of still being on-line and working ten years from now.

[0] http://linnarssonlab.org/

[1] http://loompy.org/

[2] https://github.com/linnarsson-lab/loom-viewer

[3] http://loom.linnarssonlab.org/

[4] https://twitter.com/slinnarsson/status/981919808726892545

[5] https://www.biorxiv.org/content/early/2018/04/05/294918

[6] https://imgur.com/f6GpMZ1

[7] http://loom.linnarssonlab.org/dataset/cells/osmFISH/osmFISH_..., https://i.imgur.com/a7Mjyuu.png

[8] http://mousebrain.org/doku.php?id=start

[9] http://mousebrain.org/doku.php?id=genes:aw551984

[10] http://mousebrain.org/doku.php?id=genes:actb

[11] https://github.com/linnarsson-lab/loom-viewer/blob/master/cl...

This title is horrible hyperbole. Science is more than just machine learning. Hell, even if we just constrain ourselves to "computer science" probably half of it is just math, for which the scientific paper is definitely not "obsolete" nor even deficient in any way.

But outside of computer science you need laboratories to replicate experiments. Scientific papers are perfectly fine vehicles to record the necessary information to replicate experiments in this setting. Historically appendices are used for the extended details. And yes, replication is hard, but it's part of science.

What makes you think math papers wouldn’t benefit from having interactive bits in the middle? Obviously isn’t relevant for all math papers, but often would be extremely helpful. I read a lot of technical papers, and it is quite frequent that I will need to spend an hour or two decoding some formal mathematical statements whose basic idea/intuition could be more clearly conveyed pictorially in a few minutes.

Of course, making interactive diagrams often takes dramatically more work than sketching pictures with a pen (or just writing down equations), and mathematicians are not typically trained to do it, so it would be an uphill slog for many. But I would love it if there was more funding/prestige/etc. available for mathematicians to make their papers more accessible by adding better visuals.

OK, we'll hypobolize the title a bit.

But it would be better to react to the substance of the article, which is more interesting.

awll 10 months ago [flagged]

This reads like an ad for Mathematica

Swipes like this break the site guidelines, including these important ones:

"Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

Would you please (re-)read https://news.ycombinator.com/newsguidelines.html and not post like this here?

At first then it talks about Wolfram's idiosyncracies - it is quite a rambling article, though there is a long bit about IPython/Jupyter.

You didn't read very carefully then.

Average clickbait

Maybe look a bit closer? I slog through "average clickbait" for hours a day in the hope of sparing this community of it, and can assure you that is far from the case.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact