It took a lot of work to extract any information from a paper; I had to spend hours reading it. I was always impressed at folks who could just glance at a single figure, without referring to the methods, and could glean what the paper was trying to say.
However, after working through enough papers and replicating the results of the authors, I came to learn a number of things- 1) papers are just written badly and it's not beacuse the authors are smart. It's beacuse the authors are bad writers. 2) most papers- in bio, I'd say about 90%- contain invalidating errors which mean that the figures and conclusions are worthless. It takes skilled readers to uncover methodological flaws (or infer them, as often not all the details are included).
After working in bio for a while, it was nice to be in ML because - at least it seemed- I could replicate most papers by downloading the github repo, training on my local GPU for a few days, and then using the trained model to make the same predictions as the papers. Then I realized- in most cases, what was being claimed was far more than what the trained model was actually capable of doing.
Now I stick to well-trod engineering literature that most people consider boring. In nearly all cases I can read the lit, repro the work, and get results that make sense (much of my work is ensuring that published benchmarks are reproducible).
The advice here is golden.
At times I had (still have?) a cynical look at scientific publishing, that it was a symptom of a larger problem: the Peter Principle writ large. Professors and researchers are selected for one thing (their scientific ability), but their major job duties are for something else (here, writing; but also teaching and leadership). Select for one trait, but the job description is for another, and the output is dismal.
To further extend my cynicism: why would a PI follow Cormac McCarthy's advice? Do we have evidence that better-written papers are more "successful," with more citations and shares? My argument (based only on my experiences) is for the opposite: the language is secondary. "Skip to the figures," as I have heard.
Perhaps you're saying as much, in between the lines, but I think there is a specific reason that modern scientific work has degraded in the way that it has. Most of it is rubbish, and the authors know that its rubbish. They don't want to publish rubbish, but finding new real and meaningful science is something incredibly difficult, yet they're expected to constantly publish - or perish. And so their motivation is not to inform society and help push scientific progress forward by a meaningful and relevant new discovery, but simply to publish something that can hit enough checkboxes to get published and keep moving on forward with.
And so in this regard grandiloquent language, excessive jargon/vernacular, and an obfuscation of the fundamental points under the guise of intelligence works as a phenomenal tool. As the countless hoaxes (such as the Sokal affair) have shown so long as you talk the talk and say something that is desired to be heard, you can get published even when what you submit is literally intentional nonsense. Sokal's hoax paper's  title was "Transgressing the Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity.".
Sokal hypothesized he could get a paper that is literally nonsensical published in a leading journal if " (a) it sounded good and (b) it flattered the editors' ideological preconceptions." The paper lead with: "There are many natural scientists, and especially physicists, who continue to reject the notion that the disciplines concerned with social and cultural criticism can have anything to contribute, except perhaps peripherally, to their research. Still less are they receptive to the idea that the very foundations of their worldview must be revised or rebuilt in the light of such criticism. Rather, they cling to the dogma...". Yeah, he's already 99% published there.
 - https://www.fourmilab.ch/etexts/einstein/specrel/www/
 - https://en.wikipedia.org/wiki/Sokal_affair
 - https://physics.nyu.edu/sokal/transgress_v2/transgress_v2_si...
As other commenters have noted, plain english is imprecise. When communicating with other scientists in the field it is often necessary to use language that conveys an exact meaning within their pre-existing body of knowledge.
If the audience is people without this knowledge, the precise meanings are useless and can be ignored and rigorous quantifcations are often unimportant. However, this is frustrating to scientists who are then left wondering which of n different definitions of some property/parameter you are using and how this influences results. This is critical knowledge when internalizing and applying new information.
I am also adding in many "over-elaborations" because the reviewer had questions that a simple adjective or two would have answered. A paper absolutely ~is~ a dialog with potential questions, and nothing illustrates this more than peer review.
The two audiences require different papers, and peer reviewers tend to fall only into one of them.
In the early pages, there's a brief summary taking up about a quarter of the page.
Later, A 1-3 page article. Often the issue will contain multiple papers on the same subject, this article is a simple-english aggregation of those subjects.
The research papers themselves are published later on.
For example, Nature published two recent papers on the determination of the Thorium nucleus's excited state. The papers themselves would have meant very little without a physics degree or some serious determination, but the accompanying article explains all aspects of the papers well. It explains the context (why are these papers significant), how the two papers relate to each other and the broader implications.
Having read the article, it's then much easier to comprehend the contents of the paper. Not easy, but easier.
Nature's not cheap (£198 a year), and it comes weekly which may be faster than most people can read it, but as a means of communicating the absolute cutting edge of all sciences I commend it.
You're saying that a bit under £4 a week isn't cheap? As a child my parents bought National Geographic monthly, which wasn't cheap where we lived, but it was well worth it. It threw me face fisrt into a lot of scientific knowledge and amazing photography which I wouldn't have accessed through any other means. I wouldn't call £4 a week expensive.
A paper is not source code - it is a tool for scientists to share their discoveries with fellow humans, and fellow humans will always appreciate clearly expressed ideas. After all, what good is a tool for knowledge sharing that's accessible only to a few? Now, I've argued before that some papers are necessarily difficult (because the topic is very specialized) and I stand by it. At the same time I do think a good paper should attempt to reach as wider an audience as possible, and jargon goes against this concept. I think the joke that "Git gets easier once you get the basic idea that branches are homeomorphic endofunctors mapping submanifolds of a Hilbert space" illustrates this concept beautifully.
For non-toy examples, there are notes freely available on Donald Knuth's "Mathematical Writing" course . These notes are a must read for scientist, and they include several examples of how replacing jargon with plain English can do wonders to improve readability. Page 7, in particular, explains how to rewrite a proof that "is mathematically correct (except for a minor slip) but stylistically atrocious".
To borrow your joke, please transform "homeomorphic endofunctors mapping submanifolds of a Hilbert space" into plain language with no loss of precision.
On one hand, writing that into plain language would take forever. By the time I'm done, I would have probably written a small book on algebra. Here, the use of jargon is definitely precise. If you wanted to present this in a topology conference, where it might be important that the number of dimensions of the space is infinite, it would be probably a good start.
On the other hand, thousands of people use git everyday without understanding a single word of this description. But if I say instead "git is a distributed version-control system for tracking changes in source code during software development", my clarity has now skyrocketed. I definitely lost a lot of precision, but my readers will still be able to replicate my results.
That's not to say that there is no necessary jargon - my own definition uses "distributed version-control", which is absolutely key for understanding what's going on.
I've had this sort of trouble also. The way I was taught to resolve it was to define every bit of jargon you use, or at the very least refer to some previous work's definition of that term - _and then stick to that definition come what may_.
So if you use a term, even if you want to use it in a somewhat uncommon manner, define it, and if possible define it before using it. It's like coding, right? Er. In C, anyway, if memory serves.
With that in mind, this advice is really problematic:
> Try to avoid jargon, buzzwords or overly technical language. And don’t use the same word repeatedly — it’s boring.
Boring is better than incorrect! I love when authors clearly define a term and use it over and over, because I know exactly what they mean every time. Jargon may be necessary to communicate precisely (but define your jargon, avoid assuming it's familiar). E.g. mathematically, there is a difference between a ball and a sphere. And if your L2 ball becomes an L2 balloon in the middle of a proof, readers will be very confused.
Also, this is just dead wrong:
> Avoid placing equations in the middle of sentences. Mathematics is not the same as English, and we shouldn’t pretend it is.
I can see this being useful advice when a proof is full of dense lines of inequalities, but the reasoning is still incorrect. Mathematical notation is nothing more nor less than shorthand for English (or any other language). The following is fine for example: If f(x) >= 1, then either x = 0 or x \in [1,2]. So is this: Noting that x < 5, Pr[f(y) + x >= 3] <= 0.1. Perfectly grammatical.
>> Avoid placing equations in the middle of sentences. Mathematics is not the same as English, and we shouldn’t pretend it is.
Yes, I am a mathematician and this advice goes against established practice and everything I've been taught.
Maybe correct for some fields though? I don't know. Needs a big disclaimer in any case.
> 1. Symbols in different formulas must be separated by words.
> 2. Don't start a sentence with a symbol.
13. Many readers will skim over formulas on their first reading of
your exposition. Therefore, your sentences should flow smoothly when
all but the simplest formulas are replaced by “blah” or some other
My writing has benefited greatly by treating all mathematical
expressions within sentences as nouns regardless of their relational
operators. My version of the sentence would be "Noting that x < 5
holds, we infer a probability of no more than 0.1 for f(y) + x taking
values of 3 or more.".
The advice I would give most science writers today is:
1. Have something to say.
2. Say it.
Many of the problems with overuse of jargon come from skipping directly to step 2.
For the more complicated equations, it becomes harder to unambiguously pronounce them, and the ideas are more separate anyway. Hence we need to break them out.
Moreover, I disagree that mathematical notation is a rigorous grammar. Notation is used to be consise, to represent ideas so we can work on them.
Precision is only added to resolve ambiguities that are hard to resolve based on the surrounding text.
One can make notation fully unambiguous, but that often comes at the cost of consiseness, and thus clarity.
Actually here you go, they repeat the same discussion in the preface of this free book:
See "Style", starting on page 4 of this PDF: http://assets.press.princeton.edu/chapters/gowers/gowers_VII...
More science authors should read writing advice like this.
Older papers that are considered notable today would probably be well-written compared to a random paper published last year, even if there weren't any group-level differences.
To be fair, given that the papers-published-per-scientist-per-year metric has been increasing (somewhat unhealthily, so it seems), I wouldn't be surprised if there has been at least some decrease in writing quality, simply to due to less time spent editing and polishing.
It’s also interesting how formats have evolved. Older journals like Nature and Science used to have a “letters” section--essentially letters to the editor. They were brief (200-300 words) and started with “Dear sirs”, like an actual letter. Today, a “letter” in the same journals is essentially a whole (short, 2-3 page) paper, like this one: https://www.nature.com/articles/s41586-019-1578-4
Godel Escher Bach is a book that was enthusiastically written with a word processor. And that explains a lot about it.
Nothing can single-handedly fix scientific (or technical, or academic) writing. Reading your own writing like a stranger is harder than reading your own code like a stranger. Aside from a few polymaths, the best technical/science writing takes careful, understanding partnerships.
That said, I think the bigger problem for science writing is how miserably disconnected pop-press coverage is from what is morally justifiable given the research design and results. Some of this is the fault of the scientists (and their reviewers), who don't always appear to see the limitations of their own results. More often, it seems to be about poorly-paid writers churning out clickbait from journal articles.
Unless (or until) we see fit to ensure our scientists (and inventors, and developers, and economists, ...) are polymaths capable of communicating their work lucidly, I think there's a lot of forsaken social value.
If it's worth giving a grant to study or build something, it's worth bolting a communication budget onto (not into) the grant to make sure the work is understood.
He's always been indifferent or even hostile toward financial success. He also doesn't really "teach" in a formal way - he just prefers hanging around with scientists, and they try to absorb his knowledge.
McCarthy has never shown interest in a steady job, a trait that seems to have annoyed both his ex-wives. "We lived in total poverty," says the second, Annie DeLisle, now a restaurateur in Florida. For nearly eight years they lived in a dairy barn outside Knoxville. "We were bathing in the lake," she says with some nostalgia. "Someone would call up and offer him $2,000 to come speak at a university about his books. And he would tell them that everything he had to say was there on the page. So we would eat beans for another week."
Well, are we going to (a) increase barriers to entry to the scientific professions until most scientists are also poets and warrior sages, or (b) punish bad paper writers with mandatory writing camps (that set back their scientific agendas), or (c) ???
Science papers are mostly made to be read by scientists.
> ... capable of communicating their work lucidly, I think there's a lot of forsaken social value.
You're tilting at windmills, here.
i. Never use a metaphor, simile, or other figure of speech which you are used to seeing in print.
ii. Never use a long word where a short one will do.
iii. If it is possible to cut a word out, always cut it out.
iv. Never use the passive where you can use the active.
v. Never use a foreign phrase, a scientific word, or a jargon word if you can think of an everyday English equivalent.
vi. Break any of these rules sooner than say anything outright barbarous.
From "The Science of Scientific Writing", 1990; there is a related video, [Judy Swan, Scientific Writing: Beyond Tips and Tricks](https://youtu.be/jLPCdDp_LE0)
educated, cultured, cultivated, learned, literate
In case you wondered like I did :-)
>> Minimize ... transition words — such as ‘however’ or ‘thus’ — so that the reader can focus on the main message <<
However, I'm not sure about this advice :) I remember reading a study that appropriate use of transition words is helpful in signposting how the coming sentence relates to the previous one, thereby reducing effort for the reader. What do other's think?
More generally, words like "however", "thus", "although", "therefore", "indeed", and "notwithstanding" are crucial for helping the reader follow the thread of the argument.
> He recently spent four hours looking through the collected works of Lagrange trying to find the source of “Lagrange’s inequality,” but he was unsuccessful. Considering the benefit to future authors and readers, he’s not too unhappy with the new law.
The second sentence contrasts with the first, and with the rest of the paragraph, which I have omitted here. I had to read it twice to grasp its meaning. A "however" or a "nonetheless" is a nice hint to the lazy reader.
If you can, don't send your final draft off immediately - let it sit for a while and attend to something else before (hopefully) giving it a final read through from end to end, without stopping.
With regard to transitions specifically, the meaning of the text can depend on them - for example, 'therefore', 'on the other hand' and 'in other words' cannot be substituted for one another, and it is not always obvious from the context which reading is intended.
• Don’t slow the reader down. Avoid footnotes because they break the flow of
thoughts and send your eyes darting back and forth while your hands are
turning pages or clicking on links. Try to avoid jargon, buzzwords or overly
technical language. And don’t use the same word repeatedly — it’s boring.
• Don’t over-elaborate. Only use an adjective if it’s relevant. Your paper is
not a dialogue with the readers’ potential questions, so don’t go overboard
anticipating them. Don’t say the same thing in three different ways in any
single section. Don’t say both ‘elucidate’ and ‘elaborate’. Just choose one,
or you risk that your readers will give up.
In particular the first bit of advice is not good advice, to my bitter, bitter
experience. I used to do this in my papers: I would vary the terminology I
used to refer to the same concept throughout a paper, to make the text more
interesting. This was criticised very strongly and caused untold confusion
among reviewers, and the confusion caused my work to be rejected with very
strongly negative comments. And that's not just papers- I kept having to re-do
the reports on the progress of my research required by my university, because
the internal examiners wer so confused by the way I was writing.
Then I had a single session with a woman from the unversity's center for
academic English and that was one of the first things she said to me: "use the
same term to refer to the same concept, throughout the text". It was like a
lightbulb went off and the next set of comments I got for my work was
overwhelmingly positive (with respect to my technical writing anyway). It was
just This One Simple Trick, right? But it made a world of difference.
And here's a more general bit of advice: if you ever enjoyed literature, or
indulged in writing your own, woe is you, o damned soul. Do not even consider
trying to write a technical paper as if it was literature. You will be damned
for ever to the eternal flames of damned damnation. Really- don't do it.
Literary writing is evocative, it creates fleeting impressions and conjures
emotions. Done right, no two readers will get the same impression from the
same literary text. But the same is death for technical writing. Technical
writing is precise and unambiguous. Everyone who reads it must understand it
and they must all understand the same thing. Don't mix the two kinds of
writing up or you'll make a mess. Like I have.
Pick one main audience, and a very few related audiences,
and write so that any of them can understand without
confusion or distraction. The red herring for an IPO
does not benefit from leet-speak, nor does the FAQ for
an online service benefit from legal disclaimers.
Trying to defend against bad reading is impossible and trying to do so in original article is useless.
Bad reading is huge problem for social media and internet forums. Bad readers are likely to be the first to comment article in the internet forums and comment sections. They destroy discussion.
This is dangerous advice, and not just because (as others have noted) it can add ambiguity.
Finding alternatives to the simplest word for a concept will often narrow the audience. For a simple example, if I decide I've written "Venus" too many times and substitute "the second planet", I'm unnecessarily making it more difficult for people who don't happen to know the order of the planets.
If you're referring to Venus, then please ignore that advice and say "Venus" every time, don't annoy your readers by calling it the second planet, Aphrodite, the planet of love, the evening star, and etc etc ...
That is an example of the many severe issues with the advice in this article.
Avoiding the elephant in the room that is how it's becoming mandatory to write in English for scientific communication. Won't anyone think about how this presents a risk of stagnation for scientific thought? (Starting with native English speakers, who never really have to learn another language...)
I agree with this. Newspaper style. But NeurIPS format will eat you alive if you follow this advice.