
Cormac McCarthy on how to write a science paper - danso
https://www.nature.com/articles/d41586-019-02918-5
======
dekhn
For many years I had imposter syndrome because I couldn't read papers.

It took a lot of work to extract any information from a paper; I had to spend
hours reading it. I was always impressed at folks who could just glance at a
single figure, without referring to the methods, and could glean what the
paper was trying to say.

However, after working through enough papers and replicating the results of
the authors, I came to learn a number of things- 1) papers are just written
badly and it's not beacuse the authors are smart. It's beacuse the authors are
bad writers. 2) most papers- in bio, I'd say about 90%- contain invalidating
errors which mean that the figures and conclusions are worthless. It takes
skilled readers to uncover methodological flaws (or infer them, as often not
all the details are included).

After working in bio for a while, it was nice to be in ML because - at least
it seemed- I could replicate most papers by downloading the github repo,
training on my local GPU for a few days, and then using the trained model to
make the same predictions as the papers. Then I realized- in most cases, what
was being claimed was far more than what the trained model was actually
capable of doing.

Now I stick to well-trod engineering literature that most people consider
boring. In nearly all cases I can read the lit, repro the work, and get
results that make sense (much of my work is ensuring that published benchmarks
are reproducible).

The advice here is golden.

~~~
rjf72
To first affirm what you're saying, this [1] is Einstein's revolutionary paper
on relativity. While people without a mathematical background may have some
difficulty following the technical components, what is said is plain enough
that a capable high school graduate ought be able to, at least in general,
follow the logic and ideas. Indeed even for Einstein it wasn't that far
removed from high school. It was published when he was 26, and working as an
assistant examiner in a patent office - having been unable to find any
university that would take him on.

Perhaps you're saying as much, in between the lines, but I think there is a
specific reason that modern scientific work has degraded in the way that it
has. Most of it is rubbish, and the authors know that its rubbish. They don't
want to publish rubbish, but finding new _real and meaningful_ science is
something _incredibly_ difficult, yet they're expected to constantly publish -
or perish. And so their motivation is not to inform society and help push
scientific progress forward by a meaningful and relevant new discovery, but
simply to publish something that can hit enough checkboxes to get published
and keep moving on forward with.

And so in this regard grandiloquent language, excessive jargon/vernacular, and
an obfuscation of the fundamental points under the guise of intelligence works
as a phenomenal tool. As the countless hoaxes (such as the Sokal affair) have
shown so long as you talk the talk and say something that is desired to be
heard, you can get published even when what you submit is literally
intentional nonsense. Sokal's hoax paper's [3] title was "Transgressing the
Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity.".

Sokal hypothesized he could get a paper that is literally nonsensical
published in a leading journal if " (a) it sounded good and (b) it flattered
the editors' ideological preconceptions." The paper lead with: "There are many
natural scientists, and especially physicists, who continue to reject the
notion that the disciplines concerned with social and cultural criticism can
have anything to contribute, except perhaps peripherally, to their research.
Still less are they receptive to the idea that the very foundations of their
worldview must be revised or rebuilt in the light of such criticism. Rather,
they cling to the dogma...". Yeah, he's already 99% published there.

[1] -
[https://www.fourmilab.ch/etexts/einstein/specrel/www/](https://www.fourmilab.ch/etexts/einstein/specrel/www/)

[2] -
[https://en.wikipedia.org/wiki/Sokal_affair](https://en.wikipedia.org/wiki/Sokal_affair)

[3] -
[https://physics.nyu.edu/sokal/transgress_v2/transgress_v2_si...](https://physics.nyu.edu/sokal/transgress_v2/transgress_v2_singlefile.html)

~~~
dekhn
we are in violent agreement.

------
blix
I am editing a paper now where I am removing a lot of non-jargon statements
and replacing them with jargon. The reviewers were not pleased.

As other commenters have noted, plain english is imprecise. When communicating
with other scientists in the field it is often necessary to use language that
conveys an exact meaning within their pre-existing body of knowledge.

If the audience is people without this knowledge, the precise meanings are
useless and can be ignored and rigorous quantifcations are often unimportant.
However, this is frustrating to scientists who are then left wondering which
of n different definitions of some property/parameter you are using and how
this influences results. This is critical knowledge when internalizing and
applying new information.

I am also adding in many "over-elaborations" because the reviewer had
questions that a simple adjective or two would have answered. A paper
absolutely ~is~ a dialog with potential questions, and nothing illustrates
this more than peer review.

The two audiences require different papers, and peer reviewers tend to fall
only into one of them.

~~~
ProstetnicJeltz
My parent's have been a subscriber to Nature for a while. They accomplish this
goal (communication to different audiances) very effectively.

In the early pages, there's a brief summary taking up about a quarter of the
page.

Later, A 1-3 page article. Often the issue will contain multiple papers on the
same subject, this article is a simple-english aggregation of those subjects.

The research papers themselves are published later on.

For example, Nature published two recent papers on the determination of the
Thorium nucleus's excited state. The papers themselves would have meant very
little without a physics degree or some serious determination, but the
accompanying article explains all aspects of the papers well. It explains the
context (why are these papers significant), how the two papers relate to each
other and the broader implications.

Having read the article, it's then much easier to comprehend the contents of
the paper. Not easy, but easier.

Nature's not cheap (£198 a year), and it comes weekly which may be faster than
most people can read it, but as a means of communicating the absolute cutting
edge of all sciences I commend it.

~~~
dekhn
Be aware though, Nature and Science both emphasize rapid publishing on "hot"
topics and many of the papers are just wrong. The work is accessible, but
misleading.

------
bo1024
This looks pretty good mostly, but I'd emphasize more that the most important
thing is to be understood. When it comes to a difficult mathematical concept,
this is actually really really hard. If you can nail that, then worry about
stylistic stuff, but you probably didn't nail it.

With that in mind, this advice is really problematic:

> Try to avoid jargon, buzzwords or overly technical language. And don’t use
> the same word repeatedly — it’s boring.

Boring is better than incorrect! I love when authors clearly define a term and
use it over and over, because I know exactly what they mean every time. Jargon
may be necessary to communicate precisely (but define your jargon, avoid
assuming it's familiar). E.g. mathematically, there is a difference between a
ball and a sphere. And if your L2 ball becomes an L2 balloon in the middle of
a proof, readers will be very confused.

Also, this is just dead wrong:

> Avoid placing equations in the middle of sentences. Mathematics is not the
> same as English, and we shouldn’t pretend it is.

I can see this being useful advice when a proof is full of dense lines of
inequalities, but the reasoning is still incorrect. Mathematical notation is
nothing more nor less than shorthand for English (or any other language). The
following is fine for example: If f(x) >= 1, then either x = 0 or x \in [1,2].
So is this: Noting that x < 5, Pr[f(y) + x >= 3] <= 0.1. Perfectly
grammatical.

~~~
failrate
Mathematical notation is not a shorthand for human language. It is a rigorous
grammar with properties that make it possible to express very unambiguous
qualities of a system.

~~~
rocqua
Mathematical notation can be spoken out loud. Hence, you can include it in a
sentence without breaking the flow of that sentence.

For the more complicated equations, it becomes harder to unambiguously
pronounce them, and the ideas are more separate anyway. Hence we need to break
them out.

Moreover, I disagree that mathematical notation is a rigorous grammar.
Notation is used to be consise, to represent ideas so we can work on them.
Precision is only added to resolve ambiguities that are hard to resolve based
on the surrounding text.

One can make notation fully unambiguous, but that often comes at the cost of
consiseness, and thus clarity.

------
suchire
One of my PhD classes was a paper reading seminar that spanned most of the
history of modern molecular biology, from 1959 to the present. It was starkly
apparent how much worse the writing quality became around the mid-1980s and
beyond. Compare the classic [“genetic code” paper by Francis
Crick]([https://moscow.sci-
hub.tw/1024/2cfa4b970e4a1fd816ee18afa95aa...](https://moscow.sci-
hub.tw/1024/2cfa4b970e4a1fd816ee18afa95aae98/crick1961.pdf)) with most modern
papers

More science authors should read writing advice like this.

~~~
dwaltrip
That sounds intriguing, although it strikes me that you might have been
observing a survivorship bias.

Older papers that are considered notable today would probably be well-written
compared to a random paper published last year, even if there weren't any
group-level differences.

To be fair, given that the papers-published-per-scientist-per-year metric has
been increasing (somewhat unhealthily, so it seems), I wouldn't be surprised
if there has been at least some decrease in writing quality, simply to due to
less time spent editing and polishing.

~~~
suchire
It’s definitely possible, but I think there’s still a palpable difference
today. As an example, this work from 1989 on how telomeres work was awarded a
Nobel in 2016. It’s not unreadable, but it has a completely different (and I
would say worse) style compared to the Crick paper I linked earlier:
[https://moscow.sci-
hub.tw/1036/810d9cdd2ea2e1a07d887dc432061...](https://moscow.sci-
hub.tw/1036/810d9cdd2ea2e1a07d887dc432061543/greider1989.pdf)

It’s also interesting how formats have evolved. Older journals like Nature and
Science used to have a “letters” section--essentially letters to the editor.
They were brief (200-300 words) and started with “Dear sirs”, like an actual
letter. Today, a “letter” in the same journals is essentially a whole (short,
2-3 page) paper, like this one:
[https://www.nature.com/articles/s41586-019-1578-4](https://www.nature.com/articles/s41586-019-1578-4)

------
abathur
I'm happy Cormac does this work. I'm even happier that he pointed out a few
rules to break. This time invested to improve science writing by one of our
greatest living writers is a friendly/magnanimous gesture from a discipline
that is on the rocks. I'm not sure if Cormac teaches out of necessity or the
goodness of his heart, but count me even more despondent if he needs the
salary despite his commercial and critical success.

Nothing can single-handedly fix scientific (or technical, or academic)
writing. Reading your own writing like a stranger is harder than reading your
own code like a stranger. Aside from a few polymaths, the best
technical/science writing takes careful, understanding partnerships.

That said, I think the bigger problem for science writing is how miserably
disconnected pop-press coverage is from what is morally justifiable given the
research design and results. Some of this is the fault of the scientists (and
their reviewers), who don't always appear to see the limitations of their own
results. More often, it seems to be about poorly-paid writers churning out
clickbait from journal articles.

Unless (or until) we see fit to ensure our scientists (and inventors, and
developers, and economists, ...) are polymaths capable of communicating their
work lucidly, I think there's a lot of forsaken social value.

If it's worth giving a grant to study or build something, it's worth bolting a
communication budget onto (not into) the grant to make sure the work is
understood.

~~~
thanatropism
> Unless (or until) we see fit to ensure our scientists (and inventors, and
> developers, and economists, ...) are polymaths

Well, are we going to (a) increase barriers to entry to the scientific
professions until most scientists are also poets and warrior sages, or (b)
punish bad paper writers with mandatory writing camps (that set back their
scientific agendas), or (c) ???

Science papers are mostly made to be read by scientists.

~~~
abathur
You forgot to quote the part where I didn't advocate for harsh authoritarian
barriers or punishments for non-poet scientists:

> ... capable of communicating their work lucidly, I think there's a lot of
> forsaken social value.

~~~
thanatropism
Which leaves us option (a): bar bad communicators from the scientific
profession.

~~~
abathur
Is there any defensible reason you're determined to create an authoritarian
fetish out of my assertion that scientific research worth funding in the first
place is worth bundling with a communications budget to ensure it is
appropriately understood?

You're tilting at windmills, here.

------
sb057
For a slightly more lofty version, see Orwell's 'Politics and the English
Language'[1], and in particular its six rules:

i. Never use a metaphor, simile, or other figure of speech which you are used
to seeing in print.

ii. Never use a long word where a short one will do.

iii. If it is possible to cut a word out, always cut it out.

iv. Never use the passive where you can use the active.

v. Never use a foreign phrase, a scientific word, or a jargon word if you can
think of an everyday English equivalent.

vi. Break any of these rules sooner than say anything outright barbarous.

[1]:
[https://www.orwell.ru/library/essays/politics/english/e_poli...](https://www.orwell.ru/library/essays/politics/english/e_polit)

~~~
dllthomas
[https://languagelog.ldc.upenn.edu/nll/?p=992](https://languagelog.ldc.upenn.edu/nll/?p=992)

------
charlysl
_Science is often hard to read. Most people assume that its difficulties are
born out of necessity, out of the extreme complexity of scientific concepts,
data and analysis. We argue here that complexity of thought need not lead to
impenetrability of expression; we demonstrate a number of rhetorical
principles that can produce clarity in communication without oversimplifying
scientific issues. The results are substantive, not merely cosmetic: Improving
the quality of writing actually improves the quality of thought._

From "The Science of Scientific Writing", 1990; there is a related video,
[Judy Swan, Scientific Writing: Beyond Tips and
Tricks]([https://youtu.be/jLPCdDp_LE0](https://youtu.be/jLPCdDp_LE0))

------
lewis500
I write little academic papers for a living as a professor. I think this is
good advice on how to write like cormac mccarthy. More important is to pick an
academic writer whose journal articles you love and copy them. One of my
favorites is Paul Krugman. He’s not in my field but apparently he’s good
enough that people want to read him. He uses lots of semicolons and
punctuation. I understand where this advice is coming from, though. Lots of
academics are truly horrible writers and believe complication itself makes
their writing erudite.

~~~
froh
TIL > erudite

educated, cultured, cultivated, learned, literate

In case you wondered like I did :-)

~~~
booleandilemma
I first learned this word (and many others) from EverQuest, back in the day.

------
anonymousDan
Nice article.

>> Minimize ... transition words — such as ‘however’ or ‘thus’ — so that the
reader can focus on the main message <<

However, I'm not sure about this advice :) I remember reading a study that
appropriate use of transition words is helpful in signposting how the coming
sentence relates to the previous one, thereby reducing effort for the reader.
What do other's think?

~~~
jessriedel
This was also the point that stood out to me as most wrong. If you have two
sentences, the meaning changes completely depending on whether the second
starts with "However" vs. "Thus". It does not seem like good writing practice
to me to elimate them and leave force the reader to infer.

More generally, words like "however", "thus", "although", "therefore",
"indeed", and "notwithstanding" are crucial for helping the reader follow the
thread of the argument.

~~~
dmurray
This stood out to me reading the notes from Knuth's _Mathematical Writing_
seminar, linked elsewhere on this comment page. An example from page 21:

> He recently spent four hours looking through the collected works of Lagrange
> trying to find the source of “Lagrange’s inequality,” but he was
> unsuccessful. Considering the benefit to future authors and readers, he’s
> not too unhappy with the new law.

The second sentence contrasts with the first, and with the rest of the
paragraph, which I have omitted here. I had to read it twice to grasp its
meaning. A "however" or a "nonetheless" is a nice hint to the lazy reader.

~~~
jessriedel
Agreed, though I don't think it's a matter of laziness. Language is ambiguous
(garden path sentences, etc). Redundancy is necessary to ensure clarity,
assuming your goal is to be clear, rather than to entertain.

------
dmckeon
Know your primary audience. Allow for other audiences, but know who you are
writing to. The audience for a science paper may be other researchers in your
field, researchers in associated fields, or other fields, other scientists, or
a more general audience. Audiences for hackers might be other hackers,
programmers, computer scientists, users, investors, salespeople, technicians,
translators, etc.

Pick one main audience, and a very few related audiences, and write so that
any of them can understand without confusion or distraction. The red herring
for an IPO does not benefit from leet-speak, nor does the FAQ for an online
service benefit from legal disclaimers.

------
YeGoblynQueenne

      • Don’t slow the reader down. Avoid footnotes because they break the flow of
      thoughts and send your eyes darting back and forth while your hands are
      turning pages or clicking on links. Try to avoid jargon, buzzwords or overly
      technical language. And don’t use the same word repeatedly — it’s boring.
      
      • Don’t over-elaborate. Only use an adjective if it’s relevant. Your paper is
      not a dialogue with the readers’ potential questions, so don’t go overboard
      anticipating them. Don’t say the same thing in three different ways in any
      single section. Don’t say both ‘elucidate’ and ‘elaborate’. Just choose one,
      or you risk that your readers will give up.
    

"Don't use the same word repeatedly" and "don't say the same thing in three
different ways in any single section" are contradictory.

In particular the first bit of advice is not good advice, to my bitter, bitter
experience. I used to do this in my papers: I would vary the terminology I
used to refer to the same concept throughout a paper, to make the text more
interesting. This was criticised very strongly and caused untold confusion
among reviewers, and the confusion caused my work to be rejected with very
strongly negative comments. And that's not just papers- I kept having to re-do
the reports on the progress of my research required by my university, because
the internal examiners wer so confused by the way I was writing.

Then I had a single session with a woman from the unversity's center for
academic English and that was one of the first things she said to me: "use the
same term to refer to the same concept, throughout the text". It was like a
lightbulb went off and the next set of comments I got for my work was
overwhelmingly positive (with respect to my technical writing anyway). It was
just This One Simple Trick, right? But it made a world of difference.

And here's a more general bit of advice: if you ever enjoyed literature, or
indulged in writing your own, woe is you, o damned soul. Do not even consider
trying to write a technical paper as if it was literature. You will be damned
for ever to the eternal flames of damned damnation. Really- don't do it.
Literary writing is evocative, it creates fleeting impressions and conjures
emotions. Done right, no two readers will get the same impression from the
same literary text. But the same is death for technical writing. Technical
writing is precise and unambiguous. Everyone who reads it must understand it
and they must all understand the same thing. Don't mix the two kinds of
writing up or you'll make a mess. Like I have.

------
INGELRII
> And don’t worry too much about readers who want to find a way to argue about
> every tangential point and list all possible qualifications for every
> statement. Just enjoy writing.

Trying to defend against bad reading is impossible and trying to do so in
original article is useless.

Bad reading is huge problem for social media and internet forums. Bad readers
are likely to be the first to comment article in the internet forums and
comment sections. They destroy discussion.

------
EdwardDiego
Is step 1 "Be as depressing as possible?". Full respect to McCarthy but damn
his books are hard work.

~~~
3stripe
Yes they’re dark but that just makes the glimpses of light shine brighter.

~~~
EdwardDiego
I think I really struggled with The Road because it really touched on every
primal insecurity in a parent - being unable to provide enough food or warmth,
unable to protect them.

------
save_ferris
This is some of the best writing advice I’ve read: straightforward and
incredibly digestible. All while totally vindicating me for hating so many of
the arbitrary style games I had to learn in school :)

------
mjw1007
« And don’t use the same word repeatedly — it’s boring. »

This is dangerous advice, and not just because (as others have noted) it can
add ambiguity.

Finding alternatives to the simplest word for a concept will often narrow the
audience. For a simple example, if I decide I've written "Venus" too many
times and substitute "the second planet", I'm unnecessarily making it more
difficult for people who don't happen to know the order of the planets.

~~~
herendin2
Their guidance might sound clear, but in fact "Venus" is not the kind of
repeated word they are talking about, because there's no unambiguous
substitute for it.

If you're referring to Venus, then please ignore that advice and say "Venus"
every time, don't annoy your readers by calling it the second planet,
Aphrodite, the planet of love, the evening star, and etc etc ...

That is an example of the many severe issues with the advice in this article.

------
BlueTemplar
"Colloquial expressions can be good for this, but they shouldn’t be too
narrowly tied to a region."

Avoiding the elephant in the room that is how it's becoming mandatory to write
in English for scientific communication. Won't anyone think about how this
presents a risk of stagnation for scientific thought? (Starting with native
English speakers, who never really _have_ to learn another language...)

~~~
vharuck
There are some sayings that translate meaning well. For example, the Japanese
idiom "Even monkeys fall from trees" (i.e., even experts can make a mistake).
With context, it works with any audience who know monkeys live in trees.
That's a wide audience.

~~~
BlueTemplar
Well, sure, but my concern is _specifically_ about concepts that translate
_poorly_ (and for which, therefore, speaking multiple languages is
interesting...)

------
VarFarYonder
This lecture contains the best advice I've come across for academic writing:
[https://www.youtube.com/watch?v=vtIzMaLkCaM](https://www.youtube.com/watch?v=vtIzMaLkCaM)

------
a_imho
Would be great if he presented a paper rewritten according to his advice.

------
mordymoop
I wish there was some acknowledgement of the inevitable tension between
readability and thoroughness. Sometimes it is actually more important to be
thorough at the expense of readability.

------
blt
> _Limit each paragraph to a single message. A single sentence can be a
> paragraph._

I agree with this. Newspaper style. But NeurIPS format will eat you alive if
you follow this advice.

------
scelerat
This echoes much of the theme of Strunk & White "Elements of Style": write
concisely and directly.

------
moneil971
This is so good, great advice for making scientific or technical papers more
readable

