
The readability of scientific texts is decreasing - gfredtech
https://elifesciences.org/articles/27725
======
payne92
I like'd PG's tweet about this: "Possible explanation: papers are becoming
less how you communicate ideas, and more how you register work to get credit
for it."

[https://twitter.com/i/web/status/906075608181915649](https://twitter.com/i/web/status/906075608181915649)

~~~
ahartmetz
The old problem "when you turn a metric into a goal, it ceases to be a good
metric". Scientists are rewarded for churning out a large number of
fragmentary (if you have a _really_ good idea, it's good for more than one
paper), shoddily written papers to collect citation points. "The stuff between
the formulas" is not something authors really care about.

I noticed the awful writing in new (physics) papers when I was still in
university and had to read a paper occasionally. I was always relieved to find
a paper from before roughly 1980. They are so much better written. Older
papers are often better written than most textbooks while recent papers are
much, much worse. It is _not_ just due to the subject matter changing.

Reportedly, journals also used to have paid copy editors. Today - well, you
have heard about Elsevier and other publishers, right? They don't only seem to
increase prices to increase their profits.

~~~
Spooky23
Writing ability is on the wane in general. In professional settings, IMO this
is mostly due to the disappearance of secretaries and clerical staff, lousy
education and more non-native speakers.

A long time ago as an intern, I had to re-typeset and format some reports
published by a .gov since the 1910s for reprint and web publishing. You would
see variations in style, but the reports started getting worse in the early
80s, and are almost incomprehensible today.

~~~
ahartmetz
It may also be because university education used to be for those with a great
interest in science, and I guess for some reasonably intelligent upper class
folks with nothing better to do. The first group had the motivation, the
second group the habit and education to express themselves well. Today it is
for those who want to get a good job. The education for a job majority lowers
the standards for everyone.

Btw I'm a middle class guy who did it mostly for job prospects and such, so...

------
trevyn
This seems inherently linked to the increase in complexity and precision of
scientific ideas, particularly over the time scale investigated (back to
1881).

I certainly applaud efforts to manage this complexity (e.g. the article
mentions possibly adding "lay person summaries" in addition to abstracts), but
I think that increased complexity and depth of scientific results is the
intended outcome.

It seems analogous to these insane computer-generated proofs in mathematics --
maybe we need new tooling and approaches to make sense of them, but the fact
that they exist is proof that we're discovering things and moving forward.

~~~
FabHK
I don't buy that explanation - there's also just a lot of bad writing out
there today. And, I'd wager, you'll find (more frequently than before)
attempts to impress (and even obfuscate), rather then communicate and
elucidate.

Note also that your hypothesis was briefly addressed in the article:

> An alternative explanation for the main finding is that the cumulative
> growth of scientific knowledge makes an increasingly complex language
> necessary. This cannot be directly tested, but if this were to fully explain
> the trend, we would expect a greater diversity of vocabulary as science
> grows more specialized. While accounting for the original finding of the
> increase in difficult words and of syllable count, this would not explain
> the increase of general scientific jargon words (e.g. 'furthermore' or
> 'novel', Figure 6B). Thus, this possible explanation cannot fully account
> for our findings.

~~~
kemerover
Is "furthermore" scientific jargon? I thought it is just a professional
version of "moreover". Like, I would say "moreover" to a friend, but write
"furthermore" in a letter.

------
brudgers
I suspect that today scientific texts are more frequently written by non-
native English speakers and that technical jargon can be more precisely
defined and understood in terms of non-English languages. It is also worth
noting that scientific terms of 100 years ago are often more main stream, e.g.
quantum mechanics, relativity, uncertainty, etc.

~~~
krick
I'm pretty sure this is not the case. Scientific texts are usually awfully bad
in any language I speak, including 2 languages that are my native. In fact,
"scientific paper style" language is so distinct, that it can be easily
applied to a paper that is not "scientific" per se, essentially obfuscating
it.

It seems to be very unpopular opinion here on HN, but it seems clear to me,
that academia has become some kind of patalogical structure, which exists to
exists — essentially an organism. It feeds on government and student's money,
as long as it can seems credible and necessary to the outside world; and
people within this structure can live as long, as they show that they are not
"slacking" somehow — publishing papers, giving lectures — but they don't
actually need (nor often want) to produce any _result_. Form has become more
important than the essence.

So, here we have it: the form of a "scientific paper".

~~~
spaceseaman
The casual nature with which you dismiss thousands of people who have devoted
their lives to the pursuit of academic knowledge is frustrating and incredibly
childish.

Like seriously? Do you think all of us PhDs just sit around and laugh about
how dumb the government is while planning how we can make more useless papers?
Do you think grad students just train to become better bullshitters? Like what
kind of hole do you live in where any of that even appears close to reality.

Seems to me you just tried reading a couple papers, got confused, and decided
to justify your own intelligence by claiming their hard work and advancements
in the field were pointless.

> but they don't actually need (nor often want) to produce any result

Stop talking out of your rear end. I want to produce a result. My advisor
wants to produce a result. We all want to have meaning in our lives.

EDIT: As just a casual example of the importance of academia, look at the Deep
Learning boom. First researched around 1960, only now becoming practical and
useful. It's almost as if the things academics study need time and a lot of
work to actually come to fruition. Research level study is highly expensive
for very little real gains. This has always been the case. It takes a long-
long time for the investments into research to make sense, but when they do,
the advancements are completely game-changing.

~~~
jsharf
I mean you're totally 100% right, but I don't think he wrote that with the
intent of harm. I think in his post is a more serious issue. Academic papers
are incredibly hard to understand. I received an undergrad degree in CS and
spend a lot of time teaching myself theoretical CS. I often struggle to read
CS papers, and find it hard to imagine many of them being useful to anyone
without a PhD. For instance I was reading a paper on computer vision which
used a Bayesian Network to predict a depth map from a regular 2D color image.
After spending several weeks, I felt like I understood the article almost well
enough to implement it. Thing is, there were some steps in the learning
algorithm which relied on a specific kind of optimization (I forget which),
but they didn't specify which parameters were used, so it wasn't obvious how
to implement it to me. My friend has a PhD in EE (on the theoretical side, his
education focused heavily on statistics and optimization) and he only looked
at it for 30m-1hr, but he wasn't sure what they meant either.

I can always just fall back to gradient descent instead of the optimization
parts of the algorithm I don't understand, but that impacts performance, and I
got the feeling I'd spend hours implementing this only to have something which
can't perform in real-time so I gave up. It was just a side project. It's
frustrating after devoting weeks to this. Reading and attempting to implement
a research paper in my own field is incredibly hard.

I feel like if research papers were written in a more accessible, less dry
manner and specified details important to implementation, the world could be a
better place. If it were easier to implement these things as side projects,
you might have people experimenting with these things at home and then
starting businesses out of them.

~~~
jampekka
As a PhD student doing computational stuff, I agree with this fully. I'd say
that in many, probably majority, of cases it is not possible implement the
algorithm based on just the article. Many corner cases needed are omitted from
the paper and if a pseudocode is given, it may have huge steps handwaved just
giving a line like "optimize this functional".

It's one of the most mindboggling things (the publishing racket is perhaps
worse) about academia that CS papers introducing an algorithm aren't required
to publish their implementation, even though there necessarily seems to be one
doing eg. simulation studies.

I don't really understand the rationale behind omitting the implementation.
Maybe people write such crappy code that they're ashamed of publishing it. Or
it's the more sinister scenario that the algorithm is actually crappier than
the paper claims.

~~~
AstralStorm
Latter has happened quite a few times in experience. Algorithm performed as
designed on the training or example set just to fail terribly on real life
data.

Or someone handwaved important things like having an information function be
available (impossible, used only in proofs of correctness, way of estimating
it is critical).

Or a key assumption on input was just mentioned somewhere in the depths of the
paper.

Or a very specific way of measuring the result hides the deficiencies.
(Similar to p-hacking or misusing stats in medicine.)

------
SubiculumCode
Once upon a time a top scientists could contribute to biology, chemistry, and
physics, yet today this is extremely unlikely. All the knowledge we accumulate
and build on need names, structures, and nuance. You can't just throw that all
away and describe what you are doing or thinking limited to a basic highschool
level vocabulary.

The problems we face are harder, for complex, and more esoteric than ever
before, and it is amazing.

~~~
AstralStorm
Albeit things like "managerial summary" (not ELI5 but quite close) are very
much needed. Abstracts tend to not deliver this. More importantly, often key
limitations are overlooked in both to make the research look way more
groundbreaking than it really is.

Caveat lector: speaking mostly for computer science and medicine.

------
coliveira
It is interesting to finally realize that a large part of the audience in this
web site is really anti-science. News that in some way attack mainstream
science are very much commented in a positive light, as if scientists were
secretly trying to make their own work less available and obscure on purpose.

~~~
coldtea
Or maybe some are defensive of bad science and can't stand to see news that
attack bad practices in mainstream science to be lauded?

Being anti- the bad versions/practices of something is not the same as being
anti- that something. (The same way a whistleblower cop is not anti-police).

There's this story Brecht wrote about a guy telling another: "I'm an enemy of
newspapers. I want them closed down". To which the other guy replied, "I'm an
even bigger enemy of newspapers. I want better newspapers".

> _as if scientists were secretly trying to make their own work less available
> and obscure on purpose._

If that gives them an advantage (e.g. publishing lots of BS papers and
advancing their careers) then they are (and we know they are, meta studies and
experienced academics say and show so). There's nothing of the "Area
51/Illuminati" kind of conspiracy thinking about this, if that's what you
imply.

Rather it is the classic self-advancement BS that goes on since the world
started, where people exploit loopholes and cheat to get ahead. That includes
scientists, especially in today's publish or perish climate.

~~~
FabHK
This. I am against bad science and bad scientific writing. There is plenty of
both (just follow @RealPeerReview on twitter if you need convincing, but be
prepared to suffer).

That doesn't mean at all that I'm against science. In fact, the opposite is
the case – I'm against bad science, pseudo-science, and bad science writing
because I'm so ardently in favour of good science.

~~~
coliveira
And who is to define bad science? People who, by their own words, were not
able to read scientific papers because they are "difficult to understand"? I
am not saying that the specific language used in scientific literature is an
advantage, but also doesn't constitute a problem in itself. Unless you are
able to read those papers and point out where the "bad science" is, this is
just a vacuous statement.

Also, pointing at failure points in the peer review process is ridiculous.
This is not religion where you need to uphold every word. Science is made with
lots of ideas that, looked from far ahead, are incorrect. "How did that ever
cleared the review process"? Well those reviews only cover the minimum
necessary for something to be published, it is not a guarantee that the
contents are correct. It is the social process of science that takes care of
that.

------
chwahoo
"Furthermore", "novel", "distinct" are scientific jargon that reduce
readability? Seems like a non-problem to me.

~~~
AstralStorm
Usually it is the omissions that hurt. Rarely bad sentence structure.

Harder jargon sometimes does so too and especially custom underdefined
notation.

------
Kepler-431c
I'm building a site (not launched yet) to try to help with this problem:
[https://www.wikipaper.org](https://www.wikipaper.org)

It's not just readability that's a problem, it's also that relevant code,
data, etc are scattered all over the internet.

Another problem is that many people struggle with english which is the defacto
language for research. This introduces two problems: first, one must learn to
speak english, then, one must learn how to trawl through academicese.

There have been various similar projects to wikipaper in the past, but one of
the reasons that they fail is that not many researchers have the spare time to
contribute to a project like this. I spoke to the guys running Google Scholar
and they told me this is the biggest problem. I believe I have an innovative
way to solve this problem however, anyone who wants more details please get in
touch.

I believe that making it easier to understand research going on in neighboring
fields will dramatically increase the rate at which research is performed.

------
fghtr
The concept of Research Distillation could be a solution to make the texts
more accesible to scientists.

[https://distill.pub/2017/research-debt/](https://distill.pub/2017/research-
debt/) via
[https://news.ycombinator.com/item?id=13932806](https://news.ycombinator.com/item?id=13932806)

------
karmakaze
If you consider that current areas of research is both broader in overall and
narrower/deeper for a given publication, it follows naturally that the
language of the text targets a smaller audience. Is it conceivable that we can
maintain general comprehensibility and continue to expand and refine
knowledge?

I could see that it's possible to write in a style with more analogies or
illustrative language but I can't tell from the article if this factors into
the observed trend.

------
rdlecler1
There's strong selection for obfuscation. If you make your research easy to
understand it's easier for peer reviewers to poke holes in it. If you make it
difficult to understand then peer reviewer don't want to look dumb by asking
too many questions. Just make sure you sound smart.

------
bitL
We need a neural network to translate from a scientific to normal writing
style. Only half-joking here... It can train itself on latest Deep Learning
papers too!

~~~
lstamour
With what source material for the normal writing style? If it trains based on
news coverage all it would do is pick phrases from the abstract and maybe use
a thesaurus, with no accuracy guarantees...

------
gaius
As is the reproducibility. Mere correlation or is there a causal link?

[http://www.bbc.com/news/science-
environment-39054778](http://www.bbc.com/news/science-environment-39054778)

------
bocklund
It looks like they only looked at abstracts. I'd argue that they should be
more complex and that abstract readability don't correlate with the
readability in the text.

You can communicate the big ideas and importance in the necessary context in
the abstract. These are built on top of many small simple ones. Back in the
day 'We measured this thing' was plenty for a journal paper because there was
so much disagreement that each result was novel and interesting.

Now I still believe that the quality of writing has decreased, but my point is
that you can have a complicated abstract for an idea that is extremely clear
in the text.

~~~
jlg23
> It looks like they only looked at abstracts.

From the text: "We then validated abstract readability against full text
readability, demonstrating that it is a suitable approximation for comparing
main texts."

~~~
AstralStorm
And they perpetuate bad science by not saying how they validated it...

(Either they just ran the algorithm on some full papers, which is just as
worthless, or did "by inspection", which requires additional qualification and
exact results they assigned.)

------
seanwilson
Are readability scores generally a good metric to follow when authoring
content? For example, would you pay attention to it for a landing page or a
blog post?

------
gervase
Discussions of the relative merit of academia's current state, and
specifically the publication process, pop up from time to time around here, so
I won't rehash those points here.

Additionally, I think the article's author (and other neighboring posts here)
bring up valid points regarding the escalating complexity of science and
potential correlations between that complexity and the written complexity
required to communicate it. I think the article about the ABC conjecture [0]
posted earlier today [1] is a perfect example of this.

However, I would like to pose another suggestion that may play a role in this
effect.

It is easy to see how a paper's acceptance in a journal or conference serves
as an evolutionary pressure on the author's style; in other words, one of the
reward functions for a paper's style is defined by its ability to be published
(since higher publication count correlates with higher funding availability,
for better or for worse).

With such a function in place, it makes sense that papers will start to
exhibit evolutionary traits (styles) that promote survival irrespective of
their practical or functional benefits. Let us also consider the committee
review process as part of our environment: several humans must decide whether
your paper will be published or not, based on its domain novelty. There are 4
possible outcomes:

1) Paper is novel, reviewers understand it; outcome, publication (weight=1).

2) Paper is novel, reviewers don't understand it; outcome, possible
publication (weight=0.5).

3) Paper is not novel, reviewers understand it; outcome, no publication
(weight=0).

4) Paper is not novel, reviewers don't understand it; outcome, possible
publication (weight=0.5).

Therefore, if you're publishing something, and either (A) you know it's not
very novel, or (B) you're not sure how novel other people will think it will
be, it's in your interest to obfuscate your paper as much as possible.

Additionally, for Cases 2 & 4, the weights probably trend even higher. Human
vanity may produce an outcome closer to "I don't understand it, therefore it
may be over my head; I will therefore convince myself it is a good paper. Weak
recommend!" at a higher rate than "I don't understand it; I will ask for
clarification from the author or the rest of the committee, at the risk of
appearing foolish in front of my colleagues."

If these interpretations are true, then the parent's article's results are not
particularly surprising, just depressing (from the perspective of "academia as
human progress engine").

[0]:
[https://en.wikipedia.org/wiki/Abc_conjecture](https://en.wikipedia.org/wiki/Abc_conjecture)
[1]:
[https://news.ycombinator.com/item?id=15206540](https://news.ycombinator.com/item?id=15206540)

~~~
SubiculumCode
The simplest explanation is that we've built up large bodies of specialist
knowledge that can't just be reduced to "Rock falls down, gravity. huh."

Take biology, and look at this depiction of metabolic pathways:
[https://img.scoop.it/B6Bp6lXGshGZEPRnL8_P4YXXXL4j3HpexhjNOf_...](https://img.scoop.it/B6Bp6lXGshGZEPRnL8_P4YXXXL4j3HpexhjNOf_P3YmryPKwJ94QGRtDb3Sbc6KY)

How do you bring that complexity down so that someone coming out of highschool
can avoid having to learn a lot about it first, and still transmit something
useful to other specialists? Otherwise, what we'd get is another episode of
NOVA

~~~
ahartmetz
Give a big picture overview, categorize similar things, focus (omit
unnecessary detail), mention similarities to known systems, don't assume
highly specialized knowledge just to save a little space - the usual
techniques to explain things!

Edit: If you look at a text, you can easily see whether or not the author can
or wants to make themselves clear. Papers from after roughly 1980 just don't
give that impression anymore. You also know the difference between good and
bad software documentation when you see it, if that is closer to your area of
expertise. I think your explanation is plausible, but wrong.

~~~
SubiculumCode
That is fine for reviews. I 'm not sure that is appropriate or even possible
in empirical reports.

