
Evidence-based Software Engineering – book beta - todsacerdoti
http://shape-of-code.coding-guidelines.com/2020/06/30/beta-evidence-based-software-engineering-book/
======
Mekantis
It makes me wonder. Is there enough research and sufficient good data to make
evidence-based software engineering? It's not something that comes up much and
whenever I do see research, it usually feels disconnected from realities in
software development. That said, I'm going to read this and see if there's
anything valuable to learn.

~~~
agentultra
I don't think there is a lot of data from what I've seen. This book claims to
pull together some 620 data sets for analysis. I have only glanced over a few
sections that interest me and I don't see citations to some papers that I've
found interesting.

It'll be interesting to see if the author can pull it together in an engaging
way.

There is some effort in IEEE and ACM circles to pull together research like
this as well. Worth trialing a membership to find out.

------
doctor_eval
It’s not a complete work yet.

FTA:

> The aim of only discussing a topic if public data is available, has been
> slightly bent in places (because I thought data would turn up, and it
> didn’t, or I wanted to connect two datasets, or I have not yet deleted what
> has been written).

> The outcome of these two aims is that the flow of discussion is very
> disjoint, even disconnected. Another reason might be that I have not yet
> figured out how to connect the material in a sensible way. I’m the first
> person to go through this exercise, so I have no idea where it’s going.

~~~
pieterk
One can keep writing forever, but I think... If you can't capture it in 314
pages, you might be writing less of a universal book than intended.

~~~
65536
Writing a book is not just adding pages, it’s also revisiting and revising
what you’ve already written.

So it’s entirely possible that at some point the author ends up capturing what
they wanted to capture and the total number of pages then is lower than they
are today even.

------
m0rc
First of all, I think that the idea and the effort behind the book are a
fantastic and worth undertaking.

The issue for me is that the connection between the individual topics and its
relevance for Software Engineering is not clear or present (I no doubt that
the author sees the relation as obvious).

I would have liked more elaboration of the relation of each topic (in a given
section there can be several) with respect to SE, and its positive and
negative implications for its practice.

In a similar line, the following two sources are worth reading: * Facts and
Fallacies of Software Engineering. I'm typically surprised when some IT person
tells me that he do not even know of its existence. * IEEE Voice of Evidence
Articles. They are reviews of existing evidence on various topics.

------
cmehdy
It seems like a daunting task to take on so much, so props for the work put
into this. It does kind of read like a beta, with the exception of the bigger
thread (which makes it feel a bit more like an alpha). If that's okay, I'd
like to offer my thoughts on this.

What I'm reading from the chapters and the first few pages (dense in
information!), is that there are initial assumptions and an over-arching
structure that would benefit from being highlighted.

The undocumented assumptions are about both the audience and your own
approach: Where are your target readers at, who are they in broad terms? What
prerequisites are necessary for this book to be manageable, and since it
covers multiple fields it's possible to offer some insights into other sources
(whether online or physical books or classes) that offer fundamentals, up to
perhaps even a parallel to what you're putting together.

The unannounced structure seems (as of right now) to be:

(1) Notes about stuff

(2) We're dealing with humans

(3) We're dealing with humans in a system

(4) Related and/or other relevant systems

(5) What do we do as an engineer in this mess?

(6-7-8) How does that practically unravel ?

(9-10) Here's a crash course in mathematical stuff we'll need later

(11) Regression. A lot of it.

(12) Some other stuff than Regression to think about too

(13-14) Practical applications of the last 4 chapters we've been going through

(15) Let's introduce R now!

I think it would read better if there was some progression along those lines:

(1) Tells you what this book is at its core, then very briefly what's going to
hit you, and which assumptions you will make and what prereqs your readers
might need/rely on

(1-bis) Marked optional: Historical stuff, contextual stuff not directly
related to the book's core but useful nonetheless. Perhaps the reader will
want to take a look at that at their own leisure.

(2) Bring in the practical tool you have a preference for (R), since it's both
useful for data manipulation/visualization and programming concepts. Nothing
big, the learning can and will happen throughout the book.

Now you can leverage that tool to illustrate (and low-key practice) the
concepts that you introduce about various fields. The next few chapters don't
need to change too much but they can be made much more interactive by using R,
jupyter notebooks, etc.

(3) Who do we build stuff for? (Humans)

(4) Humans don't live in a vacuum, so let's consider their system

(5) (Perhaps marked optional) Those systems coexist and interact with other
systems. Why it's relevant to know about a few.

(6) What do we do as an engineer? (Build, design, creatively, etc)

(7) Basically your 6-7-8 with some fun R stuff

\--- And I'd honestly end there, and separate the rest into its own book.

That book can be about introducing the maths, with more or less optional
parts/prereqs, leveraging R, introducing the bits of data manipulation that
are deemed useful, and so on.

------
thesz
A quote from book:

 _Some languages support the creation of executable code at runtime, e.g.,
concatenating characters to build a sequence corresponding to an executable
statement and then calling a function that interprets the string just as-if it
appeared in a source file._

I took time and did a search for any mention of higher-order functions and
combinators (as in "parsing combinators"). There is no any there.

The book is way out of date right now, even before publishing.

~~~
disgruntledphd2
Note that he's requiring there to be published data/research before including
something, so that may account for what you believe is missing.

~~~
thesz
Should I do his work here then? Search for publications, etc.

~~~
kqr
Publications with proper controlled experiments of adequate sample size. I'm
fairly sure you'll find none.

~~~
thesz
"Of adequate sample size," of course.

Let me quote book again:

 _A study by Iivonen analysed the defect detection performance of those
involved in testing software at several companies. Table2.8 shows the number
of defects detected by six testers (all but the first column, show
percentages), along with self-classification of seriousness, followed by the
default status assigned by others._

Six testers. Six!

Here's a small research about expressiveness of different programming
languages:
[http://www.cs.stir.ac.uk/~kjt/techreps/pdf/TR141.pdf](http://www.cs.stir.ac.uk/~kjt/techreps/pdf/TR141.pdf)

Also contains six entries, if you include Haskell.

Expresiveness of language is measured by Halstead metric, which predicts
density of defects in linear fashion:
[https://ieeexplore.ieee.org/document/8447959](https://ieeexplore.ieee.org/document/8447959)

The book quotes neither Personal Software Process nor Team Software Process
research. It also does not mention at all the very basic thing about cost of
software development: cost to fix a defect is proportional to the time between
defect introduction and its discovery. On which I am sure there are plenty of
publications to find research results in:
[https://www.researchgate.net/figure/Cost-of-Fixing-a-
Defect-...](https://www.researchgate.net/figure/Cost-of-Fixing-a-
Defect-10_fig4_228658128)

I have strong feeling I am doing the job of the author here. It is very sad.

------
baxter001
This is a _very_ heterogeneous text, pulling in the psychology of perception,
economic theory, a description of how regressions work and static analysis of
a whole host of programing languages along with R snippets on how to perform
the analysis.

Strikes me more like a manic episode than a useful text.

~~~
jointpdf
You could make your point a lot more constructively if you omitted the
denigration of the author’s substantial effort and people who suffer genuine
manic episodes (which tend not to be this pretty).

~~~
iratewizard
It might come off as an attack on the author, but to me it's a succinct
description. To the author, maybe it will be constructive criticism (if they
are even reading). In general, people who set out to create something grand or
complex fall short. Successful complex projects I've seen have all started off
simple and grown complex slowly out of necessity.

------
andrewcooke
this is excellent, except that it doesn't read so well.

~~~
jonpurdy
The topic is directly related to my interests as a PM but it goes way over my
head quickly. Looking forward to seeing the iterations though!

