

What makes software engineering for climate models different? (2010) - dodders
http://www.easterbrook.ca/steve/2010/03/what-makes-software-engineering-for-climate-models-different/

======
greenyoda
Make sure to read the thoughtful criticism in the comments below the article,
which add a lot of perspective to the original article. The people who wrote
the comments seem to know more about software engineering than the original
author, and argue convincingly that climate model code is no different from
other code: it needs to be modular and well-written so that it can be
understood and maintained; it needs to be tested to insure that it's working
correctly; etc.

The author seems to have a rather cavalier attitude about the correctness of
code, which one of the commenters (George Crews) picked up on:

 _" Then there is the statement: 'The software has huge societal importance,
but the impact of software errors is very limited.' I don’t see how it can be
both ways. How can something be of great importance whether or not it is
correct? IMHO, the most serious consequence of a climate software being
defective would be to then use it to make a defective political decision
costing trillions of dollars to society."_

~~~
msandford
The author of the article seems to think that those involved are all unique
butterflies simply because they write software in an academic environment.
Sure, writing software in academia is a lot different than in a business
setting. But guess what?! There's a TON of software written in academia!

The vast majority of all models that ever get written are done in academia
like nuclear simulation models or geological models or whatever. In college I
worked in high performance computing which led me to write software in
academia for blast simulations and for genetic sequence alignment. Both share
lots of the traits that the author outlined, but neither had to do with
climate.

Just because your situation isn't a business doesn't mean you're unique.

~~~
derefr
The nice thing about working within _science_ specifically, though, is that
one paper standing on its own is pretty meaningless. If you get your model
wrong, that'll show up when a meta-analysis of your paper (with its model)
against several others (with their own models) drops yours as an outlier.

------
sfrechtling
The context is also highly political. The results can be misconstrued and used
to fit a storyline. The author states it has a huge societal impact, which is
true - but I feel like climate change models are more important to politians
than the majority of citizens.

------
EpicEng
“developers are domain experts – they do not delegate programming tasks to
programmers, which means they avoid the misunderstandings of the requirements
common in many software projects”

This also likely means that these systems are poorly designed, poorly
documented, poorly implemented, and poorly understood.

This is only my assumption based upon my reading of this article, but this
assumption does come with some experience. I have worked in biotech my entire
career, often alongside domain experts who write code. This code is typically,
to be kind, not very good. It is rife with bugs, makes too many assumptions,
is difficult to understand (and, as a consequence, its limitations and
assumptions are not well understood by its users), full of copied code, etc.
etc. etc. I have rewritten a few such systems.

Also, comments like the following are scary and I can only hope that they are
not indicative of the general attitude in the field (though, based upon my own
experience, may very well be):

"The software has huge societal importance, but the impact of software errors
is very limited."

Yikes. The entire statement seems like a non-sequitur to me, but that attitude
leads down a dangerous road. So we have potentially buggy systems which output
data used in studies which have "huge societal impact"? How can the author
make the claim that software errors don't have an appreciable impact on the
result if the system was not developed using standard, accepted engineering
practices? How do the users know how to correctly interpret the data?

As an analogy, I recently rewrote an imaging and image processing system used
by my company. This system was designed and implemented by academics, and
exhibits all of the problems we in the software industry typically associate
with such code.

While rewriting it from scratch, I had no documentation to rely upon. I found
many implicit and explicit assumptions that the users where not aware of. Most
importantly, the system was originally designed for enumeration of certain
types of cells, but not for any sort of quantitative interpretation.

However, down the road, the users in the lab realized that they raw output of
the image analysis process could contain useful information. So, they started
mining it. They began comparing samples using various measurements taken
during analysis. They began making even more assumptions about what that data
meant, but they were often wrong.

On the surface, it seemed as though their work made sense, but only if one did
not understand _how_ those numbers were gathered and _under what
circumstances_ their interpretation was valid. Some of the statements made by
the author show a striking resemblance to the opinions of the original authors
of the system I had to rewrite.

These people were smart, very smart, but not engineers. They didn't have the
discipline, training, or experience required to write a system that would
stand up to scrutiny. It was a research vehicle, and it did what it was
originally intended to do, but as time passed, warts appeared.

I find it very hard to believe that climate model programming has even one
single characteristic which would cause an engineer to think that a different
engineering model was required or even warranted. To me this sounds like
people in the research/academic camp making statements about an aspect of
engineering that they do not understand.

