Why bad scientific code beats code following “best practices” (2014)

modeless · on Aug 28, 2016

I think there is a growing rebellion against the kind of software development "best practices" that result in the kind of problems noted in the article. I see senior developers in the game industry coming out against sacred principles like object orientation and function size limits. A few examples:

Casey Muratori on "Compression Oriented Programming": https://mollyrocket.com/casey/stream_0019.html

John Carmack on inlined code: http://number-none.com/blow/john_carmack_on_inlined_code.htm...

Mike Acton on "Data-Oriented Design and C++" [video]: https://www.youtube.com/watch?v=rX0ItVEVjHc

Jonathan Blow on Software Quality [video]: https://www.youtube.com/watch?v=k56wra39lwA

quotemstr · on Aug 28, 2016

For ages now, I've been telling people that the best best code, produced by the most experienced people, tends to look like novice code that happens to work --- no unnecessary abstractions, limited anticipated extensibility points, encapsulation only where it makes sense. "Best practices", blindly applied, need to die. The GoF book is a bestiary, not an example of sterling software design. IME, it's much more expensive to deal with unnecessary abstraction than to add abstractions as necessary.

People, think for yourselves! Don't just blindly do what some "Effective $Language" book tells you to do.

(For starters, stop blindly making getters and settings for data fields! Public access is okay! If you really need some kind of access logic, change the damn field name, and the compiler will tell you all the places you need to update.)

saint_fiasco · on Aug 28, 2016

>the best best code, produced by the most experienced people, tends to look like novice code that happens to work

Could it be that best practices are designed to make sure mediocre programmers working together produce decent code?

After all, actual novice programmers write code similar to the best programmers except that it doesn't work.

mavelikara · on Aug 28, 2016

"It took me four years to paint like Raphael, but a lifetime to paint like a child." - Picasso

coliveira · on Aug 28, 2016

> Could it be that best practices are designed to make sure mediocre programmers working together produce decent code?

Yes, it is, but the issue is that the industry should move away from the idea that software can be done in an assembly line. It is better to have a few highly qualified people capable of writing complex software than a thousand mediocre programers that use GoF patterns everywhere.

saint_fiasco · on Aug 29, 2016

Why? The assembly line doesn't need elite hackers, and there aren't that many elite hackers around. There is a place for cost-effective development.

phaus · on Aug 29, 2016

I'm a security analyst, not a developer so please forgive my ignorance, but how do you become one of the highly qualified coders without first spending some time being a mediocre developer?

quotemstr · on Aug 29, 2016

You don't. There's no way around having to spend some time in the trenches --- but we can at least minimize the amount of time developers spend in the sophomoric architecture-astronaut intermediate phase by not glorifying excess complexity. Ultimately, though, there's not a good way to teach good taste except to provide good examples.

delazeur · on Aug 29, 2016

"Best" practices may be a misnomer, but I don't believe it's possible to execute large projects without some kind of standardization. It is inevitable that in some cases that standardization will hinder the optimal strategy, but it will still have a net positive impact. Perhaps if we started calling it "standard practices" people would stop acting like it's a groundbreaking revelation to point out that these practices are not always ideal.

Retra · on Aug 28, 2016

I think it's more that the best practices were adopted for specific reasons, but they are not understood or transmitted in a way that makes those reasons very clear. That is, a 'best practice' tends to solve a specific kind of problem under a certain set of constraints.

Nobody remembers what the original constraints are or if they even apply in their current situation, even if they are actually trying to solve the same problem, which they might not be.

RobertKerans · on Aug 28, 2016

This spirals as well: I've spent quite a lot of time recently helping people at the early stages of learning programming, and it's taken me a while to stop becoming frustrated with their misunderstandings. Sometimes it is just because something basic isn't clicking for them, but a lot of the time it's down to me trying to explain things through a prism of accumulated patterns that are seen as best practise/common sense, but stepping back and viewing objectively are opaque and sometimes nonsensical outside of very specific scenarios. There's a tendency to massively overcomplicate, but you forget very quickly how complicated, then you build further patterns to deal with the complexity, ad infinitum

JamesBarney · on Aug 29, 2016

Yep the biggest issue we have with software is every developer knows Single Responsibility Principle, and no developers know why. Everyone knows decouple and increase cohesion, but few any know what cohesion is.

ArkyBeagle · on Aug 28, 2016

That's it. "Best practices" is, essentially, coding bureaucracy. That's not a pejorative; bureaucracy is quite necessary.

I have a rough idea of "why OO?" but in practice, it can be pretty hard on things like certain kinds of scaling, projects that require a goodly amount of serialization/configurability and the like.

jjn2009 · on Aug 28, 2016

There is a spectrum of "it just works" to "I've applied every best practice" bad programmers who spend huge amounts of time on best practices will end up wasting a lot of time over optimizing but it may have the benefit of reducing risk where they may have a lack of deep understanding or it may shine light on that risk, that is if they apply practices correctly.

quotemstr · on Aug 28, 2016

You're ignoring how "best practices" frequently add negative value. Design style isn't a trade-off between proper design and expedience. It's a matter of experience, taste, and parsimony.

sanderjd · on Aug 28, 2016

Those "Effective $Language" books make arguments for their advice, which you are free to find compelling, or not. Back when I first read Effective Java, maybe about a decade ago, I thought it was over-complicated hogwash that I knew better than. When I read it again starting about a year ago, I found the arguments for most of its advice very compelling, based on problems I've run into time and again in my own, and other people's, code, and not just in Java. YMMV I suppose!

The backlash against "engineered" software definitely seems real, and I think that's great - questioning assumptions is critical - but I think a lot of the insinuations about peoples' motivations and talents are unnecessary and honestly kind of silly. Most of us are just trying to find ways to avoid issues we've seen become problems in other projects in the past, it's not some nefarious conspiracy against simple code.

tigershark · on Aug 28, 2016

Are you seriously saying that the best code is an untestable mess of big God classes? Because in my experience this is by far the type of code written by inexperienced programmers. Abstractions and interfaces are the best way to make a system testable and extensible and it has nothing to do with using a pattern just because you read about it in the gof book 5 minutes ago. And using public fields in a non trivial project is a sure receipt for disaster.

et1337 · on Aug 28, 2016

> using public fields in a non trivial project is a sure receipt for disaster.

This is just dogma. Every Python project in existence has 100% public fields. Some are disasters, some are beautiful. Only a Sith deals in absolutes.

eximius · on Aug 28, 2016

  Better than ugly, beautiful is.
  Better then implicit, explicit is.
  Better than complex, simple is.
  Better than complicated, complex is.
  Better than nested, flat is.
  Better than dense, sparse is.
  Counts, readability does.
  Special enough to break the rules, special cases are not.
  But beaten by practicality, purity is.
  Silently passed, an error should never be.
  Unless explicitly, is it silenced.
  In the face of ambiguity, the temptation to guess, refuse you must.
  One, preferably only one, way to do it, there should be.
  Not obvious, it might be. 
  Better than never, is now.
  But often better than right now, is never.
  If hard to explain, bad it is.
  If easy to explain, good it may be.
  Namespaces are a honking good idea - more of them we should do!

-- The Zen of Python, Yoda

kazinator · on Aug 28, 2016

> Namespaces are a honking good idea - we should do more of them!

That should be:

   english.namespaces english.verbs.are english.articles.a american.english.vernacular.adjective.honking ...

qwertyuiop924 · on Aug 29, 2016

No, that's Java. Flat is better than nested: no self-respecting Python programmer would write hierarchy that deep.

ubernostrum · on Aug 29, 2016

No, Java looks like the famous quotation about the nail and the horse from:

http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom...

joesb · on Aug 29, 2016

So which rule take precedents here? Namespace is obviously not flat, at least it's more nested than one without namespace.

That's the problem I have with people praising Zen of Python as if it means anything. It's like a bible you just pick the verse you like to justify your action even if it might conflict with other rules. Then you praise the whole thing for being so wise.

qwertyuiop924 · on Aug 29, 2016

It means a lot: good python code follows it. However, yes, some of the verses conflict, because it turns out that good advice sometimes contradicts itself. All I can say is don't go too far in either direction.

LyndsySimon · on Aug 29, 2016

Much of the point of the Zen of Python is that it's self-contradictory.

eximius · on Sept 2, 2016

Balanced, you must be.

pyre · on Aug 29, 2016

You would prefer PHP[1] where the standard library has no namespaces?

[1] I refer to "classic" PHP. No clue if anything PHP5+ fixed this, though I doubt that they would make such a breaking changed even across major revisions.

kazinator · on Aug 29, 2016

Yes, I would. And though PHP is widely agreed to be piece of shit (it seems: I don't work with PHP so I'm here only relaying this popular sentiment), that doesn't tarnish the idea by association (which is what I sense you might be trying to do).

ISO C and POSIX also have a flat library namespace, together with the programs written on top. Yet, people write big applications and everything is cool. Another example is that every darned non-static file-scope identifier in the Linux kernel that isn't in a module is in the same global namespace.

Namespaces are uglifying and an idiotic solution in search of a problem. They amount to run-time cutting and pasting (one of the things which the article author is against). Because if you have some foo.bar.baz, often things are configured in the program so that just the short name baz is used in a given scope. So effectively the language is gluing together "foo.bar" and "baz" to resolve the unqualified reference. The result is that when you see "baz" in the code, you don't know which "baz" in what namespace that is.

The ISO C + POSIX solution is far better: read, fread, aio_read, open, fopen, sem_open, ...

You never set up a scope where "sem_" is implicit so that "open" means "sem_open".

Just use "sem_open" when you want "sem_open". Then I can put the cursor on it and get a man page in one keystroke.

Keep the prefixes short and sweet and everything is cool.

I was a big believer in namespaces 20 years ago when they started to be used in C++. I believed the spiel about it providing isolation for large scale projects. I don't believe it that much any more, because projects in un-namespaced C have gotten a lot larger since then, and the sky did not fall.

Scoping is the real solution. Componentize the software. Keep the component-private identifiers completely private. (For instance, if you're making shared libs, don't export any non-API symbols for dynamic linking at all.) Expose only API's with a well-considered naming scheme that is unlikely to clash with anything.

zackmorris · on Aug 29, 2016

PHP namespacing in many ways ruined the language. I'm not just talking about the poorly chosen use of the backslash '\' path separator or the fact that namespaces aren't automatically inferred which causes me to have to write "use" endless times at the top of the file which destroys my productivity when working outside an IDE.

I'm talking about the heart of PHP which is stream processing. Why in the world would you destroy the notion of simply including other source files in order to wedge in this C++ centric notion of a namespace? Before "namespace" and "use", the idea was that all of the files included together can be treated as one large file, and sadly that conceptual simplicity has been lost.

Also the lost opportunity of having objects be associative arrays like Javascript, combined with namespacing, have convinced me that perhaps PHP should be forked to a language more in-line with its roots. I haven't tried Hack or PHP7 yet but I am apprehensive that they probably make things worse in their own ways.

I think of PHP as a not-just-write-only version of Perl, lacking the learning curve of Ruby, with far surpassed forgiveness and access to system APIs over Javascript/NodeJS. Which is why it's still my favorite language, even though the curators have been asleep at the wheel at the most basic levels.

jwdunne · on Aug 29, 2016

The standard library isn't namespaced. If you want to use a stdlib function inside an NS, you can without issue. If you want to use a stdlib class, or any top level class, it needs backslash before its name or you need to import it.

There is a community move towards a standard namespacing with PHP-FIG. This is useful and we are seeing lots of progress on internals thanks to the work by the community.

Like a lot of things, PHP name spaces are or were a big mess. Lots of progress is being made to improve though I think a lot must remain.

I've thought about creating a project that packages up various categories in the stdlib into namespaces with consistent inputs and outputs. That would be nice but the process isn't a lot of fun.

balnaphone · on Aug 28, 2016

(an attempt to translate to standard english grammar, for the benefit of other non-native English readers, who may also struggle to parse this)

  Beautiful is better than ugly.
  Explicit is better then implicit.
  Simple is better than complex.
  Complex is better than complicated.
  Flat is better than nested.
  Sparse is better than dense.
  Readability counts.
  Special cases are not special enough to break the rules.
  But purity is beaten by practicality.
  An error should never be silently passed.
  Unless it is silenced explicitly.
  You must refuse the temptation to guess in the face of ambiguity.
  There should be one, preferably only one, way to do it.
  It might not be obvious. 
  Now is better than never.
  But never is often better than right now.
  It is bad if it is hard to explain.
  It may be good if it is easy to explain.
  Namespaces are a honking good idea - we should do more of them!

cariaso · on Aug 29, 2016

instead of offering your own translation, we can instead go back to the original english. Run 'python -m this' and you will see

The Zen of Python, by Tim Peters

Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Complex is better than complicated.

Flat is better than nested.

Sparse is better than dense.

Readability counts.

Special cases aren't special enough to break the rules.

Although practicality beats purity.

Errors should never pass silently.

Unless explicitly silenced.

In the face of ambiguity, refuse the temptation to guess.

There should be one-- and preferably only one --obvious way to do it.

Although that way may not be obvious at first unless you're Dutch.

Now is better than never.

Although never is often better than right now.

If the implementation is hard to explain, it's a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea -- let's do more of those!

qwertyuiop924 · on Aug 28, 2016

Now those are some best practices I can live by.

kilink · on Aug 28, 2016

I think the advice is more relevant in the context of the specific language; it's not universal.

In python you can go from attribute access to using a property (getter/setter) without breaking anything.

The same is not true in a language like Java where obj.foo is always a direct field access distinct from calling a method like obj.getFoo(), so going from public fields to getters is not backwards compatible and can be painful.

superbatfish · on Aug 29, 2016

>In python you can go from attribute access to using a property (getter/setter) without breaking anything.

True, but that should be avoided if possible. Python 'properties' violate the principal of "explicit is better than implicit". Once you realize "Oops, I need an accessor function here", the lazy programmer says "Aww, grepping for all uses of .foo and replacing them with .getFoo() will take 20 minutes. Instead, I'll just redefine it as a property and no one will notice." If you care about quality, go the extra mile: make it clear to the people reading your code that a function is being called.

Properties are a kinda nice language feature, but they are so frequently misused that I think the language would have been better off without them. They encourage bad habits.

qwertyuiop924 · on Aug 29, 2016

However, it's still a simple enough change that you shouldn't build getter/setters unless you're already pretty sure it'll be changed. OTOH, if you're using a language with generated getter/setters (ruby, smalltalk, lisp), just do it.

sbov · on Aug 29, 2016

Using find & sed is not painful.

I use getters and setters, but still think you will waste more time arguing about this issue, than leaving it be and finding out you have to change them down the line.

Edit: also, especially for getters, I like my accessors to be simple accessors. Hiding too much code behind them can be unpleasantly surprising, so as they deviate further from accessors I like to rename them - e.g. CalculateXxxx rather than GetXxx. Fewer surprises. Given that, I potentially have an issue with continuing to call it GetXXX or SetXXX in the face of certain changes.

pitay · on Aug 29, 2016

Using find & sed is not painful.

Using an IDE with an understanding of the language you have is even less painful. Having the IDE automatically refactor a public field to use getters and setters is a breeze with managed languages like Java and C#.

imtringued · on Aug 29, 2016

The same can be said about interal interfaces with only one implementation. If there is really a need for an interface then why not add it later until it's actually needed it instead of creating additional overhead for something we may never need?

tigershark · on Aug 28, 2016

If there are inexperienced programmers working on it, as in my preamble, then almost surely it is in whatever language that doesn't prohibit to mutate objects indiscriminately. In F#, Haskell and other languages that enforce immutability obviously it isn't a problem, in Python most surely it is.

et1337 · on Aug 29, 2016

Eh. Inexperienced programmers will find a way to screw things up somehow. Probably by over-engineering, like the original article describes.

MaulingMonkey · on Aug 28, 2016

Code can be underabstracted, but it can also be overabstracted - and abstracted with the wrong abstractions. And fixing the latter sometimes involves a temporary stay at "untestable mess of big God classes" when you remove the bad abstractions to clear the way for creating better ones. Not because it's the best code - far from it - but because it's slightly less terrible code.

> And using public fields in a non trivial project is a sure receipt for disaster.

Ergo, all non trivial C projects are disasters? Well, maybe, but I disagree on the reasons.

Language enforced encapsulation is a useful tool, but some people take it to the deep end and assume that if their math library's 3D vector doesn't hide "float z;" in favor of "void set_z(float new_z);" and "float get_z() const;" (one of which will probably have a copy+paste bug because hey, more stupid boilerplate code), they'll have a sure receipt for disaster. Which I suspect you'd agree is nonsense - but would also follow from reading your words a little too literally.

tigershark · on Aug 28, 2016

In my experience, in quite big projects, people had a tendency to mutate objects in the wrong place and in the wrong reasons. A field encapsulated in a getter without setter certainly helps.

qwertyuiop924 · on Aug 29, 2016

But if it has to be able to be mutated in some cases, you're stuck. This is a place where Scheme's parameterize may be useful: Define a closure with the setter in scope, and write it so that when it's called with a lambda as an arg, it will raise an error unless those certain conditions are met, in which case, it will use parameterize to make the setter available in the lambda's scope. Or I suppose it could pass it in, which would be slightly simpler...

chopin · on Aug 28, 2016

That would be a final field. But generally I agree with you and its by far the pattern I use most.

kuschku · on Aug 28, 2016

The issue is, for example, if you have for example a class encapsulating a bunch of flags, and allowing the user to either set each flag seperately, or a bitfield representing all of it.

In C, you might be able to do some magic, but in Java, you’ll need setters and getters there – you can’t even do final fields there.

Luckily, in C#, you can just use accessors, and maybe in Java with Lombok, too.

blablabla123 · on Aug 28, 2016

I think he is rather referring to cascades of function calls/types and "custom solutions". Of course abstractions are a great tool when used properly. Thinking of a best practice from Object oriented design: low coupling but high cohesion.

When you use abstractions, you get low coupling. But like everything in life, also this has a price: you may need an extra line of code to instantiate the abstraction and sometimes have to write an extra getter because it's not perfect yet. That's a fine price to pay unless go you use way too many abstractions and they are nested deeply. It may have some aesthetics but it can be hell to debug and overly complicated to extend code.

So that's why one must also focus on high cohesion. I really like the modern JavaScript way, the imports/requires are for low coupling and the build automation is for cohesion. Anyways, these things are best practices as well but not sure if the author took those into account... ;)

bartread · on Aug 28, 2016

I don't think that's seriously what he's saying. You've over-interpreted to project the "logical" extreme of the POV expressed on the author. This is a trope in online debate that needs to die.

ArkyBeagle · on Aug 28, 2016

I've returned recently ( for domain-specific reasons ) to sets of (opaque) god-classes with good interfaces when needed. No state in a god-class is visible except through the interface.

Since the domain requires a great deal of serialization, the interfaces are usually strings over that. In cases, it's even easier to just open a socket to the serialization interface.

So far, I'm able to pub/sub periodic data streams, but it'd be pretty easy to add "wake me when" operators and such.

It forces the use of one central table of names cross callbacks ( which can be grown dynamically ) but it's very, very nice to work with.

YMMV. The domain is instrumentation and industrial control, which just fits this pattern nicely. All use cases can and are specified as message sequences.

flukus · on Aug 29, 2016

I've been using the delegation pattern a lot lately as a nice way to combine the best bits of god classes (few dependencies) with the best of SRP (small easy to test parts).

This way A, B, C (and many others) only depends on X, the delegator (which exposes a number of interfaces for practically everything), but X depends on everything and the kitchen sink, X contains no real functionality.

ArkyBeagle · on Aug 29, 2016

"Delegator". Now I know what to call it :)

andrepd · on Aug 28, 2016

>And using public fields in a non trivial project is a sure receipt for disaster.

Why, though? Surely it's the programmers' job to access what they need and leave alone what the don't. As someone else said, Python has no concept of public/private and it works okay.

jedmeyers · on Aug 29, 2016

If you make the statically typed field public you lose the ability to change the implementation without breaking the clients. (Getters/setters have the same problems, but not as pronounced). And no, you don't always have the ability to recompile everything that's using your code.

lmm · on Aug 29, 2016

Python doesn't really "work okay" for ongoing projects. Framework updates become a 6-month migration job.

colejohnson66 · on Aug 29, 2016

An argument for properties are to hide the implementation details. It keeps idiots from changing variables they shouldn't (if they really want to, there's sometimes reflection), and it hides implementation detail variables from the autocomplete box.

qwertyuiop924 · on Aug 28, 2016

Make your code testable, but public fields are fine.

qwertyuiop924 · on Aug 28, 2016

Blind application needs to die in general. SOLID is a good idea (it should just be SID, IMHO, but anyways), so are design patterns... when they make sense. When you think "I need to delegate object construction to different subclasses contextually" use a factory. Don't use a factory when you don't think that. When you need to chose an approach that should be given by the caller, no, for the love of god don't use the Strategy pattern. The strategy pattern is a hack that belongs in the past, and you should use lambdas instead because it's the 21st freaking century, not 1996, and we all have lambdas now. So don't use strategy, unless you're stuck behind, and if you do, feel bad about it.

/rant

Anyways, yeah, just use your brain, insist on defined interfaces, and if something might be suboptimal, allow for it to be changed, and you'll be fine.

quotemstr · on Aug 28, 2016

> The strategy pattern is a hack that belongs in the past, and you should use lambdas instead

In that case, your lambda object is your strategy. One of the good points that the "Design Patterns" authors make and that everyone else ignores is that these patterns are names for recurring structures that pop up independently of implementation language and environment. Some silly functor object is one way of implementing the more abstract "Strategy" pattern. Your lambda is a better one in many cases. It's the same high-level concept.

qwertyuiop924 · on Aug 28, 2016

Well, yes, but if you every write a class for C#, modern Java, or pretty much any language save C++ or Python (and even then you should write functions) that has the word "strategy" in it, you're doing it wrong, and should be punished for reinventing language features using OO methodology.

Was it Steve Yegge who menationed the Perl community calling Design Patterns "FP for Java"?

quotemstr · on Aug 28, 2016

I don't know, but this Yegge essay is one of my favorites ever: http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom...

qwertyuiop924 · on Aug 28, 2016

I hadn't read that one. The line about the meaning of lambda is hilarious. Anyways, yeah the line was in "Singleton Considered Stupid."

https://sites.google.com/site/steveyegge2/singleton-consider...

grumpyprole · on Aug 29, 2016

There is a name for "strategy pattern" that existed long before the GoF book: higher-order function. But what's worse, in that it causes confusion, is when names are repurposed, "Functor" is a useful abstraction in programming, but is not related to your usage above.

qwertyuiop924 · on Aug 29, 2016

Well, a Functor isn't an HOF, and an HOF needn't be an implementation of Strategy.

grumpyprole · on Aug 29, 2016

I never said a Functor was a HOF, but it's not far wrong. The implementation of the morphism-mapping part of a functor is a higher order function. Show me an example of Strategy that is not essentially a HOF.

qwertyuiop924 · on Aug 30, 2016

All strategy is HOF, not all HOF is strategy: Observe:

  (define (make-counter n)
    (lambda (c)
      (set! n (+ n c))
      n))

This is a HOF, but not an instance of Strategy.

grumpyprole · on Aug 30, 2016

I asked for an example of Strategy that wasn't just a HOF, but I now see you do agree with this. Yes your example is technically a HOF as it returns a function. So I guess your point is that HOF is not specific enough, although one might argue that in general usage it usually does imply function-value arguments.

qwertyuiop924 · on Aug 31, 2016

I mean, that's not the general use I see, so we clearly have different social circles. The thing is, a HOF is a mechanism, the Strategy pattern is an intent. I would in fact argue that there is at least one HOF that takes functions as arguments and isn't an instance of strategy: call/cc.

call/cc does not actually use the passed in function to determine how any action should be done. All it does is provide a capture of the current continuation as an argument. That's it. So it's not a strategy, it's just a HOF.

grumpyprole · on Aug 31, 2016

Well Java didn't add lambdas to write make-counter, it was map and fold that got them envious. In fact, IIUC mathematically make-counter is first-order, counting the nesting of funtion types. In other words, I don't understand why many describe it as a HOF at all. Map and fold are second order. Callcc, ignoring its Scheme implementation as a macro, would be third order. One could argue that the second order argument to callcc is strategy, with the continuation being the strategy.

qwertyuiop924 · on Aug 31, 2016

Call/cc isn't a macro. It's actually a special form, although in CPS-based schemes, it could probably be implemented as a function if there was a lower level continuation primitive.

And semantically, a continuation is pretty much never a strategy.

As for make-counter not being higher-order, that's just not true. A higher-order function takes and/or returns a function. make-counter returns a function: it's higher-order.

grumpyprole · on Aug 31, 2016

No its disputed, for the reasons I gave. But I've already accepted that many regard make-counter to be a HOF. I was careful and said make-counter is not a second-order HOF, which Strategy is.

Callcc can be implemented as a function, no continuation primitives needed. Haskell has examples in its libraries.

Lastly, regarding semantics, I see no difference between Strategy and a second-order HOF. Yes the scope of Strategy is supposed to much more limited, but I don't see value in this. I concede that others might do.

grumpyprole · on Sept 1, 2016

Having thought about this further, I think it is probably only correct of me to talk about the "order" of "functionals" as in functions to scalars in mathematics. The term "rank" is better suited here. Category theory supports the popular definition of higher-order function and gives a different meaning to "order". So make-counter is rank-1 but still higher-order. Apologies for the previous post.

tigershark · on Aug 29, 2016

Automatic factories are extremely useful to kill the ServiceLocator anti-pattern.

qwertyuiop924 · on Aug 29, 2016

The ServiceLocator anti-pattern seems a bit silly: either just depend on the object directly, or if there may be different objects used throughout the system, pass one in.

jcoffland · on Aug 28, 2016

One of my guiding principals as a programmer is to never add code because I might need it some day.

quotemstr · on Aug 28, 2016

Still, apply common sense. Sometimes the probability of adding something is so high and the cost of making an extension point so low that it makes sense to design with extensibility in mind. These extension points are rare, though, and usually occur at major module boundaries (e.g., plugins), and don't need to be scattered through random bits of your code.

sbov · on Aug 28, 2016

You also have to factor in the cost to maintain your extension which you currently have no use for, which most people forget.

It also might be easy to include now, but your extension might complicate a new feature request. Or the new feature might complicate your previously simple extension. If it's still unused, don't be afraid to throw it away then either.

Gibbon1 · on Aug 28, 2016

I find that unneeded functionality also complicates refactoring efforts.

jcoffland · on Aug 28, 2016

Still, why not wait to add it?

Bahamut · on Aug 28, 2016

Because sometimes those choices have a large effect on the amount of work/mental tax in the future - for example, if you know that a new feature will have to interop with yours in the near future and the cost is low to implement the right extensibility point now, it would be absolutely stupid not to - I would call that bad engineering that costs time & effort.

Obviously one doesn't want to go down the rabbit hole too early, but the other extreme is just as bad.

ArkyBeagle · on Aug 28, 2016

Waiting is good when the cost model for the added code is much more in focus later.

cle · on Aug 28, 2016

The cost to add it is often much higher, once the code has high fan in (lots of code paths depend on it).

I am definitely a fan of not over engineering, but I'm more of a fan of thinking. Think about your problem and your use cases, and about your own ability to predict the future. If you can predict future changes with high probability, then go ahead and design for those. If you're not a domain expert, you probably can't predict the future at all (and you probably grossly underestimate how bad your predictions are), so you should stick to the bare minimum.

jcoffland · on Aug 28, 2016

It's a slippery slope.

jwatte · on Aug 28, 2016

The GoF book is a dictionary. When it came out, we all got names for patterns we use when appropriate. The thing that must die is over-reliance on formalisms where they don't add value. (Because there absolutely are places where they do.)

Having the taste to use the right approach for each job is what sets experience apart!

rattray · on Aug 28, 2016

Sorry, what is the GoF book?

detaro · on Aug 28, 2016

Design Patterns: Elements of Reusable Object-Oriented Software, whose four authors are sometimes called the Gang of Four (GoF). It established a vocabulary of common patterns.

https://en.wikipedia.org/wiki/Design_Patterns

rattray · on Aug 28, 2016

Thanks very much!

ScottBurson · on Aug 28, 2016

For ages now, I've been telling people that the best best code, produced by the most experienced people, tends to look like novice code that happens to work --- no unnecessary abstractions, limited anticipated extensibility points, encapsulation only where it makes sense.

I love this.

"Perfection is reached, not when there is nothing left to add, but when there is nothing left to take away." -- Antoine de Saint-Exupery

Have to disagree a little on getters and setters though. They're tremendously useful as places to set breakpoints when debugging. Well, setters are, anyway; I guess it's rarer that I'll use a getter for that. Anyway, perhaps we can agree that the need for these is a design flaw in Java; C# has a better way.

userbinator · on Aug 28, 2016

For ages now, I've been telling people that the best best code, produced by the most experienced people, tends to look like novice code that happens to work --- no unnecessary abstractions, limited anticipated extensibility points, encapsulation only where it makes sense.

In other words, the best code is the simplest code that works. It usually tends to be very flexible and extensible anyway, because there is so little of it that understanding it all and modifying it becomes easy. The most experienced programmers are the ones who can assess a problem and write code to capture its essence, and not waste time doing that which isn't necessary.

I've observed that there is a "spectrum of complexity" with two very distinct "styles" or "cultures" of software at either end; at one end, there is the side which heavily values simplicity and pragmatic design. Examples of these include most of the early UNIXes, as well as later developments like the BSD userland. The code is short, straightforward, and humble, aptly described by "novice code that happens to work".

At the other end, there's the culture and style of Enterprise Java and C#, where solving the simplest of problems turns into a huge "architected" application/framework/etc. with dozens of classes, design patterns, and liberal amounts of other bureaucratic indirection. The methodology also tends to be highly process-driven and rigid. I don't think it's a coincidence that the latter is heavily premised on and values "best practices" more than anything else.

Here's another one of the "rebellion" articles against "best practices": ttp://www.satisfice.com/blog/archives/27

bartread · on Aug 28, 2016

^^^This. The tendency for a certain type of software engineer to go around telling everyone else they're doing it wrong reminds me strongly of this scene from Justified: https://www.youtube.com/watch?v=LG4hOjJ9tEs.

crististm · on Aug 29, 2016

The problem with their advice is that they are not master Foo. They could tone it down a little for their wisdom is not absolute. Some rules works for some people in some situations and other rules are just practical conventions.

One good advice I found is to just plain ignore status quo, and follow common sense when the context demands it.

jimmaswell · on Aug 28, 2016

The getter/setter problem is solved very nicely by C#. You can change a field to a property at any time without changing the rest of the code, and a lot of the use cases for get/set can be handled compactly like public int X{get; private set;} instead of having to have 2 variables.

jwatte · on Aug 28, 2016

Getters/setters are important when the data structure is used from a different linkage unit than where it is defined, and binary compatibility across versions is important.

That doers happen, but usually in fewer cases than most intermediate programmers realize. Instead, they see cases where it is used for real (like winforms or Direct3D) and cargo cult it into all code they write.

blahedo · on Aug 28, 2016

> they see cases where it is used for real (like winforms or Direct3D) and cargo cult it into all code they write.

And a lot, a lot, of my CS prof colleagues believe it is deeply important, for reasons most of them are unable to articulate, that in an intro CS1 class all instance variables be private and accessed only through getters and setters. This is actually baked into the College Board's course description for AP CS A, and a point or two (out of 80) often hinges on it in every exam, so it is near-universally taught in high-school level CS classes (in the US). Sigh.

qwertyuiop924 · on Aug 28, 2016

So I have to write BS for AP CS A, and I have to do it in a bad language, and we'll never get Python or Lisp?

Okay, College Board, I warned you...

smack

this is for the above.

smack

and this is for teaching CS poorly in general.

smack

and this is for making us all by overpriced graphing calculators, thus keeping demand high...

quotemstr · on Aug 28, 2016

The kinds of things you need to do in order to maintain a stable ABI are not the kinds of things you should apply to all parts of your program. That way lies madness.

BTW, you don't need accessors even for public ABIs. stat(2), for example, doesn't need "accessors" for struct stat and it's been stable for decades. The same idea applies to Win32 core APIs.

the_af · on Aug 28, 2016

Not a disagreement, but it's interesting that at least 3 of 4 of the people you mentioned are game programmers (not sure about Mike Acton because his name doesn't ring a bell). Some, like Carmack, are definitely brilliant programmers. But game programming has very specific constraints, doesn't it? Speed and size are comparatively more important than in bussiness/enterprise software, and maintenance is comparatively less important.

That said, I welcome anyone trying to knock OOP off its pedestal.

modeless · on Aug 28, 2016

Mike Acton is also a game programmer. Maybe it's just because of the sources I tend to read, but I do think the game industry is leading the way here and I hope others will follow. Perhaps it's because of the focus on performance, but I think any industry could benefit from that. I curse the lack of attention to performance in modern software development every time the Twitter app takes multiple seconds to load on my 2 GHz smartphone.

> maintenance is comparatively less important

I don't agree with this. Certainly maintainability is less of a concern for the gameplay code specific to each game, but game development also encompasses engines and tools which span multiple titles and are used for many, many years. Also, the trend toward free-to-play and subscription games is making maintenance more of a concern even for single titles. World of Warcraft, Team Fortress 2, League of Legends, Clash of Clans; these titles are going to be maintained for years to come. Valve Corporation recently transplanted an entire game (DotA 2) from one game engine to another, while people continued to play.

andrepd · on Aug 28, 2016

>I curse the lack of attention to performance in modern software development every time the Twitter app takes multiple seconds to load on my 2 GHz smartphone.

I strongly agree with the sentiment, but also it should be noted that the load time of applications is related more to the storage speed, which is often abysmally bad in phones.

userbinator · on Aug 28, 2016

If they didn't have so much code (and data) to load in the first place, a lot of it probably unnecessary, applications would certainly load much faster.

oldmanjay · on Aug 29, 2016

People who aren't performance obsessed programmers enjoy detailed graphical interfaces, and those tend to require lumps of code and data. It's fine to decry it I guess, but it won't be changing any time soon.

andrepd · on Aug 29, 2016

The point is that you can have those without the fluff and layers of inefficiency that modern programming introduces.

ben_jones · on Aug 28, 2016

They're allowed to do it because of the emphasis on "shipping" and the complete lack of "maintaining" that they have to do afterwards. Most games on release are complete shit for a reason, with very few modern exceptions. Hell it isn't uncommon for a studio to outsource ports and expansions. Making their code someone else's problem is not something non software devs should aspire to.

vvanders · on Aug 28, 2016

Except there's these things called game "engines" that have can have lifecycles of 10+ years(Unreal and ID comes to mind).

ben_jones · on Aug 28, 2016

"Give a man a game engine and he delivers a game. Teach a man to make a game engine and he never delivers anything."

Most game developers do not build game engines.

yoklov · on Aug 29, 2016

All of the cited developers do, however.

pandaman · on Aug 28, 2016

Maintenance as in "changing the behavior of code on demand" in games is not solved by modifying the code but in the data. As a game programmer I'd terrified if I had to change the code every time requirements change because the requirements in games change dozen times a day. I'd rather give the tools to the game designers so they could implement their own requirements themselves.

Maintenance as in "keeping the code around and reusing it for other projects" is quite important with the games. We go extra length to make code reusable by isolating it from 3d party libraries, compiler features and such.

the_af · on Aug 28, 2016

But this goes to reinforce the idea that the principles involved in coding games are very different than those for other software, doesn't it? Therefore, what works in games is not necessarily a good idea for business software, and viceversa.

pandaman · on Aug 28, 2016

I am not sure that modifying code instead of data is a good principle for anybody.

I do agree that it depends on the goal you are trying to achieve. E.g. if I had been running a custom software shop or an embedded IT department developing some internal software I'd too make maintenance as complicated as possible so I could bill for more engineers/QA or have more reports to increase my profit/bonuses/political weight etc.

the_af · on Aug 28, 2016

> I am not sure that modifying code instead of data is a good principle for anybody.

In my experience, for in-house software, modifying (or adding) code happens all the time. It's so common I'm puzzled that you don't consider it a good principle. In fact, the alternative -- making software so flexible every possible behavior can be customized by modifying data alone -- leads to the "enterprise software" antipattern (as often mocked in The Daily WTF), where everything is needlessly flexible and complicated. Or maybe even the "inner platform effect"!

Even for game development, I've read teams (often?) hack the engines they buy, like it was famously the case for Half-Life, which used a heavily modified Quake engine.

pandaman · on Aug 28, 2016

Look at how many people make a game and how many people make in-house software (if you don't know how many people make a game - go to mobygames.com and look up credits for it). Consider functional complexity of one and another then consider the actual complexity of the code. John Blow, quoted above, gives few examples too (e.g. Facebook's client requiring a change in the OS because of running out of the class limit).

Game teams sure hack the engines and modify the code all the time but these modifications are not in response to changing requirements. It's 99% bug fixes, optimizations and planned feature implementation and 1% behavior changes requested by design.

the_af · on Aug 29, 2016

I honestly don't understand where you're going re: number of people. Care to elaborate?

I won't argue with you about the 99% data, 1% behavior changes thing. I can't argue with unreferenced statistics :) I seriously doubt your percentages, though. In any case, for business/enterprise software development it's definitely NOT the case. Software changes are both very common and a reasonable activity, so common in fact that if you disagree with this I have to ask (if you don't mind me asking): do you work in software development, and if so, what kind of software?

Edit: and to answer a previous remark of yours: writing new code for feature/change requests has little to do with "making software as complicated as possible". It's likely the opposite: software that does just one thing is easier & faster to write and maintain than needlessly "flexible" and "customizable" systems. If I wanted to make a mess -- and myself indispensable -- I'd definitely go the "extremely customizable, this thing does everything you want sir!" route ;)

pandaman · on Aug 29, 2016

There are several orders of magnitude of (man hours)/(functional complexity) difference between game companies and enterprise. Changing code is incredibly expensive compared to changing data.

>Software changes are both very common and a reasonable activity, so common in fact that if you disagree with this I have to ask (if you don't mind me asking): do you work in software development, and if so, what kind of software?

I do work in software development, as I said above I am a game programmer.

the_af · on Aug 29, 2016

Thanks for the reply!

Changing code is indeed expensive, but not prohibitively so. It's so reasonable an activity, in fact, that it's what I do in my day job, and what many others in the business/enterprise software industry also do (especially for in-house tools!). It's making "flexible" software that often turns out to be the costly option. YAGNI and other mottos apply (not blindly of course, but they often do apply). I've never seen in-house software that could be controlled entirely by data. Change requests most often require code changes, and it's not the end of the world.

I suspect your opinions are colored by the fact you're a game programmer. This ties back to my initial assertion: that game programming is different to other kinds of software development; that ease of maintenance, modularity and changing code are comparatively less important, and therefore some practices of the software engineering world are less relevant for game development, while others (raw speed, memory footprint, clever hacks to produce an interesting effect, etc) are more important. This means one has to take the advice of developers from the games industry with this in mind: that what works for games is not always best for the software industry at large, because the constraints & requirements are very different!

pandaman · on Aug 29, 2016

As I said above, requirements for games change dozen times a day, every day. This is a solved problem in the games industry. Good for you if you can afford code changes but it doesn't make such a solution better.

As of now there is no financial back pressure against the enterprise practices. It may stay like this forever. However, it's not entirely unlikely the situation will change in the future. For instance, in 80-90s people used to be paid handsomely for developing things like in-house email or spreadsheet. Now these jobs are gone because it's much cheaper to configure an off the shelf office suite than keep developers on payroll.

the_af · on Aug 30, 2016

Let me ask you this: have you ever worked in anything but games development?

The rest of the industry is very different, and it's naive to dismiss it as caused by "lack of financial back pressure". The actual goals and constraints are different, even the program's life cycle, and it's only natural the development principles differ in turn! You'll notice a vast amount of literature about modularity, patterns, programming paradigms, encapsulation, etc. -- some of it misguided, some not. All of this literature is an attempt to cope with change and complexity in the software world; if the answer was "just modify data and never touch the code" I think someone might have noticed.

Do you by any change consider modifying game scripts as "modifying data"? If so, this would explain our disagreement on this matter :)

pandaman · on Aug 30, 2016

>Let me ask you this: have you ever worked in anything but games development?

I sure did, wrote and supported in-house software. Even though I don't see how it's relevant. It's not like 99% of programming information available is not in reference of enterprise and custom software development since these industries employ so many programmers.

> The rest of the industry is very different,

I am in complete agreement with this.

> and it's naive to dismiss it as caused by "lack of financial back pressure"

How do you explain the difference in cost then?

> The actual goals and constraints are different, even the program's life cycle, and it's only natural the development principles differ in turn!

Let me ask you this: have you ever worked in games development? If not then how do you know it's different? From my experience the only difference is in the financials. Game studios sell their programs and have to make profit from it to stay afloat. In-house developers don't sell anything and are financed by the actual main business' profits. Custom software developers are one step removed from the in-house: they need to sign a customer first but it's done by sales people, after the customer has started paying it's the same smooth sailing. The actual programming in any case is the same - specs go in, code comes out.

> Do you by any change consider modifying game scripts as "modifying data"?

No, scripts are also code. I personally oppose scripting on principle and see the need of scripting as a failure of the programmers but even teams relying on scripts do not spend much effort on the scripting because, as I said above, modifying code is incredibly expensive

the_af · on Aug 30, 2016

I asked the question because what you're arguing is completely at odds with the reality of software development outside the games industry. Surely you agree your position (I'll restate it here, just so we're in the same page: that it is a bad idea to modify code and a good idea to implement most changes as "changes in data") is completely non-mainstream? Can you grant me that?

I've never worked in the games industry (though I wrote my own naive videogames, starting with my C64; like many of us, I got into computers because I wanted to make games). However, I have many friends who either work or worked in that industry, and they told me how it is. I know enough about death marches to know I mostly don't want to work in videogames (other jobs I'll never take if I can help it: consulting / "staff augmentation" companies). I also know many games programmers don't write automated tests -- I'd be scared of changing anything too if that was the case!

I'm very curious now about your position. I'm sure I must have misunderstood it. If you don't believe in changing code and you don't believe in scripting, then how do you propose stuff like changes in unit behaviors in an RTS are implemented? Say you have to change the enemy AI, or add a new unit that behaves differently, or even fix the path-finding algorithm. How do you change that by modifying only data? I understand tweaking your game by changing data (new sprites, changing the max speed of a unit or the geometry of a 3D level, etc), but actually changing behavior? And what if you're the one who's actually building the game's engine?

> How do you explain the difference in cost then?

I don't follow. Are you arguing games are less or more expensive? The economy of making games is different to business software. Games are hit based. If I read all those articles correctly, most games sell a lot of units near release, then taper off and are forgotten. Yes, some games have multiplayer and the most successful of them may last many years, and some others get add-ons, but still. Business software is completely different, especially if it's in-house: you don't need a "hit", you don't get "sales" and because it's usually not a product in itself, it gets modified constantly as the end-users (who may or may not be programmers) discover new features they need. Software like this usually lasts years, and must therefore be designed using engineering principles which will help a team of (possibly changing) programmers to alter its source code over the course of many years.

pandaman · on Aug 30, 2016

>Surely you agree your position (I'll restate it here, just so we're in the same page: that it is a bad idea to modify code and a good idea to implement most changes as "changes in data") is completely non-mainstream?

It's been mainstream in the games industry for the past 10 years or so. Obviously it's not in the enterprise software.

>If you don't believe in changing code and you don't believe in scripting, then how do you propose stuff like changes in unit behaviors in an RTS are implemented? Say you have to change the enemy AI, or add a new unit that behaves differently, or even fix the path-finding algorithm.

You are conflating two things. Bug fixing obviously requires code changes to repair the defective code. Implementing a new unit or AI is better be setting up flags or adding components from a set. It's no different from the business software recording transactions or entities into a database. Hopefully you don't write new code for each new order or a new SKU in the warehouse?

>I don't follow. Are you arguing games are less or more expensive?

As I said above, games show orders of magnitude less cost per functional complexity. A game with much more different complex behaviors than a typical enterprise system takes much less man/hours to code and test.

Ralfp · on Aug 28, 2016

It is, but every once in a while when game engine or library gets posted on, say, HN, you get the usual arguments how code is not unittestable, functions pack boatload of different behaviour into themselves or take in 20 arguments, because if it works for me in my rails app, why shouldn't it work for game engine?

qwertyuiop924 · on Aug 28, 2016

A poor programmer uses the first abstractions and ideas to come into their head, and runs with it.

A mediocre programmer uses ideas and abstractions they've heard about being good ideas for this scenario, and just runs with it, occaisionally rewriting as needed.

A good programmer carefully figures out what abstractions and ideas are appropriate for the job at hand, studying and rewriting until they're sure they've gotten them right, and uses them.

A master programmer uses the first abstractions and ideas to pop into their head: they've been at this long enough to know the right approach.

jtrtoo · on Aug 28, 2016

Yes. Unfortunately, as with driving, the vast majority of us think we're better than we are. It's good to have confidence, but near impossible to know when we're over estimating our own capabilities ... until it's too late.

In this case it's also more difficult because one doesn't always see the end results of their output, which may also be years down the road (no pun intended).

qwertyuiop924 · on Aug 28, 2016

That's why becoming an master requires experience: you need experience to know what's good and bad, down the line.

jcoffland · on Aug 28, 2016

Ridiculously long functions are a maintainability problem but so is a ton of really small functions that do not provide a logical separation of concerns.

OO code can provide modularity which can greatly improve the ability to make changes without breaking other code. On the other hand, when applied poorly it can have they opposite effect.

It's not the concepts, it's how they are applied.

quotemstr · on Aug 28, 2016

A swarm of small functions is a worse maintainability problem --- it's not obvious how they interact to solve a particular problem, and the amount of plumbing you need to ship state between these different functions is frequently brutal. Sometimes it's easier to stick things in local variables and just have a long function.

Languages that make it easy to define "local" functions that operate on implicitly captured state can help --- e.g., C++, Lisp, sometimes Java --- but only where it makes sense. I don't believe in splitting functions solely because they're too long.

majewsky · on Aug 28, 2016

> Languages that make it easy to define "local" functions that operate on implicitly captured state can help

You don't even need that. In many languages, you can have {}-delimited blocks which cause variables inside of them to go out of scope when control flow exits them. I've used that to great effect in Perl to keep intrinsically large functions maintainable.

runT1ME · on Aug 28, 2016

>- it's not obvious how they interact to solve a particular problem, and the amount of plumbing you need to ship state between these different functions is frequently brutal

this is a problem with the architecture, not a problem with small functions.

taneq · on Aug 29, 2016

Not to mention that the more small functions you have, the worse your locality of reference (in terms of programmer cache, not CPU cache). In absurdum, software composed of 1-2 line functions which are then composed into higher and higher level 1-2 line functions is no better than software composed of one giant function with internal gotos for flow control.

Agreed that you should never split functions due purely to length, but a super long function smells bad because it suggests poor separation of concerns (if a function's super long then it's probably doing a lot more than one thing). Sometimes this is a problem, sometimes (like the case of Carmack's big main loop function) there's just a lot of small things to do sequentially and one big function is as good a way to represent that as any other.

eecks · on Aug 28, 2016

Small functions (less dependencies) are easier to test

quotemstr · on Aug 28, 2016

Here's another bit of heresy: testing isn't everything. Very fine-grained test suites frequently break when the structure of code under test changes even when the new code still does its job. In the limit, it's the equivalent of just breaking if SHA256(old_code) != SHA256(new_code).

Very short functions, I've found, encourage this kind of over-testing and just add friction to code changes without actually improving system reliability.

You're better off testing at major functional boundaries, and if you do that, the length of functions matters less than the interface major modules provide to each other.

Swizec · on Aug 28, 2016

> You're better off testing at major functional boundaries, and if you do that, the length of functions matters less than the interface major modules provide to each other.

Otherwise known as integration testing, which is far more useful than unit testing because bugs, especially regression bugs, more often occur in the system, than in specific functions.

You can have as many unit tests as you want, but until you have integration test, you have zero coverage for the really complex part of your code.

Unit tests are great for algorithms though.

JulianMorrison · on Aug 28, 2016

Small functions are generally uninteresting to test. As smallness approaches "one liner", you're just verifying the compiler. Yes, 1+1 == 2.

qwertyuiop924 · on Aug 28, 2016

It depends on how small, and how swarm-y. If the function is hilariously simple, and improves readability (like a premade getter for an alist or cons based structure in lisp), just do it. If it's pretty big, and you'll never reuse it... No. If you must, keep it local.

optforfon · on Aug 29, 2016

Even if you don't reuse a function it still encapsulates certain things. By looking at the name you know what it does (self-documenting-code). The interface tells you what variables it depends on. And finally it logically segregates your code.

I think the real reason people don't like small function is simply code navigations - which honestly is a poor excuse. That's an editor/IDE problem

qwertyuiop924 · on Aug 29, 2016

Yes, but encapsulation isn't always a good thing: it lessens your awareness of what's happening within the encapsulated environment. This can lead to bugs.

jcoffland · on Aug 28, 2016

Agreed, but really long functions are an indication that the code could benefit from some judicious refactoring.

staticautomatic · on Aug 28, 2016

And yet invariably it's refactoring into a ton of small functions. Round and round we go...

jcoffland · on Aug 28, 2016

Why? I've been recommending the middle ground all along.

smitherfield · on Aug 28, 2016

The 90's and 00's fetishization of OO as magic incantation that makes your code better did generate a lot of anti-patterns, some built into the languages. Multiple/deeply-multilevel inheritance, singletons, constructors with side effects, IO objects, and all the more trivial code-bloating boilerplate like "put everything in a class" and getters and setters.

cjfont · on Aug 28, 2016

In regards to the length of functions, to me it comes down to whether it is preferable to have the entire content of the function visible to the programmer at the time any changes need to be made, or if there are multiple things going on that can be assessed independently of each other. The idea of setting a hard rule that a function can only be as long as your screen height ignores the context is what is being done within the function, and encourages the programmer to make breaks in places where it may not make sense to do so.

pkroll · on Aug 28, 2016

Was it Atwood who said, modifying someone else's quote: "The two hardest problems in programming are cache invalidation, naming things, and off-by-one errors?"

Aside from separation of concerns when you make many, many functions, you have to come up with So Many Names.

Gibbon1 · on Aug 28, 2016

A lot of those names will end up being the equivalent of nasty old assembly comments.

  add 10   ; add 10 to accumulator.

becomes

  int AddTen(int x) { return(x+10);}

dibanez · on Aug 28, 2016

Muratori's compression-oriented programming and Acton's data-oriented design have really helped me in writing HPC code. Carmack's arguments make sense for apps that have a clear main function, although they're less applicable to libraries.

I consider these also "best practices", they are just better for performance than object-oriented practices applied to many small objects.

yumaikas · on Aug 29, 2016

One other important thing to keep in mind when considering that crowd is that all of them are game programmers, which face a very different set of constraints than web developers (which is what I think a lot of people here are) face. That being said, I do like a lot of what they have said in the past.

Not to say that the above aren't all examples of skilled programmers, and likely much more practical than a lot of people, just that they have a very different experience in the world than say, Uncle Bob or Martin Fowler. (Some of the more "best practices" developers).

I think an overarching trend is that programmers in general are realizing that "best practices" like all the OOP design patterns (like the flyweight or adapter patterns) are better if you don't have to go out of your way to accommodate them, but they fit into the language well.

The movement of languages like Rust, Go and Elixir (what I've been able to investigate lately) away from class-based OOP by splitting it up into its various pieces (subtyping, polymorphism, code sharing, structured types) is a good trend for the programming industry IMO. I'm looking forward to more improvements in the ability to statically verify code a la Rust. Also exciting is the improvements that C# is getting from Joe Duffy's group to help it reduce allocations and GC pressure.

It's an exciting time to be in software development and to be following PTL development, some meaningful progress seems to be happening.

GrumpyNl · on Aug 28, 2016

I do so agree with that. Me, i am a old school programmer. Started with basic, pascal, cobol, clipper, dbase, Vb, C++, php, javascript, java. Always created the framework i needed and the libraries i needed. Straight forward piramid structure software. Everything was functions ( its coming back ). Hardly any testers other then the client and yourself. Lots of that stuf is still running. Now i'm lost a of times in the complexity of the frameworks and the use of endless classes. Debugging takes ages because some class in a total different environment is badly written. My advice, KEEP IT SIMPLE.

gens · on Aug 29, 2016

A sony dev on "Pitfalls of Object Oriented Programming": http://harmful.cat-v.org/software/OO_programming/_pdf/Pitfal...

alanbernstein · on Aug 29, 2016

Casey Muratori's article really hits home for me. That's how I've felt for a long time, and I'm glad to have a coherent article to point to, to explain this to others.

kazinator · on Aug 28, 2016

"The code has bugs and segfaults" is a clear, objective problem.

"I don't understand the code" isn't ... quite the same type of problem.

virmundi · on Aug 28, 2016

I think a lot of this could be covered by two principles that are often quoted but over looked. The first is, KISS, keep it simple stupid. The second is single responsibility.

When I design software, I apply both of these to every facet of the system (though I admit sometimes not as well as I should). The end result is I might not have a ton of interfaces and hierarchies. It might not handle curve balls as well as an abstract MachineFactoryFactory could. It does handle everything that we've thrown at it however.

jcoffland · on Aug 28, 2016

This article is anecdotal and ranty but I will respond anyway. I've spent the last 15 years working on various projects involving cleaning up scientific code bases. Messy unengineered code is fine if only a very few people ever use it. However, if the code base is meant to evolve over time you need good software engineering or it will become fragile and unmaintainable.

That said, there are many "programmers" who apply design concepts willy nilly with out really understanding why. They often make a bigger mess of things. There is an art to quality software engineering which takes time to learn and is a skill which must be continually improved.

The claim in the article that programmers have too much free time on their hands because they aren't doing real work, like a scientist does, is obviously ridiculous. Any programmer worth their salt is busy as hell and spends a lot of thought on optimizing their time.

Conclusion, scientists should work with software engineers for projects that are meant to grow into something larger but hire programmers with a proven track record of creating maintainable software.

sixbrx · on Aug 28, 2016

I've had similar experience with scientific software. When I'm told that the existing software is "OK because it works", I ask "how do you know it works?" because typically there are no unit tests or tests of any sort of individual stages for that matter.

I've found that scientists tend to assume "it works" when they like the results they see such as R^2 values high enough to publish.

Recently I converted some scientific software that was using correlation^2 (calling it R^2) as a measure for model predictions, as opposed to something more appropriate like PRESS-derived R^2s (correlation is totally inappropriate for judging predictions because it's translation and scale independent on both observed and predicted sides). Nobody went looking for the problem because results seem good and reasonable. Converting to a proper prediction R^2, some of the results are now negative, meaning the models are doing worse than a simple constant-mean function. Yikes.

wsha · on Aug 29, 2016

Yes, I work on a mixed team of physicists, engineers, and computer scientists, and the most frustrating part is trying to work with some of the physicists' code. For the most part it is fairly functional, but the problem is that it is almost unreadable. It is quite clear that they write it as fast as possible so they can do what the OP would call real work without regard for others will need to work with and maintain that code later on.

ThePhysicist · on Aug 28, 2016

What most people seem to forget is that "best practices" are not universal: Depending on the size and scope of the software project, some best practices are actually worst practices and can slow you down. For example, unit testing and extensive documentation might be irrelevant for a short term project / prototype while they will be indispensable for code that should be understood and used by other people. Also, for software projects that have an exploratory nature (which is often the case for scientific projects) it's usually no use trying to define a complete code architecture at the start of the project, as the assumptions about how the code should work and how to structure it will probably change during the project as you get a better understanding of the problem that you try to solve. Trying to follow a given paradigm here (e.g. OOP or MVC) can even lead to architecture-induced damage.

The size of the project is also a very important factor. From my own experience, most software engineering methods start to have a positive return-on-investment only as you go beyond 5.000-10.000 lines of code, as at this point the code base is usually too large to be understandable by a single person (depending on the complexity of course), so making changes will be much easier with a good suite of unit tests that makes sure you don't break anything when you change code (this is especially true for dynamically typed languages).

So I'd say that instead of memorizing best practices you need to develop a good feeling for how code bases behave at different sizes and complexities (including how they react to changes), as this will allow you to make a good decision on which "best practices" to adopt.

Also, scientists are -from my own experience- not always the worst software developers as they are less hindered by most of the paradigms / cargo cults that the modern programmer has to put up with (being test-driven, agile, always separating concerns, doing MVP, using OOP [or not], being scalable, ...). They therefore tend to approach projects in a more naive and playful way, which is not always a bad thing.

ben_jones · on Aug 28, 2016

Steve Ballmer on "KLOCs" [1]. Not saying you're taking that extreme but LOC value is certainly debatable...

[1]: https://www.youtube.com/watch?v=kHI7RTKhlz0

mikekchar · on Aug 29, 2016

Complexity is related to size, but is also related to coding style. Comparisons of LOC is meaningless outside of a context,but surprisingly useful inside of a context (and as long as you don't use them as metrics, because they can be gamed too easily).

If you want to see this in action, write a script that will troll your code base and count the total number of uncommented lines of code every day. Draw a graph. Even without knowing anything about your project, I think you will find a very interesting thing -- namely that the code base grows consistently and that the amount it grows per day is a random variable with a normal distribution. (Obviously this only works if you have a consistent number of developers)

If you then do a rolling average (say every 2 weeks), I think you will find something even more interesting: the rate of change will going in one direction or another -- either higher or lower and it will be doing it consistently (normalizing for the number of developers is a bit easier here).

Once you have verified that, you can ponder about what it all means.

ktamiola · on Aug 28, 2016

Amen!

whorleater · on Aug 28, 2016

Disclosure: I'm a recent astronomy grad who specialized in computational astrophysics. Definitely biased.

The issue is that at least for many scientists and mathematicians, mathematical abstraction and code abstraction are topics that oftentimes run orthogonal to each other.

Mathematical abstractions (integration, mathematical vernacular, etc) are abstractions hundreds of years old, with an extremely precise, austere, and well defined domain, meant to manage complexity in a mathematical manner. Code abstractions are recent, flexible, and much more prone to wiggly definitions, meant to manage complexity in an architectural manner.

Scientists often times have already solved a problem using mathematical abstractions, e.g. each step of the Runge-Kutta [1] method. The integrations and function values for each step is well defined, and results in scientists wanting to map these steps one-to-one with their code, oftentimes resulting in blobs of code with if/else statements strewn about. This is awful by software engineering standards, but in the view of the scientist, the code simply follows the abstraction laid out by the mathematics themselves. This is also why it's often times correct to trust results derived from spaghetti code, since the methods that the code implements themselves are often times verified.

Software engineers see this complexity as something that's malleable, something that should be able to handle future changes. This is why it code abstractions play bumper cars with mathematical abstractions, simply because mathematical abstractions are meant to be unchanging by default, which makes tools like inheritance, templates, and even naming standards poorly suited for scientific applications. It's extremely unlikely I'll ever rewrite a step of symplectic integrators [2], meaning that I won't need to worry about whether this code is future proof against architectural changes or not. Functions, by and large in mathematics, are meant to be immutable.

Tl; dr: Scientists want to play with Hot Wheels tracks while software engineers want to play with Lego blocks.

[1]: https://en.wikipedia.org/wiki/Runge–Kutta_methods

[2]: https://en.wikipedia.org/wiki/Symplectic_integrator

noobermin · on Aug 28, 2016

>mathematical abstractions are meant to be unchanging by default

Let's say today I am doing RK2, and tomorrow I want RK4, how do I easily make my change? In my codes, it's a change of a single line and I get higher order convergence, etc. It is not a week or month project, as for many codes, it would be because of some of those abstractions you derride.

Also, computational math is an active area of research, the method you mentioned is not hundreds of years old, although yes, it was developed in the early 1900's. To this day, people are developing new methods that give higher order accuracy (orders above O(err^10) to abuse notation)...but as you can guess, no one uses them because changing the current codes are so difficult they just don't.[0] Of course, I agree O(err^4) is often enough, so the motivation to change codes now isn't that over-powering, but it again is something we lose by learning things a little but outside our field which could be helpful.

[0]Instead we, choose smaller and smaller mesh-sizes and timesteps to deal with small order error, and request millions of cpu hours, use electricity, kill trees and contribute to global warming.

pkolaczk · on Aug 29, 2016

Higher order integration methods are not always more accurate or more power efficient. They are typically worse in terms of stability if the step is too big, they require more computations per step and they may have higher error constant, so they actually often require smaller steps than low order methods. That's why in circuit simulators only methods of order up to 4 are really useful, and most of the time simple schemes of order 2 are used.

wyager · on Aug 28, 2016

It sounds like you want a language like Haskell. Abstractions are based on mathematical (algebraic and category theoretic) abstractions with well-defined laws. The language has immutable semantics and admits equations reasoning. Using libraries like Dimensional has made me better at physics; many fields of physics play fast and loose with units and dimensions and aren't even aware of it.

whorleater · on Aug 28, 2016

I'm glad someone picked up on it! I've definitely picked "FP is the truth for astrophysics programming" as my hill to die on, but the issue with Haskell is it's combination of vernacular and tools that make it hard to approach for the novice scientific programmer. Almost all dedicated astrophysics programmers wind up using Fortran90, which is sorta the de facto due to it's imperative nature.

Mikeb85 · on Aug 28, 2016

Modern Fortran is pretty awesome though; it has pure functions, it's fast, scales well, is easy to read and understand. Probably much more suited to most scientific fields than Haskell.

moron4hire · on Aug 28, 2016

meeeeh, come on. You can't say the sloppy code can be trusted because the clean math it is based on is verified. The sloppiness of the code prevents validation that it properly implements that precious math of yours.

The problem is that you want to treat the code as not your "real" job. Your real job is getting correct answers into published papers, and providing a proof of that correctness. If your code, on which your results rely, is too sloppy for anyone else to understand (and note that "anyone else" can include "you, in 6 months"), then you've not proven correctness at all.

whorleater · on Aug 28, 2016

>you want to treat code as not your "real" job

I'm not treating anything, it's because coding isn't my job. The job of a scientist is to do research, and coding is nothing more than a tool towards that goal.

>your code, on which your results rely, is too sloppy for anyone else to understand...then you've not proven correctness at all

No, my results rely on my experimental methods, my mathematical models, and my code. Correctness can be proven in spite of sloppy code. Would you dispute a claim on the basis that calculations done on a calculator can't be seen by others?

Furthermore, the burden of proof after peer review in academia is on the person disproving in it. If my code is wrong at a basic level, what good does it do for anyone? If someone is to disprove my paper, they should reimplement the code in order to account for errors.

Does this excuse spaghetti level code that often accompanies papers? Of course not. Scientists have a lot to learn from software engineering about proper programming skills, but programming is simply another tool in the repertoire, not something that should be put on a pedestal.

ben_jones · on Aug 28, 2016

> coding is nothing more than a tool towards that goal.

That's an important idiom that most devs need to understand at some points in their career, but don't. It's not even exclusive to business goals, but sanity and complexity ones as well..

taneq · on Aug 29, 2016

Chances are, coding isn't your "real" job in a lot of cases (including most software engineers and programmers). Your real job is solving a problem for someone, using code. Good software architecture, coding style, etc. are there to help you achieve this goal, but the end user of the software doesn't care about them.

moron4hire · on Aug 29, 2016

No more than a home owner cares whether or not their house's blueprints were printed on a napkin. If such a thing were to happen, it would be indicative of an incompetent contractor.

taneq · on Aug 30, 2016

Unless you're the first owner of the house, you probably don't even have meaningful engineering drawings for it. And even if you did, they are not what you care about. (Well, it'd be nice to have some documentation about the wiring, plumbing, which walls are load bearing, etc. but that's another rant.)

What you care about as a homeowner is that your house is solid, watertight, safe to live in, acceptable to look at, and meets your needs as a tenant. Whether your builder designed it down to the last tack and cable-tie in SolidWorks, or sketched it on a napkin, or made it up as they went along, makes no difference to you as the homeowner.

Quality of process is only ever a proxy for quality of results.

moron4hire · on Aug 30, 2016

But it's an extremely good proxy. You don't see good results coming out of bad tools on a regular basis.

whorleater · on Aug 29, 2016

If the contractor's blueprints had been meticulously peer reviewed, why does the blueprint medium matter?

moron4hire · on Aug 29, 2016

You're speaking in absurdities.

There are things you just don't see together. You don't see quality blueprints printed on napkins, and you don't see quality code that is well-suited to its task written in a sloppy way. Technically speaking, you can lose weight eating all your meals from McDonald's. But the person who eats all their meals from McDonald's isn't going to exercise the portion control necessary to do it. It's just not a thing that happens often enough to bother considering it.

Sloppy code is an excellent indicator that the program is a buggy piece of junk. I don't care if it's technically "possible" to write a "good" program in a sloppy way. If your code is sloppy, you aren't that person who makes that program. The person who is capable of making good programs doesn't write sloppy code, even if they are technically capable of doing it.

jimbokun · on Aug 28, 2016

So is something like Haskell a potentially better fit? When you use terms like "immutable" and "unchanging", makes me think of functional programming.

oneloop · on Aug 28, 2016

> The issue is that at least for many scientists and mathematicians, mathematical abstraction and code abstraction are topics that oftentimes run orthogonal to each other.

Excellent observation. I'm an ex-physicist and on the few occasions that I had to use computers the only thing I cared about was how computer functions mapped into the mathematical abstractions that I cared about. Everything else was just noise.

mcguire · on Aug 28, 2016

"Crashes (null pointers, bounds errors), largely mitigated by valgrind/massive testing"

Once upon a time I had lunch with a friend-of-a-friend whose entire job, as a contractor for NASA, was running one program, a launch vehicle simulation. People would contact her, give her the parameters (payload, etc.) and she would provide the results, including launch parameters for how to get the launch to work. Now, you may be thinking, that seems a little suboptimal. Why couldn't they run the program themselves; they're rocket scientists, after all?

Unfortunately, running the program was a dark art. The knowledge of initial parameter settings to get reasonable results out of the back end had to be learned before it would provide, well, reasonable results. One example: she had to tell the simulation to "turn off" the atmosphere above a certain altitude or the simulation would simply crash. She had one funny story about a group at Georga Tech who wanted to use the program, so they dutifully packed off a copy to them. They came back wondering why they couldn't match the results she was getting. It turns out that they had sent the grad students a later version of the program than she was using.

Anyway, who's up for a trip to Mars?

noobermin · on Aug 28, 2016

Here's the thing that grinds my gears. Let's see scientists apply that same attitude toward papers. Let them label a bunch of equations poorly, and not label a few, have them explain concepts out of turn in different places in the document, have them produce shitty, unreadable figures, let's see how that turns out.

The issue is that code which eventually leads to their results isn't public, they don't have their reputation lying on it, and so they can pretend they understand what they talk about when they come to publishing, but one or two looks at their code let's you know they hardly bullshit. But when if comes to a paper, well, they will be judged on that, so they can't be messy there.

It's okay if it's a one off code for one group, that's fine. But when a code is vital for so many people, for it to be that terrible and inaccessible?

Simple solution: if you are funded by the tax payer, what you produce should be accessible by the tax payer (absent defense restrictions). Demanding accessibility for gov't funded papers is good but I feel the same restriction should apply to code.

munin · on Aug 28, 2016

> Let them label a bunch of equations poorly, and not label a few, have them explain concepts out of turn in different places in the document, have them produce shitty, unreadable figures, let's see how that turns out.

This is what they already do, though...

Spooky23 · on Aug 28, 2016

That's incredibly common in enterprises. the fad is to move enterprises to an ITIL model where work is done by a mix of matrixed in-house and outsourced teams, the overhead of "best practices" becomes important.

sseagull · on Aug 28, 2016

His first list really, really hand-waves the problems that style of coding can cause. Just use better tools or run valgrind? It never is that simple.

One aspect of scientific coding is that it can have very long lifetimes. I sometimes work on some code > 20 years old. Technology can change a lot in that time frame. For example, using global data (common back then) can completely destroy parallel capability.

The 'old' style also makes the code sensitive to small changes in theory. Need to support a new theory that is basically the same as the old one with a few tweaks? Copy and paste, change a few things, and get working on that paper! Who cares if you just copied a whole bunch of global data - you successfully avoided the conflict by putting "2" at the end of every variable. You've got better things to do than proper coding.

Obviously, over-engineering is a problem. But science does need a bit of "engineering" to begin with.

Anecdote: A friend of mine wanted my help with parsing some outputs and replacing some text in input files. Simple stuff. He showed me what he had. It was written in fortran because that's what his advisor knew :(

Note: I'm currently part of a group trying to help with best practices in computational chemistry. We'll see how it goes, but the field seems kind of open to the idea (ie, there is starting to be funding for software maintenance, etc).

luthaf · on Aug 28, 2016

> there is starting to be funding for software maintenance

Any reference concerning this point? I am interested!

sseagull · on Aug 28, 2016

Here is a starting point for one big movement in my field:

https://www.nsf.gov/news/news_summ.jsp?org=NSF&cntn_id=18934...

It's not quite "maintenance", but is definitely a step away from just writing software to get an answer and then abandoning it.

Also, anecdotally, a there is movement towards more open-source software. Slowly but surely, things are moving in the right direction.

The_suffocated · on Aug 28, 2016

I think some of the author's criticisms are misplaced.

Long functions — Yes, functions in scientific programming tend to be longer than your usual ones, but that's often because they cannot be split into smaller functions that are meaningful on their own. In other words, there's simply nothing to "refactor". Splitting them into smaller chunks would simply result in a lot of small functions with unclear purposes. Every function should be made as small as possible, but not smaller.

Bad names — The author gives 'm' and 'k' as examples of bad variable names. I think this is a very misplaced criticism. Unless we are talking about a scientific library, many scientific programs are just implementations of some algorithms that appear in published papers. For such programs, the MAIN documentations are not in the comments but the published papers themselves. The correct way to name the variables is to use exactly the symbols in the paper, but not to use your favourite Hungarian or Utopian notations. (Some programming languages such as Rust or Ruby are by design very inconvenient in this respect.) As for long variable names, I think they are rather infrequent (unless in Java code); the author was perhaps unlucky enough to meet many.

adrianratnapala · on Aug 28, 2016

Mostly I agree, bad naive code is better than bad sophisiticated code.

Also science very frequenly only requires small programs that are used for one analisys and then thrown away. It's OK to have a snarl of bad Fortran or Numpy if it only 400 lines long.

BUT: scientific projects are often (in my old field, usually) also engineering projects. Such experiments are complex automated data gathering machines hardware and take rougly similar data runs tens of thousands of times.

There should be some engineering professionalism at the start to design and plan such a machine. Especially the software, since it is mostly a question of integrating off-the shelf hardware.

But PIs think:

(A) engineering is done most cheaply by PhD students -- a penny pinching fallacy.

(B) that their needs will grow unpredictably over time.

B is true, but is actually is a reason to have a good custom platform designed at the start, so that changes are less costly. Your part time programmer is going to develop many thousand of lines of code no one can understand or extend. (I've done it, I should know.)

shitgoose · on Aug 28, 2016

even B is false a lot of times. Just look at most of this 'big data' - it all can fit on my mobile phone.

ska · on Aug 28, 2016

I believe this post is fundamentally misguided, but I can see how the author got there. In fact I see it as a sort of category error. When you talk about a style of programming being "good" or "bad", I always want to ask "for what?". I wonder if the author has thought about what would happen if everyone adopted the "scientific" style they are alluding too.

Most of what the author describes as the problems of code generated by scientist are what I would call symptoms. The real problems are things like: incorrect abstractions, deep coupling, overly clever approaches with unclear implicit assumptions. Of course this causes maintenance and debugging to be more difficult than it should but the real problem is that such code does not scale well and is poor at managing complexity of the code base.

So long as your code (if not necessarily its domain) is simple, you are fine. Luckily this describes a huge swath of scientific code. However system complexity is largely limited by the the tools and approaches you use .. all systems eventually grow to become almost unmodifiable eventually.

The point is, this will happen to you faster if you follow the "scientific coder" approaches the author describes. Now it turns out that programmers have come up with architectural approaches that help manage complexity over the last several decades. The bad news for scientific coders is that to be successful with these techniques you actually have to dedicate some significant amount of time to learning to become a better programmer and designer, and learning how to use these techniques. It also often has a cost in terms of the amount of time needed to introduce a small change. And sometimes you make design choices that don't help your development at all. They help your ability to release, or audit for regulatory purposes, or build cross-platform, or ... you get the idea. So these approaches absolutely have costs. You have to ask yourself what you are buying with this cost, and do you need it for your project.

The real pain comes when you have people who only understand the "scientific" style already bumping up against their systems ability to handle complexity, but doubling down on the approach and just doing it harder. Those systems really aren't any fun to repair.

raverbashing · on Aug 28, 2016

It's an interesting discussion, and as the article points out, "Software Engineer" code has some issues as well

There's also an issue that code ends up reflecting the initial process of the scientific calculation needed, which might not be a good idea (but if you depart from that, it causes other problems as well)

Also, I'm going to be honest, a lot of software engineers are bad at math (or just don't care). In theory a/b + c/b is the same as (a+c)/b, in practice you might near some precision edge that you can't deal directly and hence you need to calculate this in another way

Try solving a PDE in C/C++ for extra fun

ska · on Aug 28, 2016

It's worse that you say (and think?). For example: in general, floating point equality isn't transitive, and addition isn't even associative.

Not only do those "bad at math" software engineers get this wrong, most of the scientists do too. These two groups often make different types of errors, true - but nearly everybody who hasn't studied numerical computation wiht some care is just bad at it.

dibanez · on Aug 28, 2016

I'm 80% "software engineer" and 20% "researcher" and have to play both roles to write supercomputer code (I'm the minority, most peers are more researchers). These issues are important right now, as the govt is investing in software engineering due to recent hardware changes that require porting efforts. We recognize the pitfalls of naive software engineering applied to scientific code, and would like to do things more carefully. I don't think we should have to choose one or the other; with proper communication we can achieve a better balance.

joseraul · on Aug 28, 2016

In his excellent book [1], Andy Hunt explains what expertise is with a multi-level model [2], where a novice needs rules that describe what to do (to get started) while an expert chooses patterns according to his goal.

So, "best practices" are patterns that work in most situations, and an expert can adapt to several (and new) situations.

[1] https://pragprog.com/book/ahptl/pragmatic-thinking-and-learn...

[2] https://en.wikipedia.org/wiki/Dreyfus_model_of_skill_acquisi...

nolemurs · on Aug 28, 2016

The title of this article should really be "Why bad scientific code beats bad software engineer code."

It contrasts a bunch of bad things scientific coders do, and a bunch of bad things bad software engineers do. There's no "best practices" to be seen on either side.

pyrale · on Aug 28, 2016

The article overlooks a massive source of problems : the problems he describes in engineers' code usually starts to become annoying at larger scale. The problems he describes in scientists' code rarely happens at scale, because it can't be extended significantly. I feel it's weird to compare codebases that probably count in the thousands, and codebases that count in the hundreds of thousands or million lines of code.

Also it is worth noting that every single problem he has with engineers' code is described at length in the litterature (working effectively with legacy code, DDD blue book, etc). Of course, these problems exist. But this is linked to the fact that hiring bad programmers still yields benefits. I believe this is not something that we can change, but if the guy is interested in reducing his pain with crappy code, there are solutions out there.

lilbobbytables · on Aug 28, 2016

> Long functions This isn't the worst thing. As long it gets refactored when there is a need for parts of that function to be used in multiple places.

> Bad names (m, k, longWindedNameThatYouCantReallyReadBTWProgrammersDoThatALotToo) I can live with long winded names, while slightly annoying, they at least still help with figuring out what's going on.

What I can't stand are one or two letter variable names. They're just so unnecessary. Be mildly descriptive and your code becomes so much easier to follow, compared to alphabet soup.

What annoys me about stuff like this is that it just feels like pure laziness and disregard for others. Having done code reviews of data scientists they just don't want to hear it. They adamantly don't care - compared to my software engineer compatriots who would at least sit there and consider it.

But this is just my own anecdotal experience.

toufka · on Aug 28, 2016

As a poster above pointed out, a lot of scientific code is an implementation of a mathematical device. And the scientist is trying to make their equations come to life. And in math, many equations are simplified to their variables in order to avoid insane complexity. Many of the scientists actually are thinking in terms of 'S', 't' and 'v', etc. What's the particle's x, y, t coordinates, and how does that get me v, p and l? So that they can write out:

v = ((x2 -x1)^2 + (y2 - y1)^2) ^ (1/2) / t

rather than:

velocity = sqrt(pow((locationX2 - locationX1),2) + pow((locationY2 - locationY1),2)) / duration

The latter is AWFUL mathematics, and very real code. (and that is an easy equation. I've had to implement very very complicated calculus into objective-c code and it is absolutely horrid what comes out as 'code', as clean as that code might be. It in no way whatsoever resembles the elegance of the math that birthed it.)

When I first started, I naively tried to write math code with the natural Objective-C objects and ended up on the very wrong side of the language. I realize the mistake now, but it's very awkward to ask the (scientist) programmer to go along programming with the language's tutorialed objects, then to tell them, "btw, that 'NSNumber' you have, can't be used as an exponent, along with that 'float' over there. And you can't add NSNumbers and 'integers'. Oh, you want to multiply two NSNumbers together? You want to write an equation with NSNumbers on one line!? Go for it. Oh, and you want to do a cross-product on a matrix? Ha!".

occamrazor · on Aug 29, 2016

It is tradition in mathematics (and physics, and maybe other sciences) to use single letter names. A function is f, a variable is x, a parameter is a. These short names are intuitive for the scientist who wrote the code, even if programmers have different conventions.

xapata · on Aug 28, 2016

The meat is in the footnote, as always.

> (In fact, when the job is far from trivial technically and/or socially, programmers' horrible training shifts their focus away from their immediate duty – is the goddamn thing actually working, nice to use, efficient/cheap, etc.? – and instead they declare themselves as responsible for nothing but the sacred APIs which they proceed to complexify beyond belief. Meanwhile, functionally the thing barely works.)

It seems the author has been plagued with programmers who avoid taking responsibility. One strategy for creating job security is to build a system too complex for anyone else to maintain it. Perhaps the author's colleagues are using this strategy.

It's hard to take complaints about "best practices" seriously when the practices described are not best.

thearn4 · on Aug 28, 2016

Working in this area (and coming from a math background), the biggest issues that I have with most scientific and engineering code are:

1) lack of version control

2) lack of testing

Everything else (including the occasional bad language fit) is usually a distant 3rd.

taeric · on Aug 28, 2016

    > Simple-minded, care-free near-incompetence can be
    > better than industrial-strength good intentions 
    > paving a superhighway to hell.

Love this line.

I think the thing about bad scientific code that makes it good is that you can often get really good walls around what goes in and what comes out. To the point that you can then mitigate the danger of bad code to just that component.

Software architects, on the other hand, often try to pull everything in to the single "program" so that, in the end, you sum all of the weak parts. All too often, I have seen workflows where people used to postprocess output data get pulled into doing it in the same run as the generation of the data.

mattkrause · on Aug 29, 2016

As always, the right way is somewhere down the middle.

I recently inherited a blob of "scientific code" with basically no abstraction. Need to indicate the sampling period? That'll never change--just type .0001; that'll never change. Need to read some files? Just blindly open a hardcoded list of filename and assume it's okay--it'll always been like that ? And of course, these files are in that format and there's no need to check. Of course, after this code was written, we bought new hardware. It gathers similar data, but samples at a completely different frequency, has a different number of channels, and records the data in a totally different way.

We could fork the code, find-and-replace the sampling rates, and all that, and maintain a version for each device we buy. Or we could write a DataReader interface, some derived versions for each data source, and maybe even the dreaded DataReaderFactory to automatically detect the filetypes.

Guess which approach will work better in a few years?

LyndsySimon · on Aug 29, 2016

In my experience, there is a middle path. Hard-code the sampling period, but put it in a constant `SAMPLING_PERIOD`. Then when the hardware changes and things break, refactor the I/O code into a DataReader object. If and only if you need to support several formats, either implement your DataReaderFactory, or write a class for each filetype.

taeric · on Aug 29, 2016

Statistically, you aren't going to make it a few years.

Rainymood · on Aug 28, 2016

I recently followed a course on "Principles of programming for Econometrics" and although I knew a lot about programming already I learned a lot about being structured and documentation. The professor ran some example code which he wrote 10 years ago! He wasn't really sure what the function did again and BAM it was there in the documentation (i.e. comment header of the function).

I used to just hack stuff together in either R or Python but that course really got me thinking about what I want to accomplish first. Write that down on paper. And then and only then after you have the whole program outlined in your head start writing functions with well defined inputs and outputs.

wintermute42 · on Aug 29, 2016

Why not use the computer to help you define and understand the problem? It will be much faster to iterate quickly at a repl and then write the cleaned up version later rather than just trying model the whole thing in your head first