
Harmful Gotos, Premature Optimizations, and Programming Myths - nkurz
http://videlalvaro.github.io/2015/02/programming-myths.html?
======
Isamu
Once I had to port a large legacy FORTRAN program to a Unix box. It was
structured entirely with goto, no procedures, functions, or control structures
- just thousands of lines of code with gotos. It was doing some complicated
iterative engineering calculations, goto loops inside of goto loops, gotos
jumping outside of the loops, jumping back in, etc. It was obvious somebody
kept tinkering with it over the years, adding new gotos to make some new
features work.

Very hard to analyze what was going on, very hard to refactor, very hard to
reason about this program.

I was able to break out some parts into functions, and turn some gotos into
loops, but for the most part it resisted my efforts.

This is what Dijkstra was complaining about - programs that resisted proofs
because they were hard or impossible to decompose. He was not pedantic about
goto.

You shouldn't worry about a goto here and there when it makes sense.

~~~
Animats
_Once I had to port a large legacy FORTRAN program to a Unix box. It was
structured entirely with goto, no procedures, functions, or control structures
- just thousands of lines of code with gotos._

I had to do something like that once. I printed out the program on fanfold
paper, took over a conference room with a long table, and traced out the
control flow with colored markers. I was able to fix the program. That was not
fun.

I have never coded a goto in any language since the days of FORTRAN and
assembler. If I need to bail out of the middle of something, I'll use a break
or a return, even if this requires encapsulating a loop in a function.

Note that languages with nested functions facilitate programming without
"goto". When you need to encapsulate something which may require a bailout,
it's easier.

Rust does nested functions well. The compiler can tell whether a nested
function references an outer scope and thus needs to be a closure. If the
function doesn't, an ordinary function is generated.

~~~
kstenerud
The trouble comes when you need to clean up resources while bailing out.
Exception handlers and RAII are great for this, but when you don't have such
features available, goto is pretty handy:

[https://github.com/kstenerud/KSCrash/blob/master/Source/KSCr...](https://github.com/kstenerud/KSCrash/blob/master/Source/KSCrash/Recording/Sentry/KSCrashSentry_MachException.c#L340)

If I ever had to add more resources or steps into the mix here, I'd be in a
world of hurt pretty quick if I were using a complicated nest of breaks and
if-else. One could forcefully encapsulate each step into a separate function,
but then you start to trend towards a forest of tiny little trees that you're
not even sure what to name anymore, and the overall structure of the function
and what it's doing become muddled.

~~~
Animats
It's hard to do right. RAII handles initialization fine, but error handling in
the destructors is hard. Raising an exception in a destructor seldom ends
well. Go's "defer" has the same problem - what do you do if there's an error
in a deferred call? Most languages with exceptions have problems in this area.

Python's "with" clause seems to be the best solution so far. The "with" clause
calls __enter__ at the start and __exit__ at the end. The __exit__ gets called
no matter how you leave the "with" clause. This includes exceptions and
returns. Importantly, there are parameters to __exit__ which provide
information about what's going on when an __exit__ is executed because of an
exception. Exceptions in __exit__ can be handled and passed on, even for
nested "with" clauses. Writing a proper __exit__ is a bit tricky, but that's
typically done only for standard resources such as files, connections, and
locks.

So nested error handling via exceptions can work well if the scoped open
mechanism ("with") and the exception mechanism play well together. It took a
decade to get that retrofitted into Python, but in the end, it worked well.

Rust's error types are sound, but result in rather bulky code. I've written on
this before.

------
numeromancer
The author would have done much better if, rather than poking sticks at those
who wish to forbid gotos, and posting vague and generic quotes about memes and
cargo-cults, they had provided several examples of code wherein the use of
gotos was a decided improvement, in terms of speed and clarity, over the best
that could be done without it. Anyone who has read code that has been twisted
out of shape, or entagled with unecessary "done" flags and the like, and who
can be improved with such remonstrations, will recognize the virtue in this
point of view.

~~~
coldtea
> _The author would have done much better if, rather than poking sticks at
> those who wish to forbid gotos, and posting vague and generic quotes about
> memes and cargo-cults, they had provided several examples of code wherein
> the use of gotos was a decided improvement, in terms of speed and clarity,
> over the best that could be done without it._

He would have only done better by doing that if his goal was to merely educate
people that goto can be good too.

Which is very far from being his goal.

~~~
KedarMhaswade
Would you think [http://bit.ly/recurse-iterate](http://bit.ly/recurse-iterate)
does an okay job of working toward that goal?

------
keenerd
There was a similar blog post a while back about the standard deviation in
statistics. About how when the SD was discovered, the original paper was full
of disclaimers that the SD should only be used by statisticians working with
perfect distributions. The squaring would amplify the natural outliers present
in real-world data and make the SD too dangerous to trust. And yet we use the
SD for everything, even though the original reason for it (a computation
shortcut to save hand-labor) is pretty much meaningless.

I have a copy of the original paper but the paper lacks the historical
perspective of the post. I've never been able to find that blog post again.
Does anyone remember what I'm talking about?

~~~
rdtsc
I kind of remember SD being used because it has the same units as the samples
in the distribution.

~~~
keenerd
Nope. The alternative (mean average distribution) also has the same units. The
problem with the MAD was that it used an absolute value and so introduced a
discontinuity. If you are a mathematician working with a system of equations,
each discontinuity doubles the size of the system when taking
integrals/derivatives. The SD was a clever hack to hide the discontinuity. It
makes the mathematician's job of juggling equations much easier by changing
the workload from O(n^2) to O(n). If you are not doing calculus on the system
then the SD has much less utility. Negative utility even, because the
square/square-root hack used by the SD is identical to the MAD only for
perfectly normal distributions. For real-world distributions it tends to give
wrong results.

------
kstenerud
I've always stuck to my guns with goto, and will continue to do so. Any
sufficiently large C codebase will have some areas where goto makes the code
cleaner, less error prone, and easier to refactor (most commonly in error
handling code).

~~~
Liru
goto is merely a tool that got a bad rap simply because it has its roots in
basic assembler, back before structured programming was a thing. I agree with
you regarding its use, but I'm still surprised by just how many seemingly
educated programmers jump onto the "never use goto" cargo cult because of a
paper title that they misattribute to Djikstra.

Ironically, from the code I've seen, I don't think I've ever seen anyone
actually _misuse_ goto, probably because the people who use it actually know
what they're doing. That's just anecdotal on my part, but still.

~~~
rbanffy
> I don't think I've ever seen anyone actually misuse goto

I've lived through the 80's, when 8-bit Microsoft BASIC was a popular
programming language. Misuse of GOTO's was quite common, albeit unavoidable.

~~~
jrochkind1
If it was unavoidable, what made it "misuse"?

~~~
spc476
Worse case, imagine a GOTO at the end of each line of code.

There was no lexical scoping in BASIC [1] (and all variables were global but
that's another topic) so one could literally GOTO any point in the program.

[1] I'm talking about the BASIC one found on most 80s home computers at the
time.

------
jarcane
I grew up coding in line-numbered, Disk Extended Color BASIC on a Tandy Color
Computer 3 for a number of years, before graduating to QBasic.

The thing I look back on about "goto" and "gosub" is that, within the confines
of what those simple control structures offered, much of my code wasn't
actually _structured_ that differently from what I would go on to write in
more procedural languages down the road.

If you factored out most of my programs into pseudocode (allowing for
limitations like global-only scope); the result would pretty much look like
any QB or C program seeking to tackle the same problems, with main loops,
clear procedural calls, etc. It was even idiomatic in those days to reserve a
var or even a point in memory to take that gosub's "arguments".

The result of which is that when I did move up to QBasic, it was a pretty
simple, easy transition, because it basically just allowed for a cleaner way
of writing in the style I was already hacking together with DECB's primitive
tools.

Good programming patterns emerge because they're the most efficient way of
doing things. Goto is just another tool to do what a body needs to do.

~~~
JoeAltmaier
I don't think BASIC is even capable of expressing what programmers do these
days. Maybe top-level app cut-and-paste logic. But not anything complex like
libraries and data structures etc.

~~~
coldtea
Have you looked at modern BASICs?

------
rblstr
Wow, the comments on this post really do sum up his points in this article.
The whole article isn't about goto at all, but about programming myths and how
they perpetuate and how HN is just allowing us to be more lazy by only reading
the titles to posts. Why is nobody discussing the actual intent behind this
article, which is how myths and, slightly related, programming myths begin and
propagate throughout our education and culture.

Instead, ITT, gotos are bad and here's why, gotos are ok and here's why, gotos
are good and here's why.

------
5outh
Suddenly, I feel like I should spend a heck of a lot more time reading classic
papers, books etc.

------
taeric
This was a fun read, though I am unsure of what the conclusion really is. Even
Knuth ended by going on about how he probably wouldn't use goto statements. He
just likes building a foundation of really fast/efficient techniques.

So sure, many of these things are mythical in scope and lore. But... the gist
of many of them are not so misunderstood as to be wrong.

~~~
wongarsu
I guess the point is that discussions like the linked github issue
([https://github.com/igorw/retry/issues/3](https://github.com/igorw/retry/issues/3))
should be over after 2 comments, not after a 196 comment flamewar. If we
criticise some programming technique we should first ask ourselves if that
technique is actually harmful in that context.

And then there's of course the greater narrative about the state of the
industry, people chasing easy mantras instead of computer science, programmers
ignorant of programming history, cargo cult programming etc.

------
jnazario
a good discussion on "goto considered harmful" here:
[http://stackoverflow.com/questions/46586/goto-still-
consider...](http://stackoverflow.com/questions/46586/goto-still-considered-
harmful)

in short, it's not a golden rule that some people seem to think it is.

------
probablybroken
I'm curious about the statement: "For example, more often than not we see blog
posts by people claiming to have beaten the CAP Theorem. A mathematical
theorem…"

I've never seen any actual mathematical proof of CAP - only a great deal of (
mostly well founded ) engineering conjecture.

Has anyone actually produced any mathematical proof of CAP?

~~~
DanielRibeiro
_Has anyone actually produced any mathematical proof of CAP?_

Great question!

This presentation, linked by the article, explains the misunderstanding:
[https://www.youtube.com/watch?v=Wp08EmQtP44#t=1273](https://www.youtube.com/watch?v=Wp08EmQtP44#t=1273)

The paper you are looking for is this:
[http://dl.acm.org/citation.cfm?id=564601](http://dl.acm.org/citation.cfm?id=564601)

Also available here:
[http://webpages.cs.luc.edu/~pld/353/gilbert_lynch_brewer_pro...](http://webpages.cs.luc.edu/~pld/353/gilbert_lynch_brewer_proof.pdf)

------
mncolinlee
I have cited Dijkstra's paper several times of late. Usually, it's because of
asynchronous callbacks and the new paradigm of nesting callbacks many, many
layers deep until state, thread-safety, and side effects become almost
unmaintainable.

Many believe the problems are solved when nesting is concealed using named
functions. It alleviates deep indenting, but it may make the code jump around
while still dealing with shared state, propagating errors, and threading
issues.

I have argued that callbacks are becoming the new GOTO statement-- useful in
some contexts, but frequently overused in ways that were never intended. The
result is "spaghetti code" that is difficult to follow and maintain. This
paradigm has led to the popularization of the term "callback hell."

------
swatow
I think there are extremely few cases when goto is justified in C++. Unlike C,
C++ provides plenty of mechanisms to cleanup when exiting a function early.
And as for breaking out of multiple loops, a "done" variable is not that bad.

------
Marazan
Also the premature optimisation quote is not taken out of context, unless he
doesn't think "premature" means the same as everyone else.

~~~
SomeCallMeTim
It's the extreme that people take that quote to that causes the problem. He
explicitly mentions library development as a place that you always want to
optimize, because libraries are often used in inner loops.

When developers totally ignore optimal code practices, bad things can happen.
Things like 25000 (!) string allocations for _each character typed_ in the
Chrome "omnibox". [1] That's not likely one small bit of code with a few
std::string's being allocated, but probably dozens or hundreds of (albeit
minor) fixes spread throughout the code.

It is absolutely worthwhile to code optimally as a habit, when the optimal
coding practice isn't significantly harder to implement or understand than the
suboptimal practice. (In the above case, having a "pass all strings as const
references" would have likely prevented most of the allocations.)

Similarly, the way you architect a product can strongly influence how easily
it can be optimized. I've cut processing time down from 10's of minutes to
under a second on some tasks, just by improving how the code was written --
and that approach means that when I write servers, I can handle all the likely
traffic I'll ever see on a handful of servers instead of having to create
additional layers of complexity to handle dozens of servers working in
parallel.

So I disagree that "premature" optimization is a bad thing, by the typical
definition of "premature."

[1]
[https://groups.google.com/a/chromium.org/forum/#!msg/chromiu...](https://groups.google.com/a/chromium.org/forum/#!msg/chromium-
dev/EUqoIz2iFU4/kPZ5ZK0K3gEJ)

~~~
kbenson
I think the point is that you can optimize while writing code, but be leery of
optimizing while _designing_. To a greater or lesser degree depending on the
person, task or goal, designing and coding may end up being close to the same
thing, which confuses the issue.

In real terms, writing perfomant code is good, but packing data structures in
interesting ways and optimizing in other manners when you aren't even sure you
know all the data you need to store yet is often counter-productive.

~~~
SomeCallMeTim
I think optimizing while designing is totally appropriate. In fact you can
make some of the most powerful optimizations at the design stage, with almost
no additional effort required.

Should you go down a rabbit hole, spending hours optimizing something because
you think it might help? No, of course not. To my reading, __that __is what
the warning is about.

But when you're designing a system, especially one that you _know_ needs to be
fast, should your design be optimized from the ground up? Yes, in some cases
at least. [1]

When I code, I choose to use languages (C++, Go, LuaJIT, JavaScript/Node) that
allow me to write code that ends up faster. Some people will cite "premature
optimization" based solely on that decision, and they'll write their code in
Python, or PHP, or Perl, or Ruby. [2] Since my servers will handle 3000 user
interactions per second with no caching, I still contend that this is an
appropriate stage for optimization.

I've seen projects _fail_ because the server requirements were too high: 100
concurrent users per Python-based server on a free-to-play game meant that if
the app took off, it would be losing money because of the costs of running
hundreds of servers. You can't just "optimize" the app if the problem is you
need to rewrite it in a completely different language. Computer time in the
cloud may seem cheaper than programmer time, but in practice if you pay the
programmers up front to do it _right_ , you can cut ongoing operating costs
_indefinitely_. And sometimes that can mean the difference between a project
that succeeds and one that fails.

[1] [http://gamesfromwithin.com/data-oriented-
design](http://gamesfromwithin.com/data-oriented-design)

[2] It looks like at least Ruby has a server that uses a similar strategy to
Node or Nginx+LuaJIT: [http://www.akitaonrails.com/2014/10/19/the-new-kid-on-
the-bl...](http://www.akitaonrails.com/2014/10/19/the-new-kid-on-the-block-
for-ruby-servers-raptor) \-- but Ruby and Rails are ugly for a million other
reasons, in addition to being slow, so I still would shun it.

~~~
kbenson
I would contend that in the successful cases where you are optimizing from the
beginning, you are actually doing a two stage process, one design stage where
you figure out what you need, and another where you figure out how to do that
efficiently and fast. To some degree, it's semantics (when is optimization
_not_ design related?), but my point was that optimization before nailing down
the requirements for that component is what leads to problems. I think that's
the _premature_ in _premature optimization_ , when you spend time optimizing
something you aren't sure will even survive a later point in the design stage.

~~~
SomeCallMeTim
> I think that's the premature in premature optimization, when you spend time
> optimizing something you aren't sure will even survive a later point in the
> design stage.

That's true enough, though it doesn't seem to be the original point, since the
context refers to optimizing parts of the code that aren't taking most of the
time.

Which _is_ a danger, but I think the danger is in wasting time on such
optimizations, either up front or through increased complexity.

------
ww520
Rules are meant to be broken. Once you have sufficiently understand the
reasoning behind the rules, feel free to break them when appropriate. People
being dogmatic about some rules just see all problems as the same nail that
can be hammered by a single hammer. It's intellectually lazy to not analyze
the problem within the present context and to see how the rules are applied or
not applied.

------
KedarMhaswade
Oh, this is a timely discussion. I just wrote an article on this topic:
[http://bit.ly/recurse-iterate](http://bit.ly/recurse-iterate). What I thought
was good about go to statement was that it helps us understand how recursion
works and how one can eliminate it in favor of iteration.

------
rumcajz
To decide whether to optimise/not optimise or use goto/avoid goto you need
some amount of experience. Thus, people with no such experience resort to the
rules of thumb (don't optimise, don't use gotos). From there it's a little
step to trying to impose the rule dogmatically on the others.

------
oxalo
Hopefully the article doesn't fall prey to exactly what it discusses.

------
samsaga2
I feel the same with syntactic sugar. Every feature added to a programming
language it's a new source of potential errors. I prefer write two lines of
nice code, than one line of magic tricks.

------
sparaker
I don't think using goto decides if a person is a good programmer or not. Goto
statements make a lot more sense at times and they are alot of times faster
than alternatives.

------
nawitus
I wouldn't implement goto in a new programming language. Not because it's
never useful, but because it's only very rarely needed, and it's very
existence means that it's easy to misuse it. Programming languages should be
minimalistic and rarely needed features should be cut away. I've read similar
reasoning from Eric Lippert when he was still designing C# features (which is
not the best example, as C# is rather bloated).

Of course, the "should we use goto" question is different from "should we
implement goto in new programming languages".

~~~
kstenerud
`it's very existence means that it's easy to misuse it`

However, few people actually do:
[https://peerj.com/preprints/826v1](https://peerj.com/preprints/826v1)

~~~
nawitus
In C at least. Anyway, that's probably due to the significant amount of
discussion about goto. Perhaps "ironically" goto is a good feature only
because it's criticized enough so that it's typically used correctly?

~~~
eridal
> Perhaps "ironically" _goto is a good feature_ only because it's criticized
> enough so that it's typically used correctly?

there are no good/bad features, there are only good/bad programmers.

------
Marazan
Author states that Big O notation is to measure algorithmic complexity, that's
simply not right. It is to measure runtime - a completely different metric.

~~~
svisser
The article states that Big O notation is used to analyse algorithm complexity
and that's correct - it's to analyse the time complexity of an algorithm.

~~~
srdev
It can also be used to analyze space complexity. So generically saying
"complexity" is also correct.

