Follow Up to Crappy Programmer Thread

nostrademons · on Nov 24, 2007

I'm curious: how many of those single exit/entry bugs were caused because a function mucked with global state?

Again, I haven't seen this problem in the wild, and I have trouble seeing how it could exist. My functions tend to be actual functions - they don't touch state outside of the function (and lexically enclosing scopes, for closures) - and so if a function exits early, it just gives the wrong answer. This is caught immediately by the unit tests; there's no place for bugs to hide and cause problems later.

The one exception is UI-intensive code, which often requires state because users expect their UI to change in response to distant changes. For this, any "hidden" state is clearly documented and included in the tests, just as if it were the return value of the function. All other destructive updates touch the UI, so they're fairly quickly visible when screwed up (and I have individual tests for subwidgets, that test the GUI components in isolation).

I see a lot of the tenets of structured programming as holdovers from when it was okay to mutate variables. It's state that's the true cause of bugs, not single-entry-single-exit.

edw519 · on Nov 25, 2007

Good point. I failed to point out that a lot of this history was in technologies where state was often global (COBOL, FORTRAN, BASIC). How times have changed.

gruseom · on Nov 25, 2007

We have a backlog of 495 bugs on the bug list. The original cause of about one third of these was [...]

How could they know the cause of bugs that hadn't been fixed yet?

nradov · on Nov 26, 2007

The single entry issue has apparently been resolved; I can't think of a single modern language that allows calling into the middle of a routine. But it's interesting that after so many years there is no general consensus on single exit versus multiple exits. The PMD static analysis tool for Java will (by default) generate a warning for multiple exits http://pmd.sourceforge.net/rules/controversial.html . In Steve McConnell's book "Code Complete, Second Edition" http://www.cc2e.com/ he recommends minimizing exits, but using multiple exits when that improves reliability. Since he has reviewed work by thousands of developers using many languages and styles his recommendations carry significant weight. I understand that Martin Fowler also makes a similar recommendation in his book "Refactoring: Improving the Design of Existing Code" http://martinfowler.com/books.html#refactoring , although I haven't gotten around to reading it yet.

Based on my own experience I do it either way depending on what keeps the code simpler and easier to understand. One common problem with using a single-exit design is that it often forces you to use a mutable variable, or at least separate variable declaration from initialization. For most of the code I write now I force myself to use immutable ("final" in Java) variables in 95%+ of cases and have found this delivers huge benefits in reliability and maintainability. Managing state well is just as important as managing control flow. I haven't seen multiple-exit designs cause any later maintenance defects that a single-exit design would have prevented.

I'm not sure what problems edw519 has seen over the years, but perhaps many of the maintenance defects he saw were caused more by long routines than by multiple exits. If you keep most routines short enough to fit on a single screen that makes it easy for maintenance programmers to follow the flow regardless of the number of exits.

edw519 · on Nov 26, 2007

"...keep most routines short enough to fit on a single screen..."

Absolutely.

What invariably seems to happen is that 30 lines of code turns into 300 in a few years. Seems like most enhancements are inserted "black box style" so as to not "mess around with that which we don't understand and don't have enough time to learn".

Goladus · on Nov 25, 2007

I very much enjoyed the discussion, I'm sorry if it got heated. In any case, I think single-exit is fine for a corporation that needs to use clearly-defined, easy-to-follow policies. It's a reasonable baseline.

However, there are cases where trying to workaround multiple exits does lead to more errors. A function that returns 'right' 'left' or 'both' is probably better off having 3 returns rather than assigning the result to a local variable and carrying it to the end of the function. You might declare the local as the wrong type, you might forget an 'else' and accidentally overwrite the return value, etc. "No early exits unless your function has no side-effects" is a lot less clear to most programmers.

Of those 495 bugs on the bug list, how many were caused by broken side-effects?

edw519 · on Nov 25, 2007

No need for anyone to apologize. We should all have pretty thick skin here.

My only regret: I took a 5 minute break and posted a top 10 list for a few chuckles. I ended up spending 2 hours here and didn't finish my day's work until after midnight. Pretty soon I may need a 12 Step program for news.ycombinator.com.

nostrademons · on Nov 25, 2007

There's noprocrast. Lifesaver (well, timesaver) for me...

edw519 · on Nov 25, 2007

No need for that. I can handle it. Sure, I can. Really, I can.

mynameishere · on Nov 25, 2007

About half the bugs I've seen were caused either by poor variable naming or single entry/single exit violations

I honestly can't remember seeing a bug caused by either. Sorry. Obviously, experiences vary quite a bit in this respect. From what I've seen, the most common causes of bugs are:

1. Off-by-one errors. 2. Platform incompatibility. 3. Allocation issues (C/C++) 4. Shared state problems. 5. Not releasing/closing resources.

Function returns? Variable names? They just don't figure.

edw519 · on Nov 25, 2007

Single entry/single exit refers to ANY process, not just functions. Don't underestimate the hell who can go through with poor variable naming. Case in point:

2 months ago, a client needed significant changes to a major process in their app. (I had 2 weeks.) This process was a BASIC subroutine called by 16 other processes. Now get this: It was 2800 lines of code with one entry and 16 different exits. It was written in 1991 and had been modified 68 times before I came along. Here are a few of the variables: A, AA, AAA, AAAA, B, BB, BBB, BBBB, C, CC, CCC, CCCC, INFO, FORM1, FORM2, FORM3, HOLD.FORM2, OLD.FORM2, ORIG.FORM2, STUFF, DUMMY1, DUMMY2, DUMMY3... You get the idea. I spent 4 days renaming variables before I could start refactoring. There were some variables I never did figure out. I rewrote the program (550 lines) and hit the deadline. I uncovered over 20 undocumented bugs. I wonder what it will look like 5 years from now.

Everything I write new is either Javascript or PHP. Now that my startup is ramping up, soon I won't have to deal with any of this old stuff any more.

(Aside: I wonder how much production code in the world today is over 10 years old. And how much has been modified over 10 times?)

Tichy · on Nov 25, 2007

Sounds as if multiple exits were the least of the problems of that code.

edw519 · on Nov 25, 2007

Welcome to my life.