Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Spartan Programming (codinghorror.com)
50 points by bdfh42 on July 8, 2008 | hide | past | favorite | 72 comments


Interesting concept. Certainly struck a nerve with me.

I've found one of the easiest ways to minimalize my code is to require less code.

How? Better program design. Tighter structure, more efficient algorithms to accomplish to same thing. Reverse the order of two functions and suddenly that conditional is no longer necessary. Adding a parameter to a function may make it larger, but useful in more places. Only handle 90% of cases and send the outliers off to custom-land. You get the idea.

It then becomes a whole lot easier to minimalize 50 LOC than 100 LOC.

[EDIT: The timing of this post couldn't have been better. I struggled with some pretty complex code all day yesterday. I woke up at 5 a.m. and thought, "Just switch the order the functions are called." I did and it simplified beautifully, working perfectly, and eliminating about 50 LOC.]


This can also sometimes be achieved with math. For example, I was writing code that handled velocity and animation based on the arrow keys. The character had to change animations depending on 8 directions - nesw & diagonals. At first I had an ugly if-elseif statement handling all the keystates. But then I realized I could come up with a cool scheme to simplfy and beautify the code. I also needed to solve the problem of having a constant max-velocity that stayed the same even when going in a diagonal.

        //which direction to go
	dirX = 0;
	dirY = 0;
	if( keystate[SDLK_LEFT] ) dirX--;
	if( keystate[SDLK_RIGHT]) dirX++;
	if( keystate[SDLK_UP] )   dirY--;
	if( keystate[SDLK_DOWN])  dirY++;
	
	if( dirX == 0 && dirY == 0){
		//stand still
		heroVelX = 0;
		heroVelY = 0;
		
		heroAnimation = heroStill[oldHeroDirIndex];
	} else {
		//run in a direction
		angle = atan2(dirY,dirX);
		heroVelX = heroSpeed * cos(angle);
		heroVelY = heroSpeed * sin(angle);
		
		oldHeroDirIndex = (dirX+1)*3 + (dirY+1);
		heroAnimation = heroRunning[oldHeroDirIndex];
	}
	
	//step to next frame in animation
	if( ++heroAnimation->curFrame >= heroAnimation->numFrames ) heroAnimation->curFrame = 0;
	
	heroX += heroVelX;
	heroY += heroVelY;

-removes conflict from pressing left and right or up and down at the same time -uses a formula to calculate which animation to use, instead of using switch statement -uses a little bit of trig to keep max-speed constant

Programming like this is very rewarding, if only for the feeling you get when it works. I guess this isn't really a big deal to any of you, but this sucker used to be 70 repetitive and buggy lines, rather than the pretty 29 it is now.


One of my most productive days was throwing away 1000 lines of code.

--Ken Thompson


I like a lot of the concepts, but I'm dubious about terse naming. The long e-mail example simplifies "msg" to "m", for example. Now, while I would agree that veryLongVariableNames can get tedious and annoying, and while I'm also in favor of standardized generic loop indexes (i,j,etc..), I think this may go to the other extreme. I could see going from "message" -> "msg", perhaps, but I'm not sure "msg" -> "m" is a win. Why is saving two characters that important?


The Pike essay linked elsewhere in the comments puts it this way:

[snip] Finally, I prefer minimum­length but maximum­information names, and then let the context fill in the rest. Globals, for instance, typically have little context when they are used, so their names need to be relatively evocative. Thus I say maxphysaddr (not MaximumPhysicalAddress) for a global variable, but np not NodePointer for a pointer locally defined and used. This is largely a matter of taste, but taste is relevant to clarity. [/snip]

One could argue that the variable m, within the context of that method, is easily understood as a Message.

All that said, I'd prefer msg--two extra tokens per line isn't really going to clutter the code, and likely make it easier for other programmers to understand what's going on.


I could see going from "message" -> "msg"

There are dozens of possible abbreviations for "message" but only one correct spelling of "message". Abbreviating a word forces the reader to remember both what it is, and what its abbreviation is. One of the seldom-mentioned gifts that java has given the world is killing the horrendous C practice of insane abbreviations. I can't tell you how many times I've had to look up itoa(). It just wouldn't stay in my brain. But I guess it keeps my teletype's ink from running out too fast.


I'd upvote you twice if I could.

You don't even have to look at the screen if you follow a consistent rule of not abbreviating.

If you do abbreviate, does your buddy use the same scheme? Consistently? Why optimize for typing speed when memory and "variable lookups" are the bottleneck?


While it's true that there are many abbreviations of "message," surely some are inherently easier to cognitively process than others. Compare:

"eag", "sg", "msa" -- randomly chosen letters (in orig order) vs.

"age", "meg","mag" -- purposefully misleading substrings, vs

"msg", "mes", "mssg" -- closer to semantic content

My point is not so much to quibble, but to suggest that we may have something to learn from cognitive psychology on the subject. Does anyone recall the recent study that showed taht as lnog as the fsrit and lsat lteters of wrods wree the smae, poplee culod gnerlaly mkae out the manenig? (whew! hard to type that.) This suggests to me that at the very least, retaining the first letter is important, and possibly the last as well.

My qualm about single letter variable names is that they may require more "lookup time" (on average) because so much of the semantic content has been stripped from the name. I'm curious if anyone has actually researched this? I'm only theorizing here.


I think once you load the context of any given program into your head, the variable names become irrelevant. Programming on my TI-83 in elementary school, all I got was A-Z, but when I spent every class making a checkers game in TI-basic, I never forgot that C was the X-dimension of the current selection, Q was the number of black pieces left, and matrix [B] was the last saved game


I go for abbreviations that are pronounceable - but I do try to abbreviate. My soft rule is six characters or more.

I understand your concern, but I consider horizontal space a resource. There's a point at which long names just add to the clutter and get in the way of the concepts.


I found that terse naming has several advantages:

* Most obvious one: it's easier to type. Especially in Erlang - the language I write most code in these days, where variables must start with a capital letter. The fewer times I have to reach for the shift key to make a capital (if I used CamelCase) or to make an underscore, the better.

* Short names help you fit code in 80 columns.

* Terse variable names force you to pay close attention and make no assumptions about what's inside that variable. Also, I found that a short name + an inline comment to explain what's there works better than an overly descriptive name.

I do the same for function names too, with the exception of public functions -- those get proper names. So I'd call the public function `integer_to_string` but if I were to use it only privately, it'd be `itostr`.


itostr takes time to mentally process (or pronounce), if I had to read your code. integer_to_string takes no effort and can't be confused. To make the point:

itostr - itoaster?

itostr - int to stream?

itostr - internationalize OString?

itostr - The string representation for indium tin oxide?


> itostr takes time to mentally process (or pronounce), if I had to read your code. integer_to_string takes no effort and can't be confused.

The idea is that the first time you saw `itostr` you'd do something like this:

* Jump to the definition of `itostr` (a couple of keypresses in Emacs).

* Read the Edoc comment for the function ("Convert integer to string").

* Jump back to where you were (one key combo).

Sure, if the function was named `integer_to_string` you wouldn't have to do the above. But this was a trivial example. Many functions would take more than three words to describe what they are doing precisely, especially when your functions are granular and do exactly one thing. This can quickly lead to long lines, which I find hard to read and ugly.

At the end of the day, it's all down to personal preferences and style. I like things this way, and you like them the other. It's just the way it is, and there are arguments for both approaches. For me, the benefits of terse naming outweigh its disadvantages, and the advantages of long descriptive names.


I guess, to me, if those lookup keystrokes resulted from saving a few keystrokes in a function name, and developers might look it up many times, it doesn't seem like you come out ahead. But, as you say, to each their own.


Personally I would rather have descriptive variable names such as "this_goes_to_db" or something to that effect. But then the question is how long is too long? and by me practising that I find my code it readable but at the same time uneccesarily verbose.


No, it's Iguana to Stretchpants.


As an emacs user who is only slightly familiar with vim, I've wondered whether this may reflect a weakness of vim. With emacs + SLIME or even just dabbrev (included in emacs by default), you don't type the whole word anyway - you type an abbreviation and expand it. Does vim not have an equivalent capability? If not, that would explain why pg thinks abbreviated names are a win.


Word completion is available and it would be possible to implement something like this in vim script. But, as far as I know, vim users don't typically use this; it doesn't generally fit in with the philosophy of the editor.


So you let the philosophy of your editor get in the way of writing good code? Sounds like a plan...


lol. No, you let it get in the way of extending your editor. You can write good code without abbreviation expansions. Vi hardliners would see it as unnecessary feature creep as it is rather complicated and doesn't fit in as well as other commands.



yes, there's abbreviation expansion. i do well enough with keyword completion though


I agree. With IDE auto-completion, measuring characters become less meaningful. Using tokens would be better.


IDEs can't read your code for you. So VeryLongIdentifiers should still be replaced with something that's easier to scan.


As far as I'm concerned, VeryLongIdentifiers are easier to scan (for small values of "very"). VeryLongIdentifier is actually a reasonable length, though any more than this does get unwieldy.


Hmm... It's amusing to me that spartan programming has a name like this. To me, it's simply about efficiently using resources. I am reminded of the time I asked Miguel de Icaza about design patterns. He said he never saw the point of having them in a book because people would just invent them when they needed them.


I agree, this could have been described as good programming. :)

There's a quote by a well-known author whose name I cannot recall at the moment, concluding a lengthy letter: "Forgive me for writing such a long letter, I did not have time to make it brief."

One of the reasons that I've almost invariably been more productive than most other coders I've worked with is that I write a lot less code than they do. Another side effect of that is that my code has never been a bottleneck, so I've never had to do any further optimization.

It also takes significantly less time to debug :)


"I have made this letter longer than usual, because I lack the time to make it short" - Blaise Pascal (French mathematician)


A point for having design patterns in a book is that everyone can use the same names for the same patterns. That helps somewhat in communication.


I'm not convinced patterns are that complicated.


They are not complicated. They are common - that's the whole thing behind them. They are so common that it's useful to use same names to talk about them because you will see them over and over in the code of different persons. And certainly defining them exactly and having good implementation examples might also occasionally help a programmer as we're not all perfect ;-)


Some kill-OOP rules in addition:

* Rewrite your code so that you have more static functions that only deal with their arguments and possibly return something (no access to global data or class data). In other words, localize and isolate parts of your code that have minimal side effects. Such parts would be easier to write, debug and maintain.

* A class that contains only ctor/dtor and one method should be rewritten to a single static function.

* A class that returns a reference to a single instance (singleton) is a sign of terrible design. Rewrite.

* If the amount of lines in your unit tests is comparable with that of the original code, it means you expose too much in your interfaces. Rewrite your classes by making their interfaces smaller and simpler. This will reduce your unit tests and make them more meaningful as well.


Yeah, there's a few things here that set me on-edge.

For one, early returns means you have to go poking around for the returns. That highly-indented style is a much clearer "bad code smell" that makes it easier to clean up the routine.

And their "reduce curly braces" is just SCREAMINGLY awful. If you have a one-line block, you STILL want curly braces around it. The eye just loves to skim over this mistake:

if (condition)

    i++;

    applyTaxes(order);


I am a reformed indentaholic. Now I adore early returns and use them often. If you follow the first cardinal rule of OOP, keep methods short, then early returns are easy to spot.

This kind of short-sighted nesting annoys me:

  function foo(bar) {
   if (bar == 1) {
      doThis();
   } else {
      var x = doSomething();
      if (x == 2) {
         doSomethingElse();
      } else {
         var y = doThat();
         // ... more code here ...
         for (var m=0; m<y; m++) {
            doBoo(m);
         }
      }
   } 
  }
Yuck! I much prefer:

  function foo(bar) {
     if (bar == 1) {
        doThis();
        return;
     }

     var x = doSomething();
     if (x == 2) {
       doSomethingElse();
       return;
     }

     var y = doThat();
     // ... more code here ...
     for (var m=0; m<y; m++) {
        doBoo(m);
     }
     return;
  }
The difference here is not drastic in this short example but in functions with 50+ lines of code it's very clear.

The average person's short term memory is only 5 - 7 items deep. 3 of those slots are already occupied with what they are doing, so you have 2 slots left, maybe 4 slots if they've had their coffee and are really paying attention. Every time you make a code indent, whether it be a loop or nested condition, then you're asking the reader to remember that context switch. Once you have indenting more than two tabs deep then you're in hazy territory where the reader is unlikely to remember all of the conditions that lead to those lines of code being executed.


One other heuristic I follow that's not mentioned in the essay is: If one branch of an "if" statement is much shorter than the other, change the condition so that you can put the short branch first.

My reasoning is also involves memory: While the reader is reading the first branch, he needs to hold the possibility of an "else" branch in his head, along with the condition. By minimizing the length of the first branch, you minimize the amount of time he needs to remember this.


I disagree. I think it depends on context. Generally, one branch is the more common case, and I prefer to put that one first. It's the main flow of the code.


Yes but, say you've got a longish chunk of code with several error conditions; you come across an 'else' after a lot of closing braces. What the heck does it match with? What other error conditions need handling and got forgotten in the forward rush of common cases? Error handling is easy to forget; the common case remembers /itself/ you might say, and can be put off for a moment while you get the clutter out of the way. My MO, anyway.


I stress the common case for the same reason that when I construct sentences, I try to put the main point up front, and save the side comments for later. Example:

"I ran into Bob while walking to the store."

"While walking to the store, I ran into Bob."

The point I'm trying to get across, in this case, is that I ran into Bob. I try to do the same with code.


Does this really throw people off? I see code like this from time to time, and my brain mentally ends that first "if" statement right at the ; after i++. It even did when I was reading the above example.


Yes, it throws people off, especially when it's buried in the middle of other code.


I agree with you. I dislike returns scattered throughout a single method. Makes tracking them a bit difficult.


does your editor not highlight returns properly?


Most of my editors do but I end up doing a lot of work SSHed from other computers without windows so I end up doing a good amount of work in raw text files.

Also, it's easier for me to process a single return line than to take a look at the logic behind each of the other returns. If I end up returning an undef at the beginning of a function and then edit it a few months later I may just skip that step and not see where that case is being handled.

It just seems a better way to structure my code, for me at least.


I do 99% of my work in a terminal and usually over ssh. that's why I use Emacs.

Since you use a single return statement I'm assuming you keep track of the value to return by assigning it to a variable. How is keeping track of multiple assignments to the variable you will return easier than keeping track of multiple return statements? I've always found it easier to keep track of return statements since they're highlighted.


Yea I would prefer to use Emacs but I don't get access to that all the time. Thus my returns don't always get highlighted so it's as if I am using a basic text editor.

In addition, I tend to read code from the bottom up (literally), so I end up tracing that return value upwards through the code. Also, having some simple branching let's me quickly identify what each block of code is doing.

Clearly there is not one way to do things and there are times where it's significantly easier to just return in the beginning but for more advanced cases I like having variables keep track.

Also, when there are many steps involved in the computation, it's easier to keep track of the individual pieces and then combine them at the very end than have that logic dispersed throughout the function.

I have also recently started favoring returning hashes so in case I ever need to add some additional results, backwards compatibility is preserved and I only need to work on one return statement.


One thing I think is missing from this exposition is a discussion of tradeoffs. You can't blindly apply these rules, because they're to some degree contradictory. Example: you can make all your variables private to better "minimize accessibility of variables", but that requires elaborate accessor methods which increase "vertical complexity".

Which is the right path? It depends, and the solution is a sense of aesthetic. That, not the rules themselves, seems to me what this techniques is all about. It's really just a different, more specific way of rephrasing PG's essay about taste in software engineering.


I've programmed in many styles until I read Martin Fowler's "Refactoring Code" and it taught me the best style is simple READABILITY! The only thing that matters is "Is it reabable"? For example, don't be afraid of replacing 5 lines of code with a descriptively named function whose only content is those 5 lines of code. It may seem pointless to have a function that is only called once but if it makes the code more readable then do it.


Extracting a single-use function can get in the way of future refactorings that cross the function's boundary. I inline single-use functions whenever I see them, and almost always after that, other, more useful possible refactorings become visible. This can happen with multiple-use functions too, but much less often.


"Example: you can make all your variables private to better "minimize accessibility of variables", but that requires elaborate accessor methods which increase "vertical complexity"."

No. don't create accessors and don't access the data in private variables outside of the scope in which they are defined.

Accessors, in most cases I have seen, simply outright defeat the entire purpose of having a private keyword to begin with.

EDIT: replaced CAPS with italic emphasis, in order to reduce decibel level.


Chill. That was exactly my point. Clearly you have a strong aesthetic sense. :)

You might want to lose a few of those CAPITALS though. Someone might think you were screaming about software.


Sounds quite a bit like what Rob Pike has to say about programming style:

http://www.lysator.liu.se/c/pikestyle.html


Nice reference. Thanks.

I don't expect you to agree with all of them...

I don't.

Even index variable names should be meaningful. Just drop down a couple of nested levels, use "i" again, forget to make it local, and see how much fun you'll have.

But I love the disclaimer at the beginning. Made me feel a whole lot better, no matter what he said. I'll have to remember that one.


Not that I like it, but a lot of algorithms in papers use shorthand subscripting so they can write equations that aren't 300 columns long. Say you're doing neural network stuff like w_1,w_2,w_3,...,w_n for weights, inputs, outputs, etc.

If you're reproducing those algorithms in a program, you could do worse than copying their naming structure so that the correspondence between the program and the paper is as obvious as possible.


Code Complete discusses this. When only at a single loop level, follow convention and use i.

However, as soon as you drop to 2 iteration levels, you should rename your index variables so that they are meaningful. Do not use j or k.


Right. Problem is that it probably won't be you adding the next level. Here's what really happens:

- You code a proper loop iterating with "i".

- Programmer 2 adds 50 LOC.

- Programmer 3 adds 100 LOC.

- Programmer 4 drops a level, but since he doesn't see the whole routine (too many LOC for one page or one screen), he uses a global "i" again. But nothing breaks because "i" is always 0 in both levels, both in testing and in production.

- Data changes and it breaks. You get the call, "What's wrong with your program?!?"

That's why you should have never used "i" in the first place. Call it "SkuNdx", "RecCtr", "ClassNbr", "Ctr1", anything but "i".


Wow, different strokes. I wouldn't want anything to do with code that used variables like "SkuNdx" over a simple i, j, or k.

That said, ideally I prefer using languages with each or foreach or a similar construct. If you want to iterate over a set of objects, you should only have a variable for an object. Who cares about the index?

    foreach( sku in skus )
Or

    skus.each do { |sku| .... }
If you reserve i, j, k for cases where you actually need an index, I think you'll find that i has even more meaning: i is an index and it's presence means the index is actually important. That's a lot of meaning for a single letter.

With respect to what other programmers might do, I think "defensive" programming is a waste of energy. A bad programmer will screw up your code regardless of what you name a variable. The best you can do is avoid working with such people. If that doesn't work, write unit tests. Seriously, at least then you know that a bad programmer broke your code before you get a call.


With respect to what other programmers might do, I think "defensive" programming is a waste of energy. A bad programmer will screw up your code regardless of what you name a variable. The best you can do is avoid working with such people.

Exactly. I had a really hard time working with normal programmers... I would explain why I did something that wasn't brute force, they wouldn't get it and their way also worked... so the codebase was one big festering pile of shit.

Now I work somewhere where my coworkers are arguable some of the best programmers in the world... and I don't have this problem.

Life is too short for bad jobs. As a programmer with a clue, you are a very hot commodity. So go somewhere where you won't want to kill yourself at the end of the day. (Or start your own company, of course. Then you really get to pick who you work with.)


The best you can do is avoid working with such people.

If only. I actually laughed out loud when I read that. Exactly what we just talked about over lunch. (How can someone be so smart, personable, and pleasant, and still write code like that.)

OTOH, another great argument for starting your own.


Often times a meaningful name suggests itself as the other replier to this comment pointed out: "for each post in posts..."

One has to wonder if that function in quest really has to be >= 150 LOC and absolutely can't be broken up. One has to wonder whether aftter programmers 1, 2 and 3 have worked on it, each adding 20-100 LOC, that the function doesn't really have 3 functions within it. In mainstream languages functions/subroutines/methods are like paragraphs: they should express one thing or action.


Index variables with names like i, j, k are meaningful, through convention. Also, index variables are usually used to index things, and long names clutter up those expressions.


This is madness.


No.. This is Sparta!


Geez. Please tell me these pun threads are not spreading here from reddit.


Quoting is hard, let's go fight in the shade!


Madness? This is programming.


Progamming? This is Web surfing!


I guess Adobe would agree with you, I mean, today's Adobe.


This reminds me my 9th grade "Computer Science" class. Mr. Brown taught us some of the same principles with Applesoft Basic. I'm always surprised when I work with other programmers who don't think it's important to write spartan code. Which isn't to say I haven't been guilty of the same thing a time or two - but this article reminded me of my first exposure to programming in the 9th grade. Here's to you, Mr. B.


I now have a name for my coding style! The email-send-function-case-study he links to pretty much follows the way I write code (when forced to write Java here at work).


Hot damn I've been practising Spartan Programming for years before I read this article. People thought I was crazy then :-)


Isn't this a fanciful expression to label DRY programming?


Uh... More like DRY is equally a new, fancy acronym given to a whole bunch of techniques that have been in use for years. This stuff isn't new at all, it's just experiencing something of a renaissance after the dark ages of the late 90's Big Design Madness.

The "Spartan Programming" stuff Atwood talks about is focused mostly on C, Java and other low level imperative languages. "DRY" tends to be most used in the context of web development. Other than that, I agree: they're mostly the same idea.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: