Hacker News new | past | comments | ask | show | jobs | submit login
How to reduce the cognitive load of your code (chrismm.com)
304 points by ingve on Mar 29, 2016 | hide | past | favorite | 232 comments

There are so many similarities between writing code and writing English.

- Thinking of paragraphs as functions with one purpose

- keeping sentences short to reduce load on working memory and increase comprehension

- create visual breaks to help the reader by grouping common stuff together as mini-functions

- reduce intimidation factor of reading by removing convoluted stuff

- remove cognitive noise (dead code, unnecessary comments, variables declared out of context or too soon etc.)

- keeping terminology consistent across the code (domain language)

- not using double negatives e.g a = !notLocked

- using automated systems to simplify expressions (e.g weird boolean conditions http://www.wolframalpha.com/input/?i=a+%26%26+b+||+c )


Unlike English, it is less difficult to have a program reorganize your code to make it more readable based on a set of cognitive principles.

But, beyond the readability and understanding of a function, we should also learn from other engineering fields. For example, system thinking helps tremendously in organizing code if the cost of refactoring code wasn't so risky in dynamic languages.

I agree with everything you said. As an old Perl guy, I found it hilarious that Perl has a reputation for illegibility, while I find it easier to read than most Java, because in Perl I can express my intent, and in Java it's buried in the noise.

One cognitive problem I've not yet found a good solution to:

So I have a high level routine to, say, sort the items in a display. Said sorting has some cascading effects, so I'll have a high level function that calls some mid level functions that call some lower level functions.

But at some point I get a name clash between a high level function and a low level one. Like...I'm calling the high level sortDisplayedItems, (or sort_displayed_items) and there's a low level function that ACTUALLY sorts them (as opposed to handling the other bits). Sure, I can give it a distinct name (e.g. sort_items), but when skimming the code, it's unclear which is which. If I have to mentally parse the code to understand what is happening, I've failed at readability/maintainability.

How have others handled this?

> I'm calling the high level sortDisplayedItems, (or sort_displayed_items) and there's a low level function that ACTUALLY sorts them (as opposed to handling the other bits).

You're absolutely right: naming things is hard. When I was writing Scheme long ago [in university], our convention was to name helper functions with `-helper` in the name:

Now that I write mainly in Python, the convention seems to be to use underscores to indicate things that you should ignore. Usually that's done at a method level, where your consumers are calling the object's public API (sort), so the object's namespace helps reduce such name collisions. The addition of triple-quote docstrings for method comments helps a lot at clarifying purpose as well:

    class FooSorter(object):

        def sort(self, foos):
            """Sort an iterable of Foo objects"""
            # ... sanity-check inputs ...
            self._sort(foos)  # do the sort operation

        def _sort(self, foos):
            Sort things we already know are Foo objects.
            This is a helper for sort(), which already 
            sanity-checked our inputs.
            # do the actual sorting

Wrt Perl being readable / legible; I think most find Perl hard to read because there are 3-4 different ways to do the same thing. And many of the more "advanced" ways of doing things rely on short, terse single characters that act essentially like magic and behave differently in different situations. At least this is what I remember as I climbed the Perl ladder.

Contrast this with say, Python, where readability is highly valued and they have a one-way-to-do-certain things mentality, like sorts for example, means code is generally easier to grok, debug, and extend.

Thus, readability becomes especially important when you have to deal with someone else's big ball o' mud and you need to fix a high stress, high visibility, lines down issue at one in the morning because the original programmer is long gone, didn't leave any comments in their code, and decided to play "look at me, I'm so clever with how I use this operator", but forgot to account for a certain scenario that manufacturing decided to roll out the other day without telling anyone. Can I get an amen?

I won't disagree with most of what you said (though I will say that having one way to do it does NOT mean that said one way is particularly readable/maintainble - looking at you, Java).

That said, for every time I've hated that someone got terse and clever, I've loved that I wrote something that _read_ well, particularly when returning to previous code I don't remember. Coding is like a language, you can use that expressiveness to be terse, to be rambling, or to be clear. Learning to do so is a skill that has to be learned (cost), but can be more expressive (benefit). Languages (or patterns) that restrict that reduce the cost, and reduce the benefit.

Consider, for example, that Python takes great pride in it's explicit readability, and yet it chose "lambda" as a keyword (making no one happy outside of higher math), that "def" was chosen instead of "define", and that one of the most powerful parts of the language (comprehensions) are highly prone to using secret knowledge. [] works differently than (), for example.

So, what you've described are all valid reasons that Perl has a bad rep...but I don't blame Perl for them anymore than I blame JS for the hideousness that is the DOM, or Python for the bad scripts that have been written in it. Regardless of the language, cleverness and/or terseness without regard to future people (including your future self) is just not a good practice.

  though I will say that having one way to do it does NOT mean that 
  said one way is particularly readable/maintainble - looking at you, 
  So, what you've described are all valid reasons that 
  Perl has a bad rep...but I don't blame Perl for them 
  anymore than I blame JS for the hideousness that is the 
  DOM, or Python for the bad scripts that have been written 
  in it.
Will you be willing to extend the same charity to the bad Java code you have seen? Or do you have specific complaints for your harsh comments singling out Java?

If you're looking for me to say I'm biased and occasionally say some exaggerated complaints....yeah, that's true. But as I still feel the basis has reasoning, I'll continue to answer:

Yes and no.

For the bad Java code I've seen, Java does not bear the responsibility. (And I've seen my fair share of bad Java code because I spent 5 years as a Java dev in a place that had a collection of bad coders and worse code). I've also (since) seen great coders with quality code (in Java), but I definitely retain some bias from my earlier experience.

BUT, when your language design promotes certain things, then yes, that language takes the hit. For example, some of the Perl options are just plain terrible (e.g. changing 0 indexing) and Perl takes the blame. Perl also moved away from those, recognizing the errors. Books like Higher Order Perl (and associated efforts) came out to promote best practices.

Java expressly sought verbosity, and thus takes the blame for verbosity at the expense of clarity.

Other practices are encouraged by the community. That's a gray area - it's not the language at fault, but...it's common. These can range from small and almost petty (Perl promoted underscores in variable names for REASONS, while Java promotes camelCaseBecauseIGuessSomePeopleLikeToSquintToParseVariables.) Not really the LANGUAGE fault, but definitely an expectation. (and one I've grown accustomed to in the name of working with others). Other problems are larger (Kingdom of Nouns) - still not the Language fault, but you can expect it if you're dealing with the code.

Many a time I've tried to trace through some Java code...weaving through "Impl" classes and factories, trying to find where some piece of logic is implemented...only to find myself in an empty class definition. That's a fault of both encouraged practices and poor practices.

When people complain of poor Perl code, it's usually code that was either written by someone that didn't do that as their main job, code from newbies, or code that has changed hands repeatedly with no one trying to make it maintainable. Most any code will suffer in those conditions. Java, thus far, has had the worst overall quality of code in "professional" collections that I've experienced. But then again, I've really only seen code in any quantity in 3 or 4 languages, so....anecdotal experience is anecdotal and subjective.

(I have to give props to Python here...while I have a few nits about a few things, and I'm not well-versed enough to discuss code quality of anything I skim, I've seen a lot of code that was probably "bad" but nonetheless avoided some mistakes that are commonly made in other languages, particularly with new coders.)

One approach that can help is to name things based on what the functions actually do.

   validation Logic
   sortDisplayedItems(); //Actually sorts items.
This can be harder to maintain, but really long names end up a useful code smell.

I find it a bit...incorrect.

I mean, your above code LIES. If I call validateSortDisplayedItems, I don't validate, I validate AND sort. Plus, what do you do if you have "validateItems" and "sortItems", and then one function that calls them each in turn? call it "validateAndSortItems"? Yuck.

Depends on the Validation Logic. I like the style that basically works like this:

   sortDisplayedItems(); //Actually sorts items.
AKA validate means try and make valid, not verify that data isValid. So, you can't just do if(isValid) sort; the bonus is unrecoverable errors end up at leaf nodes vs. the happy path.

At the high level, your function might be sortClicked, which can then respond to that by calling a wide range of functions. (userCanSort,SortData,UpdateDisplay)

PS: I find the validate > correct loop is generally the important and error prone part of code, so I give it priority. The happy path where everything works is more or less an addendum.

The problem is that the validateSortDisplayItems is some sort of public function. The caller doesn't care about the validation, only that the items are sorted.

All kinds of functions everywhere validate their arguments; that doesn't deserve to be in their name.

The question is why you need a helper function? IMO, public functions are a special case and your generally better off using different naming schemes for internal and external functions.

Granted, if it's pure pass-through then reusing the name is a non issue. Foo(x){foo(x);} is not confusing. Foo(x){junk; foo(x);} is.

One option is Foo(x){validateForFoo; ValidedFoo(x);}

Alternatively for private functions JunkFoo(x){junk(x); foo(x);} And replacing JunkFoo with a meaningful name if possible.

Again though, public function Foo(x) should really call internal function Bar(x) otherwise it's really easy to couple things on both sides of an interface.

The computation represented by "foo" could actually not be contributing any new semantics to "foo". It could just be some boiler plate that is needed to extract and isolate various resources out of context "x", so that these pieces can then be passed to the low-level "foo", which knows nothing of the aggregate "x". "junk" could be some locking an dunlocking, or debugging, or preparing the display in some way or whatever.

If someone wants to know how to name the low-level helper which does the real work and the interface over it, we just have to take it for granted that the separation is necessary and that it makes sense for either function to have the name "displaySortedItems" or whatever.

If your stripping boilerplate then your operating on different things. Which seems more readable?

  SortGuiElement(){junk; SortUserNames(); junk;}

  SortNames(){junk; sortNames() junk;}
It might seem obvious and the code might be identical, but I see the second case very frequently in other peoples code.

Re-using an identifier with just a case difference is criminal.

You want:

  SortNames(){junk; SortNamesImpl(); junk;}
or whatever: SortNamesGuts(), InternalSortNames(), DoSortNames(), LowLevelSortNames(). Anything but just flipping the case of a letter or two in the "SortNames" identifier.

I agree that reusing the same name with different capitalization is bad, but I have seen this from several developers across many different teams and several languages. These same people often have Foo(x){junk; Foo(x,y,z); more junk;} sure if it's a pass though that's fine but if you have 10 foo('s) that all have internal logic then please come up with some actual names.

Amway IMO, SortNames() and Internal/Do/LowLevel/SortNames() are almost as bad because just looking at the names it's not obvious what's going on. Yes, visually they look different which helps and consistency can make things even more readable. But, even just SortNamesHappyPath() gives some idea of what's going on.

A convention I've seen in Java libraries and that I'm fond of is:

  public sort() {
    try {
    } catch (ex) {

  private doSort() throws Exception {
    // core impl

Make them separate functions each with their own local state or use namespacing. The problem sounds like too much state in a single algorithm. Break it down into subalgorithms each with their own state.

I always call the low level functions with some clear name related to the high level one by convention:

- sort_displayed_items_guts

- sort_displayed_items_impl

- ll_sort_displayed_items (ll == low level)

- do_sort_displayed_items

And I'd say that the most important similarity is that good code and good prose can't be written in vacuum. "Any fool can write code that a computer can understand. Good programmers write code that humans can understand." - Martin Fowler

Anybody who has written a book knows the feeling of slowly going crazy because you lose touch with how readers will take it. People writing long works in English have a host of tricks to overcome this. First, they'll have editors. Generally, they have more than one. They'll have sample readers, often quite a number of them. And then they iterate, going over something repeatedly to make it more readable. Writers I know will spend 2-3x the time in the revision process than they did in writing the first draft.

For me, the best way to get that same feedback is pair programming. I was recently looking back at a code base produced by a small team I was on a few years back; the 4 of us did it with pair programming and frequent pair rotation. It is a great code base, one I'm entirely proud of. Clear, readable, well factored, intellectually coherent, and with amazing unit testing coverage. I think that's because every line, every change had two pairs of eyes on it, which meant that we were constantly evaluating readability, constantly testing and reducing cognitive load.

I agree. I find myself making this comparison often. But, I would add a slight caveat to your point because writing takes many forms. For example, the comment I'm writing at this moment does not feel cognitively taxing (loosely speaking). However, writing a manuscript for publication feels very similar to writing code, at least to me. All of this is to say that I think technical writing is similar to writing code.

Based on your list of how you like to write English, I agree. But I think your list is way off-base when it comes to human languages.

The beauty of human language is that it allows us to express our individuality as humans. There are an infinite number of ways to write the same thing, and each author can have a unique style based on how they choose their words, structure their sentences, etc. This is fantastic, and is truly one of the pillars of a free society. Consider "Newspeak" in Orwell's 1984 - its entire purpose was to eliminate free thought by eliminating choices in writing and speech.

Now in computer programming, I think unique styles are to be strictly avoided. A Newspeak-ish programming language that minimized the ability to write "creatively" would actually be pretty nice.

Different guidelines for different writing. I will readily agree that this list would result in awful poetry, but then again, uniqueness and individuality would result in equally awful technical writing. So it's not that this list is off-base for human languages. Rather, it's off-base for art.

Wouldn't want a novelist to strictly follow that list, but I wouldn't mind a technical writer doing so.

Yes, with technical writing I (generally) just want the author to help me get the information into my mind with minimal cognitive effort. There are times where being poetic can help, such as selling me on a new programming model, but most technical writing out there is too long winded for my taste.

Refactoring is not more risky in dynamic languages if you have unit tests to back you up.

On the other end, on static language you will likely end up finding out that your initial architecture did not take in consideration this use case or this API, and you may spend a considerable amount of time fixing what would be trivial to do a dynamic language.

Also remember that the optimal width for text readability is around 4 inches, or 60 characters. So unless you have many levels of indention, stick to the 80 column rule.

Great. Now my English has improved but my coding hasn't.

Stop drawing parallels you aren't helping anybody! :D

Code clutter is much more consequential than it gets credit for. Poor formatting, inconsistent whitespace, snips of unused code, and misleading filenames are speedbumps (or worse, spike strips!) that a developer is going to hit every time they sit down to code.

And unlike bad abstractions -- which you can "learn" about and mentally model -- the friction of clutter is a constant damping to your productivity.

Building features in a cluttered codebase is like being asked to install plumbing in a hoarder's basement.

Understanding a cluttered codebase is like reading an advanced math textbook by candlelight that was written with broken typewriters by a group of modernist poets.

De-cluttering is the first thing you should do, and you should stick to it above all else. Before you add tooling, before you refactor, before you abstract and framework-ize, get rid of the clutter. That means format your code, and use a linter whose rules are based on widely adopted community conventions.

> Building features in a cluttered codebase is like being asked to install plumbing in a hoarder's basement.

Hah, yes! That's a great analogy. It doesn't mean you can't get the job done, it just slows you down every time you move around the code base.

What's a Linter?

It's a tool that finds things that aren't strictly errors in your code, but are likely to cause problems.

see also: https://en.wikipedia.org/wiki/Lint_(software) (this article refers to one particular (original?) lint tool, but there are numerous others for various other languages)

I used to think that a lot of bad code out there was made by lazy, incompetent programmers...

But then, after a certain job, I realized that this is probably not the case.

Now I belive that most bad code out thare is made by overworked and tired programmers in a rush to deliver something that works.

The main thing most programmers overestimate is their own working memory i.e the number of things you are retaining when writing a piece of code.

Problem is: working memory is short term. After a few days, you revisit that code and you realize how much it costs to load it all in your head again.

Basically, we need code reviews of code we wrote last week as opposed to code we wrote yesterday.

The code I'm working on now, and the code I worked on before that, was designed to be memorized. I suspect the only reason the team is productive is that they are working from long term memory most of the time.

The insidious things about this are twofold. First, it makes all new team members look like idiots. These other guys are getting work done, what's your problem? Our problem is we can't figure out wtf is going on without breaking things to do it.

Second, everyone doing work to fix the problem is a threat, because they are moving code that is memorized.

I honestly don't know how to fix this problem, I used to just walk away but I feel like I should be able to do better than that. Boiling that frog is a long term commitment.

Unit tests tell you what that small thing you did ages ago was meant to do, and what it does in different circumstances. If you have the discipline to thoroughly test your code(and the knowhow to not write brittle tests), it can really pay off.

Yeah people who rely on rote memorization tend not to be too keen on unit tests.

Thankfully the Old Ways are dying, but there are still pretty huge pockets of holdouts. Sometimes it can be hard to tell from the interview process where your prospective employer is at on the continuum.

Both [of the last two, big enterprise] places asked me about testing, and I wrongly assumed that meant they knew and cared. At the former they neither knew nor cared (in fact their dependency graph was so very broken that I couldn't write tests even though I wanted to, and I wasn't going to be able to unwind 250k lines of Big Ball of Mud in the time I had. Hated it). At the latter they care a bit, but are only beginning to understand what kind of trouble they've gotten themselves into.

>If you have the discipline to thoroughly test your code

This isn't the issue. I, and I suspect many others, simply don't have time to thoroughly test code.

It's the age-old issue of "Business doesn't rely on good code. It relies on delivering the product of that code." Up until the ship is both on fire and sinking - nobody cares about sailing a good ship. They'll settle for the shoddy lake boat and ride it out as long as it will last.

I believe even in the short term, tested code gets done faster than non tested code, simply because you tend to waste so little time in trivial mistakes when you test. But it takes time to get to that point, and you don't get to practice testing because it's a new thing and learning it takes time you simply don't have, it looks like it will slow you down.

The trick is to get off the boat before its final voyage. Or if you're really smart, the penultimate voyage, so nobody thinks you were involved.

I think that's one area where executives and salespeople have acumen that us poor plebs lack in spades. I'm not saying it's a good thing, I just think we get left holding the bag.

Walk away. Life is far too short to have to deal with that crap.

About a year ago, I came across an off-topic quip on Reddit that was supposed to be a humorous reference to the phenomenon where beginner programmers first discover they're unable to understand the code they'd written weeks or even days before. It made me realize that I haven't had a WTF moment like that in something like 5+ years, and I'd forgotten the phenomenon even existed.

Some time before, I set out to outline a disciplined approach for writing code that resembles the practices in the article and commit to following it, regardless of whatever strain of laziness I'm afflicted with in the moment telling me to just break the rules and let it slide.

Contributing to Mozilla in a time before GitHub was a big part of my coming-of-age story, and I credit exposure to the Mozilla code review process and a lot of the other good practices as the number one reason for this—as well as the source of my annoyance when I try digging into some project that I'd like to contribute to, only to find that the maintainers are basically flying the thing by the seat of their pants.

This is why programming is one of the hardest professions. You always carry your work home inside your head. Even doctors and lawyers don't have that much of a memory load (for example, in the case of doctors: medical records are mostly sufficient).

It's both. Some of my worst code was to look at someone else's code after a manager shouted "FIX THIS NOW!!!" with some kind of quick and dirty hack, and then never going back to get it done right. I'm pretty sure this happens independent of whether or not I am any good. :-)

When you're really good, you learn to FIX IT NOW while also not making a hack. Your code becomes flexible enough to handle that kind of change.

You can't anticipate every emergency change required to the codebase. Sure, you can make some intelligent predictions about certain points in the code and encapsulate possible changes but it's impossible in the general case. Trying to code to future possible requirements is often far more harmful because of over-abstraction than occasionally having to perform a quick hack and then fix it later.

Yeah. It's a balance between "build as little 'framework code' as possible;" and "build the framework as you go."

I say 'framework code' but it could be called structural code, or glue code or something, too.

That's assuming it's your code. Most of the time it's not your code. While one would hope that the rest of the team writes flexible code as well, that's not always the case.

Then settle for making the code better than it was when you started working on it.

Usually it's not my code :/

I'm not that good to pull that off with other people's code. :-)

Nah, you're good enough. If the code is bad, change your goal to "make it better than it was when I got here." That's something you can do.

Happens to the best of us! Also, it's very difficult to keep entropy at bay in something that requires so much abstract thought and effort as programming, especially with how broad software's reach has gotten.

I was having this dilemma on my walk home. I try to write good idiomatic Django code, and know how to make it efficient in the ways described in Two Scoops (a book about Django best practices). I have inherited an app, parts of which are exactly the opposite of Django best practices (thin models fat controllers being the most obvious one - the application's documentation says that's the philosophy , while Django encourages, and works better the opposite way).

So do I spend more time refactoring and bringing this application to a decent state, or do I just keep on hacking more crap on to it? No one except for me seems to give a shit as long as it seems to be working.

A rewrite is a different animal than writing new code. The problem with rewrites is the marginal utility is a diminishing return. Unless the code is really bad, or will require very frequent revisits.

I think the best practice is to just refactor the bits that you need to touch. At a minimum avoid leaving them worse, and in an ideal world leave each piece you touch better for your having looked at it.

Thats my plan, assuming i get the chance to.

> Now I belive that most bad code out thare is made by overworked and tired programmers in a rush to deliver something that works.

I would agree. But I think putting up with those conditions is a different sort of laziness, and agreeing to produce garbage is a different sort of incompetence. Looking back on the times I've done that myself, I deeply regret it.

Programmers have a lot more power to shape process than they realize. There is a giant deficit of good programmers right now. Few managers understand what we're doing anyhow, so if we say, "Nothing gets marked as done until the code is up to professional standards, which includes sufficient unit tests and a well factored design", we can frequently make it stick.

Sure, they will still push you to go faster, but they will always push you to go faster. I think there are better ways to respond to that. E.g., by redirecting their "go faster" energy to breaking units of work down into smaller lumps so they can do better scope control. Or by having them get feedback on new systems early and often, so their decisions about what to make next are informed decisions, not just executive fantasies expressed in bloated 300-page Word documents.

"so if we say, "Nothing gets marked as done until the code is up to professional standards, which includes sufficient unit tests and a well factored design", we can frequently make it stick."

Or don't even say it, just do it and don't tell the manager, while incorporating into your estimates.

Overall time to deliver working software with acceptable performance and bug count will still be faster, anyway.

While we do have a good amount of power, they still have the ultimate power in that they write the checks.

If that's how you measure power, the ultimate power is in the hands of the customers, as they're the ones footing the bill.

And a given check only has power over you to the extent that you let it. Most programmers will have little trouble finding a job if they have good professional networks or are otherwise willing to work at having options. If you are confident that a just-as-good job is easily available, you can be much braver in the one you have.

In my experience, most bad code is written by dogmatic cargo cult programmers that are more interested in writing code that adheres to their pet development philosophy or framework instead of programming to solve a problem in the simplest way possible.

Think we all got different experiences there. My main problems have been with people who insist on just churning out features in the most straightforward way without doing any kind of upfront thinking. You end up with reams of copy past coding, code duplication and so much code that it is very hard to keep track of what is going on.

Like e.g. if you are parsing a file, and it is complicated you can save a lot of time and give a lot more clarity if e.g. you use some form of state machine or look into a little bit of theory of textual transformations and representations.

I mean I have seen people managing a forrest of state variables and who have never heard about regular expressions or any kind of methodology on how to deal with text formats.

You end up with a crap load of messy unmaintainable code.

But of course it is also a problem when people turn every operation into a ceremony involving factories, command objects and any design pattern you can think of. In the Java world that seems like a common problem. I think my own philosophy aligns more with how Go libraries are written. They are pretty simple and straight forward. They use smart approaches were needed but don't insist on adding indirections, encapsulation etc everywhere.

I am jealous of you. It is fairly straightforward to clean up quickly written, first version code. (In fact, this is probably the best way to develop, if you don't shirk the responsibility to refactor it yourself.) But if you have an over-architected, obfuscated mess, where do you even start? You should appreciate your fellow developers more. Maybe buy them a fruit basket or something?

A fashionable framework on your CV will get you an interview, while claiming that the code was simple won't. It's as immoral as objective reality is.

Good point. It's similar to the fallacy of measuring coder productivity by "lines of code written."

Reminds me of the anecdote about Bill Atkinson: His manager required a "productivity" form that asked "how many lines of code did you write this week?" Having just refactored QuickDraw, with vast gains in speed and simplicity, answered "-2000."


That's quite poetic. But I assume you mean "amoral"?

Being a pessimist I consider it immoral.

Isn't that just another philosophy? Lets be more charitable. Engineers are given limited time to do any job (time == money). So they do what they can. Mostly on a budget.

The bad code I see isn't because someone was in a rush - it's because somebody spent a lot of time and effort to make it a mess. I swear 80% of developers don't know the difference between clear, elegant, readable, and modular code and a giant pile of mess. This is why I've stopped encouraging team members to refactor; the end result is most often worse.

Part of that is that any 3 developers will have 4 different opinions on what is clear, elegant, and modular, and what is a giant pile of mess.

Have you tried doing code reviews? If a refactoring sucks you should be able to catch before it goes into master(or however your main branch is called), and educate the programmer. Teaching our coworkers to write better code is part of our jobs.

The difference between two programmers one producing great to read/maintain code is not lazinesss, incompentence or one being overworked. It is discipline.

The disciplined developer will produce better code.

That's like saying the more fit person will run faster, regardless of how tired they are. Discipline will help you write better code, but it is far from the only factor.

I've seen code readability decrease linearly since the 90s. This makes sense since there were around 3-4 developers to do the work of every 1 today.

But, I think this is just a symptom of a lack of capacity and resource management.

Contrary to popular belief, it was actually possible to do iterative and incremental 'agile' development prior to 1999. Most engineers back then used the term 'waterfall' to describe a well-known anti-pattern.

There are many reasons for bad code.

Much of it is decent code that was then altered several times, without anyone taking a fresh look to clean it up.

But I'd guess the biggest factor is that most programmers can't tell good code from from bad very well, when writing it.

Where I work, all the devs are lazy and incompetent... So I have to rush and overwork to get something that works.

Get out of there as soon as you can.

...did you just call yourself lazy and incompetent?

Most of the bad code was made by very productive developers.

This leads to the question what 'productive' really means. If a productive programmer writes a lot of code that isn't maintainable at all, can you really call that person productive? Maybe... but the poor sob tasked with maintaining the code later, will certainly not be called productive.

Maybe productivity should not be regarded isolated from other metrics and/or certain style-considerations.

What is the true measure of productivity? If you include things like maintenance and other developers' time are they really being productive? There's individual productivity and team productivity. Also a quick one time job is different than something that needs to be maintained. Everything is relative to the goals.

One of my co-workers likes to say that 10X engineers produce tech debt 10X faster than the average engineer :) Definitely been true in my experience.

Wouldn't they just be doing everything code-related 10x faster?

It seems to make sense:

N * (debt factor) = debt

10N * (debt factor) = 10 * debt

Because productive developers write more code or because productive developers write worse code?

Perhaps being a "productive" programmer requires understanding the requirements well enough to quickly write working code, then moving on to the next problem. Personally, I usually find that it takes two refactorings/rewrites to go from initial working code to something that I honestly believe is understandable/maintainable, and I don't always have time to do that.

It's more subtle than that. People act as if developers are all a clean slate, which isn't true.

developers have personalities with differing values. That super productive developer may just not value the flexibility of the code due to his personality coupled with that developers personal experience.

That developer will absolutely shine in some environments and be abysmal in others since there are absolutely environments where flexibility isn't an overriding concern.

As you get more experience you learn to be more explicit about your decisions, but you'll still have your preferences and your defaults and it takes a certain threshold for you to move away from said preferences and defaults.

That's simply because the non-productive programmers don't write very much code, good or bad.

Less code is better code.

Not in the context of "I don't build anything."

I think the reality is a mixture of both, the lazy, incompetent programmers are under pressure to ship too!

I constantly tried to get the same problems (a "temporary solution") fixed in my previous job, but it was always put off until next month, as something more urgent came up. I wasted around 1/4 of my time cleaning data as a result, and that led to really low productivity because of all the interruptions it caused - I hardly ever got into the flow. Three years on, and I realised it was time to leave. It would have only taken me 3 -4 weeks to replace the crap code with something better, but instead my developer productivity was probably around 1/4 of what it could be for 3 years because of constant firefighting and dealing with unvalidated data going into the database and the firefighting / cleanup that it required.

So common! My friends gave notice. In his remaining time he rewrote the data-onboarding-and-sanitizing code for performance (something that there was never time to do) and reduced it from 2 days to 2 hours. Which probably had a larger impact than anything else he'd done in his entire time there.

It is probably about half. Some programmers are new, or don't care, or a mix.

Not all programmers are "good" programmers.

Like often with this kind of article it barely scratches the surface.

"null != variable" will confuse people is downright silly. People confused by this won't have an inkling of what any non-hello-world program does.

The rest has some validity, but it focuses on syntax and programming in the very small. It might take a bit of effort, but I can make sense of a tangled function (that's not an excuse to code sloppily though).

The real challenges are architectural: understanding how an application is structured is often a daunting task, and yet there are some very simple things one can do to combat this, starting with grouping things thoughtfully, generous pointers in the documentation, and a few paragraphs of architecture overview in the readme file.

Re: "null != variable":

I don't think it's necessarily confusing, it's just that usually we tend to think of the elements we're working with (variables, objects, functions, etc...) as taking on values, and so linguistically, we ask "is my thing null?" Not, "is nullness something that applies to my thing?" Hence "thing operation value" is arguably cognitively cheaper than "value operation thing."

So you could argue that it's one of those things that increases cognitive load if you're not used to it. That last bit is important because almost anything no matter how convoluted can become acceptable once it becomes a convention. In java I used to wonder why people did this:

    if ("".equals(str)) {...
rather than

    if (str.equals("")) {...
until I got more familiar with NPEs. Now the former seems quite natural.

I think the point was, that as soon as you are aware that typos of "=" and "==" are common, and hard to catch mechanically, the _habit_ of using the (constant == myvar) pattern can suddenly be seen as having more value as a hedge against human error.

I'd rather write it that way, and then later change it in code review to `(myvar == constant)`, than risk writing it as `(myvar = constant)` and have it sneak through. Granted unit tests might catch this, but perhaps the error is in the unit test. ;)

Java if statements only allow boolean expressions, so the thing that the Yoda condition is supposed to protect against isn't even possible. I think this is precisely what the author is talking about, as this style is a totally unnecessary holdover from C.

For me the _habit_ of using (constant == mylar) is way harder to get used to than using "=" vs "==", which the compiler will warn me about if I mess it up.

I would disagree that they are hard to catch mechanically. GCC will warn about it with -Wparentheses, which is included in -Wall.

The example you show is just a bad workaround though: Your real problem there is that you have code that deals with input that can be null, but whose job is not just to get rid of the nullability. Keeping around fields that could be null and looking for ways to do it safely is like forcing everyone to wear radiations suits because you refuse to have laws about where you can keep nuclear materials.

This is why in modern Java, or in Scala, we have Option types (and in old school java, Nullable annotations), which makes trickery like that unnecessary.

I think you're reading too much into 'confusing'. It won't leave someone without understanding; it -may- cause them to momentarily pause, going "WTF?...uh...probably does ~this~. But why would someone do things this way?" That's still delaying them, that's still distracting them, that's still putting up an obstacle. Not one a future maintainer can't overcome, but -why do it-? That's the point.

It seems like a reading micro-optimization to me, though. Surely most of the delays in understanding code lie in things other than "Yoda" conditionals. I'm going to guess complexity elsewhere in the code (even conceptual difficulties, such as "what is the purpose of this at all?") dwarf minor details like how one writes conditionals. If so, what is the point of this piece of advice?

> It seems like a reading micro-optimization to me, though.

It's a "reading micro-optimization" in the same way that not littering your code with comments is a "reading micro-optimization".

It makes you pause and interrupts your flow of thought as you contemplate something not directly related to what you were modeling in your head 2 seconds ago.

I don't think it's the same. I'm not arguing against code cleanliness or good practices. Code readability matters. I favor concise, expressive languages over verbose ones for precisely this reason.

Littering your code with comments IS a huge readability problem. You are forced to filter them out to find the real code, since they are mostly garbage, but you can never be sure you aren't skipping some vital bit of information. Likewise, boilerplate and needless ceremony obscure the intent of the code, so they are obstacles to be reduced in good, clean code.

Contrast this with writing Yoda conditions. They are no big deal. They interrupt your flow of thought exactly once: the first time you encounter them. If your train of thought is interrupted every time you encounter them, you are way too novice a programmer. So I think this is a really minor issue in the sea of software complexity; so minor, in fact, that it seems bizarre to me to mention it.

I think whether to use Yoda conditions, much like the placement of braces or the number of whitespaces for indentation, are the stuff of flamewars and endless argument because they are the kind of things we programmers love to obsess about, but they are really not very important.

> If your train of thought is interrupted every time you encounter them, you are way too novice a programmer.

You're unimaginative.

This conversation is finished.

This comment breaks the HN guidelines. If you don't have something civil and substantive to say, please don't post.


please stop following me around dang. One would think you would be warning the person who implied I was a novice programmer, but that would require consistency and fairness.

either ban me or leave me alone until I do something that's actually ban worthy please.

And to be clear, so there is no confusion here.

Your guidelines state the following:

> When disagreeing, please reply to the argument instead of calling names. E.g. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."

According to your guidelines, his comment:

> If your train of thought is interrupted every time you encounter them, you are way too novice a programmer.

Should have been something such as:

> If your train of thought is interrupted every time you encounter them you probably don't see yoda conditionals very often.

Please, do me favor, go warn him as well.

consistency and fairness.

Or just ban me since we both know that's what you want to do.

I apologize if I offended you. I didn't mean "you" as in "you, mreiland", but as in a generic "you". I can no longer edit my post, but I've no problem if an admin edits that part to make it more explicitly general.

My use of "you" was the same as yours when you said "it makes you pause and interrupts your train of thought"; I didn't take it to mean you were referring to me specifically! Unfortunately written language is prone to this kind of misunderstandings :(

I have no issues with the way you worded that, I would have defended you had dang stepped in to complain about it (and in fact I've done exactly that in the past, which is when dang started harassing me). I was more pointing out the hypocrisy of dang.

I ended the conversation for exactly the reason I cited, a distinct lack of imagination. Anyone who immediately reaches for "you must be a novice programmer" isn't really someone I'm interested in conversing with.

Especially since anyone who stopped to think about it for more than two seconds would realize a "novice programmer" would find the yoda conditionals easier specifically because they're still learning to read code.

No one in their right mind would claim an adult reader would be better at having random bits of text in their novels read from right to left in the middle of their left to right text. Most reasonable people are going to agree that it's easier for a young person just learning to read to pick up on that.

Yet there you were, making exactly that claim for programming.

It indicated a reactionary comment with a complete lack of critical thinking on your part and I just have no interest in spending my time speaking with a person whose thought processes work in that manner.

And if that offends you, then so be it. I personally do think you were simply being passive aggressive and that's why you didn't stop to think about what you were actually saying.

Which is your right to do, and I would defend you for it. But it doesn't mean I'm willing to continue engaging you.

Your comments suggest that you feel this is personal. It isn't. You just need to follow the rules like everyone else.

Inconsistencies in HN moderation are random side-effects of the impossibility of reading all the threads. You're welcome to bring them to our attention, but please don't take them personally.

> Inconsistencies in HN moderation are random side-effects of the impossibility of reading all the threads.

Unless it's one of mine, of course. Then you seem fairly consistent.

ban me or leave me alone.

Ok, we've banned you. When you're ready to follow the rules like everyone else, you're welcome to email hn@ycombinator.com and get unbanned.

I agree that big, deep stuff like architecture and abstractions can be daunting, but I wouldn't be quick to dismiss style and formatting as mere nitpicks. Code clutter can have an outsize effect downwind of those who create it.

A codebase is like a home. We all have to live in it. Architecture and abstractions are like the furniture and appliances. Code formatting and syntax are the nicknacks, paper bills, decorations, cat toys, board games etc.

Does the toaster really belong on top of that highly flammable mattress? This is a big, deep question about the overall functioning of the home. But throw a pile of ratty blankets on top, put some boxes of moldy half-eaten pizza around the floor, and maybe wind that kite string over the doorknob and around the teetering bookcase. Can you still see the toaster? How do you plan to get in there to move it to a proper place?

Small issues can block, conceal, misrepresent, and make big issues even more dangerous. Which why I say: fix all the small things first and stop new clutter from appearing (lint your code), then talk about rearranging the furniture.

Oh yes, I agree. But it's a shame it's all people ever talk about, when there's this much bigger thing that looms beneath the surface.

"null != variable" will confuse people is downright silly. People confused by this won't have an inkling of what any non-hello-world program does.

I feel this is one of those things someone new might get confused about the first time they read it. After that, no. IOW, understanding this is part of learning and once it is learned, it is OK.

Yeah that struck me as an exceptionally minor thing to care about. Seriously we got way bigger problems than this in most code: reams of code duplication, 10 classes deep inheritance hierarchies, monster classes, 2 pages long methods.

In fact one of the most counterproductive things I have seen at work is people arguing at length about tiny details in the code standard and wasting hours work and goodwill.

My take on this is, you give people advice on why a code standard is good to follow. But if they insist on tiny little quirk, let it be. It is not worth fighting over. The whole development community is full of people with all sort of little quirks.

At the end of the day we got to get stuff done, and you have to weigh the benefits of stepping on somebody's toes against what you gain from it.

I completely agree. While the small things do make things easier or more difficult, the architecture can make an order of magnitude more difference.

The last two applications I have inherited are horrendously over complex and they are performing relatively simple tasks. Far too many joins done at the application level. The front page of the current application I am working on makes over 1000 calls to the database. I don't think it needs more than one for the main part. (And if the logic is really so complex that it can't be done in the database, that's a sign that the database schema probably needs updated.)

Replying to myself, to say that it is interesting how must people replied to this to comment about the "null != variable" thing.

Kinda makes my point.

I suppose it could be explained by the fact that it's much easier to say something about the intuition of "null != variable" than to share some architectural insight.

What if the order depended on the check condition?

`if (null == variable) or if (false == variable)` reads well that you are checking for null/false.

`if (variable != null) or if (variable != false)` reads well that you want non-null/false conditions.

yeah, I'm not even sure what's suppose to be confusing? That the "constant" comes first, before the comparison operator?

I am not sure, but here's my guess.

If I say (A != B), people may think of this as being equivalent to saying that there's a property we care about- being not equal to B- and we are checking whether A has that property.

If I say (Joe != null), then we are saying that there is a property "being not null" and we are checking whether Joe has that property. Feels pretty natural.

If I say (null != Joe), then we are saying that there is a property "being not Joe" and we are checking whether null has that property. This is pretty amusing way to think. In the lines after one says something like

    if (null == myArgument) throw new ArgumentNullException("myArgument")
one may proceed with assurance that some properties of null must hold :-)

The distinction between null and not-null is so generally useful that I could imagine making a partially-applied function like NotNull, and applying that to Joe (and many other objects). Would NotJoe get much use other than applying it to null? Probably not.

Yoda speak, it is confusing like. Most people think "if the variable is not null" rather than "if null, the variable is not". It's easy enough to adapt to the style if it's used consistently throughout a code base, but that effort yields no benefits in most languages.

Most people think "if the variable is not null"

Only because this is how people are traditionally taught to think. This can be untaught easily enough.

So you can unteach your developers to think... for what purpose?

I think he's complaining that it doesn't read like English. You aren't testing null, you are testing the variable. There is an implication that the subject of the "sentence" will come first. Of course English order is idiomatic in most programming languages, but I don't think that means you can't have other idioms if there is a reason. I find it amusing that he complains about abusing fluent interfaces (making it read like English for no real benefit), but doesn't realize that he's doing the same thing here.

That's why you have coding standards on your team. People agree on the idioms they use. Some people need to be told that it isn't an idiom if only one person likes it, though ;-)

For me it's not that it's confusing, it's annoying to read because it indicates that the programmer didn't actually take the time to learn the language, and is instead writing it as C/PHP something else where expressions in if statements don't need to be explicitly boolean (the examples in the article are Java). If they wrote that, what else do they not understand about the language?

What if it is 0 but not null? I write in a few different languages throughout a typical week. It is nice to have some explicit statements when juggling between them. I know them very well, but it is still tricky when going back and forth.

That also wouldn't even compile. I also write in multiple languages, and for languages for which this is an issue I use a linter. This isn't even an issue in C if you compile everything with the appropriate flags (-Wall or -Wparentheses).

Slightly harder to understand, it is. But incomprehensible, it is not.

"Avoid using language extensions and libraries that do not play well with your IDE. The impact they will have on your productivity will far outweigh the small benefit of easier configuration or saving a few keystrokes with more terse syntax."

This is the main reason to avoid Spring.

Having so much critical code in XML and properties configuration files, annotations, and magical interfaces that somehow sprout implementations with no source code, make it impossible to debug using the normal techniques of Java development.

"Show me all of the implementations of this interface method"

"Show me where this object is instantiated"

"Show me the code being invoked when I call this method"

In my day to day work, Spring's only purpose often seems to be hiding the code that will actually execute when my program runs.

Intellij IDEA is remarkably Spring-aware if you set it up properly. Static checks across schema-specified XML and Java is tight. That said, I couldn't imagine working with Spring if I weren't using IDEA.

Because it is much better to let misspellings get caught during run time than compile time. /s

Same with a lot of the DI frameworks that were so popular for a while. New guy doesn't wire something up correctly, checks it in and everyone spends the next 3 hours debugging a rabbit hole. (OK, slight exaggeration.)

I spent 3 weeks on a spring rabbit hole as a new dev. Not exactly the easiest thing to get a hang of.

Interesting. Are you familiar with IntelliJ? My experience has been that its Spring integration provides most of what you're asking for.

IntelliJ is amazing, and the only way I can function at all with our code base.

But there are definitely still times when Spring manages to elude even IntelliJ. How anyone thought "magically appearing interface implementations with no code" was a good idea still baffles me.

If I was to describe the essence of clean code in one word, I would say "balance".

Yes, yes, readability and problem separation and stuff is important, but so is efficiency and usability and security and deadline and everything. Bottom line: balance.

And it's the hardest thing to achieve.

I've always worked through the idea that "Extremism is harmful".

Literally anything taken too far is a bad thing. And just about everything in life is a matter of finding the right balance.

No one ever mentions formatting.

I really like aligning multiline blocks, adding whitespace and useless braces here an there. e.g: Having just a single space between function name and arguments makes it look less like a call. Yet almost all lint presets/defaults forbid this. Typography is all about the whitespace between letters forming easily recognizable shapes.

I like to break lines and use indentation to line up repeated text, so you can see there is repetition, and so the parts that are different are obvious. A simple example:

    if ((mouse.x == 0) &&
        (mouse.y == 0)) {
scores more points than:

    if ((mouse.x == 0) && (mouse.y == 0)) {

maybe if(mouseIsAtOrigin()){

The point wasn't to come up with the best api, but to illustrate how to format code that has repetitions and regular variations in it, to make it easy to visually identify which parts repeat and which parts vary. That makes it easier to read the code and spot errors.

For example:

    sqrt((x * x) +
         (y * y) +
         (z * z));
You can run your eyes up and down each column to verify it's squaring x, y and z.

That reflects the structure and symmetry of the expression better than:

    sqrt((x * x) + (y * y) + (y * z));
Did you spot the error?

    sqrt((x * x) +
         (y * y) +
         (y * z));
How about now?

Here's some code that has a lot of examples of that style, a JavaScript implementation of a weird hybrid Margolis cellular automata neighborhood, which has a lot of two-dimensional patterns:


That really only makes sense if you've got a 40 year old code base that supports every platform conceived, some of which have mouse origins that are -42.333f or can only be determined at run time.

Take a look at the source for tcsh, this is exactly how it is done there. Unless you're interested in supporting a long dead platform - it makes things unnecessarily complex. I love tcsh, but so many ifdefs in so many static functions...

All this is solved by a linter though, there's no point even trying to remember this, just define your linter rules and let it deal with it.

It might not be the default linter styles, but set up your linter for your project and give your mind more important things to focus on.

A linter is also a great way to jump start learning a new language. If you encounter a warning that doesn't make sense to you, look it up. You'll start to get familiar with common language pitfalls without actually getting burned by any of them.

I didn't know linters had become so advanced. Can you recommend any which work on DSLs (including custom ones), and warn about 2D alignment issues? For example, here's some Nix code I have open right now:

     annotateAsts    = import ./annotateAsts.nix    { inherit stdenv annotatedb;    };
     runTypes        = import ./runTypes.nix        { inherit stdenv annotatedb jq; };
     dumpAndAnnotate = import ./dumpAndAnnotate.nix { inherit downloadAndDump;      };
It would be nice to have a tool rate various equivalent arrangements and warn if it finds one with a significantly better score, e.g. showing me the above if I'd given it something more confusing like:

     annotateAsts = import ./annotateAsts.nix { inherit stdenv annotatedb; };
     runTypes = import ./runTypes.nix { inherit stdenv annotatedb jq; };
     dumpAndAnnotate = import ./dumpAndAnnotate.nix { inherit downloadAndDump; };
Of course, as well as formatting it would be nice for equivalent representations of the same expression to be compared, e.g. using an SMT solver or genetic programming. For example, in Nix the variable names after "inherit" can be in any order, so it's easy to find permutations which highlight common elements (like "stdenv" and "annotatedb" above); if I'd written these in a different order (e.g. "inherit jq annotatedb stdenv;" on line two), it would be nice to be shown rearrangements which score more highly.

It's not just linters either. I can't even find an indenter which handles 2D alignment. For example, indenting something like (random bash code I have open at the moment):

    jq -n --argfile asts        <(echo "$ASTS")                       \
          --argfile cmd         <(echo "$CMD"         | jq -s -R '.') \
          --argfile result      <(echo "$RESULT"      | jq -s -R '.') \
          --argfile scopecmd    <(echo "$SCOPECMD"    | jq -s -R '.') \
          --argfile scoperesult <(echo "$SCOPERESULT" | jq -s -R '.') \
          '{asts: $asts, cmd: $cmd, result: $result, scopecmd: $scopecmd, scoperesult: $scoperesult}'
Emacs wants to put the second '--argfile' directly beneath '-n', which is clearly confusing compared to the above. If linters solve all typography issues, are there any which can be queried for the local-optimal indentation on a line-by-line basis?

Typography is much more than that, of course, but with fixed width plain text, you don't have many options...so ascii art it is. I'm pondering a language that includes formatting abstractions so that you can prepare code for reading along with its functionality (sort of like literate programming, but still starting from code). I would love to see comments in a side bar, long monotonous calls organized into tables, proper spacing and even lines to delimit sections, and so on...

But our current programming systems, we are still very much in the dark ages of code typography.

> I would love to see comments in a side bar

This is brilliant. A neat way to bootstrap getting this sort of thing implemented in most code editors would be to write a plugin that can do this for existing code and make it good enough to turn heads. It would extract documentation blocks to be presented as prose in a vertically split pane to the right and present the file itself with those blocks hidden—as if automatic folding were turned on. It could even apply some fairly simple heuristics to automatically link to other relevant comments. The goal should be an experience indistinguishable from an embedded iframe showing human-generated API docs from the Web.

Another thing I'd like to kill is the file tree that you see in most VCS Web frontends that tell you the last commit message that touched the file/directory, rather than about the structure of the code.

Netscape's old Bonsai tool tried to do something like this. (When Netscape open sourced Mozilla, they also opened up a lot of their internal tools. This is where Bugzilla came from, but there were others, too.) When you were looking at a directory listing, if the file contained what looked like a short description of its purpose in the comments near the top of the file, Bonsai would grab description and present alongside the file name.

In my own projects today, I try to always include a file overview containing a short, single line description and then write a paragraph or two going into further detail, documenting the whys of the code, and generally explaining its overall role in the project/justifying its existence. I'm basically writing for a tool that doesn't exist but that I'd like to see get created and gain widespread acceptance.

There is a tool for literate coffeescript that formats the output in a similar way - you have the english text in a pane on the left and the code in a pane on the right. I quite like the effect -- especially since there are potentially no comments in the code at all, which I often want if I'm just trying to read it quickly.

Unfortunately, literate coffeescript is not really mature enough to be used in a large project, IMHO. However, I hope that more people think about separating human language commentary from computer code. I think the idea behind literate programming is a good one and I hope that it gains some traction some day.

It doesn't really sound like the same thing I'm talking about, to be honest.

And I find the idea of a tool that automatically rewrites machine readable code into a natural language to be of dubious value beyond use cases where someone is first picking up the language. Similar to those tools that exist to generate comments in the form "Set global position" based on a method named setGlobalPosition. It just creates redundancy, and if you're committing the output to your source tree, then it's redundancy in the form of clutter, too.

What I'm thinking of, as I said, should shoot for parity with having a half-screen browser window open to the right containing the relevant docs. Only in this instance the docs are "live", and the lookup process is context-sensitive, requiring very little manual effort to perform it. I know that a basic attempt at something bearing minimal similarity is available in most editors that try to implement Intellisense, but generally I find the helpfulness of the small popup in most implementations to be limited to helping you get the method signature right, and not much else.

One of the nice side effect of the design I'm talking about would be a system that encourages keeping the docs up-to-date and useful as much as it encourage their consumption.

I think you might be misunderstanding what "literate programming" is. It's a method of programming where you embed code inside English language documentation. The idea is to be able to present the documentation in a way that is useful to a human, but have the compiler extract the computer code and reassemble it in the way that the computer would like to see it.

Literate coffeescript does not have the tools for extracting and rearranging text, so it's really just a way of embedding markdown text into your coffeescript code. However, it's useful because you can embed html hyperlinks which can do things like enable you to click to get to the tests, etc.

Here is a small example of something I wrote in literate coffeescript: https://github.com/ygt-mikekchar/react-maybe-matchers/blob/m...

Now imagine that you have the English text on the left hand side and the source code on the right hand side. Ideally you would have tools that would allow you to make the hyperlinks (possibly automatically) and keep the documentation in sync. Such tools do not yet exist at the moment, unfortunately.

Edit: I should admit to being embarrassed about my fluent interface abuse in this code ;-)

Sorry. I'm familiar with literate programming, but I mistook one of your statements:

> formats the output in a similar way - you have the english text in a pane on the left and the code in a pane on the right

... to be a description of a system that didn't really sound like what I had in mind, and didn't really sound like literate programming, either. After reading this comment and a reread of your original one, I understand I was wrong in my interpretation of what you meant. Sorry about that.

This is one thing I do too, and what really bugged me when I first tried Go. As an example I will align equals signs like this:

    loggedIn = foo
    isAdmin  = bar
It doesn't look like much, but when you are glancing over code it really helps to quickly read it.

And then you add `isCurrentlyAllowedToReadPosts = baz` and all your nifty formatting breaks.

This. I can't stand formatting conventions that cause a change to needlessly spill over into the neighbouring lines of a git-blame or diff. It might look good in the 2 dimensions here and now but history is important, and extra lines touched make it harder to follow.

Yes, very important. I love editors that does most of this from a button press. Makes others code easier to understand. Netbeans have lots of settings so you get the braces and spaces where you want them.

'Avoid using language extensions and libraries that do not play well with your IDE'

I'm of the opinion this should be extended to "does not play well without an IDE". Because even in projects that said "everyone, use Eclipse(/IntelliJ/whatever)", and tried to share project files, there was constant pain in ensuring that everyone had the same development environment ("Oh, yeah, I made a local change to my project file, but then I also made a change that needed to be shared, and whoops, I broke everyone" and the like). Not to mention trying to work with code deployed onto boxes without an IDE. I've gotten to where "if I can't make sense of it and be productive in it with just vim, grep, and find, your code is too complex".

Well, sure, but that's why you don't share project files.

I think that if you do things correctly, your concern is mostly mitigated. My last job was Java/Spring. As someone noted above, you really don't want to be working with Java/Spring without IntelliJ.

But we never had issues with anyone breaking builds due to screwing up project files, because those weren't shared and the project would run/build/test immediately after import into the IDE without any additional tweaks.

I think it's sort of odd that many folks complain about IDEs, but then use vim with a million plugins. Isn't vim + plugins = IDE? Or, if the overall point is "java sucks, don't use java" then fine, but if you have to use it, I couldn't even imagine using plain old vim without any language support...

Sure. Does it require a specific IDE then? Does it cleanly import, from scratch, with working tests and etc, regardless of whether they're using Eclipse, IntelliJ, Netbeans, etc? Or is there One True IDE? Is that IDE available everywhere you have to touch code? Is it straightforward to integrate it with your CI/CD tools, despite conforming to that IDE convention? Will it continue to cleanly import after you've left the code to rot for a couple of years in production, and then have to come back in to maintain it? Etc. I've been bitten by all of these.

Whereas I've gone to projects that were written without all of this, and even without getting the code to even compile locally (because of dependency hell that I didn't want to suffer through if I didn't have to), was able to diagnose and fix issues, because the code was written to be understood with only a text editor.

The idea is that you still conform to the convention of the project (Rails, Spring, etc.), not the IDE, and the FOSS CLI tools etc. can still understand the project, but so does the IDE (because it is made to understand the convention, e.g. how the annotations/XML files work, how to get a list of all your Rake tasks).

By not sharing IDE project files, not having any IDE-specific machinery in the project, but still expecting CI to be able to run it, being able to theoretically hack on the project and get results (build, test) even with just bash and grep and nano, you get the best of both worlds.

Modern IDEs are better at this than they ever used to be. I was really anti-IDE for a long time, but JetBrains' products made it make sense finally.

Someone wrote a post a while back about this topic and called it the grep test. See: http://jamie-wong.com/2013/07/12/grep-test/

It's funny that you mention the Java ecosystem as one of the worst offenders, since the nature of the language itself and the culture around best practices in the early days should have put it in a particularly favorable position here. This is covered partly in another great post titled "Java for Everything": http://www.teamten.com/lawrence/writings/java-for-everything...

Unfortunately, I too find most Java projects are developed in a way that make them completely unapproachable without the assistance of an IDE. I'm secretly hoping that Microsoft announces first-class support for Android app development in VSCode, given that the team's overall gestalt is on producing a nimble, code-oriented editor. They may not otherwise have any incentive to make that kind of thing happen, except that I think it would be a real boon for VSCode adoption. Anxiously waiting to see if there will be any interesting announcements to come out of their developer conference this weekend, although I'm not counting on it.

Even more amusing, I didn't mention Java, just IDEs popular with Java (but which support other languages too). I was certainly thinking it, though. As were you, as soon as you read the description of the problem. Yay, validation.

I agree completely and totally. A good programmer's text editor (e.g. vi or emacs) is far more general than an IDE, more easily extensible and rather more future-proof (folks will still be using emacs & vim in twenty years; will anyone be using Eclipse?).

Requiring an IDE is, to my mind, a symptom of a far-too-complex environment which will lead to breakage.

People have been using eclipse for almost 15 years, and if they stop using it, it will be because there is a better IDE (see: IntelliJ). Things like smart auto-complete, click-to-definition, history/diff merge views and integrated build support/dependency management are a huge productivity win.

If something requires an IDE in order to run it, edit it, etc, that's kind of a code smell. The IDE might be masking complexity for you, but it is creating complexity on the server side or for developers who prefer to use something like vim. Getting something to build for the first time in Eclipse is a pain. So many settings that keep changing which menu they are under in different version and you have no idea what you are doing wrong. If I have to click through dozens of menus and type in values in order to get it to build for the first time that is another code smell. The ideal scenario is to check it out, run a single command for package bundling; setup; etc, and then run.

That reminds me when I first came across autocomplete, I thought, "Wow, this is really nice, I like it." Shortly thereafter, I had a horrifying worry that someday I would come across projects that couldn't be understood without autocomplete.

And soon enough, my next job the code was impossible to understand without autocomplete and a debugger. I am still sad about that to this day.

I didn't appreciate how much of a difference it would make until I tried it, but now I know that one of the best ways of making code more comprehensible is to eliminate any questions about interactions between components by using a language with referential transparency. The results of functions should be determined solely by the values of their arguments, with no contamination by shared state and no side effects.

Yes, yes, yes.

Refactoring code without side effects is so incredibly liberating: I barely have to compile many changes, much less carefully test them, because they're so obviously correct, because I can SEE the flow. I don't have to speculate on what the rest of the system thinks is going on.

In other words, a pure function.


In mathematical words, a function.

What languages are you referring to? What languages do not support referential transparency?

There's a significant difference between a language which permits it and a language which embraces it.

My tongue-in-cheek programming language selection rule: avoid languages that require punctuation to call a function -- like f(x,y), x.f(y), (f x y), or x f: y.

Biggest thing for me: have a small number of fundamental, composable abstractions. Be able to reason about how their properties interact when you combine them.

This is the main reason that Haskell is such a force multiplier for me. All the small minutia is well defined and it frees you to think about the big picture. Sure, it's a big learning curve, but so worth it :-)

The variable name you choose when writing a function may not make any sense when reading it a month later, or to a third party. These rules are as useful as feng shui -- if you have a good design sense, you can produce good results with FS.

The readability wins I've seen come from using common tools; I'm going to be much more productive on a codebase written in a library ecosystem I understand. More generally, a powerful standard library with consistent calling conventions can help a language be useful. It's not productive for me to spend 10 minutes reading a function only to realize 'oh, this human wrote their own string split'.

Tests help too because they make clearer the dependent typing of a function's call signature; stuff like 'don't pass both of these variables' or 'a must always be > b' are never clear (and I've never used a design by contract langauge). Tests are sometimes better than documentation because you can sometimes get alerted if they're wrong.

Most important rule for readable code: hire programmers who know how to read. Reading a large project is a skill and lses people can read than write code. Every project is going to have a quirk of its evolution that's hard to understand without a close reading. (Every large C project contains a buggy partial implementation of LISP). Hire programmers who can survive that.

I spend a lot of energy discussing this at work, and mostly that has been "continue talking while people look at you like you have two heads"

It feels like maybe people are almost ready to listen now. I don't know how this relates to the number of libraries we use now, or the StackOverflow culture, but I have my suspicions that they are related. Maybe we should start talking about Library Fatigue...

Not sure I can say I always write clean code but when I do, in non-trivial cases, I often start to write a comment, delete it and replace the code I wrote with something that does not need a comment... I suppose it only works for people who (think they) know(s) what good code looks like, but (I think) it works as a good guiding principle (for me).

This is one of my guiding principles too. If I find myself wanting to comment the code to make it clearer or understandable, I refactor the code.

The one time I let comments slide is when they document unexpected and unavoidable behavior that I can't fix in code. For example working with an inconsistent data set and having to add some convoluted code because what is a string in 95% of the rows might parse as a tuple in the remaining lines, so I need to account for that. This is so that I, or the next guy, knows why the code was written as such and won't go off trying to fix it.

Developer cognitive load is one thing that kills projects, teams, and, eventually organizations. In the various companies and projects I've worked on, you can usually spot the culprit: a single person, or small group of people who take pride in making things more difficult for others to understand, or actively creating insider systems that serve to entrench themselves and make them seem important and the center of attention.

He advocates prefixes on variable names. I would like to see good examples of this as I have never seen them as helpful (especially the tblUsers and intUserId type that can be common) .

Christian points to Joel Spolsky's example of using prefixes to add meaning, not type information like your examples. From Joel's essay:

All strings that come from [user input] must be stored in variables (or database columns) with a name starting with the prefix "us" (for Unsafe String). All strings that have been HTML encoded or which came from a known-safe location must be stored in variables with a name starting with the prefix "s" (for Safe string).

    us = Request("name")

    ...pages later...
    usName = us

    ...pages later...
    recordset("usName") = usName

    ...days later...
    sName = Encode(recordset("usName"))

    ...pages or even months later...
    Write sName
The thing I want you to notice about the new convention is that now, if you make a mistake with an unsafe string, you can always see it on some single line of code, as long as the coding convention is adhered to:

    s = Request("name")
is a priori wrong, because you see the result of Request being assigned to a variable whose name begins with s, which is against the rules.

[1] http://www.joelonsoftware.com/articles/Wrong.html

(edit: formatting, citation)

In modern type systems (Haskell, etc.) you can define a new type that wraps an existing type with zero runtime cost. In Haskell you could write

newtype UnsafeString = Unsafe String

(UnsafeString is the name of the type. Unsafe is the constructor you use to create an UnsafeString from the String).

With this and other techniques you can lift these characteristics of you data into the type system, which is IME much more useful than using hungarian notation.

I don't think the Haskell typing system is powerful enough for completely replacing prefix labels.

Marking something unsafe is a best case scenario. Separating lines from rows would require a huge amount of boiterplate to preserve the numeric operations, and one still can not create a library that will check something like:

    l = 5 :: Meter
    w = 4 :: Newton
    in l * w :: Joule

Representing constraints in the type system is always an engineering tradeoff---how much type level machinery do you want to build vs the effort required to build that?

Not sure what you mean by separating lines from rows. In the example you give of computing with units, I've seen examples of such systems in Scala (e.g. http://www.squants.com/) and F# comes with something built-in for this (IIUC). You might be interested in checking them out; it's quite neat.

> Not sure what you mean by separating lines from rows.

It's the example Joe Spolsky used on the blog post about prefixes that everybody keeps pointing to. He showed code that worked on rows and columns (not lines, sorry) on a table, using prefixes to avoid mixing them.

Haskell types do support this use, but you'll have to declare both types as Integral and Numeric, with the resulting boiterplate. I'm not sure exactly how to solve this, maybe a way to "lift" typeclasses out of a newtype declaration would be general enough. It is probably possible to do with template Haskell, but that is already outside of the language.

That Scala library is interesting, although it looks like lots of the work is done at run time. Also, it looks like Haskell is able to do it [1]. (It's a recurring theme of mine, to complain that something can not be done on Haskell, just to discover somebody that did it.)

[1] http://hackage.haskell.org/package/dimensional

Joel advocates for correct Hungarian Notation, unfortunately people took it to put the language type of the variable (the compiler will check for that, obviously), but Hungarian Notation actually means you set a subtype to the variable so it helps with reading the code

You're not supposed to write

iLength = iCount * iSize

but rather (something like)

meterLength = unitCount * meterSize

Yeh I don't like abbreviations in code so would have unsafeName = Request("name") etc etc which does not have that problem... unless 'unsafe' is a thing in your business domain then things could get complex :)

The article goes on to advocate prefixing the methods that return safe and unsafe strings with 's' or 'us' I would hate that! I can easily see you ending up with string methods like safePadLeft and unsafePadLeft that do exactly the same thing but are named for where in the code they are used...

Coding is hard, and as other comments have stated all these blog posts and books need to be condensed into 'balance' and that balance point is different for each coder, project and team.

I can easily see you ending up with string methods like safePadLeft and unsafePadLeft that do exactly the same thing but are named for where in the code they are used...

I don't follow. Since the argument type to either method would be the same (a string), why are two methods necessary? Or are you inferring that we define subtypes of String: SafeString and UnsafeString? And wouldn't subtypes inherit behavior from their parent?

Last time I inherited a project that used prefixes like this extensively, it was the most frustrating and productivity draining coding time of my life. Sure, it could work, I guess. If there was actually any consistency to it. So many hours lost trying to find out why a prefix was defined in two different ways....

Advocating for prefixes like this is just going to result in badly obfuscated code where we get the exact opposite of, "JavaBeanFactoryConstructorXML_dom_to_json".

I wonder how much of the later trend is due to the combination of early autocompletion and a codebase the size of windows or office, in a language where types can't be determined for sure without running each source file through the preprocessor. And most source files include the massive windows.h or some equally large hairball.

I assume that global variables would be horrible in such an environment, but typing out the prefix and maybe one letter of the actual name would cut the completion list down to a manageable size.

Perhaps apps hungarian started because it helped avoid common errors in complex GUI logic, then similar practices were carried over to the systems side where they were useful for mostly unrelated reasons, eventually reaching the world in general through the header files produced by the systems folks but lacking any of the context or history needed to go beyond "microsoft does it that way, there must be a good reason". Hopefully I'm just being overly pessimistic and extrapolating from massively insufficient information, though...

This old chestnut from Joel should be helpful (or at least entertaining): http://www.joelonsoftware.com/articles/Wrong.html

He eventually gets to "hungarian notation" (those redundant prefixes you don't like) towards the end of the piece. tl;dr: misinterpretation of a good idea, cargo-culted down through the ages.

I've thought about this quite a bit from a language design standpoint. I've come to realize the following:

1) Reduce the number of variables that need to be kept track of in order to make a function easier to understand.

2) Avoid metaprogramming unless there is a clear need for it (doing something over and over).

3) DRY isn't always good. Sometimes being more verbose is easier to read than being clever so you don't have to type as much. Concentrate on readability first.

4) When there are a lot of interacting components consider the DCI pattern. It will save the developer from bouncing from file to file, module to module, just to understand the flow of an algorithm. Each time a developer needs to look up a different file more cognitive load is introduced. An algorithm should be easy to follow in a single file with code in sequential order. The opposite of this is message passing and having components pass messages to other components.

5) Syntax matters. Unfortunately once you choose your stack there isn't much you can do about this. Some syntaxes are much noisier than others. Every little bit of noise adds more cognitive load. Don't believe me, try do long division with Roman numerals.

6) Compress complex concepts into shorter ones. This builds on the Sapir-Whorf hypothesis. This might be a moving a part of an algorithm into its own function or storing an intermediate state in its own variable (as opposed to function composition 4 levels deep). `map` is much simpler to understand than `for (var i=0; i<arr.length; i++) { doSomethingWith(arr[i]) }`.

7) Spend time getting to know your editor. When you can reduce the amount of muscle movement required to perform an action it generally reduces the cognitive load as well. Not to mention making you more productive.

One thing I like which nobody ever seems to mention, is that when you divide your code into smaller functions it helps to have your functions actually take inputs and give outputs. It is very hard to read code which is just line after line of functions taking no arguments and returning nothing. No matter how descriptive your method names are, it will just end up being more confusing than if you simply wrote out all the code contained in those functions instead.

People need to see how the code flows and how things are logically connected. That is impossible to do with reams of methods operating exclusively on member variables.

I guess this is just another way of stating the benefits of functional programming. People do modularization at the function/method level so badly so often that I often think talking about modularization at class level, is beginning at the wrong end.

So many struggle with modularization at small scale.

Honestly, while I love these articles as food for thought, they really aren't the solution to the problem.

In my experience, programmers generally have a sense of what parts of their code need to be cleaned up. If they are a newer developer their ideas might be a little screwy. They might not see the root problems, but they know which closets are producing ghouls.

The issue I see is that developers don't think refactoring is a good a use of their time, or someone up the food chain doesn't think it's a good use of their time, or there's just so much institutional inertia making people question whether it's a good use of their time.

If you want your code to be better though, don't read about architectural problems, just take some time to fix the stuff you know is bad in whatever way you know how.

That's better than any article or class and it pays you money and it makes your code more fun.

I've found that diagrams help. Young developers (in my experience) tend to laugh at the thought of making UML diagrams. At the very least, a class diagram will tell what classes references what. Using some custom stereotypes can help give further meaning.

Ideally, sequence diagrams (or a similar type) can help shed light on just how the project comes together. Explaining how a project works from a high level can leave unintended knowledge gaps. The great thing about these diagrams is that they can largely be automated with very few inaccuracies.

Diagrams are definitely now a silver bullet. I have found that they can easy grow unwieldy when attempting to incorporate too much information into a single diagram. Much like separation of concerns or the single responsibility principle, I believe that diagrams work best when they're focusing on a single piece of major functionality.

What is a stereotype?

Stereotypes are an extension of UML to include extra information about the class. This[1] is a class diagram (albeit, overly detailed) I created for a library that we used. Above the class names are stereotypes. I know which classes came directly from the library, which are service classes, etc.

[1]: http://i67.tinypic.com/33xyaz4.png

one programming style that i think makes things worse is the "how many calls can i make on a single line" programming style.

  MyObjecter.GetObjectRelatativeThingy(ThingyHolder.WhatsMything(IndexKeeper.Get(), OtherWhatsIt.BuildOther( MoonCheese.Intensify(true)))
It's probably a result of the letterbox effect of widescreen monitors and IDEs with horizontal toolbars and status bars.

Is there anything wrong with expanding the idea of 'one purpose per function' into 'one purpose per line' and split it all up into multiple lines with temp vars inbetween.

this gives you benefits of:

- short lines and simple ideas

- easier to set break points

- allows for adding logging between calls.

- allows for setting debug watches on the temp vars

- the compiler tidies any temp vars away so there shouldn't be any extra copying.

I like Raymond Hettinger's idea of one concept per line. Sometimes that means having two or even three functions on a single line, sometimes one.

Would recommend reading 'Code Complete' than this article

He does reference it in both the first and last paragraph!

my fave is clean code... though uncle bob seems divisive these days for some reason.


The article is well-intentioned but misses the point in a few places. For example, the suggestion to have, in an MVC project, three top-level directories: one each for models, views, and controllers. This works fine for small projects, but larger projects can see significant benefit by keeping related code together. As with everything, it's a judgment call. The simple "one folder per type of thing" rule may not be applicable in all situations.

It's been coined the LIFT Principle - aka "Folders-by-Feature" vs. "Folders-by-type".


Unfortunately, since the simple examples use Folders-by-type, and many people are incapable (or unwilling?) to think for themselves - they just continue that way of doing things, even in giant projects. Hell, EmberJS embodies the Folders-by-type into the framework.

Yeah, where do you add queues, events, observers, helpers, traits, interfaces, etc.?

Each of those should be a separate module uploaded to NPM. That was the versions can vary independently (for max compatibility).

This technique is named 'carbo-loading' (because of all the spaghetti).

Its not about understanding a line of code - anybody can do that. Its about absorbing kilo-lines of code and gaining an understanding of the whole thing. Without having to draw attention to every line.

I can read code by the page, hitting next page at about a 1Hz rate. If the code is not overheated. That means, avoid lots of syntactic bloat, keep it concise, keep it modular, with low branching. Just about what the OP says.

One thing I appreciate about .NET are the conventions around project structure. When digging into an ASP.NET project I know where I can reliably find assets, models, views, and controllers. Also like conventions like prefixing interfaces with "I". Little things but makes diving into random codebases easier as a Java developer.

+1 for avoiding frameworks and extensions that don't play along with the IDE. I've only recently realized just how much the supposed productivity gains of those tools evaporate when your workflow gets bogged down.

Very good

Until you have some code reviewer that thinks otherwise because of some "stupid reason" and you can't get around their hard heads.

One example, breaking a 81 char line because it goes over the limit and getting two shorter lines that are awful to read

So yeah I'll go for this when I'm working with reasonable people

So, what do you do if you get a job with Java or iOS and 30++ char method names?

(Not being snarky, I would like to do some iOS development for fun but want lines under 78 chars, so it don't go over 80 with diff.)

Sorry, it's just not practical to keep lines under 80 characters in some languages.

I do iOS development in Swift and I generally don't exceed 80 characters. It was much harder to stay under 80 with Objective-C but still usually possible.

Everyone else on my team is in love with writing 120+ character lines, which makes me sad.

Good advice, but it still seems like just cosmetics compared to the real brainkiller: (incidental) complexity.

It's kind of how legible handwriting is desirable, but not really the most important factor, when it comes to the amount of brain required to solve a math problem.

> There are four main concepts I will talk about here.

There are 5 concepts in the article. Off-by-one error? :)

Cognitive load is a harder problem than we give it credit.

Is there a program that rates other programs in terms of Flesch Kincaid score or the equivalent? Old Man and the Sea F-K score is 4. Code isn't much different than other writings. I prefer to read code written like a Hemingway.

Some good tips in this article but for me the problems usually start when I have to tie two or three APIs with different coding conventions together.

Nothing about division of concerns and organization?

Even the first example is a bit silly. Nobody is going to not understand that (null == foo) is the same as (foo == null).

It's easy to call out trivial examples because they're trivial, but you're missing the forest for the trees. His point is that the cognitive load adds up. One (null == foo) probably won't slow you down noticeably, but the more of these tricks and workarounds are scattered throughout, the more time you'll need to spend reading and understanding the code.

Sometimes the workarounds are necessary for performance reasons, or they're just good practice to avoid common pitfalls. But in a lot of languages they aren't, so unless you have a good reason for increasing the overhead, it's better to lean towards readability.

Edit: Also, for this particular example, a good linter is all you need to warn you about an accidental assignment. If you don't have a linter, sure, then put null first, but it's important to recognize that this sacrifices a small amount of future scannability for the more immediate avoidance of a bug.

It does increase the cognitive load for no particular reason. Which is what the author is trying to avoid.

It's the difference between, say, length(foo) and foo.length() - any difference in cognitive load should be lost in the noise next to "this variable can be null".

And given most languages in use today use = to mean assignment instead of equality, it's hardly "no reason".

Except if you do if(foo = null){} it won't compile in java which is exactly what you want to happen. So using it in java is just cognitive load and provides 0 benefits.

Habits are hard to break. Switching back and forth between habits leads to mistakes. There's a trade-off here. I'm not going to say one is definitely worth it vs. the other, but if we pretend the disadvantages of one option don't exist, we lose the ability to make appropriate judgements.

I guess I didn't think about people using C/++ and Java together regularly.

Also, it can still catch bugs even in javascript. It's not related to C at all, as original article states.

Equality operator is commutative (sans operand side effects). It's not a quirk.

It's easier to use tools or IDE.

  $ jshint test.js
  test.js: line 3, col 10, Expected a conditional expression and instead saw an assignment.

  1 error

Agreed. I understand the desire to make the variable of interest more prominent, by putting it first, but that particular bug is well worth defending against.

Isn't this just a fancy phrasing for "make your code easy to read"?

+1 This whole blog post reeks of elitism and over confidence.

The blog post over states its own importance as after learning these clean code would be easy. Like fade diets, magic paradigms etc... another example of someone believing or trying to convince others that there is a "magic path". "Just remember these four concepts", "get a six pack with just 20 minutes a day"...

There are a few good tips in here but really nothing new. A nice reminder that there are some simple steps to help improve your own code.

A large devil of clean code is not just in nitty gritty "have one line per logical action" but rather the deconstruction of a problem into easily followable steps.

In that way this is definitely not an exhaustive description.

The best way to minimise the cognitive load is to always use DSLs.

Especially if there are objects that correspond to something real.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact