> “Good code is simple” doesn’t actually say anything. [...] What we call “simple” depends on our experience, skills, interest, patience, and curiosity.
I like Rich Hickey's stance on this: "simple" is objective (antonym: "complex"), whereas "easy" is subjective (antonym: "hard"). Easy depends on skills, interest, patience, curiosity - but simple does not. Simple is about lack of intertwining of concerns. About writing code that can be understood without "bringing in" the entire codebase into your mind. Like, you can understand a pure function just by looking at it (it's simple). If it modifies or uses a global variable, now you have to look at all the places that "touch" that global variable, in order to understand what your function does; the code is thus much more complex.
With that definition - it's absolutely correct to say that the code should be as simple as possible. And that writing simple code is typically very hard.
This is one of the things that get all weird the more I think about it.
Trying to define an objective term like "simple" is of no use if you can't map it to something in human perception. Let's say we have some all-knowing oracle that can tell us which algorithm for problem X is the simplest one. If nobody would agree, what is the point? Or, if 51% of programmers would say A is simplest while the other 49% consider algorithm B more simple, should we just go "yay majority" and call it a day? In reality we have no way of knowing which one is objectively most simple, because at least regarding this domain, we as humans are unable to make an objective judgment.
Is the simplest solution always the one with the least lines of code? Least number of classes? Singletons vs static classes? Recursive approach vs. iterative approach? C style linked lists (next-pointer in data type) vs. "proper" list class? In general: Could we formalize this, ie. come up with an algorithm how to determine how simple code is?
Often times code that is easier to understand might be harder to work with, while more elaborated code with indirections and abstractions is harder to grasp (simply because it's more LoC/classes, just more reading), but once you got familiar with the code base, much easier to work with. So when you say objectively simple, do you mean "simple to understand so you can tell what it does", or "simple to understand wrt maintaining, extending and fixing"? And which kind of simple should I go for when writing code?
Python's PEP 20 utilizes this view on simple v complex, and also covers the idea of complicated.
> Simple is better than complex.
Complex is better than complicated.
I've stuck by those two lines of PEP 20 since I learned about them. The object is to write simple code, but it's okay to write complex code if your goal is to avoid complicated (unreadable) code.
That is a strange and "unnatural" (for me) use of the word "complicated". Cynefin[1] defines 'complicated' as "known unknows" and 'complex' as 'unknown unknowns'. The idea is that complicated things are non-obvious, but they are understandable (with expertise, you get the relationship between cause and effect). With complex, full cause & effect cannot be known apriori, it's just observable after the fact.
That somewhat resonates with the definition that complex (code) is about intertwining - 'complex' is problematic because as things get more and more complex, you can have only a local understanding, and can't possibly hope to grasp all the implications of a code change. This suggests that "complicated" is something that is "somewhat complex, but still manageable with sufficient expertise/domain knowledge". I tend to use "complicated" to denote something that has inherent complexities (due to the nature of the real world); and "complex" to denote something that is artificially complex, due to ... well, lack of thought into making it as simple as it could be. In that sense, I view "complex" as being definitely worse than "complicated"
I see it the opposite way. Some things are inherently complex, like heart surgery or launching a rocket. There are many things that have to work together perfectly or the whole thing goes south.
Complicated to me means that complications have been added where they weren’t needed. A Rube Goldberg machine solves a simple problem that has been complicated.
I've come across these meanings too, and they're the ones I use.
Complexity is a property of the problem; complication is a property of the solution. Whilst over-complication is bad (e.g. your Rube Goldberg example), so is being simplistic (not enough complications to handle the complexity).
Examples of simplistic solutions can be found in all of those "What programmers don't know about FOO" articles (human names, timezones, distributed systems, etc.)
That's a great point. I've given approximately zero thought to this, but would you say TDD (or any other requirements-driven design process) would avoid that kind of oversimplification?
> would you say TDD (or any other requirements-driven design process) would avoid that kind of oversimplification?
I haven't thought too much about how such things relate, but my first thought is that TDD seems biased towards oversimplification, since we do 'the simplest thing which makes the tests pass'. If we're testing that Bob's user profile shows the name "Bob", then the simplest thing is to have all profiles hard-coded to say "Bob". To fix this, TDD tells us to add more tests which check different names. That's before we've even begun to consider the hairy details of how names are used in different cultures.
It's a reasonable approach, but it's not "solving" the oversimplification problem, it's just shifting it into the test suite. It's similar to the problem of over-fitting in machine learning models, where we may need millions of input/output examples to narrow-down the intended behaviour.
I think the aggressive refactoring advocated by TDD manages to bias it away from overcomplication.
TDD should give you the simplest solution conforming to your spec (your tests). So it's all the necessary complexity without any unnecessary complexity to fulfill your testsuite, and ease of refactoring in case you need to change something more fundamental about your code.
At some point you need to specify which edge cases you need.
Hmmm... I looked up word origins. "com" + "plicare" - "to fold together". You seem to be right, it's closer to the original meaning to use them like you say ("complex" = inherent complexity; "complicated" = "incidental/ added complexity"). I can't reconcile it with Cynefin, but I'll have to update my usage of those terms in a coding context.
I've had this conversation with few people - mainly peers with whom I'm discussing our craft. Most recently it was with a junior engineer, and I wanted to pin these definitions down precisely so that we had a common vocabulary for describing ideas we were developing. I mention this because I haven't heard these terms commonly used in any particular context so I don't think there's an agreed upon "correct" definition of them. I think my definitions more closely jibe with the normal English meanings, though.
I'd say don't try to use Cynefin's definition to understand what the PEP authors means. (Un)known unknowns is a pretty terrible definition of those words, it doesn't really match the common definitions at all.
Complex and complicated are, according to the dictionary, basically synonymous, and they mean something intricate or tricky that is composed of many interconnected parts.
I think the PEP authors were trying to say it's better for a piece of code to be clever and tricky to avoid complicated interactions with other code, if that's your only choice. I think they're trying to suggest complexity at a small scale is better than complexity at a large scale. But I'm just guessing, and I do think it could be better stated with two words that aren't synonymous.
In a uni course a while back they did define complex as something that can't be understood, and complicated as something that is understandable but hard.
So I agree with the above poster that the PEP makes no sense here.
Complex does not mean something that can't be understood. Like Cynefin, your uni course might also using bad definitions, if they meant those definitions for the general context. If they were giving you domain specific definitions, like in a math course or something, then your applying it to this situation might be your misunderstanding. We have dictionaries for just this purpose, so that we all agree what words mean. Words also have multiple meanings, so while it may be true that what the PEP authors said makes no sense to you, it's best if we avoid egg on our faces by calling someone else wrong if the problem is that we didn't take the time to understand what they meant, or assumed incorrectly that our understanding is the only one possible.
The common definitions of complex and complicated are that they're synonymous, there is no strong or fundamental distinction between them in the way you're trying to suggest. That's the main problem with the PEP statement. Trying to assert there's a difference is likely to result in miscommunication, this uni course and the PEP authors are both making the same mistake.
adjective
1.
composed of elaborately interconnected parts; complex:
complicated apparatus for measuring brain functions.
It seems more to be based on personal agency. If youre going to make something, make something simple. But if you can't, it is better to make something complex (opaque by scope), than to make something complicated (opaque by architecture).
There is readability and there is understandability. Which may or may not be the same. Just like text you may have text that is cramped with very long lines and no line spacing, maybe no word spacing—it may be very hard to read it but once read it can be easy to understand. OTOH you can have a nicely typeset text which is impossible to understand.
And if you talk about code there is also changeability. Which is somehow at odds with understandability. To make code easier to change you must move to abstractions, and more abstractions make a code more difficult to understand—but easier to change.
> With that definition - it's absolutely correct to say that the code should be as simple as possible.
Yet, it is humans who need to read it. This is the end goal.
If you rewrite the code to satisfy a theoretical objective criteria of simplicity, but end up with something that people reading the code find harder to read, then you have failed.
But if you don't have an objective criteria of simplicity, how can you even hope to have universal agreement of what is "simpler"?
Humans are complicated beasts. Saying "people find the code harder to read" doesn't help me know what I should do. If I have no objective criteria, then it must be subjective, and then all that "simple" means is that "this is easiest for me right now" - which, from experience, I claim to be a very unlikely path to produce code that multiple people agree to be "simple". (except if the problem was trivial to begin with - but then, who cares?)
"Which is the better painting, Van Gogh's Starry Night or Seurat's La Grande Jatee?" is a deeply subjective question.
"Which is the better painting, Van Gogh's Starry Night or a kindergartener's watercolor of their family?", on the other hand, appears to have a correct answer.
There are multiple competing definitions of "simple" with respect to code and multiple kinds of simplicity that often have to be traded off against one another, and reasonable programmers can and do disagree about which compromises to make and where. Code quality (including, but not limited to legibility) is in many regards a subjective endeavor.
That doesn't mean, though, that all decisions are equally justifiable. Trust your sense of aesthetics. It's usually trying to tell you something important, even if you don't know how to put it into words yet. With subjective endeavors there is no perfect, but there is usually better. Chase that.
In your first example, both are exceptional, so really "which is better" is irrelevant.
In the second, the difference is so big that it's obvious for anyone which is better, but that still doesn't make it useful for getting better at painting. "Paint more like a Van Gogh and less like a kindergarten kid" is not useful advice.
People do have art critics and manuals that attempt to write down exactly what made Van Gogh paintings great. Those writings are useful, and many artists do indeed study them.
Maybe I read too much into your original comment, but it reminded me of a pitfall that I see too many engineering types fall into, namely that because something can't be formalized then it must be arbitrary, unimportant, or both. In particular, this seems like a counterproductive way to frame the issue:
> If I have no objective criteria, then it must be subjective, and then all that "simple" means is that "this is easiest for me right now"
Here's a few other (actionble) things that "simple" can mean.
* seems to reliably be easiest for me every time I look at it
* was judged to be better by the three or four people I could get to give me an opinion
* follows a heuristic that a bunch of smart people all seem to agree generally makes things better
* mirrors whatever we did on this project the last time we were in this situation
Even if all it means is "easiest for me right now", that's still probably going to get better results than "the first thing that popped into my head that might work".
There is no objective definition of simple. On behalf of the maintenance programmers of the world, please don't take that to mean it's not worth trying anyway.
Oh, I agree with that. What I meant was that "this doesn't mean there can't be objective criteria about what is simple and what is not; or that it's futile to attempt objective definitions of 'this is simple/this is not' ".
"Easiest for me right now", in my experience, only leads to simplicity in trivial cases. Simplicity requires intense thought & explorations, it's never easy (except, again, in trivial circumstances).
What's the alternative though? Write your code such that people subjectively find it easy to read? Then you have to know the set of all people who are going to read your code AND how their internal (and possibly inconsistent) "easy" function works AND those people can never change personally AND no additional people can be added to that set unless they match someone else's "easy" function AND the sum of all the readers' "easy" functions have to result in at least some chunk of code being "easy".
At least with an objective criteria anyone who doesn't understand your code has something objective they can get better at in order to understand your code. And this is something that everyone can synchronize on from now and here to infinity and everywhere.
I don't think you can solve complex problems and guarantee the solutions are easy to understand. (You've hit the nail on the head.)
But you can absolutely solve many complex problems without straying outside of the set of simple arrangements or simple configurations of purely simple solutions. It may still be hard to understand. It was probably hard to write, too.
It should not be hard to refactor, if at all possible, when the scope of the problem has predictably changed later on.
Well humans reading your bit of code is not the only end goal.
My intuition tells me that when I've refactored something to be simple-but-hard it ends up making more maintainable code. There is less state being maintained, less lines of code to maintain, less coupling. Yes it requires a bit more time to understand, but there is also less stuff to understand. Bad code and bad decisions snowball into bad projects
Yes, simply optimizing for less state and fewer side-effects, and letting the chips fall where they may, is 80% of the benefit. Most of what I do is realtime and/or robotics. Basically, varitions on:
power_on_self_test()
while True:
recover_from_everything_that_has_gone_wrong_so_far()
If the only rule you enforce is making sure every piece of state has a damn good reason for existing life is improved.
OK, on re-read this sounds like more like a rant, and less like a philosphy of life. Oh well.
My intuition tells me that good code is when I reach a localized minima balancing coupling, amount of code written, expressive density of the code, clarity of purpose and ability to fail quickly under invalid conditions.
There are often times when I could make the code simpler but it would be longer and vice versa and my decision about which one to pick is purely personal.
https://www.sandimetz.com/99bottles/ is all about that. It starts with a few different ways to solve the same problem and then goes into comparing them using chosen criteria.
And then it goes on to implement the solution using TDD.
Very underappreciated book, imho.
Simplicity is about the cognitive overhead to understand code, the simpler it is the less effort it requires to reason about.
That doesn't mean you don't have to learn anything to understand it. Going the easy route usually means dumbing down things and while it can help reading in the moment it definitely doesn't help in the long run as the code evolve over time.
I agree. Idiom plays a huge role. Example: Python list and dictionary comprehensions can often make the result clear much more easily and succictly than the equivalent loop. But you have to learn Python comprehensions, and the common idioms. And it is possible to take it too far, and write an overly-complicated and obtuse comprehension that is difficult to reason about.
The goal should be code that is easy to reason about, given a reader that is familiar with the popular idioms. I think that for most languages, thoughtfully chosing good names and writing idiomatic code goes a long way to reducing the cognitive load in reasoning about code.
I also find that Python's syntax helps readabilty, and that also decreases cognitive load. Over the holidays I spent some quality time reading up on Rust and starting to climb the learning curve. I'm still at the stage that I find Rust's syntax jarring. It looks like so much ASCII salad. I can't help but think that a little more attention paid to natural syntactic flow would make Rust code easier to reason about for the long haul. Of course, I'm only at the beginning stages of learning Rust idioms -- a fair analaysis can only be done after I have learned them well. But still, it looks like Rust values screen real estate over readabilty, which is a misguided optimization IMHO.
We chose a lot of Rust's syntax to be familiar to those who use "curly brace and semicolon" languages, and some stuff from functional languages. It's not really about brevity.
Fair enough. I get twitchy-semicolon-pinky when I switch back and forth between Python and C. I have much sympathy for Python core devs... I don't really do C++ because most of my C is embedded real-time, so dynamic data structures are not a thing, out of an abundance of conservatism. If I did C++, I'm sure I'd find Rust less jarring.
My excitement about Rust is mainly the potential for using Rust for embedded real time -- but I see microcontrollers are still 3rd tier support. It seems like Rust the language is directed at exactly that problem, but Rust's build system and overall ecosystem are not embedded-aware, unless there is plan for xargo to get tier 1 support?
So... this isn't the thread for it but I have to ask... the only Rust reference I've seen to list/hash comprehensions is somebody who did a macro. So... is that the idiomatic way to do a list comprehension in Rust? Or is Rust going to make comprehensions first-class syntax? It seems to me that if it can be done in a macro, it isn't a big leap to make it part of the language.
The plan is still to "unfork" xargo, that is, to put its functionality in Cargo. We'll see how long that takes. Many people are advocating for 2018 to be focused on embedded in some way, so we'll see!
I'd imagine most people reach for iterators when it comes to list comprehensions. If the crate gets popular enough, there might be a thing, but I don't even know the name of the crate you're talking about, so I don't think it is.
Pulling xargo functionality back into tier 1 mainline would be an excellent piece of enabling infrastructure for all embedded work.
I think a challenge with embedded is that the platform landscape is highly fragmented. I would advocate for picking one target CPU and making it tier 2, let the rest stay at tier 3. Pick one that has a cheap development board to use as a reference platform. Don't even try to test everything in the cpu/platform matrix -- just pick one to be the pilot, and the rest will benefit from at least having that one big snowplow clear the road in front.
My specific suggestion would be an Arm Cortex-M4 with hard float of some flavor. That clears the road ahead for a lot of hardware of general interest. ST has quite a few easily available dev boards at reasonable prices, but that is just what I am most familiar with -- there are certainly others. I am also a big fan of Micropython, and the reference board for that is the Pyboard, which might be a good inexpensive choice also, not to mention a good way to spread the open source and open hardware love. It would be easy and reliable to flash a Pyboard back and forth between Micropython or some Rust-on-bare-metal image, I would personally be a fan (that's only one data point, but it's mine :)
I've understood "simple" in coding to mean "with the fewest abstractions possible."
This. Is. Hard.
It does help when the language is simpler. But as I code, when I find the need to deal with more exceptions, more functions, and more words just to get something to work... I just assume I should do it differently.
Unfortunately, that's most of the time I spend coding.
The author consistently compares code language style to literature language style, but I don't think it's a helpful comparison at all.
Literature operates by an intractably complex interaction between the words written on the page and the mind of the reader. Exchanging a single word for another may totally alter the effect of a passage, based on factors like the meaning of the word, the associations the word has for each reader, and the aesthetic appearance and sound of the word.
Moreover, we have no objective way of measuring the effect of literature. We don't have a test that can measure how much more empathetic I become after reading a novel, or how much more I understand the human experience.
Computer code typically has measurable outcomes. While "readability" may still be a relatively subjective measure, we can at least say "these two versions of this function are equivalent" if the outcome is the same - the array is sorted correctly, the transaction amount is calculated correctly, the graph is rendered with the same pixels.
This means that editing computer code is nothing at all like editing literature. The cliff-notes version of Macbeth can never produce the same effect in the mind of the reader as reading Macbeth -- but a function that is subjectively easier to read can be shown objectively to produce the same effect as the original function.
Good point, I agree with everything you wrote. I didn't intend to compare code language style to literature, maybe I failed to make that clear.
I tried to make the more subtle comparison that some programmers react reflexively to code that doesn't look familiar to them and call it unreadable, the same way that a person unfamiliar with Shakespeare and unprepared (or unwilling) to make the effort to read it will dismiss Macbeth as hard to read.
A person new to Shakespeare picking up Macbeth will find the book challenging because of the antique language, the poetic form and dramatic devices, maybe the historical setting and allusions. A person new to a body of programming code may react the same way, for similar reasons. A reader new to Shakespeare probably knows that the problem lies in their own abilities and patience, not in the book. A programmer new to a body of code will tend to blame the code, in my experience.
> A programmer new to a body of code will tend to blame the code, in my experience.
The key thing I disagree with about this is essay is that when it comes to code, there isn't really any reason ever to have anything other than The Hungry Caterpillar or Puppy Peek-a-boo.
There will obviously be differences between projects, but within a single API or whatever, there's no reason why every endpoint shouldn't be coded in exactly the same way. E.g.:
1) Validate user permissions
2) Validate user input
3) Validate business logic
4) Persist results
5) Return data or an error
Within a given project all variables and functions should be named consistently, there should be a consistent style of error handling, etc. Once you've read the style guide and understand how one endpoint works, you should be able to understand how every endpoint works. IMHO saying "I don't understand this, rewrite it" is always the most valuable code review one can give or receive.
If you hired someone to advertise your product on TV and viewers weren't able to understand the ad, you'd probably fire your ad team immediately. I don't see why developers should be held to any lesser standard.
Good writing is not necessarily readable writing. Shakespearean works are good writing, but they are most certainly not easily readable. There is code that is great code because for many other reasons than readability.(Like performance, correctness, robustness)
Easy to read writing is simple and concrete. Code that is easy to understand is also simple and concrete.
Complexity and abstraction push away understanding like two positive magnets. But sometimes readability is worth sacrificing to achieve a separate goal with the code.
They would have been easy to understand by folks who were alive at the time they were written when performed on stage. The only unfamiliar words would have been the ones he coined, but most of those were easy cognates from latin. If Shakespeare wrote the same works today then he'd be a terrible writer.
His comparison to literature falls short in another way: he only considers the great writers. Sure, Dostoevsky is hard to read. But most writers aren't Dostoevsky: some writing is just hard to read because it's bad.
I think a better comparison would be to undergraduate essays. Most of the time the essential points are there, but if the author doesn't know how to write coherently the result is confusing.
I don't think that software and literature are inherently different, but I do agree that this post tries to stretch this metaphor beyond its limits.
The first 4 of 6 proposed meanings of "unreadable" are some variant of "It's objectively fine but I don't happen to like it". I wonder if his experience is one where one sees a lot of high-quality but foreign-looking source code.
The article says: "By analogy, plenty of people find reading Homer, Shakespeare, or Nabokov difficult and challenging, but we don’t say “Macbeth is unreadable.”"
When I say some code is "unreadable", I'm very much not saying "It's like that Scottish Play". Shakespeare had a reason to create a convoluted plot for his story. He specifically edited it to be like that. Unreadable source code is usually (IME) a historical accident, often caused by multiple programmers on the same work. Give 3 writers a copy of one of Shakespeare's plays and tell them to each, concurrently, add a new character to the story, and you're not going to end up with a play of equal quality as the original.
I'm not sure I could name a significant program in use written by a single author (Redis might be the closest, in spirit), and some have dozens or hundreds, yet the average book on my shelf here has between 1.0 and 1.1 authors. It's not surprising, to me, that the average program is an incoherent mess, compared to the average book.
Perhaps "Literate Programming" is a good idea, but we've been missing the point. What qualifies a work as Literature is not great typesetting and hardbound publishing and complete sentences, but something simpler: single authorship.
"Should we strive to satisfy the Shakespeare for Dummies demographic?"
My understanding (and I admit it's been a while since I've studied Shakespeare) is that Shakespeare was written for the commoner. The study guides I've seen for it are almost entirely spent defining words that haven't been in common use in 100+ years, and the occasional cultural note. (When's the last time you spoke a word that ended in "-'st" or "-eth"?) If you gave me some FORTRAN-IV code I would struggle at first to understand it, as I do with Shakespearean English, not because any of the concepts are hard but simply the dialect is unfamiliar.
I can tell you my experience. I see a lot of foreign code, since I work almost exclusively on legacy code (not necessarily old, just not supported by the original developers). Some I would call low quality, most not that bad. I have to spend time and effort understanding the code, even the code I would call high quality.
I get my customers after their original developers have gone, and after multiple other programmers have deemed the code unreadable and unmaintainable. My value proposition comes down to putting in the effort to understand and work with code that every programmer who looked at it before me wanted to rewrite. In ten years doing this kind of work I've only told one customer to start over from scratch.
Fred Brooks talks about "conceptual integrity" and the detrimental effect of too many designers and programmers in The Mythical Man-Month. If you read about the development of Unix and C at Bell Labs you come away with the same impression: a small group of like-minded programmers who achieved conceptual integrity. When only one programmer writes code that integrity usually shows. The more programmers involved the harder maintaining conceptual integrity becomes.
It's useful to hear about your background, and based on your article, it's not too surprising. I'm beginning to believe that there's no such thing as a "typical" programmer (no offense!). I know people who write software for aerospace, political dissidents, major web services, healthcare, weekend hackathons, embedded systems, etc., and all of these have completely different kinds of requirements.
Brooks' book is on my shelf (it's one of the N=1 cases!), but I've come to the realization that I'm the only one who actually believes what he says. More than one manager has had it sitting on their desk when they told my team "This project is at risk of running late, so we're adding some additional people..." Never have I ever worked for a manager who agreed to implement anything in it.
Conceptual integrity is an attribute we really have no metrics for (AFAIK), which means it will naturally suffer as a project grows. That's probably why so many managers have no problem sacrificing it for short-term speed. It's a tragedy of the commons. You get credit for adding features, and for creating a new project to replace an old one, but you never see the downside from the gradual removal of conceptual integrity which caused the creation of the replacement project in the first place.
It's also a dramatic difference from literature. When George R. R. Martin is late with the next "Game of Thrones" volume, nobody suggests adding more authors to hit an arbitrary deadline. We'd rather let the schedule slip by a few years than sacrifice any conceptual integrity.
My experience with Brooks’ Law matches yours: More honored in the breach than in the observance. I also have worked for managers who display TMMM in their office but add people to projects to meet a deadline.
I think decisions like adding more people and sacrificing conceptual integrity and quality reflect a human bias to favor short-term results and to discount or ignore possible future costs.
It's easy to make measurements of literature. It just turns out that word length and sentence structure and number of chapters aren't that interesting. Most of what gets measured when it comes to code is similar, it's just that it is easy to automate the measurement.
The sorts of things that are hard to measure for literature, like impacts on the lives of individuals, are also hard to measure in code. So code measures what isn't hard rather than what is important (and tends to elevate what is easy to measure to importance as best industry practice). To put it another way, we don't go around measuring how much more emphatic I become after checking in on Yelp, snapchatting my BFF, or listening to fifteen minutes of Ginger Baker Radio on Pandora.
Maybe it is just because our expectations for code are so much lower. And maybe our expectations are so much lower because we don't treat code as literature. We accept code that doesn't change us and our view of the world as good code. Sorting an array quickly is held up as good code despite not having much impact on how we look at the world.
> Exchanging a single word for another may totally alter the effect of a passage
I think that describes code perfectly.
> but a function that is subjectively easier to read can be shown objectively to produce the same effect as the original function.
I think you lost me here.
I've rarely seen two implementations of an array sort that have the exact same effects. Things like how fast it is, and how much memory it uses is important to me.
How many characters on the screen is also important too, although certainly in more of a subjective way.
> The cliff-notes version of Macbeth can never produce the same effect in the mind of the reader as reading Macbeth
And the bubble sort doesn't produce the same effect as quicksort.
You seem to be talking about algorithmic differences. Most of the time, when I see someone talking about code readability, they're not talking about the algorithm. The same exact algorithm can be implemented with varying degrees of readability.
>> Exchanging a single word for another may totally alter the effect of a passage
> I think that describes code perfectly.
I wasn't very clear here: what I meant was that exchanging a single word even for its closest synonym will always change the effect of a passage to some degree, because the intrinsic aesthetic qualities of the word inevitably alter the effect of the word.
This isn't true for code, where refactoring a line of code might change how readable that line is, while conceivably having an identical effect when the line is interpreted, compiled or executed.
That the computer cannot receive joy from code does not mean that I can't; That the computer doesn't experience joy from Macbeth doesn't mean that I don't.
Reading joyful code is a treat. I pity who has not read some code that has brought them joy. And yet, what brings me joy[1], [2] is what the peanut gallery may call unreadable. What then?
[1]: www.nsl.com/papers/origins.htm
[2]: cr.yp.to/qmail.html
Surely something else must be going on.
Kernighan and Plauger, while otherwise dated, has one of the best rules for programmers:
Say what you mean, simply and directly.
Assume the best of the writer of the code you are reading; assume this was them saying what they meant, as simply and directly as they could.
Was it clear in their mind?
Or does it feel like they were muddling through the problem?
If as I am decoding this foreign thing of other people's code, I find my mind repeatedly would put it simpler, cleaner, and clearer, I can complain about the tedium, and mindless repetition in the code, but I'm still not complaining about it's readability.
This is a thought-provoking and well-written blog post about programmer biases when it comes to reading, and judging, other people's code.
The author is making the point that code readability is ultimately in the eye of the beholder. I've come to share the author's views, and I have to say I don't hear it said much in programming culture. At most places I've worked, there's this culture of constant refactoring under the guise of "continuous improvement," when really, if you look closely, it's really motivated out of disdain for the last developer's programming style, and in my opinion, a general aversion to reading code.
Reading code is about 10x as hard as writing it. It takes more concentration, it's less fun, it's harder, and it doesn't impress anyone. You have to know the language better than the person who wrote it, because not only do you have to understand why the code does what they intended it to, but you also have to understand why the code does other things they didn't intend (a.k.a. bugs). But in my experience, you save your team
a lot more time and energy in the long run by preferring to read and understand existing code.
Alternatively, we refactor because the right abstraction for the previous phase of the project is not the right abstraction for the current or next phase of the project.
While I do have a respect for Chesterson's Fence as a concept, sometimes the answer to "why is it this way" is "we were learning as we went, and if we did it again, we'd do it another way."
I look at it this way: When you look at an older city built before the age of the car, they were built to be tiny to start, not more than a set of shacks. As the town built wealth, the buildings went to more sturdy, to multi-story, and to more ornate structures. Similarly, our code should start simple and dirty, then cleaned up as it has proved its worth, and then refactored to more robust patterns as the code has built the wealth (and demand) to justify it.
We should consider rewrites, then, as a sign of value, rather than as a sign of the previous programmer's failure.
> sometimes the answer to "why is it this way" is "we were learning as we went, and if we did it again, we'd do it another way."
This is true! In fact there's a lot of that, in my own experience. Rewrites are probably most useful on code you wrote, rather than on someone else's, and right when you realize what went wrong, while you're still intimately familiar with the old code.
I've watched two different companies follow the "things will be so much better if we rewrite" logic and rationalize large scale rewrites that cost millions, and years, and failed to achieve the aims of doing it better the second time.
Rewrites should be considered a sign of value only if we actually learned from out mistakes, only if both everything wrong with the old code and everything right with the old code are well understood. If you rewrite anything substantial before that, you're just guessing, and you're most likely (in my experience) going to suffer taking longer than you want and making the same mistakes again. I've seen that happen to many very smart people.
So there's a balance. Rewrites are sometimes valuable, but not automatically valuable. Sometimes rewrites are very harmful. The best chance you have of knowing which one is to read lots and lots of code before you start, to make absolutely sure that the code you're replacing is never being replaced only because the readers didn't understand it or didn't like it's patterns.
OTOH, if you have complete test coverage in place before a rewrite, you can freely annihilate and redo large portions of code without having to study too hard.
I've seen smart people fall into the following trap:
1. Previous developers did X and X is bad, therefore, their code needs to be rewritten without doing X.
2. Oh crap - turns out, the new code has all these requirements to match the old code's functionality in ways we didn't expect (letters A-W).
3. Okay, so we're doing Y and Z in the rewrite, knowing it's pretty bad, because we didn't know we'd have to do A-W and now we're short on time. Oh, and parts of the code still do X.
Now, wait a year for one third of the team to leave because they're rushed, overworked, burnt out, going in a different direction... and another third to get laid off because the project went way over time and budget...
4. Previous developers did X, Y, and Z, and X, Y, and Z are bad, therefore, their code must be rewritten...
That's the point of Chesterton's fence analogy: Don't rip down the fence until you know why the fence was put up.
I think we're in agreement on this - if you don't have the tests, you don't know what the system does. The Michael Feathers approach is my favored path forward in these cases. Rewrites are more valuable in the small (class-level) than in the large (application-level) in the vast majority of cases. And if you absolutely need to replace an application (say, your company standardized on Oracle and Tcl and you can't hire any new developers because they laugh when you tell them your stack...) you do it piecemeal, building tests in your old system so that you can reliably replicate the functionality in a way that functions as a living, reliable spec.
> sometimes the answer to "why is it this way" is "we were learning as we went, and if we did it again, we'd do it another way."
True, and that's believable when it's people rewriting their own code, but sometimes you'll have people who didn't even try to understand the existing code express a desire to rewrite it.
I suspect this is as culturally determined as the differing American, European, and Japanese attitudes to rebuilding buildings. European cities typically retain medieval street layout and as many buildings from that time as possible, sometimes even rebuilding to the original style after destructive events. Whereas Japanese houses have a ~30 year lifespan and are routinely rebuilt. And I'm looking up the hill at a castle that accreted between the 12th and 17th centuries.
Just as the street layout will outlive the buildings, APIs tend to be extremely durable and intolerant of destructive rewrites. Consider the Python transition.
There is a lot the context you’ve built up in writing that code that could never fit in the comments (even if thoughts could easily be expressed in words, they would dwarf the code and not directly correspond to it; actually comments can actually make code reading harder in this way). It really isn’t about the language either, at least the programming language, but how the problem was defined and understood in the first place, how this understanding was encoded in the software.
Reading code is basically trying to reverse engineer the thinking of the programmer by looking at second order output. Of course it will be hard! It isn’t just style, nor would I say mainly just style. Continuous improvement is often just a matter of rebuilding the context that was lost with the last programmer.
There is a lot the context you’ve built up in writing that code that could never fit in the comments
Isn't that precisely what Knuth was trying to resolve when he came up with the idea of literate programming[1]? The fact that you might end up with more words than code isn't a really problem if the end result is better (for some value of 'better') than just the code.
Yes. But those words don’t come for free, they could be much more expensive than writing the code itself, it’s like trying to teach something rather than just doing it.
it’s like trying to teach something rather than just doing it
If you work on a team a lot of your time is spent 'teaching' (explaining) your code to other developers. Or teaching yourself about it when you come back to something you wrote 6 months ago. Or 'teaching' a QA person your logic to understand where a bug is coming from. Or using your code to literally teach a concept to a junior developer.
Writing documentation can feel like teaching rather than just doing, but that's not necessarily a bad thing if you're working on something that other people need to understand.
This is not the right model for assessing the cost of documentation. If the program is in any sense designed (as opposed to being assembled and modified on the basis of hunches until it appears to work), then the ideas expressed by those words must have been known no later than the completion of the work. Therefore, the cost of documentation is that of writing down these ideas, and the cost of not documenting is the cost of repeatedly reverse-engineering them from the code.
It is my experience that you can markedly improve the speed and accuracy with which a newcomer can understand a code base with supplementary documentation of considerably fewer words than are in the code itself - so long as those words are well-chosen, and focus on the programmer's intent.
You mean waterfall right? Ya, then I guess. If the program is understood before it is written (waterfall), then you merely write down these ideas along side the code. If programming plays any matter into evolving the design (as you say, “hunches” that go into a feedback loop), then this will break down quickly.
Edit. Never mind you mean afterwards. But is the assumption that design can occur independently of programming really true in practice?
It depends on the complexity of the problem you are working on. Something well understood before programming starts has a better chance of being well documented with short prose (because it is well understood, a lot of shared universal context can be relied on). There are lots of things out there that don’t meet this criteria, however.
My description of blind trial-and-error programming is not a veiled reference to agile development, if that is what you are thinking - it is, rather, an anti-pattern in development that is neither agile nor waterfall. There is nothing in agile that says you should just try things until something seems to work. No line of code gets written without the programmer of having an expectation of it making some contribution to the solution, and the issue is how well-founded that expectation was.
Update: Perhaps the canonical example of non-agile trial-and-error programming without well-founded expectations is the programmer who is putting delays in various parts of his program in an attempt to fix a concurrency error.
One's understanding of the program does not necessarily break down under iterative development, as, if you are doing it right, each iteration improves your understanding.
The usefulness of documentation does depend on the complexity of the problem, but in the opposite sense: programs solving simple, well-defined problems do not benefit much from additional explanation (there's not much to say that is not obvious), but the more complex things get, the more it helps.
There are plenty of cases where programming helps you explore the design space, where you have little knowledge about the APIs you are using, so you poke them a bit here and there, obtaining experience in knowing how to use them in the way that you need (because, let’s be honest, even the best frameworks have defficiencies in their documentstion, if you decide to read the docs at all). Likewise, it isn’t that weird to write some code that you know is broken so you can fix it in the debugger where live values and feedback are available. Heck, many people code from interpreters these days which are as exploratory as you can get!
Of course, we can argue about different kinds of programming have different needs. Prototyping doesn’t require documentation and so can move much faster than product development, for example. The cost of not documenting is a huge win for the prototyper, allowing them to try out and throw away designs while worrying less about sunk costs.
The design has to come from somewhere, after all. A design team with prototyping resources really values those resources.
Your example is a case where small amounts of additional documentation can be useful. If you are working with a poorly-documented API and you find something that is unintuitive and non-obvious (it might be as simple as an arbitrary choice between equally valid design options, where exactly one needed to be chosen) but which matters in what you are using it for, then a note of that fact could save a lot of time in the long run, depending on what you are coding for: if it is just for yourself, then you only have to consider what is best for you, but if it is an actual product where it is likely that others will have to understand it, that note might pay for itself many times over.
I sometimes describe programming as a one-way hash operation on requirements. A lot of information and context gets lost when writing software, and I haven't seen a workable solution to that problem yet.
This is closer to the root of the problem than saying code is unreadable... the information isn't lost because of the code being produced, it's lost because the developer leaves. Documentation won't work because requirements change faster than they can be documented. I don't know a solution other than just trying to convince that developer not to leave.
IMO that's part of why readable code is so important. If I can look at a piece of code and understand its behavior then I can know something. I might not know what stated requirement it was trying to solve, but I can know for sure what requirements it implements.
Compare that to sloppy code bases with side effects everywhere where you don't know what it's supposed to do or what it does.
The biases are real, but readability is perhaps less of an ultimate concern than maintainability and reusability. Both of these depend on the ability of yourself (and others) being able to understand and adapt the code. A bit of foresight in making the life of future you (or colleague) easier goes a long way. Technical debt is real too and the interest can be high.
I would say that readability is a facet of maintainability. If a good developer looks at code the first time and thinks, "WTF," it could use some better readability. That's about as nailed down as I can make it (because it's so objective).
Throughout most of my career reading and maintaining code got fobbed off on the new programmers and the less skilled. Maintenance programming has a bad reputation, partly because it requires reading and figuring out someone else's code.
Now I make a living reading and maintaining code no one else will touch. Lots of companies can't afford to rewrite a mostly-working system, or they can't take the risk. I found a niche doing maintenance work and now I enjoy fixing what other programmers have said they can't maintain. Freelance maintenance work pays just as well as green-fields development and has fewer customer hassles, too.
Only if you write shitty code it's 10 times easier to write it than to read.
When I write my code I use much, much more effort to try to express clearly my intent and what it is doing, and in that case it's much easier to read it than to write.
I think that we can easily say that good code is very difficult to write and very easy to read, bad code is very easy to write and very difficult to read.
The code is written once and read n times by m different people, so there is a huge gain in spending more time in making it simpler to understand rather than spending the minimum time in writing code and leave all the effort to the readers.
Readers that, very often, after they become too frustrated, want to rewrite it (and for very good reasons I'd say).
We all think that we write our code so other programmers can read it, and we believe that because while we have the code in our heads we don't have any problem reading it.
I don't question your intentions or abilities, but the fact remains that a whole lot of programmers find almost every piece of code they didn't write hard to read. Or they say that -- I think they mean they just don't like the look of it, or they can imagine writing it differently.
I'm of the opinion that companies should rewrite as much of their codebase as often as possible. Given the difficulty of trying to understand code written by someone else, why shouldn't production code be considered immutable?
This reminds me of the discussion of "quality" in Zen and the Art of Motorcycle Maintenance, where the class finds that while nobody can exactly define quality, when confronted with high and low quality writing everybody could recognize which was which.
As OP realizes, nobody thinks Knuth's ACP is poorly written even though it's hard. The elegance is obvious, even if the algorithm is hard to understand. On the opposite end, I've come across FizzBuzz implementations from job candidates that I've struggled to follow. I've often seen a 10-1 difference in code sizes for different implementations of the same problem, and while there's edge cases where smaller code can start to become confusing, generally every dev on the team recognizes which of the two is more readable.
I have no problem with the idea that there can be subjectivity in discussions of readability. But the idea that readability is purely subjective, or only has to do with differing styles is ridiculous. In fact, I don't believe for a second that OP really believes this. He wants to make a point about subjectivity, which is fine, but for some reason has chosen to write it in this exaggerated clickbait style.
I had Zen and the Art of Motorcycle Maintenance in mind when thinking about readability, because that describes one kind of quality we care about.
I don't think I wrote that "readability is purely subjective," I certainly didn't mean to imply that. I did mean to make a point about subjectivity.
Sorry you didn't like the style. I write articles because I enjoy writing and I think I have ideas worth sharing. I don't have ads or affiliate links on my site, so I have no incentive to post clickbait.
I think you're right. And the OP is right, too. There are contexts that OP clearly lays down where they think that programmers are too dismissive and try to stroke their ego at every opportunity and so the excuse of readability needs to be called out for what it actually is. Its an interesting piece, if nothing, the quotes embedded in them are worth their weight in gold. I wouldn't call it clickbait, though.
I once had a developer who consistently wrote unreadable code while trying to be clever. To give him useful feedback I needed to come up with a black and white heuristic for knowing if code is readable or not.
The heuristic is this:
Readable code can be explained in English sentences in one pass.
Your code tells a story. When you read it do you introduce the characters in the right order? Halfway through a complex passage do you reference something out of context without explaining it? Give concepts valid names to help the reader build context on your intent.
The best part of this heuristic is that you can get the code author to attempt to read back their own code, and trip over their own convolutions, and realize where there is opportunity to simplify.
We all worked on the same codebase so a reasonable shared understanding of the concepts was assumed.
This blog post throws various situations into the mix to try to be contentious or something but misses the point.
That's called the telephone test. From The Elements of Programming Style (referenced in my article):
"Use the telephone test for readability. If someone could understand your code when read aloud over the telephone, it's clear enough. If not, then it needs rewriting."
I agree that the telephone test heuristic gives you an idea of readability, but in 1974 Kernighan and Plauger assumed someone one the other end of the phone who knew Fortran or PL/I.
Trying to explain a moderately complex class hierarchy or a factory function or a closure over the phone would prove challenging. I could explain the purpose of the code at a level someone else could understand, but they might not recognize the implementation when looking at the code.
A few years ago, doing maintenance on an abandoned production system, I came across a function with the comment "Produce a unique six-digit identifier that doesn't start with 0". The original programmer could have explained that to me over the phone, but his implementation (in PHP) was not clear at first or second glance. I expected this:
$id = rand(100000, 999999);
Instead the function started with a loop to get the first digit:
while (true) {
$first = rand(0, 9);
if ($first != 0) break;
}
I'm not making this up.
Then another loop to pick five more digits in the range 0..9. Then a database query to see if that ID was used, and if it was the function called itself recursively. I'm not kidding.
So while I would have understood the problem described over the phone, I didn't recognize the implementation at first sight. If the programmer had read the code to me over the phone I might have assumed I was missing some requirement that led to this overly-complicated implementation. Not all of the code looked like this so I don't know how the original programmer got wound around the axle on this function.
1. Write in English what you intend the code to do, as comments.
2. Progressively write code to do what you said you would do above until it is all done.
3. Leave the comments there to explain what it does.
I've worked this way since I got to use comments (originally I coded in hex, computing negative branch offsets in my head as I went -- possibly this explains why I'm such an curmudgeon about requiring comments that explain any non-obvious code's purpose and intentions).
(Yes there is an issue with consistency between comments and code, especially if code is changed over time. Better to fight that problem I believe than to have no information from the original developer as to wtf they thought they were doing.)
> By analogy, plenty of people find reading Homer, Shakespeare, or Nabokov difficult and challenging, but we don’t say “Macbeth is unreadable.”
What of Joyce's Ulysses?
The thing is, most authors do put considerable effort into readability, because it helps you get read, published and paid. There's an interesting parallel with this article - writer:programmer :: critique group:code review.
> Programmers, prone to investing their ego in their code, worry about criticism from a lot of people they don’t know. Slurs about competency abound in programmer forums. Code samples out of context get critiqued and subjected to stylistic nit-picking. That process can prove helpful for programmers with thick skins who can interpret criticism as either useful mentoring or useless insults. Less experienced programmers can get misled and brow-beaten. Public code reviews create a kind of programmer performance art, when programmers write code to impress other programmers, or to avoid withering criticism from self-appointed experts
s/programmer/author and it still holds. As does much of the rest of the article.
I would say that programmer education really does underplay reading code. Students are taught to write small pieces and may never be shown a large novel-sized lump of code unless they go looking for it.
Ulysses is not unreadable, it just takes a lot of effort. For programming, the criteria is not if it is hard to read, but whether the same thing could be expressed in way that would be simpler and easier to read. But I not sure to what extent this could be applied to poetry, since the expression in language is in itself "the thing".
"Unreadable" in this whole discussion should be taken as meaning "requires effort that the reader is not willing to expend", not an absolute. There are very few things that are impossible to read at all, even the notoriously and deliberately unintelligible stunt languages like INTERCAL and brainfk.
I'd even go so far to say that the only way something can be literally unreadable is if it is physically impossible to do so (e.g. unrecoverable data) or there is not yet any known way of determining the meaning of a given text (e.g. the Voynich manuscript).
But of course that isn't what is meant when we say something is unreadable because humans don't deal in absolutes (though programmers sometimes do, which is why programmers are so fond of using overly noncommital language like "should" and "most likely").
Yeah. If you wrote a comment in the style of Ulysses (at least, in one of several of the styles used there), I'm not going to bother trying to read it. It may be something profound said very well, but I still am going to err on the side of assuming that it's a waste of my time.
And if you write your code in the equivalent of the style of Ulysses, well, let's just say that having PhD dissertations several decades from now that try to explain exactly what your code means is not the same as code readability.
This is a fantastic post that asks a question that I think many people just jump over on their way to "being right".
What exactly does the thing we are talking about mean?
In this case "code readability".
I would add to this that "code readability" only means something to me in the context of it's intended audience. Code is readable when it is simple to decipher by the people that the author intended to decipher it. For anyone else to hold it to standards outside of those agreed upon by the author and the intended audience seems silly.
A problem arises, if there exists no group of people that the author had in mind when writing the code. Some programmers really just code "so that it works" and don't care about readability at all. That's why we have code reviews - a code review is an attempt to read the code. If at least one reviewer understood the code, then we can assume it was not completely unreadable.
I often refactor code as a way to understand it; working through the logic and rewriting it in my own style helps me get a good grasp of what the original code was doing.
Sometimes I throw away that refactoring if I realise that the code I was having trouble understanding was actually fairly well written and I just need to sprinkle in some comments.
Doing a lot of code maintenance what is the worst is most "smart" things. Every time I get to some indirection to be smart and I can't just ctrl-click on some line to get to the definition I'm angry.
I don't want to check what is there at runtime once the configuration has been loaded. I don't like going through 10 get("serviceName")->handleSomething() and having to check where those services are defined. I really don't like having to go through 20 abstractions trying to find who fucked some data on the way in my debugger.
I know, your 3 loc per method codebase felt good to write. The added useless junk around the code (function definition and {} on their line) means even with a 4k monitor I can't see your full algo in one page.
Using reflection magic? Please don't. I like dumb code which shows what it does and can easily be followed.
And that's coming from someone who likes the ${whatever returns a string} method of defining dynamically named variables in php: it feels always fun to write. But it is a hindrance when debugging. And when debugging the less "small shit" you have to deal with, the more brainpower you have to focus on finding the problem.
I think it's necessary to expand on the authors definition of "complex" vs "simple"... Things can be "complex" and thus hard to read for many difference reasons:
At one extreme the reader is definitely responsible for things like understanding an underlying algorithm - the algorithm chosen might be a good choice, but it might also be very complex. It's not always possible to clearly convey an efficient implementation of a complex algorithm in code - It could be argued at the most that the author should name the algo in a comment, but litering the implementation with comments describing the algorithm would make it harder to read for someone who is familiar with the algorithm.
At the other extreme there are all kinds of unnecessary complexity that make it hard to read, some by poorly suited forced design patterns, some by cruft that accumulates and some by un-refactored growth (big ball of mud).
One of my pet hates is highly layered object oriented code where someone has gone to painful ends to generalise and modularise the fuck out of every single routine to the point that the overhead of traversing the hierarchy to determine what actually happens in what order with what side effects when you call one of the outer layers of that ginormous onion is humanly impractical. I'd call that unreadable code due to complexity that the authors are responsible for.
Readablility is definitely not similar to understanding. The problem I have with reading code from other codebases is not understanding the coherence. Naming of variables is a great deal here, one poorly chosen name can totally confuse me. And still it is highly subjective because an other developer might instantly see the relation just by chance.
If you're unfortunate enough to code full-stack web apps then you'll likely be dealing with hundreds of npm libraries you've never heard of! How can you know all possible effects and side effects of 1 simple call to a method in some library? That alone would be a quite a study in some/most cases.
It would be great if for any project you can find at least an overview and explanation of the coherence that the files and used modules have in relation to each other. But unfortunately that is often missing already to start with, so you end up with a vast and often unsolvable puzzle.
Programming computers is still in its infancy. We definitely need better tools and structures to understand what's going on IMHO. It makes me laugh when people tell me they solve this with a linter.
I work on a system where arbitrary property access results in database round-trips.
So the following code:
foreach(var b in a.Bs)
{
if(b.Data != null)
{
ThisIsInteresting(b);
}
}
results in a.Bs.Count + 1 round-trips to the server and then database if these fields have not been accessed before. (a.Bs is one and b.Data != null is one as well).
This is a fine example of what I'd call "unreadable" code because there is so much activity that is implied by actions that in other systems are completely innocent and well isolated.
Also there is the point of cognitive load. If your entire application was simply one function it is technically "readable" but the cognitive load of trying to work out what is in scope and what might change a given variable is too high to read without extraordinary effort.
When working in such systems (as I have before) you HAVE to rewrite it to even start reading it because a human memory cannot hold all that information at one time.
I think brushing it all of this off as "the reader isn't reading well enough" is disingenuous.
To read code is to understand it. It has little to do with style but more to do with program structure and abstraction layers. I've maintained a PHP codebase with some of the worst code I've ever seen. For Example:
1. A function with a few thousand lines of code, accepting dozens of parameter. The body is made up of some nested switch-case and if-else.
2. Instead of using constructors, every class has a static method that does nothing but initialize itself.
3. A class which queries DB and generate HTML, with nearly a hundred public variable. A few levels of sub classes with functions to update those variables.
When a piece of code is not obvious in what it's doing, I'll call it unreadable. In some cases this is acceptable, for example in some clever algorithm (https://en.wikipedia.org/wiki/Fast_inverse_square_root), but it is still what it is, unreadable.
There is often a world of a difference between code hard to read because it tries to solve a complex problem or is written in a paradigm I haven't learned yet, and code hard to read, because its author was just plain sloppy, and didn't properly care about naming or decomposition. Distinguishing between these two cases is a matter of enough experience.
Then who are all these commenters in mailing lists, forums, blog posts and source control systems who are proclaiming code they cannot read as "unreadable" or 'unmaintainable"?
Clearly the author is aware these comments fall into the categories he mentions; they are reframing "I can't read this code" into some sort of pseudo-objectivity.
If typical programmers are aware of this practice, then who are these commenters labelling things they cannot read as "unreadable" or "unmaintainable"?
Are they "typical programmers"?
Can we respond to those labelling code as "unreadable" or "unmaintable" by pointing out that they really mean "I cannot read this code"?
Will they acknowledge, "Yes, what I meant was I do not know the language well-enough to understand that code, but others might be able to read and edit it."
The article hints at it, but your sense of the readability of a codebase changes the longer you are exposed to it. Patterns that make it easier to become moderately competent in a particular codebase can actually make it harder to become expert. So if you want to maximize readability, you need to estimate how long future maintainers will live with your code.
Will they drop in and do a single commit and never look at it again? Then make sure everything you need to understand a piece of code is right there next to it.
Will they work on it for a long time, changing many parts? Then be very consistent with the patterns and idioms you use across the project, so that their previous experience changing other parts of it helps them understand how to make the next change.
I don't like the shakespeare analogy because not every written piece of code is supposed to be a work of art. You wouldn't write an instruction manual like you would write Macbeth. Most of the time there's really no value in writing cryptic code
You are saying that programming is the same as an instruction manual? Programming is art. Nearly every objective in programming will have over a million potential implementations. Personality, wisdom, and mood are in every program (obviously there are some Hello World exceptions) just like artistic writing and nothing like an instruction manual. Machine code is an instruction manual compiled by a computer, a program is a piece of art created by humans.
well no. You can write artistic code, people do that in coding all the time (in programming contests, sample code, your own code etc) but the real goal of writing code is to perform a certain task. The execution of those instructions is what matters, not the style of how the code written.
There's a sense of "unreadable" that I sometimes use that means "the code does not contain the information that you need to understand how it will typically function". E.g. things that have internal control flow determined by complex input processing where the input is ill-specified. It can be essentially impossible to put together a mental picture of the operation of the program, because key parts are essentially unspecified.
And there's a related sense that I occasionally use to describe code that's so difficult to follow (in the sense that the control flow or structure is needless obfuscated) that the process of trying to understand it doesn't really resemble simply "reading".
Some examples:
A function is called in many places in es6 code. I want to see its definition, my editor can't figure it out. I look at the import statement, it's imported from a file. I open that file, and all it does is export * from a dozen other files - so now all I know is that it's in one of these dozen files. I am reduced to doing something like grepping for the function name and looking for the case that looks more like a definition than a call. At one point I was reading the code, now I'm doing something related and more involved than just reading.
Or, in this system components emit message objects with a "title" field. The messages are dispatched to dozens of functions across files that inspect the title field to determine if they do anything with it. Only they don't just do a simple comparison against a string literal, they check if the title meets a regex. There are many of these different regexes.
I read the code that emits the message and I want to know what behaviors the message being sent out triggers - how do I track that down? I can't even grep for the message title. In order to make a list of things this message might do, I have to look through everything that might possibly handle the message and try to figure out which regexes are triggered.
In this case, reading is involved in the same way that reading through a math book and solving the exercises does involve reading - that's not where most of the work comes from.
The biggest helper for me when trying to read other peoples' code is having a document (or documents) which describe what the heck the code is supposed to be doing in human terms. This is especially true when I'm not super familiar with the business history, long forgotten bug fixes, and nuance which was previously worked around or implemented.
Software documentation almost always sucks, in my experience. When I do find well written documentation that reflects the code, it becomes much easier to understand the code quickly so that I can make the changes I want to make.
Time is a commodity. The extra time spent reviewing or maintaining code that is difficult to read would be better spent writing the code well the first time.
In open source development, a maintainer’s time is often extremely limited. If you can’t make your code easily readable for that maintainer, then your contribution is worthless. You have created an asset with a transaction fee that prices you out of the market.
The author is spot on for 'standard' code styles. However when you enter the realm of crazy code styles (say, all code of the program in one single line, obfuscated code not meant for humans to read and so on), wouldn't we all agree that is in fact, objectively, unreadable? Beacuse actually reading it is simply too hard no matter how skilled and knowledgable you are? (note none of the OP's points cover this, except for 'The code offends my sense of aesthetics' maybe, but by a stretch).
And extending this somewhat further, I have already seen code written by humans which I am confident the vast majority of programmers would consider unreadable. Unfortunately including the guy who wrote it.. We're talking mixing indent witdhs/seemingly mixing all known code styles/complete lack of consistency/lines of hundreds of characters/big chunks of >20 newlines/... I think there is a point where this becomes unredable and it's not the reader who is to be 'blamed'.
> wouldn't we all agree that is in fact, objectively, unreadable
No, because there's a rich body of counterexamples in the array-oriented languages (APL, K, J). Everyone who doesn't know these languages immediately dismisses programs written in them as unreadable, but that is far from the truth.
If you spend some time in the alternate programming universe that such languages inhabit, you'll get into your bones the knowledge that readability is relative both to language and to reader, and you'll start to think about simplicity in a deeper way. (By "you" I don't mean you personally, of course, I mean everyhacker.)
He doesn't need to say "btw there's still some actually really terrible code out there" for us to know he knows it's true. He's making a point that wouldn't benefit from going there.
Some good points. I think code-readability should be measured by people who are as accustomed as possible to the code base. IE, if you wrote all the code, how quickly can you refresh your memory ( 6 months later ), as well as trace down bugs.
There is a difference between reading legacy code without empathy (sympathy? pretty sure it's empathy) and calling code equivalent of Macbeth as unreadable when Cliff's Notes would do...
Code readability is when you are able to pick up the source of a program that you haven't looked at in 10 years and instantly understand it without having to study it intently.
Over time the first thing I've come to desire more than 'intrinsic' readability when joining a new project is a decent, up to date, hyperlinked glossary with all of the relevant acronyms and terms that will be used in the code. I've had variations on this conversation many times before:
* New coworker (me): What's an XYZ? (some core thing in the system)
* Coworker 1: An XYZ is [blah blah blah].
* Coworker 2: Actually, an XYZ may not be [blah blah] it could be [blah].
I like Rich Hickey's stance on this: "simple" is objective (antonym: "complex"), whereas "easy" is subjective (antonym: "hard"). Easy depends on skills, interest, patience, curiosity - but simple does not. Simple is about lack of intertwining of concerns. About writing code that can be understood without "bringing in" the entire codebase into your mind. Like, you can understand a pure function just by looking at it (it's simple). If it modifies or uses a global variable, now you have to look at all the places that "touch" that global variable, in order to understand what your function does; the code is thus much more complex.
With that definition - it's absolutely correct to say that the code should be as simple as possible. And that writing simple code is typically very hard.