There is a lot the context you’ve built up in writing that code that could never fit in the comments (even if thoughts could easily be expressed in words, they would dwarf the code and not directly correspond to it; actually comments can actually make code reading harder in this way). It really isn’t about the language either, at least the programming language, but how the problem was defined and understood in the first place, how this understanding was encoded in the software.
Reading code is basically trying to reverse engineer the thinking of the programmer by looking at second order output. Of course it will be hard! It isn’t just style, nor would I say mainly just style. Continuous improvement is often just a matter of rebuilding the context that was lost with the last programmer.
There is a lot the context you’ve built up in writing that code that could never fit in the comments
Isn't that precisely what Knuth was trying to resolve when he came up with the idea of literate programming[1]? The fact that you might end up with more words than code isn't a really problem if the end result is better (for some value of 'better') than just the code.
Yes. But those words don’t come for free, they could be much more expensive than writing the code itself, it’s like trying to teach something rather than just doing it.
it’s like trying to teach something rather than just doing it
If you work on a team a lot of your time is spent 'teaching' (explaining) your code to other developers. Or teaching yourself about it when you come back to something you wrote 6 months ago. Or 'teaching' a QA person your logic to understand where a bug is coming from. Or using your code to literally teach a concept to a junior developer.
Writing documentation can feel like teaching rather than just doing, but that's not necessarily a bad thing if you're working on something that other people need to understand.
This is not the right model for assessing the cost of documentation. If the program is in any sense designed (as opposed to being assembled and modified on the basis of hunches until it appears to work), then the ideas expressed by those words must have been known no later than the completion of the work. Therefore, the cost of documentation is that of writing down these ideas, and the cost of not documenting is the cost of repeatedly reverse-engineering them from the code.
It is my experience that you can markedly improve the speed and accuracy with which a newcomer can understand a code base with supplementary documentation of considerably fewer words than are in the code itself - so long as those words are well-chosen, and focus on the programmer's intent.
You mean waterfall right? Ya, then I guess. If the program is understood before it is written (waterfall), then you merely write down these ideas along side the code. If programming plays any matter into evolving the design (as you say, “hunches” that go into a feedback loop), then this will break down quickly.
Edit. Never mind you mean afterwards. But is the assumption that design can occur independently of programming really true in practice?
It depends on the complexity of the problem you are working on. Something well understood before programming starts has a better chance of being well documented with short prose (because it is well understood, a lot of shared universal context can be relied on). There are lots of things out there that don’t meet this criteria, however.
My description of blind trial-and-error programming is not a veiled reference to agile development, if that is what you are thinking - it is, rather, an anti-pattern in development that is neither agile nor waterfall. There is nothing in agile that says you should just try things until something seems to work. No line of code gets written without the programmer of having an expectation of it making some contribution to the solution, and the issue is how well-founded that expectation was.
Update: Perhaps the canonical example of non-agile trial-and-error programming without well-founded expectations is the programmer who is putting delays in various parts of his program in an attempt to fix a concurrency error.
One's understanding of the program does not necessarily break down under iterative development, as, if you are doing it right, each iteration improves your understanding.
The usefulness of documentation does depend on the complexity of the problem, but in the opposite sense: programs solving simple, well-defined problems do not benefit much from additional explanation (there's not much to say that is not obvious), but the more complex things get, the more it helps.
There are plenty of cases where programming helps you explore the design space, where you have little knowledge about the APIs you are using, so you poke them a bit here and there, obtaining experience in knowing how to use them in the way that you need (because, let’s be honest, even the best frameworks have defficiencies in their documentstion, if you decide to read the docs at all). Likewise, it isn’t that weird to write some code that you know is broken so you can fix it in the debugger where live values and feedback are available. Heck, many people code from interpreters these days which are as exploratory as you can get!
Of course, we can argue about different kinds of programming have different needs. Prototyping doesn’t require documentation and so can move much faster than product development, for example. The cost of not documenting is a huge win for the prototyper, allowing them to try out and throw away designs while worrying less about sunk costs.
The design has to come from somewhere, after all. A design team with prototyping resources really values those resources.
Your example is a case where small amounts of additional documentation can be useful. If you are working with a poorly-documented API and you find something that is unintuitive and non-obvious (it might be as simple as an arbitrary choice between equally valid design options, where exactly one needed to be chosen) but which matters in what you are using it for, then a note of that fact could save a lot of time in the long run, depending on what you are coding for: if it is just for yourself, then you only have to consider what is best for you, but if it is an actual product where it is likely that others will have to understand it, that note might pay for itself many times over.
I sometimes describe programming as a one-way hash operation on requirements. A lot of information and context gets lost when writing software, and I haven't seen a workable solution to that problem yet.
This is closer to the root of the problem than saying code is unreadable... the information isn't lost because of the code being produced, it's lost because the developer leaves. Documentation won't work because requirements change faster than they can be documented. I don't know a solution other than just trying to convince that developer not to leave.
IMO that's part of why readable code is so important. If I can look at a piece of code and understand its behavior then I can know something. I might not know what stated requirement it was trying to solve, but I can know for sure what requirements it implements.
Compare that to sloppy code bases with side effects everywhere where you don't know what it's supposed to do or what it does.
Reading code is basically trying to reverse engineer the thinking of the programmer by looking at second order output. Of course it will be hard! It isn’t just style, nor would I say mainly just style. Continuous improvement is often just a matter of rebuilding the context that was lost with the last programmer.