

Refactoring wars and how to avoid them: what is "simplicity" in programming? - MikeTaylor
http://reprog.wordpress.com/2010/03/28/what-is-simplicity-in-programming/

======
Chris_Newton
We are far too willing to dismiss things as mere "personal preference" in this
industry. There may be competing theories all supported by some empirical
evidence and sound logic, but some claims are directly contradicted by hard
data. Those claims are not just a personal preference for the best way to do
things. They are measurably, objectively wrong, and failing to say so is just
being nice because we don't want to criticise someone.

On the evidence I have found to date, I am coming around to the view that the
style of writing many very short functions (say up to 5-6 lines) with little
complexity in the logic (say just a single level of nesting) is one approach
whose claimed superiority is directly contradicted by empirical data. For
example, McConnell discussed the number-of-lines issue in _Code Complete_
years ago, citing multiple studies. Anyone can find still more by investing a
few minutes in Google Scholar searches.

Alas, that does not stop bloggers, consultants, trainers and book authors from
advocating this programming style, even though it invariably results in the
kind of incohesive "spread" that the article mentioned.

~~~
dkarl
Code that falls in line with your expectations and caters to your mental
toolkit is easy to understand. It also helps when code works well with your
existing programming environment. I work with Java programmers and C++
programmers, and they both have ways of jumping quickly to class and method
definitions, but the C++ programmers' way -- incremental search -- only works
within a single file. The Java programmers barely notice that they're hopping
around in the filesystem like Q-bert. The C++ programmers, working in emacs
and vi, come to a grinding halt when they need to hop around to many different
files in different directories, probably because they almost never need to do
that.

(It may seem like the Java programmers simply use more capable tools, but most
of them don't use incremental search at all, so they're much slower at finding
something that they don't have a hop-to-X button for. Why? Because they get by
fine without it, just like the C++ programmers get by without learning
whatever emacs or vi extension would help them navigate Java-style source
directory hierarchies.)

Obviously some kinds of tools and programmers will be more common than others,
so one style can be empirically better than another yet be worse for a
particular group of programmers. There won't be one better way of writing code
until everybody has the same expectations and the same tools. I don't think we
ever want to reach that kind of consensus.

~~~
Chris_Newton
I'm certainly not advocating complete uniformity in how everyone does
everything, but I think we have to be careful about judging programming style
based on artificial limitations. Static code navigation, on the level of your
examples, is essentially a solved problem. If the developers you work with are
using tools that can't do it, or aren't using the tools they have to best
effect, then they have a basic problem that has little to do with coding
style.

------
aaronblohowiak
this entire blog post reduces thusly: "hiding complexity with abstraction and
encapsulation is not simplification, but further increases cognitive load as
the abstractions must be undone to fully comprehend the function of the code."
It is an interesting point, sure. I think the example as given makes his point
fairly well.

The question becomes: when does hiding complexity improve your ability to
reason about the code?

I agree with the author that this is ultimately a personal preference. One
guideline that I appreciate is that your code should operate at a consistent
level of abstraction. For instance, in the routine that defines the business
logic, the details of the protocol used for persistence should be "somewhere
else." This complicates tracing the execution of the program, but it makes
understanding the intent of the business logic easier as there are less
"implementation details" to ignore.

~~~
DougBTX
_The question becomes: when does hiding complexity improve your ability to
reason about the code?_

When you trust the abstraction. For example, on the whole we trust the
filesystem abstraction, we rarley try to inspect the phisical location of the
bits on the disk - the filename is sufficient, we can treat the fielsystem as
a black box.

On the other hand, if you don't trust the code, you're going to want to read
every line, and that's easier to do if all the code is in one method. An
abstraction is a liability if you can't trust it.

Re testing: it's easier to trust a well tested function, so splitting up code
to make it easier to test, and then testing it, will allow you to ignore it's
implementation. Suddenly it isn't logic spread across six classes, but one
class using some utility methods.

~~~
chousuke
What kind of abstractions are the most trustworthy?

In my experience, the ultimate in abstraction is anything that works solely
with immutable data. This includes, obviously, pure functions, but also
objects which have no mutable state. Such things are trivial to test and
perfectly composable. It's easier to gain trust in them as you rarely need to
know exactly what they do, only whether they appear to work or not.

Obviously immutability alone is not enough for many abstractions, but I would
still like to see it used more. The current object-oriented models seem to
encourage encapsulating state, rather than encapsulating data, which seems to
lead programmers to use mutable things even when they are unnecessary.

Unless the abstraction requires it (eg. IO), I think any state held within an
object is, in a way, leaky abstraction. As long as there is state within an
object, it is not obvious whether you can, at any one time:

1) call its methods 2) hold a reference to it 3) delete it, or your reference
to it (Mostly applies to non-GC languages) 4) Pass it as a parameter to
something

All these worries disappear with immutable values.

~~~
wisty
Pure functions have one problem - they don't have state, and sometimes you
need state.

What's needed is pure functions for logic, and dumb objects holding data, and
function pointers.

------
Khaki
Surely one long function is very much harder to test than several shorter
ones? So in the context of unit-testing at least, testability is in harmony
with Fowler's definition of simplicity. Seems like a nice goal to me.

~~~
olliesaunders
Yes.

And with testability comes flexibility and maintainability. I think the author
is focusing too much the simplicity of familiarization and conceptual weight
where, very often, the real challenge with imperative code lies in
maintenance.

------
sgoranson
Personally my brain has an easier time grasping {TaskA(); TaskB(); TaskC();}
rather than a 750loc block of code. Yes, multiple source files are annoying,
but there's plenty of tools out there to help (ctags and vim is enough for
me).

But the more important reasons to strive for high cohesion/low coupling are:
Future changes are generally easier when you have well defined blocks instead
of a birds nest of code, unit testing practically writes itself, and you have
a better chance at isolating a problem to a module if it's responsibilities
are few and well defined.

------
fauigerzigerk
I think you are wrong. It's not just individual taste. What matters is how
well functions are aligned with units of change and reuse.

Simplicity is having to think about a smaller number of items and dependencies
when I want to change/extend/reuse code. If there are many functions and
classes but I have to touch all of them whenever I want to make a typical
modification, then it's worse than having everything in one function.
Conversely, if a more granular design allows me to ignore most of the code and
just change one simple function or even just parameterise it differently,
that's simplicity.

But there is a snag. Even if the design aligns well with units of change and
reuse, and I would have to make just one small change in one small function in
order to have the desired effect, I don't necessarily know which of a large
number of functions it is. I might not even know whether or not such a
function exists. So I have to understand how everything works together in
order to benefit from well designed code.

That leads me to the conclusion that better programmers who do understand the
system and its interdependencies benefit from small units provided they align
well with units of change and reuse. Bad programmers have to look at
everything every time anyway, so they might find it easier to look at one
large chunk of code.

------
daakus
I think the point is avoiding complexity until absolutely necessary. Splitting
code and creating abstractions "because it will make doing X easier in the
future" is often the wrong thing to do. The right thing imho is to split it
when you need X now, not the future.

~~~
zmmmmm
That's the point that seems to be most missed here.

It makes total sense to split things into 7 different classes when you
actually _need_ different implementations of all those parts so that the
abstractions you've made are useful.

The problem with demonstrating these things in books (or in general) is that
your examples have to be simple to be comprehensible. But the presumption is
that the techniques are being applied into reality to a much more complex
system. I'm sure if the book had injected 20,000 lines of code into the
example he would have written a blog post about how the book should have used
a simpler example to demonstrate the point while having no complaint about the
fact that it used 7 classes to do it.

I think because of this tendency in books and courses to demonstrate complex
OO techniques with simple examples many people come away with the attitude
that you should do all this stuff pre-emptively rather than "on demand". I'm
not sure that was ever really the intention, but it has resulted in a lot more
overly abstracted code being produced in the world than necessary.

------
trunnell
The author's criticism of Fowler's refactoring:

 _But look at the cost: to understand how rentals are calculated, we now have
to read six classes instead of one method... Fowler evidently finds it easier
to read many small methods than a few larger ones; I find the opposite._

This raises the question, do we all have our own definition of good code?

Several books have tried to formalize what is good code (he mentions
_Refactoring_ , _Code Complete_ also comes to mind). I enjoyed those books,
but I'm often reminded of what my CS professor once told me: good code is a
matter of taste.

In the example he cites from _Refactoring_ , my taste is more like Fowler's. I
think it's easier for bugs to hide in long methods than short ones. Plus,
Fowler extracted some distinct concepts into their own classes-- things like
prices. Price formulas are likely to change, so I say the cost of extracting
that class is well worth it.

It seems to me that ultimately, Fowler and others are not claiming to have
found the secret to "good code;" rather, they are trying to influence people's
taste for what _they_ consider good code.

------
drp
Fowler's book is about simplifying mainentance and testability, not
(necessarily) simplifying the actual code. He doesn't even claim his
techniques are the best way to start programs, hence the "Refactoring to"
portion of the title. Also, it's specifically about object oriented design
patterns so the examples aren't intended to apply to other paradigms.

------
digitallogic
From the first chapter of 'Refactoring': "Was it worth it? The gain is that if
I change of the price's behavior, add new pricess, or add extra price
dependent behavior, the will be much easier to make."

Massive functions start out as small and then large ones. If a business logic
function has to grow every time there's a new rule that could live behind an
abstraction, it's on the path to becoming massive.

------
DanielRibeiro
Well, I like two definitions.

1: From the Agile Manifesto (<http://agilemanifesto.org/principles.html>):

 _Simplicity--the art of maximizing the amount of work not done--is
essential._

2: From Kent Beck on Extreme Programming:

 _Simplicity is the most intensely intellectual of the XP values. To make a
system simple enough to gracefully solve only today's problem is hard work.
Yesterday's simple solution may be fine today, or it may look simplistic or
complex. When you need to change to regain simplicity, you must find a way
from where you are to where you want to be._

And a related one from Antoine de Saint-Exupery:

 _Perfection is achieved, not when there is nothing more to add, but when
there is nothing left to take away_

------
barrkel
There's a spectrum with respect to abstraction. Too little, and you end up
with a brittle blob of code where the multiplication of each axis of change
and parameterization causes an explosion in state changes and maintenance
costs. Too much, and you end up with too many indirections, lost performance,
and an additive cost of change and parameterization in each axis that dwarfs
the complexity of the underlying operation.

The happy middle is "just right", but it's hard to recognize or achieve
without some experience of either extreme. And even then, the chosen point
might be biased towards less abstraction for performance reasons or more
abstraction for composability and reuse reasons. But really good use of
abstractions (ideally including at the language level) ought to reduce the
amount of compromise needed.

------
yason
Simplicity and good code is what we _seek_ by practicing the art of
programming. It's not meant to be an end.

The question of simplicity and good code is not a problem to solve but,
rather, something that compels us to reach for more beautiful solutions to
different problems — more beautiful than those we already know.

Suppose that Leonardo, after having painted "Mona Lisa", had concluded: "This
is the most beautiful painting I can draw, therefore I'll just stick to the
style and draw slight variations of her from now on because this is the most
beautiful painting ever." He might have sought for something more, too.

------
johnwatson11218
Does anyone know if there was ever a project to take a really complicated oo
system and a unit test. You run the unit test with some aspectj stuff and
produce one big method that does the same thing that all the objects and
methods just did? Not as a replacement but rather as a tool to help you get
the jist of something you don't really care about quickly?

Maybe even comment (or color code) the big method to show which objects it was
ripped from?

Then maybe something to try and compress this big method and remove lines of
code that are only picking out impls at runtime or doing reflection or
something?

------
chadaustin
I wish I had the "How do you stop people from refactoring too much?" problem.
:)

~~~
jacabado
Be carefull with what you wish. Ill motivated attempts of refactoring more
often than not disrupt an entire team rythm. I come to appreciate the
simplicity of lazy spirits.

