Hacker News new | past | comments | ask | show | jobs | submit login
Readability, hackability, and abstraction (benkuhn.net)
92 points by jeremynixon on Apr 9, 2015 | hide | past | favorite | 18 comments


> In fact, I wonder if it’s even possible to get all three of readability, hackability and abstraction.

I'm not sure this is the right way to say it, because abstraction is a means to an end, not an end in itself, right? Unlike the other two that are ends in themselves. Usually abstraction is intended to be a means to flexiblility/maintainability (is that what OP means by 'hackability'?), but sometimes counter-productive, as the OP explains.

I think there is a definite tension between flexibility and simplicity, inherent to software engineering. And that the tension is generally expressed via fighting with abstraction -- the right or wrong abstraction, over-engineering (too much abstraction), inflexibility (not enough abstraction), etc.

I think the better you understand your business domain, the better you can do at managing the tension. I think it's not a good thing that these days it seems to be assumed that there's no need for domain knowledge, a good programmer is a good programmer and can do well in any domain.

I would phrase the 3 tradeoffs as Understandable (readability), Reusable (abstraction), and Malleable (hackability). The article tries to demonstrate that:

U+M = a short script is easy to understand and can be quickly modified in small ways to test things or meet new requirements, but lack of reusable abstractions makes big or numerous changes harder and more time consuming

U+R = a longer program that uses an abstraction (classes, functions, modules, etc) is understandable and has many reusable components, but accounting for changes requires adding new components. This means the code isn't very malleable because you have to write all the interactions and containers and other stuff that lets new code connect and work in the abstraction.

R+M = finally, a fully abstract and malleable system, full of components and connectors and design patterns and such, can be readily adapted to changing requirements because it has high abstraction yet is factored deeply enough to allow small changes without lots of code. However, making changes requires understanding the entire system, damaging readability/understanding.

I don't think the problem is hopeless though. Usually when I see this, it's either because the problem domain is just that complicated, or because the system is actually too flexible for the scope of the problem. Sometimes a different abstraction can clear things up, because the old system was using too much of a poor abstraction to make it work.

In this instance, the last paragraph contains the real problem and hints towards a solution:

> If I’m trying to run five different deploy recipes across four different hardware/OS configurations, that’s 20 different potential interactions to take care of.

I forgot the name of the problem now, but Peyton-Jones talks about this with regards to how OOP and FP tend to hit opposite sides of these problems. OOP makes it easy to add new objects but hard to add methods because you have to add it to every object. FP makes it easy to add pattern-matching functions, but hard to add new objects because each function has to account for the new objects.

Something like multiple dispatch is the usual solution: one function for each combination of things, with shared functions factored out as needed. The OOP version, the visitor pattern, falls afoul of being hard to understand, imho.

Yeah, I like the way you describe the three qualities.

> I don't think the problem is hopeless though. Usually when I see this, it's either because the problem domain is just that complicated, or because the system is actually too flexible for the scope of the problem.

I don't think it's _hopeless_, but I do think it is one of the intrinsic challenges in software engineering. That I think deserves more attention, including in training.

I think you are absolutely right about one of the main pitfalls being when "the system is actually too flexible for the scope of the problem" -- the trick is understanding the true scope of the problem, and how much flexibility is really required -- and it's not usually really a "how much" question, but a question of flexible _where_ and _how_.

In open source software, it sometimes can mean saying "No, we can't make this flexible to your desires exactly, because those desires are outside of our principle goals, and we haven't yet figured out how to accommodate them without ruining understandability."

Although sometimes you can find a clever way to introduce the right hook point to support that flexibility without turning your software into an over-engineered monstrosity... sometimes you can't. Which is about the problem domain, about your understanding of the problem domain, and about your craftsmanship.

Another paradox or irony or tension is that 'design patterns' are intended as discovered templates for flexibility with simplicity, but 'design patterns' applied willy-nilly, cargo-cult, carelessly, without sufficient grasp of the true problem domain, or to excess -- result in lost understandability, and sometimes lost reusability and malleability too!

Many of us have a natural inclination to maximize flexibility always as an unalloyed good, instead of dealing with the inherent tension as constraint to catalyze good design.

The problem you are referring to is called the expression problem [0]. There are solutions, such as the Visitor Pattern, but these tend to increase complexity substantially. Functional programming has approached this from the other side by introducing something called protocols [1].

[0] http://c2.com/cgi/wiki?ExpressionProblem

[1] http://www.ibm.com/developerworks/library/j-clojure-protocol...

I find that applying the minimum amount of abstraction necessary to reduce excessive duplication and keeping the call graph/class hierarchy relatively "flat" yields the best results. Making the design more table-driven, where most of the complexity/diversity of cases is data and the code is a simple interpreter, instead of directly encoding them in the code, also works well.

“Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.” -- Fred Brooks

"I haven’t found any pattern that accomplishes all three [readability, hackability and abstraction] at once."

It is called "elegance", and it's not a pattern. It's the result of years of experience, of time and attention paid to grok formal systems and their interactions with reality.

"the author was confused about which subtype of Machine they were operating on"

One could hence argue for strong types. If I got €1 for each time I see careless typing breaking people's Python and Ruby scripts I could retire rich right now.

You probably mean static, not strong typing.

D'oh. Of course ;)

> and each method call could mean one of many different things depending on the runtime class of the object it’s being called on. That makes it much harder to look at code and know what path will be executed, so the code becomes much harder to keep in my head.

This is precisely the problem DCI solves. It puts system level behavior (collaboration between objects) into contexts, so that the code path is readable within a single method, rather than spread out across the network of objects.

You get to keep the abstraction and malleability of your objects, while still being readable. If you're not familiar with it, this is a great talk:


Sometimes I would combine all functions back into one and rethink how to refactor them.

Obviously it's pretty hard to produce easy to understand code with wrong abstractions. Objects or other data abstractions cannot help with control flow. Anonymous functions are for that.

Sorry if this sounds too basic but I'd love to learn more about how to apply anonymous functions to improve this kind of control flow - would you mind recommending some resources/examples?

Try "out of the tar pit" paper first

I've taken a quick look at it and looks very interesting, can't wait to fully read it this evening. Thank you!

Oh I think it was great until the classes got added. Can't he just pass the bsd parameter in the ip variable. No state and immutable.

Did you read the text of the article or just the code?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact