Hacker News new | comments | show | ask | jobs | submit login
The Brittleness Of Type Hierarchies (codegrunt.co.uk)
88 points by singular 1907 days ago | hide | past | web | 73 comments | favorite

As I've become more experienced over the years I've come to believe that the prolification of patterns to "fix" oop problems (dependency injection, immutability, builder patterns, events etc etc) are a symptom of inherent flaws in the object oriented model.

It feels like we're bending over backwards to fix a model that promotes lots of problematic designs while not doing much for resolving them (or supporting basic things like concurrency/parallelism). We can never predict all possible problems, or predict the future so no language can be "perfect", but a language could inherently be more agile and flexible. I wish I could say what a better model would be. I'd hazard a guess that its something more based on composition and functional programming than inheritance and classes. Perhaps even metaprogramming and/or code generation

For now it seems OOP is the worst paradigm for programming, except all other paradigms of programming

It's a good, if fairly old [1] insight. The fact is, there are two kinds of patterns: architectural patterns (e.g. "message bus", "web service", etc.) and there are language deficiency patterns (e.g. factory, double dispatch/visitor, etc.). A good way to gauge the power of a language is to see how many of these patterns you need.

For example, when I was exposed to Smalltalk I fell in love with it. So many of the patterns I was used to needing were not necessarily there. That's because Smalltalk is more expressively powerful than, say, Java. You see this directly in the GoF patterns book as several of the patterns only provide C++ code and say "Smalltalk doesn't need this as it does X".

Eventually, though, I begin to see that pg was right and single dispatch is a subset (in this case; inferior) of generic functions. The issue was double dispatch. Smalltalk needs it, Lisp doesn't. I never have to write boring double dispatch wiring in Lisp because of how generic functions work.

So I don't think it's OO itself that is flawed (though it's not as a big a deal as it's made out to be. It's a form of code organization, specifically global variable demarcation and code reuse), but rather many OO languages that are weak enough to require programmers to hand-code out these common patterns instead of either providing it in the language or providing a powerful enough language to let us solve it once.

[1] http://blog.plover.com/2006/09/11/

Agreed. I'll go ahead and burn the karma on an overwrought historical analogy right now:

C++ did to object-oriented languages in the 90s what the Church did to Europe in the medieval period. Thankfully functional languages (the Islamic world) preserved a lot of knowledge through that period so that it can now be re-introduced.

Smalltalk is, of course, Constantinople.

Well done everybody who didn’t reply to this.

I agree that single dispatch seems to be a pretty obvious problem with most oop languages, if you have a method that has two different types as parameters, on which do it belong? Take a pick and hold your breath... or do some unwieldy manager class :) Smalltalk and Lisp seems (zealously) loved by many, and it's easy to like the simple elegance of languages like that.

For me they always felt wee bit too messy and academically oriented. I want a language that reads more like a book and less like a formula, perhaps that's just me :)

This problem is easily resolved by noting that you think the method has to 'belong' to one type or another instead of exist as a first class object of its own. In practice this was done first in Lisp (the 'generic function' works better with typical Lisp environments than endless (send object message) calls, in particular with functions like map), and then the fact that the generic function treated one and only one argument specifically was noticed. From here we get multiple dispatch in a very natural way, a concept that is quite awkward in a language like Smalltalk.

The concept is quite awkward in any single dispatch language. The biggest WTF ever is Racket adding single dispatch OO! You actually have to say (send object 'message).

> I want a language that reads more like a book and less like a formula, perhaps that's just me :)

Perhaps you shouldn't look so much for a new language, but rather a new way of using the language. Have a look at http://www.literateprogramming.com/adventure.pdf, a showcase of literate programming using the example of the venerable Colossal Cave Adventure.

What's the alternative though? The minute you try to treat any group of data in a general way this problem arises. The minute you publish any kind of interface you've essentially signed a contract with all client code.

All the common solutions to this problem (structural typing, COM-style versioned interfaces etc) create problems of their own. I think it's just a genuinely hard problem.

Some day there'll probably be a nextgen language that addresses at least some of these issues. One way to resolve some of them today however is code generation.

While not appropriate to all kinds of solution I've had some success with building solutions, especially "enterprise solutions", where you model the problem descriptively (in say xml) and then generate the appropriate code.

This way you can glue together "architecture" code with "meat code", (ie substantial methods) and have the agility to refactor and regenerate. The bulk of the code is often some kind of yak pattern shaving so you can save quite a lot of time.

You're basically designing your own DSL (Domain Specific Language) and then writing the "parser/compiler"

Beat me to this but I like the objective C approach: You can save lots of specialized classes by extending existing ones with categories. And instead of a rigid class structure you usually check the availability of properties at runtime. It might be more verbose and in general more code but the code you have can be quickly adapted and is quite stable. I'd love objective c to become more multiplatform (which is of course the major downside for now).

> For now it seems OOP is the worst paradigm for programming, except all other paradigms of programming

I remember a piece about how most of the common OOP `patterns' are invisible or trivial in languages with first-class functions.

Apparently most of the readers have missed the point. He says up front that what he's describing doesn't really happen seriously in small pieces of code. The code example is an illustration, one that I thought was very clear.

As for a solution? The only purpose of inheritance or subtyping is polymorphism. You may be doing polymorphism in a very roundabout way (if (isa(X)) { ...get a field from X... }), but it's still polymorphism under the hood. There's actually a very good argument against inheritance for polymorphism: you can't straightforwardly write a statically typed, polymorphic max function. You have to introduce generics to the language. That way lies the Standard Template Library and generic functions a la Common Lisp or Dylan (which is a pretty wonderful world).

Now, in implementation you may want some of the polymorphisms to be due to the same fields being in the same memory offset in all subtypes, which seems different, but why must it be? Why shouldn't it be a declaration about a family of types? I may have to go play with that...though I think it's equivalent to how it's done in Forth. So much seems to be.

There are compelling uses for inheritance polymorphism - GUI frameworks are generally examples of this technique put to good use.

I suspect that OOP systems would be a lot less likely to go haywire like the author describes if the Liskov Substitution Principle were better-known among programmers.

I'd even like to see it baked into a language. Get rid of overriding base methods. Instead the superclass's version is always called, and the subclass is only allowed to tack on some additional code that runs after the base method returns. Yes, returns - the subclass's code shouldn't be allowed any chance to modify the result. It shouldn't be allowed to directly modify non-public fields that belong to the base class, either.

I suspect that a language with those kinds of restrictions on inheritance polymorphism would encourage developers to be a lot more thoughtful about how they design class hierarchies. Which they should be, since someone might get stuck with the results of those decisions for decades.

Taking the C# example - Microsoft decided to push mutability in collection classes all the way down to the root of the object tree. Which creates a lot of pain for conscientious developers. Mutability isn't just a non-essential feature of most collections, it's also undesirable in a great many cases.

While I agree with you about the Liskov Substitution Principle, I don't think that would have helped in this example. If your assumptions about the behavior of the base classes are wrong, your code is wrong. Substitute a wrong super-class for a wrong sub-class, and it's still wrong.

What the LSP does do, though, is help you know when you really shouldn't be subclassing things. If your implementation leaks beyond the class interface, it shouldn't be a subclass.

Agreed that it isn't a magic fix. I was only hoping that bringing such a restriction to the forefront of programmers' minds would encourage them to be a little bit more diligent about deciding what's really an essential feature of a category before they start to cut code.

It's all too easy to fall into the trap of automatically pushing things up to the superclass without thinking first. "I might want this elsewhere" is a common way to look at it. Following LSP encourages one to think, "I might get stuck with this" instead.

Here's the thing: while you do not have to do a Big Design Up Front, there's no reason in the world why you can't have a lot of conversations around future behavior of the system as you go about working through your first few sprints.

While good Agile teams can do whatever is put in front of them, there is an implicit assumption in project work: if you start out building as securities system you're not going to be changing over to a system to feed and care for circus elephants in the middle of the project. That is, there is a fixed and limited set of nouns and relationships which comprise 90-95% of the problem domain that can easily be discovered simply by talking about the problem.

I'm not trying to disparage the author: this is a real problem. I'm just pointing out that mature teams cover the domain fairly completely in an informal fashion (perhaps a few hours of conversation spread out over a week or two) before writing anything. That's not design, that's just understanding the world of the customer. [Insert long rant here about how most programming teams have forgotten or hardly use any sort of analysis techniques]

Of course, the best of these teams still run into the same problem down the road, but it should be a pretty long ways down the road. Like years. If not, you probably never really understood what the hell you were doing in the first place. (Not the programming part, the part about fully understanding the user)

Type hierarchies can allow for flexibility easily. It's up to the team to spot where flexibility is going to be needed and put it in there. Brittleness is a risk just like any other project risk.

I wish we could find a way to make non-programmers understand this. Just because programming doesn't involve moving lots of heavy stuff and putting it together physically, doesn't mean that there are no rules about how software needs to be built.

People tend to focus on the happy fact that you actually can change your fundamental assumptions in software without loud, ugly demolition sounds, but they tend to stay blind to the fact that it still involves a lot of work.

It's partly our fault, too. We're the ones who have been promising that "this time it'll all work out with this fancy new methodology we've discovered". I've seen a lot of people assume that "agile" means "clients can change their mind as often as they want and we'll take it in stride" and that we'll do so at zero cost.

I commented on the author's blog, but I thought people here might be interested as well:

I agree with Chris Parnin (in the comments of the author's blog)--this isn't a type hierarchy problem. It's an incremental design problem. It's true that inheritance should be used with caution, and this example (intentionally) overuses it, but the deeper problem seems to be that the author doesn't understand how refactoring and incremental design work.

Let's stipulate that your initial guesses about a domain will almost always be wrong. In this example, the author assumed that all securities will have an Isin, but it turns out they don't. Options are a type of security that don't have Isin.

One solution is to hack Option as a subtype of Security. As the author shows, this leads to a big mess. A much better solution is to refactor as soon as you notice that the domain is wrong.

Here's how it works:

Step 1: Notice that Options are securities, but they don't have Isins. Observe that the domain model is wrong. Smack yourself on the forehead.

Step 2: Realize that Security is not in fact representative of all Securities. Rename it IdentifiedSecurity (or IsinSecurity, if you prefer). This is an automated refactoring in C# and Java, and will automatically rename all uses of the class as well.

Step 3: Create a new superclass called Security and move Description and Exchange to that superclass, if desired.

Step 4: Create Option as a subclass of the new Security superclass.

Step 5: Enjoy your improved design. Some parts of the application (such as Trade) will be too conservative and use IdentifiedSecurity when they could use Security; those are easily fixed on a case-by-case basis as needed.

For more about incremental design, see Martin Fowler's _Refactoring,_ Joshua Kerievsky's _Refactoring to Patterns,_ or the "Incremental Design" chapter of my book (http://jamesshore.com/Agile-Book/incremental_design.html). You can also see me aggressively apply incremental design in my Let's Play TDD screencast, here: http://jamesshore.com/Blog/Lets-Play .

i had a great experience with OO design in an internship in grad school. i was sole coder on a java project, a networked whiteboard app, and i had the time and flexibility to scrap the whole thing halfway through and rewrite it from scratch once i had some clue about the problem domain.

the best thing is that since the main functionality was a vector graphics editor, i actually got to implement one of the classic blackboard OO examples in practice, and see how it really worked out:

    Interface Shape
        AbstractShape implements Shape
            TwoPointShape extends AbstractShape
                Rectangle extends TwoPointShape
                Ellipse extends TwoPointShape
                Line extends TwoPointShape
                    Arrow extends Line
            Text extends AbstractShape
            Icon extends AbstractShape
            Raster extends AbstractShape
(or something like that)

obviously this was the product of a lot of refactoring (i think the TwoPointShape "insight" came to me relatively late in the project)

of course, i ended up spending most of my time chasing ConcurrentModificationException and jumping back and forth between two or three boxes tracking down network synchronization bugs (client A begins moving an icon, client B begins moving the same icon, client C begins moving the same icon, client B releases the mouse, client C releases the mouse, client A releases the mouse: what happens?), but that taught valuable lessons too:

specifically, all they really ever teach in school is what i call the data paradigm: your app starts, takes input, does stuff, stops. gui- and network-event-based programs, and how to design, test, and debug them, barely come up.

of course once i got into industry, i got turned on to array programming and never looked back :-)

If I were solving this problem in C# I'd probably lean towards defining specific properties of objects in interfaces, rather than just through a type hierarchy.

So we might have the interface ISecurityWithIsin , IOption etc.

This has the advantage of allowing

a.) easy use of mocking, dependency injection for testing.

b.) Classes can implement more than one of these interfaces.

The question then becomes- where do you put your base, shared functionality (e.g. a method that is common to all stocks with Isin numbers). Possibly this becomes another set of classes...

The problem with interfaces is that you have to reimplement all the code for the basic functionality in every class. Scala-like traits, which are basically interfaces with optional implementation, should help here.

Also, if you're using a flat type hierarchy where classes never inherit, but only extend interfaces, you're better of using a different language which natively supports a structural typing style, such as Go, OCaml, or Objective C.

In the Moose / Perl 6 world, we use roles for this. (I know the idea is borrowed from elsewhere, but I don't know where.) As a Perl 6 programmer, I have almost completely given up on traditional inheritance, because roles capture everything I want from inheritance with greater robustness.

To be fair to original article, this does mean I'm doing my best to avoid type hierarchies...

I know the idea is borrowed from elsewhere, but I don't know where.

It probably originated in Flavors, an object system for Lisp (see http://en.wikipedia.org/wiki/Flavors_%28computer_science%29)

It was a combination of dissatisfaction with multiple inheritance (see Perl 5, C++), interfaces (see Java), and mixins (see Perl 5, Ruby), as well as an acknowledgement that aspects (see Java) and multimethods (see Common Lisp) solve part of the problem well and part of the problem poorly.

Then Allison and I saw Dr. Black present the Smalltalk traits paper and she realized that their formalism was exactly what I'd been talking about informally, so we borrowed that instead.

How are roles different from mixins?

The only difference that I can think of is that mixins are declared at class declaration, is that what you mean?

Roles have specific composition rules which forbid name collisions; there's no last-in wins rule. Roles also provide type allomorphism which is much less ambiguous than duck typing.

True- but in this case the classes looked relatively simple (i.e. they were carrying data, but not operating on it)

Taking the interface approach further, your classes that operate on your data would take in objects by interface e.g.:

public class IsinValidator


    public bool Validate(IIsinSecurity security)

Just use polymorphic associations (weak linking) wherever you want to enable code reuse and the implementation is not tightly linked to the class hierarchy. It is about how you think in terms of OO.

Right. Smaller, atomic classes help with this. But then you have a snowstorm of classes, all with more-or-less helpful names. That's complicated too. Probably the better answer though.

Sometimes the code is complex because the problem is complex.

    We are faced with the dilemma - a lot of the code is now reliant on Isin, and NullReferenceExceptions are getting thrown all over the place because the field isn’t getting populated
Sounds like it's catching bugs. You get to fix all the Isin places in a single unit test pass. Or you change the types and it's all fixed by getting it to compile.

The examples provided by the author are absolutely horrible code. A type hierarchy can be more than two deep. How about adding another base type for securities with ISINs?

Moreover, if you ever see code like security is Option or a switch based on the name of the type it is a great sign of poorly architected code.

The solutions provided aren't really solutions at all. How would functional programming solve the problem? If anything a lot of functional languages are even more rigidly typed than C#.

(Author here) I agree that the code is horrible :-) that's kind of the point - if you choose to hack up your code, then you end up with horrors like if(foo is Bar) { ... } else { ... }.

The type declarations are, I'd argue, not horrible at the outset, but once you introduce the new requirements then it becomes incorrect. I even suggest possible solutions including adding depth to the hierarchy - the point is doing that means yak shaving, not doing it results in a definitely horrible object-orientated design because it no longer fits the problem.

You can argue that you should have a better model from the outset given the coarse change which comes into play, which is potentially where my example starts to break down (I think it would be hard to find an example that would not break down in some way here, given the need to compress reality into a blog post), so take it as read that in reality the changes are often a lot more subtle than the example I give.

The point is that you're having to make far-reaching, rigid decisions up front. I have encountered this over and over again.

I am happy to admit it's an inadequacy in object orientated design on my part, and if somebody were to suggest approaches that helps mitigate this problem I'd be very happy, but I do wonder whether it's an inherent property of the whole approach.

I agree that hierarchy-based OO can introduce real overheads. Favour composition over inheritance, and all that.

Having said that, I believe the problem is exacerbated because developers tend to worry too much about the structure of data from some arbitrary point of view. Put another way, we don’t tend to focus on how the data will be used in specific contexts, and therefore we don’t actively structure our data to be helpful in those contexts, even if each context is well defined and known to be relevant, unlike whatever “natural” structure we instinctively impose.

In your particular example, my first questions would be about how ISIN data is going to be used: where is the value for each security originally determined, where is it looked up later, what kind of decisions are made based on it, and so on. My next questions would probably be about why securities that are used in that way are being stored and manipulated with securities that aren’t. If there is no need to use both kinds of data at the same time, do the representations of the different types of security need to have related types in the code, or is that just an arbitrary decision we have made because it feels natural and in accordance with our view of the real world? Does a security even need to wrap up all of these properties in a single object, or are we really dealing with two or more distinct ideas that just happen to relate to the same real world activity at some point?

It’s not possible to answer questions like these without context, but so often with OO, we instinctively dive in and start identifying the nouns first, without considering the verbs that go with them.

This isn't really an OO problem. You can have similar troubles in a language with no static typing and no type hierarchies because you still have clients with expectations about the contents of a particular field.

A published API is a contract. Changing contracts is painful. There is no silver bullet.

You might have been able to deflect more of this if you had made up a new type of security that broke the rules (something nobody could have expected), however I thought your example illustrated the issue very well. For me it is especially common to run into the field that could never be null but now is scenario.

I liked your article a lot, very nice write up. I tend to lean towards the idea that there may not be much of a solution for this. But it's better to look for one than to just give up!

In my experience modifying existing C#/Java in such a manner is not a big deal since the type system and the language enable extremely powerful refactoring tools like Resharper.

But your point is fair. It is a tradeoff between flexibility and guarantees of strong, static typing -- a tradeoff that's a no-brainer in my opinion.

I feel you have missed the point of the article by criticizing the example. The suggestions you mention are discussed as possibilities, and his 'security is Option' is shown to give an example of bad code that arises during the quick fix process.

The point is that these situations are common and it would be nice if your language made it easier to adjust your object structure when needed.

> How would functional programming solve the problem?

In a language like ML or Haskell, it simply wouldn't compile until you fixed all the uses of Isin. By itself, it wouldn't solve the problem so much as force you to solve it before running it again.

Haskell and ML lack intrinsic subtyping. I don't know what you'd wind up with, but it would definitely feel a lot more explicit. I suspect if you followed a no-inheritance/many interfaces regime in C# you could port it across fairly straightforwardly to a typeclass-based implementation in Haskell.

I'm studying Smalltalk now and I actually am more interested in what the solution would look like without accessor methods. I do OOP professionally and FP on the side, but I don't think Java encourages good practices; everything winds up being a bean, which is essentially a struct. I think if OO principles were truly and seriously applied there would not be direct access to the Isin field (or any other field) from all over the codebase, those accesses would instead be somehow made into responsibilities of the owning class.

It's interesting to me because I think OO without encapsulation ports over fairly nicely to ML and Haskell, and OO with encapsulation should port over fairly nicely to Haskell with type classes. The trouble is always inheritance. But that turns out to be trouble in OO languages as well, so maybe it's a wash.

One of the fundamental principles of OO is go for interface inheritance(design) as opposed to class inheritance(implementation). Better way to enable code reuse is through association. That is why languages like Java do not allow multiple class inheritance, but allow you to inherit from multiple interfaces. You are doing it the wrong way!

One of the fundamental principles of OO is go for interface inheritance(design) as opposed to class inheritance(implementation).

That wasn't a "fundamental principle of OO" until halfway through the 2000s.

Wrong for some problem spaces, very right for others. There are many problem domains, and no tool right for all of them.

Very right. It is about how you think in terms of OO. If you feel the problem space cannot be modeled easily, just don't go for OO. That said, many problem spaces are amenable to OO thinking. But some are not (especially the ones that deal with serial hardware interfaces), just go for procedural thinking in these cases.

The problem is you really want to inherit the implementations as well -- otherwise every one of the classes that use the Isin interface must implement its storage and accessors as well.

You can just do both, though. Use interfaces for polymorphism, to define the interactions between your objects -- and then, if you want, use inheritance as one possible strategy for code reuse.

That works great if you're only using one interface in your class. What if you need to include two?

Then you use a different reuse strategy, like composition. The language is not ideal in this regard, certainly.

As always: It Depends (tm).

Type hierarchies have their place, but can be overused and abused. My typical approach is to be fairly conservative with base classes, and rely on them more for shared implementations rather than polymorphism. Gosu also supports composition (See http://lazygosu.org/ search for delegates) for shared implementations, but it is syntactically heavier-weight, even if it can be cleaner.

For polymorphism, I'm more inclined to use interfaces. I think there is a place for explicit (java-style) as well as go-style (implicit) interfaces.

The real culprit here is overdesign/premature abstraculation: you can go batshit early on in a project with almost any language feature and compromise your flexibility. Broadly, write as little code as possible, balanced with readability (e.g. don't go ape-shit with obscure macros) and using standard idioms, and let the underlying abstractions emerge when they are ready.

The older I get, the more I feel like less code is the most important thing by a long shot.

Um, aren't shared implementations exactly what you're not supposed to use inheritance for? According to the Liskov Substitution Principle (http://www.objectmentor.com/resources/articles/lsp.pdf), if it's not transparently substitutable, it shouldn't be a subclass.

Probably. But I don't care what the OO academics say: look how they designed JUnit.

I find class-based inheritance most useful for reusing base implementations, and interfaces (explicit or implicit) for conceptual encapsulation. I wish Liskov the best of luck in her software writing.

Here I've translated the example to haskell:

    data Exchange = Exchange { bic :: String, name :: String }
    data Security = Security { description :: String, exchange :: Exchange, isin :: String }
    data Stock = Stock { security :: Security }
    data Bond = Bond { security :: Security, expiry :: EpochTime }
    data Trade = Trade { price :: Decimal, quantity :: Decimal, security :: Security }
Now to add Option:

    data Option = Option { security :: Security, call :: Bool, lotSize :: Decimal, maturity :: EpochTime, strikePrice :: Decimal }
And in the example, the problem is that the Option uses Security, which has an isin, which doesn't make sense for Option. In haskell, this is a sort of problem which is typically fixed by adjusting the data types. There are many ways they could be changed, some will model the domain better than others. Let's just make the same quick fix used in the example, of allowing isin to not be set:

    data Security = Security { description :: String, exchange :: Exchange, isin :: Maybe String }
This means that isin is Nothing or Just a String. As soon as this change is made, every place in the program that directly accessed the isin will fail to compile. Fixing the compilation errors will involve adding a case to handle isin-less Securities.

    - foo (Security { isin = i }) = 
    + foo (Security { isin = Just i }) = ...
    + foo (Security { isin = Nothing }) = ...
The code does become somewhat ugly with these cases, but you know every case has been covered, and that it will work.

Maybe later it's decided to go back and fix it to use the separation between physical and derivative securities that was originally considered but not done due to lack of time. It could then look like this:

    data Security = PhysicalSecurity { description :: String, exchange :: Exchange, isin :: String } 
                  | DerivativeSecurity { description :: String, exchange :: Exchange }
Again this type change would drive a pass through the code, fixing it up to compile.

    foo (PhysicalSecurity { isin = i }) = ...
    foo (DerivativeSecurity {}) = ...
Again you'll know when you're done because the program will successfully compile. In this case, splitting the data type seems to have led to better, clearer code. It might be worthwhile to factor out a helper type to simplify the Security type:

    data SecurityBase = SecurityBase { description :: String, exchange :: Exchange }
    data Security = PhysicalSecurity { base :: SecurityBase, isin :: String }
                  | DerivativeSecurity { base :: SecurityBase }
Although you may find this complicates other things as you "follow the types" and change the code to match. There are surely other approaches; so far this has stuck with simple data types, but typeclasses could also be used. You may want to constrain Bonds to using a PhysicalSecurity, and Options to using a DerivativeSecurity, and there are various ways that could be enforced. And so on.

What was surprising to me coming to haskell from a background in loosely typed languages (and lowlevel langs like C) is that the types are not a straightjacket that is set in stone from the start, but ebb and flow as you refine your understanding of the problem domain. What well chosen types in haskell do constrain is the mistakes you want to be prevented from making. These days if I find myself repeatedly making a mistake in my code, I adjust the types to prevent that sort of mistake in the future.


Side note: The above code will not compile as written, because it exposes an annoying problem in haskell's record syntax. There are several fields named "security" that conflict with one-another. This is typically dealt with by using ugly field names (stockSecurity, bondSecurity, tradeSecurity, optionSecurity), or more advanced things like lenses, or by putting the data types in separate modules and using module namespacing.

The original blog post wonders whether function programming will give a solution to the problem using immutability and lack / tagging of side-effects. Haskell is a purely functional language, but the feature of the type system, algebraic data types, that actually helps solve the problem here would be perfectly feasible to add to a language like C.

It reminds me of other features, like garbage collection, that were initially introduced in languages like Lisp, but turn out to be useful in the wider world.

I find modules namespacing to be a more elegant solution than prefixed field names. (I haven't played around with lenses for this kind of use, enough.) It's just a shame that Haskell doesn't have an in-file syntax for modules like, say, OCaml but each module needs to put into a separate file.

I do too. I also tend to find that once I have a module for a particular record type, I eventually find enough other things to put in it, like instance declarations, that it's not a hardship to have a separate file.

Yes, it's not a hardship, but it does raise the threshold for doing the Right Thing. Quite unnecessarily in my humble opinion, since OCaml provides an example of how to add this feature to the language without colliding with any of Haskell established concepts. (I don't know about collision with the existing syntax, but you'd enable the modules like any other extention with a pragma.)

Do you know whether Template Haskell could be contorted to provide a syntax?

I spent a few years using type hierarchies intensely in the early 00s and found the experience excruciatingly bad. The crystalline structure of your types quickly shatters on the shoals of reality and you are left taping the pieces together. After a particularly bad experience I generally stopped writing OO beyond simple structs.

Around 2010 I started reading rpg's writings on Lisp and software development; that opened my thoughts to a different thought process of how to design software with objects that I haven't really finished working out.

I do agree with you: the C++ modality of inheritance doesn't really work in many cases. It's a tool, but a tool that works badly often. I think a more CLOS or Haskellian viewpoint will yield better results in the long run.

I'm in favor of decoupling data structure from interface entirely, via records + protocols. Hierarchies have their place, but ultimately can't deal with cross-cutting concerns. Mixins with structural typing is one way to approach the problem, but for formal contracts I prefer Clojure's approach: http://www.ibm.com/developerworks/java/library/j-clojure-pro...

I'm surprised he didn't mention typeclasses as a solution to this general problem. In many cases it's an unambiguously better solution than inheritance. A particular strength is that typeclasses can be easily defined or overridden at call-sites as easily as at where data types are defined. OOP forces an uncomfortably close complecting of data and operations on that data, leading to the difficulties enumerated in this blog post.

I stopped when I read this:

  Functional programming is enjoying a great upswing in interest
  and popularity these days. I wonder whether the stronger type
  systems of these languages...
Functional programming == stronger type system? I thought I can use JS in a functional way without a lot of safety nets. Clojure isn't Scala. Is he right? What am I missing?

Whoever wrote that was obviously referencing ML/Haskell

"Or do we find some other, less salubrious way around the problem?"

Since salubrious means "good" or "healthy", this statement doesn't make much sense. If you're going to use words your readers are likely going to have to look up, at least use them correctly.

EDIT: at least, it doesn't make sense insofar as I understood the intent of the sentence.

I meant less salubrious as in more seedy, less wholesome, a usage I've heard before, e.g. 'a less salubrious bar'. The idea was to imply an evil, dirty hack - I was trying to add colour to the post, but perhaps didn't succeed :-)

I do try to endeavour to stick to the simplest possible expression, but sometimes still fall foul to the use of a word which I enjoy but in fact reduces clarity.

Could someone who understands CLOS well weigh in on how this problem might be approached from that point of view?

How can he write such an article, stating that he used C# because he knows it, and not tackle the problem using the main resource for such issues in C# / Java: Interfaces. Using interfaces, you can decouple all of those classes from each other, and never have this issue in the first place. If the stated model is the way he would typically tackle a problem in C# then there are fundamental issues with his choices, something that is not a failing of the Type system.

Yeah, because it's so hard NOT to use inheritance.....

I'm not quite sure what the issue is here. It turns out that the author modeled the domain incorrectly. At least that incorrect model is completely explicit in the code. If it weren't spelled out explicitly, the coupling that the author speaks of would be insidiously spread throughout the code. In order to make the change that the author wants, it's as easy as introducing a new abstract class. In fact, if all the existing code correctly assumes the existence of an Isin, we can create the following set of classes:

abstract class BaseSecurity { public string Description { get; set; } public Exchange Exchange { get; set; } }

then modify Security to derive from BaseSecurity:

abstract class Security : BaseSecurity { public string Isin { get; set; } }

Then you are done, except for two issues: first is that any serialized data needs to be regenerated, and second is that you can't trade BaseSecurities. However, this trading functionality can be written separately without disturbing the existing ecosystem of software. This is what your type hierarchy buys you.

On the other hand, if we insist that this is not correct, and Security should have Isin removed, then we can add a new PhysicalSecurity between Security and the various implementations, and Stock/Bond/Trade can inherit from PhysicalSecurity.

In that case, the problem is that a lot of code was written with the incorrect assumption that an Isin exists in every security. Now we have to take a step back and ask how to fix that code on a case by case basis. No matter what language you use, it's always possible to write bad code with incorrect assumptions, and in that case you must pay the price. Hopefully you would be clear with your client on the delays required.

Now we can ask ourselves how a static language treats the situation differently than a dynamic language. The author seems to think a dynamic language would help, providing only praise in his description of them.

With a static language, we can simply remove Isin from the definition of Option. This will cause a lot of compilation failures. However, every place where there is a compilation failure is a place in the code which had an incorrect assumption. Each of these incorrect assumptions must be considered individually. After all, this represents the model for a trading system, and any bugs would likely result in severe financial consequences.

In a dynamic language, the definition could be changed, but there would not be any inherent mechanism to catch the now-incorrect calls. Instead, we would just get the NullPointerExceptions that the author complains about and which jeopardize the viability of the financial trading system. Perhaps the coders would have written beautiful unit tests that would help, but that could be the case in any static language as well. Of course, it's also possible that the coders would have created trivial unit tests or no tests at all.

In any case, I see this situation as a win for static type systems rather than a loss.

Hey, author here.

Indeed the domain is modelled incorrectly, the problem being that this realisation that this is the case has come late on in the development process. Perhaps in my simplified example that's something you could pick up on before work commenced, but in reality there are many little (and sometimes big) things which you either miss no matter how hard you try, or change later on.

The problem with your suggestions as to fixing the hierarchy is that, yes this is something you ought to do and it resolves the problem, but it takes time and is in essence yak shaving, or at least feels that way - are you really working towards making your software better, or are you just pandering to the abstraction you've created?

In any case, this is one of the 2 options I suggest to resolve the situation, and definitely the correct one (I cover the idea of separating out ISIN and non-ISIN instruments as e.g. physical/derivative), however it takes time and the problem is it is so often the case that time pressures result in you hacking around the problem, making the hierarchy not only incorrect, but actively misleading.

I didn't mean to suggest dynamic languages were a silver bullet here, nor did I mean to put particular emphasis on them.

No matter what you will have to refactor, but the question is whether a rigid hierarchy makes that more or less difficult, and whether that rigidity tempts you into hacks which make this structure misleading.

I think the problem with putting any particular example in the article is that it will inevitably be inadequate for its purpose. The problem really bites when you've experienced a large code base have some small incongruity that doesn't quite fit the model, but I don't think I could clearly and simply express that in code.

There are always time trade-offs. When the overall software design becomes tightly coupled with the type system you simplify the code. You could abstract the way you handle messages, but at some point you need to get the ISDN for some security and no amount of abstraction let's you avoid that.

Now, as you say doing it the right way can take significant amounts of time. But so do the hacks it's just a question of when to spend the time. IMO, front loading the costs is much better because you get a clear understanding of what adding each feature actually costs.

its a problem if you don't know your domain ...

you have no business designing type hierarchies if you don't have a clue about the domain you are modeling.

The OP clearly has more than a clue about the domain.

If you work in a complex domain for any length of time you realise that domains change. I don't just mean that requirements change but the actual real world domain changes.

As it happens, I have been babysitting a large system in this domain since 1996. ISINs were hardly used then, and many types of derivative financial instruments hadn't been invented. Government regulations too keep changing, often bringing into existence completely new data points.

'Knowing a domain' is a meaningless concept in long-lived business system.

"Government regulations too keep changing"

Excellent point. This puts some of the comments above about how agile methods could quickly identify these problems and address them by refactoring into doubt. You could build a very big system, understanding the domain well, and then have a regulation change introduce new issues that you could not have anticipated.

Beyond finance, health care would be another setting where this could be important (HIPPA must have caused a lot of hasty alterations).

That's kind of a cop-out, don't you think? After all, when we embark on from-scratch coding projects, we must not be perfect experts in the domain or we'd already have code we can use, right? How are we supposed to become domain experts without writing the code? Attend night classes? :) The design the author winds up with has plenty of technical debt, but it's not hard to imagine winding up there, even with partial knowledge of the domain that is constantly increasing.

How do you design a type hierarchy that incorporates all future change requests? Can I borrow your crystal ball?

I think one of the fundamental issues here is how the programmer views OO as a programming methodology alone. OO is more a collaboration tool which helps large teams come up with complex functionality. The architect or lead designer comes up with system level abstractions and module contracts. The module designer then comes up with module level abstractions and interfaces. Finally the programmer is supposed to code to the interface given to him. Thus large projects can be managed better as each person knows their roles and responsibilities and system can be thought of as composed of blackboxes.

This works only when the architect knows his job and module designers are good. Good programmers often do not make good architects (it is a different thing that often good architects are good programmers too). In OO design comes first, second and third; implementation comes last. This creates a situation where programmers do not have enough work towards the beginning of the project. But as any normal scenario, this text book version works only 80%. Remaining 20% are situations where we do not know the abstractions to begin with or implementation feasibility is questionable.

This is where I use the programming resources to do prototyping of 20% functionality while the design is going on in parallel. In cases where abstractions may change, keep them at a very high level and evolve the design over time. By providing hooks to refactor and evolve the design over time, you can future insulate to some extent.

As long as every programmer is not forced to think in OO design terms and is given a simple contract of coding to the interface it works. That said good architects are rare and the job requires some experience and expertise in abstract thinking and most programmers do not end up as one.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact