Hacker News new | past | comments | ask | show | jobs | submit login
How Enums Spread Disease – And How to Cure It (2012) (codecraft.co)
57 points by gpvos on Apr 15, 2015 | hide | past | web | favorite | 36 comments

Seems to me that the whole point of OOP is programming is to encapsulate this information. An enum is basically just a degenerate class. If you're finding that you are doing a case on your enum all over the place, you should probably be defining them as objects (or possibly subclasses).

The solution of creating a bunch of tuples (or structs)...well, that's just creating a bunch of objects. The language and paradigm already has a way of dealing with these that's a bit more flexible and powerful than tuples.

> The solution of creating a bunch of tuples (or structs)...well, that's just creating a bunch of objects.

Essentially in Java that's actually what you're doing, and if you override the constructor of the enum you get a nice version of what the author is describing (but with more flexibility than a straight up Map)

(oh, bthornbury covered this in his comment)

> An enum is basically just a degenerate class

My recent facebook post is, verbatim: Java enums are shitty classes.

That's how I thought about it when reading. I kept wondering whether I was missing something, but: the things to be modeled by programs have a continuum of complexities, but there are only a small number of language constructs to use for modeling, which are each more/less suitable based on the complexity of the thing to be modeled (for one thing). The 'language constructs' from simple to complex are something like: enum, struct, class. It sounds like the author wants something that's between struct and class, though personally, I don't see why doesn't just use a class/objects.

The meaning is basically different. An enumerated value doesn't always carry around additional state, neither is it expected to carry around implementation information.

Applying polymorphism often creates an nasty inversion. If the situation is such that you need the algorithm to mutate "outside" data based on "inside" (the object) information, you end up calling a method on the object that "passes in the world." This is a particularly terrible form of coupling since it obscures the straight-line path and allows new code to break old code when it gets used out of context.

The trick the article describes is more like what you would do if you were modeling this problem in SQL and decided to normalize the data into an additional table: Each enum now points to a static "program" containing various true-false properties - e.g. trucks are large, cars are not. Instead of checking the enum directly across all business logic, you can use it to look up an associated property. Subsequently, your test doesn't have to be updated to account for new types of vehicles.

Algebraic data types (such as Haxe's enum) that can do a compile-time coverage check and allow parameterization are another way to attack this problem.

I was nodding in agreement throughout the article until the solution was presented, or at least the implementation of it. Maybe it's truly genius and am I just not understanding it properly, but the fact I had to go over it multiple times to grasp wtf is going one isn't exactly a good sign, I think.

The idea behind the solution is definitely ok: you define a type with a bunch of immutable properties describing it. But surely there must be more elegant/easier to read/easier to understand ways to do this than to have a bunch of macros/includes inside enums and whatnot? Moreover the TUPLE macro is declared twice with the same arguments (and then the arguments are repeated once more) so that means extending the type has to be done in 2 places which is exactly one of the things warned about earlier.

This is exactly why I am really happy to be working in a language which doesn't support macros. Are they performant? Yes. But they absolutely ruin code readability and make debugging harder also (as you're stepping through "code" which only exists post-compile).

I felt like the OP made a great point about enums and I agree.

They do have downsides, but you gloss over the main reason to use macros; they're an extremely powerful tool.

Meh. People are expensive, compute is cheap. If code is twice as hard to debug, you better have a darn cost saving reason to be working in inefficient language like C/C++.

Macros really aren't about performance, though (or at least they shouldn't be; inlining and/or constexpr are a better solution to that problem in basically every respect). They're either about simple, potentially extremely unsafe, syntax extensions (as in the article) or about using metadata that wouldn't otherwise be available (e.g. filenames/line numbers in the assert macro).

Not that I think macros are a good thing– 99% of the time they're best avoided– but dismissing them because 'performance isn't important' is basically a non sequitur.

Plenty of languages provide filenames/line numbers in assertions without using macros.

Furthermore, `__LINE__` and `__FILE__` style macros are a pretty safe subset of macros that aren't really representative of the problems with macros. Compare:

    #include <stdlib.h>
    #include <stdio.h>
    #include "my_macro_library.h"

    int main()
        int x = 0;
        printf("%i", increment(x));

        printf("%i", __LINE__);

        return EXIT_SUCCESS;
There are a ton of different implementations of an `increment` macro which are going to cause undefined behavior here. Contrast with the `__LINE__` macro: it's implemented by the compiler and has unsurprising, well-defined behavior.

Well, that's not always true (thinking of the numerous times I have had to optimize code written by people with this attitude), but I wasn't speaking about performance. Macros allow you to do things in e.g. C which cannot be done via any other mechanism in the language.

> Macros allow you to do things in e.g. C which cannot be done via any other mechanism in the language.

This isn't really an argument in favor of macros so much as an argument against minimally-featured languages like C.

To be clear, I'm not against using macros in every case. I think a lot of the danger of macros in C comes from C's syntax: macros become a much more powerful tool when you're using a homoiconic language like Lisp. There are a few different projects which have attempted to write languages with C-like semantics and Lisp-like syntax, but sadly none have gained traction.

2 places is better than N places. At the cost of slightly heavier enum declarations (2 places), you get rid of tests throughout the code for particular enum values.

Also consider inheiritance and enums. Derived classes have to jump through hoops to extend enum space e.g. enum { tractor = truck+1, crane, bulldozer}; where truck was the last enum value in the base class.

With the tuple method, you just add some more tuples.

Yes of course inheritance (if you can even call it that) is problematic and indeed one, if not the whole, point of the solution is having to add an antry means having to do it in one place only, but my question remains: does it have to be with a macro? I threw something together and this seems sort of equivalent but (imo) with less of the uglyness?

    enum class VehicleType

    using tuple = std::tuple< std::string, int, int >;
    using map = std::map< VehicleType, tuple >;

    const map VehicleTypes =
      { VehicleType::Car, tuple( "car", 4, 10 ) },
      { VehicleType::Bicycle, tuple( "bicycle", 3, 20 ) }

    const tuple& get_it( VehicleType type )
      static tuple def( "unknown", -1, -1 );
      auto it = VehicleTypes.find( type );
      return it == VehicleTypes.end() ? def : it->second;

    std::string getVehicleTypeName( VehicleType type )
      return std::get< 0 >( get_it( type ) );

So in other words, only use Enums to declare a set of 1-dimensional items that can't be broken down further (i.e., item is a property, not an object). Otherwise, you are going to run into trouble.

Isn't that kind of the definition of an Enum vs. Struct vs. Class/Object? And the problem lies with what enums are being used for in some cases, rather than enums being a bad practice?

Depends on the language. Java enums can hold arbitrary attributes (the "cure" is built-in), and algebraic data types go significantly further as variants can have completely different fields.

This is a good look at the dangers of diffusing semantic knowledge in general. His approach is much more true to OOP paradigms.

In Java, you can take this a stop further, and use enums basically as enumerated instances of a class. Example:

public enum Stuff { Stuff1(param1), Stuff2(param2);

    private Object param;
    private Stuff(Object param){
        this.param = param;

    public getParam(){
        return this.param;

Now you can call Stuff1.getParam();

> In Java, you can take this a stop further, and use enums basically as enumerated instances of a class.

Isn't it half the point of java's enums? (the other half being type-safe enums).

The solution presented here basically consists of implementing a VehicleType class with a bunch of static instances without using the class syntax. I can't decide if this was a deliberate decision from the author or if he just didn't realize this.

Maybe to be C-compatible?

I guess the way to solve this disease is just to use Enum for what they are supposed to be used for. Which is storing a state, that's it.

This article really lacks of OOP concept. In the first example the enum is a property on the 'vehicule' object, it would be way better to add all the properties needed to that class instead of creating a weird 'VehicleTypeTuple' struct...

In my experience, most of the evils of enums can be traced back to the lack of exhaustive pattern matching in C++/Java. When programming in Rust or Haskell, enums are used much more often, and the exhaustiveness enforcement by the compiler means that these kinds of use cases tend to be encapsulated inside helper functions, which end up being equivalent to the members of the "tuples" presented by the author. Any additions or changes to the enum will automatically make the compiler force you to update these functions.

The title is a bit misleading--maybe 'plain enums'. In Java for instance, all the solutions can all be encapsulated within the language's 'enum' type.

I second that; Java has solutions for all problems described by this post.

- constructors, private fields, methods, addressing the "separations of concerns" - how about Enum.valueOf(enumClass, text.toUpperCase()) to avoid the "classic shadow array of string literals"?

Here is an example of what can be achieved by Java enums

  enum Type {
      public void makeNoise() {
      public void makeNoise() {
    private Type parent;
    Type(Type parent) {
      this.parent = parent;
    public boolean isA(Type type) {
      if(this == type) return true;
      if(parent != null) return parent.isA(type);
      return false;
    public void makeNoise() { }
  assert Type.DOG.isA(Type.MAMMAL) == true;
  assert Type.CAT.isA(Type.DOG) == false;
  Type.CAT.makeNoise(); // "Meow"

This demonstrates a different problem - enums are closed. If I want to add an Owl, I can't do it without editing Type. This might break the API contract (if you own the library) or not be possible (if it's a third part library).

There is very little reason to use an enum over classes in this example. It's rare that I've seen a Java class style enum that wouldn't have been better served by classes (including some that I've written myself and regretted!)

The main point of an enumerated type is that it should be closed. You control all instances of the type, so that you and the compiler can benefit from the fact that it's closed.

If the fact that enums are closed ever presents a problem, then it's a good indication that you shouldn't be using enums. That's not a problem with enums, it's just a bad design decision that needs to be reversed.

Agreed that Vehicle is a poor choice for an enum, because it's something that isn't really closed. But if you're modelling something like states of a TCP connection, there's nothing dangerous about it being closed.

For this example, I agree that it makes very little sense. The question you really need to ask yourself whenever you are making an `enum` is: Does it make sense that this type is restricted to a predefined selection of instances? There are times when the legitimate answer to this question is "yes", and in those cases go ahead and make an `enum` and enjoy its benefits.

Historically, I've used rich enums in Java to be able to use switch statements to reason about a system state and as poor man's type pattern matching. Sometimes the switch is handy for making a state machine clearer, and sometimes it makes it easier to explain the relationships of these states as an ordered list. Another important use of enums is if you're trying to create a rich set of annotations that take parameters, which would force you into dropping a lot of the Java OOP typing available. I can't have a closure or even an anonymous class as an argument to parameterized annotations - and this is definitely by design.

I got bit pretty hard before trying to design an annotation-driven convention for a library I was writing (I had been writing a lot of Python for a while) and realized I'd have to squash a bunch of my carefully arranged classes into an enum. You run into similar problems when marshalling your objects to, say, a Thrift structure definition.

This does not read to me like good advice from an experienced programmer.

It has been a while since I wrote Java, but if my memory is correct, Java has a much cleaner solution to this. In Java an enum is just a class with a fixed number of instances, each with a constant name. You can make the instances private and expose all of your behavior through the interface of the class, using a "factory" method to find the correct instance (I say factory in quotes because it doesn't create the instance, it just finds the correct instance, which already exists). There's nothing stopping you from doing a switch-case against the instances, but at least if someone does that it will be limited to the enum class. And with some discipline you can define all the properties in the enum's constructor so no switch-cases are necessary. This mitigates all the OP's concerns, I think.

In C# you can use extension methods to keep all your switch-cases in one place, but it takes a little more work to enforce this with private visibility. I haven't thought through this completely, but my first attempt might be to replace the enum with a class with a private constructor and expose public static const instances, or maybe even hide the instances behind a factory method. I haven't tried this in practice, however, so I'm not sure what the implications of that approach are.

Why not simple subclasses? E.g. class Truck extends Vehicle. From the article, the argument would be that "everything in one file" is desired, so you can e.g. parse it from other or for other languages.

There are also the downsides of the table approach. What does avg_km_per_liter mean for e-cars? With multiple classes they can have different fields.

Everything is a tradeoff.

This is the standard style of programming in functional programming languages. While it isn't object-oriented (being its dual), I don't see why is it so bad (I wouldn't like a single `Vehicle` class to have methods handling lane use in each state).

One of the antipatterns I see in db schemas is over-use of just one field to indicate status or action, if not both. It's the same basic deal of overloading enums as the article demonstrated -- every instance increases your pain of complexity.

So, I guess, that's to say, this is a little bigger than just enums.

Shouldn't this all be in a config file or vehicle database in the first place and not hardcoded?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact