Hacker News new | past | comments | ask | show | jobs | submit login

> And one that I can't get confirmation on if it's in 7 or got pushed, index-access for List and Maps

I just tried it out, it didn't get added to 7.

> String-in-switch (NOTE: check on performance implications of this for tight loops)

I haven't checked the performance, but I tried decompiling a String switch statement:

  String test = "asdf";
  switch (test) {
    case "sss":
      System.out.println("sss");
      break;
    case "asdf":
      System.out.println("asdf");
      break;
  }
That decompiles to:

  switch(test.hashCode()) {
    case 114195:
      if(test.equals("sss"))
        byte0 = 0;
      break;
    case 3003444:
      if(test.equals("asdf"))
      byte0 = 1;
      break;
  }
  switch(byte0) {
    case 0: // '\0'
      System.out.println("sss");
      break;
    case 1: // '\001'
      System.out.println("asdf");
      break;
  }
So it's one switch on the String's hashcode, with calls to equals to verify that the string is a match (and an if/else if/else block if you have multiple strings in the switch that have the same hashcode), assigning the result to a temp variable, and then a switch on a temp variable to execute your code.

It looks like it would be fairly quick.




Really appreciate you going through the trouble to do that and post the results so we don't need any hand-waving.

This looks the same as the original proposal -- using the hashCode and final verification via equals.

So in a typical case scenario you have the 1-time calculation of the String's hashCode (which is then cached by java.lang.String) and the two method calls.

.equals() is the only thing that could get away from you performance wise, because unless string1 == string2 or are different lengths, you fall into a char-by-char comparison loop.

That could be a shitty surprise if you have a bunch of constants like "SAVE, RMVE, DELT, INST, UPDT" and don't realize what you are causing to happen under the covers.

I'd be curious (now) what actually happens with using Enums in switch statements to see what kind of hidden code is actually getting executed.

## ASIDE: An interesting alternative to switch statements with Enums: http://francisoud.blogspot.com/2008/02/better-way-to-use-jav...

Seems really a-typical though.

## UPDATE: It looks like using Enums ends up getting translated to:

  switch(enum.ordinal())
calls followed up goto statements to the case labels. So enums should be practically just as performant as constant numeric values except for the method call.

If the JVM is smart enough to replace the calls to 'ordinal' with the constant return value, then it should be right on par with using numerical constants which would be sweet and make using the strategy outlined in that link possibly a nice performant alternative to normal Enum use.


The .equals() still shouldn't be much of a problem. If you've got a bunch of constants like you gave, they will probably all have different hashcodes so you don't have to compare against all of them.

I checked what happens when you use an enum, and it generates some funky code. It creates an anonymous inner class with a static initializer for an array of ints. It uses the ordinal value to assign a sequential number starting with one. Then the switch statement is on intArray[theEnum.ordinal()], and the case statements start with 1. Here's the code:

  class Test2{
    public static void main(String[] args) {
      MyEnum en = MyEnum.ONE;
      switch (en) {
        case ONE:
          System.out.println("one"); break;
        case TWO:
          System.out.println("two"); break;
        default:
          System.out.println("default");
       }
    }
  }

  enum MyEnum {
    ONE, TWO, THREE;
  }
Decompiled (after being compiled with jdk1.7.0):

    public static void main(String args[])
    {
        MyEnum myenum = MyEnum.ONE;
        static class _cls1
        {

            static final int $SwitchMap$MyEnum[];

            static
            {
                $SwitchMap$MyEnum = new int[MyEnum.values().length];
                try
                {
                    $SwitchMap$MyEnum[MyEnum.ONE.ordinal()] = 1;
                }
                catch(NoSuchFieldError nosuchfielderror) { }
                try
                {
                    $SwitchMap$MyEnum[MyEnum.TWO.ordinal()] = 2;
                }
                catch(NoSuchFieldError nosuchfielderror1) { }
            }
        }

        switch(_cls1.$SwitchMap$MyEnum[myenum.ordinal()])
        {
        case 1: // '\001'
            System.out.println("one");
            break;

        case 2: // '\002'
            System.out.println("two");
            break;

        default:
            System.out.println("default");
            break;
        }
    }
I wonder why it doesn't just switch on the ordinal directly. It's also worth noting that it has to make an array for MyEnum.values().length. This means if you have an enum with a lot of values, you're wasting a lot of memory (especially since it creates this array every class you use a switch in). I tried with an enum with 40 values, and it didn't switch to a map. Maybe there's some higher threshold where it would switch to a map instead (but again, I don't understand why it wouldn't switch on ordinal() directly).


Good point about equals, I'm just thinking in a tight render loop (like in a game engine or something equally as sensitive) even with the diff hash codes you are dropping through to a char-to-char comparison IN the off chance that your two strings are the same length.

I'll be the first to admit those are pretty slim odds, but it's always fun to dig up these little performance things hidden in the JVM and store them away for later use in some interesting project.

Nice decompile with the enums... no idea what's with the array either; another hidden gem to at least be aware of.

This is alot like when autoboxing came out. New Java people thought int i and Integer ii were the same when used in different settings, not realizing that under the covers Integer.valueOf(i) and ii.intValue() are getting generated.

It's great if valueOf is returning a cached value (-127 to 128) but beyond that you are just object-creating all over the place.

Working on SJXP (xml parser layer that sits ontop of Pull Parser to make it like XPath, but super fast) I did some HPROF and found my use of path hashcode lookups in a Map<Integer, String> was generating millions of Objects.

So I moved to an Integer cache that holds the Integer object generated from the hashCode() and re-uses it as needed.

Ended up dropping the overhead (CPU and memory) of the library below everything in native XPP to where it adds almost no overhead.

Anyway -- thanks for diving deep with me on that, we have two new tricks up our sleeves ;)


I wonder why it doesn't just switch on the ordinal directly.

I don't know how JVM bytecode works, or if it's even possible to assign discontiguous ordinal values, but in C compiled to x86 assembly language, a switch statement on (mostly-)contiguous values can be converted into a jump table. If there are large gaps in the switch values, either you have a lot of wasted space in the jump table, you have multiple jump tables, or you fall back to if/elseif/etc-style code.

It would make more sense to me to assign the contiguous integer values during the construction of the Enum, but maybe that doesn't work when Java 7 code tries to use an Enum compiled with Java 6 or earlier.


> I don't know how JVM bytecode works, or if it's even possible to assign discontiguous ordinal values

That was my first thought as well, but I double checked and the ordinal() method is final, so you're guaranteed not to get anything other than continuous values for your enums.

> in C compiled to x86 assembly language, a switch statement on (mostly-)contiguous values can be converted into a jump table

Ah, this is probably the intention. The JVM can probably optimize the switch. I just checked it, and when there are multiple switch statements in a class only using a few values in an enum, it tries to make them all sequential in the switch, that's why it uses the array. I guess it's just a mistake (or lazy coding) that it always makes the array size TheEnum.values().length rather than how many it's actually going to use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: