
Java Reflection, 1000x Faster - norswap
http://norswap.com/fast-java-reflection/
======
btown
> The simple approach is to simply add a children() method to the interface
> and implement it in every kind of node. Very tedious, and boilerplaty as
> hell.

The best approach without boilerplate would be to generate this at compile
time, not runtime. You could use an approach like in
[http://hannesdorfmann.com/annotation-
processing/annotationpr...](http://hannesdorfmann.com/annotation-
processing/annotationprocessing101) to create at compile-time, using
reflection, a ChildWalker class that accesses the relevant properties for
every class annotated with @Node. At runtime, the JIT should get this down to
a vtable lookup and field lookups - just about as optimized as you can get.

EDIT: Always remember [https://xkcd.com/1205/](https://xkcd.com/1205/) \- if
it would take 5 seconds to write each children() method, and you write no more
than 5 Node classes per day (amortized), then you shouldn't spend more 12
hours writing the code for this blog post ;) Unless, of course, that was the
point all along!

~~~
specialist
Developing, debugging, using an annotation processor is opposite of
productive. Not recommended.

~~~
wzsddtc
It really depends on the annotation processor and what kind of code you are
trying to generate. In the author's case it shouldn't take that long nor would
it be hard to develop and debug.

~~~
flukus
I think it depends more on your IDE playing ball with the code generator and
custom build steps. Most do an awful job.

I'd love one that I could just point to a build tool/script instead of
creating it's own.

~~~
quickthrower2
Also I've worked places with this magic and no induction or documentation or
time for anyone to explain anything. The more behind the scenes hard to
discover stuff you implement the better the docs and onboarding needs to be.
It can't just be done to fulfill a architect astronaut fantasy.

~~~
flukus
I don't care so much about the documentation, as long as it's part of the
build script (it's own documentation) and not checked into source control. By
not checking it into source control it won't be edited, there won't be a mix
of generated and custom code. By being part of the build process you avoid the
generation tool and scripts being lost to the sands of time.

~~~
quickthrower2
I like the build process to be committed personally for build reproducibility.
It can be in a seperate more secured repository though.

------
jtmarmon
Seems to me this is approaching this totally the wrong way. First off, the
current implementation is just looping through all the methods of the
implementing class and the first one that matches a Node or [Node] return type
is supposedly its children? That seems extremely brittle.

The fact that you have 100 classes implementing this is a good sign you should
turn the design on its head. Why not just have a single tree structure built
using generics?

edit: also I would point out that you're not really making reflection any
faster, just your code

~~~
dragandj
On top of that, he acknowledges that there is a much, much simpler solution,
known from the dawn of Java, and this is to have a polymorphic children()
method.

But that would (gasp!) require adding those implementations. Which is tedious
(it is not with a bit of inheritance or some other technique but ok). OK, so,
it requires a bit of work. But those methods would take a few (or a dozen)
nanoseconds to dispatch; you can't go faster than that. What's the
alternative? To build your brittle infrastructure? Hmmm, I must be missing
something...

~~~
SOLAR_FIELDS
This is assuming you have full control over all the implementations. I see
this frequently in large Java and Scala code bases in enterprise, you have a
generic library you built awhile back and now has all sorts of users in all
different places that you may not even have source code access to, only jar
access though Nexus or Artifactory. Then you need to do some operation like OP
notes but you can't expect everyone to commit your code immediately or even at
all. Thus OP is left with only reasonable solution detailed.

~~~
dtech
So your complaint is that it is hard for users if you make a breaking change
in your library, big surprise. This is why breaking changes in libraries
should only be done with utmost consideration, preferably after first
deprecating the usage.

And if you make a breaking change to the interface, you better be damn well
breaking it at compile time and not hiding it with reflection shenanigans.

In the example of the OP, it's easy to think of an example that would totally
be broken by the author's reflection approach, while the user has no warning
at compile time and no reasonable way of knowing that what he's doing isn't
allowed:

    
    
        class MyWeirdNode extends Node {
            private final int i = 1;
            public IntNode asIntNode() {
                return new IntNode(i)
            }
        }

~~~
eecc
That's where client provided implicit converters in Scala are actually useful
magic

------
hota_mazi
If you are using reflection and encountering performance problems, the correct
approach is pretty much always to write an annotation processor and generate
that code instead.

~~~
avodonosov
Or generate code without using annotation processor

~~~
specialist
ReflectASM is the fastest, most general purpose reflection solution. It uses
runtime bytecode generation.

[https://github.com/EsotericSoftware/reflectasm](https://github.com/EsotericSoftware/reflectasm)

When I needed something more custom, I've done the bytecode strategy a few
times, using ASM library or equiv. But the cost of change (brittleness,
aggravation) is pretty high.

Now I just write Java source and use the JDK's built-in javac to generate the
bytecodes.

Some people like StringTemplate.
[http://www.stringtemplate.org](http://www.stringtemplate.org) For whatever
reason, I still prefer Velocity. Template languages are mostly evil, and
variations of terrible, so I just stick to what I know.

Annotations are never the correct answer. For any use case, purpose.

Edit: Whoops. May have spoke too soon. JDK 8's MethodHandle and some other
features (inlining simple getters) may have mooted ReflectASM.
[https://github.com/EsotericSoftware/reflectasm/issues/23](https://github.com/EsotericSoftware/reflectasm/issues/23)
I may have to retool some of my older code.

~~~
wtetzner
JavaPoet is a relatively nice library for generating Java sources:
[https://github.com/square/javapoet](https://github.com/square/javapoet)

~~~
specialist
Thanks. I tried JavaPoet, once apon a time.

I'm just a simple bear. I can't keep all the details straight.

One strategy I use to cope is round tripping, as a way to confirm I get the
expected results.

My current approach to code generation: painstakingly hand craft the desired
source code output. Then turn that into a template (progressive refinement).
Then compare (diff) generated output to manual. Iterate. Many times that means
adjusting target to make it easier to generate.

Using bytecodes (class file) as the target output adds complexity and more
steps. I've been reversing bytecodes for a long time, manually using javap,
the decompilers (which are now quite good), and using ASMs code generator
plugin for Eclipse (super clever). But it's too fussy, compared to the Java
source code strategy.

------
kodablah
Didn't look too deep, but this might be due to the fact that call sites are
memoized for invoke calls as part of the invokedynamic improvements (but
really MethodHandle::invoke and MethodHandle::invokeExact are where the magic
is as is mentioned by the JVM spec IIRC).

Regardless, MethodHandle invocations are always going to be faster than
reflection Method invocations. A primary reason is that the parameters for
invoke and invokeExact are not subject to the same boxing and runtime-
validation requirements as reflection.

------
JackFr
> I had an interface (representing a tree node)

No you didn't. An interface representing a tree node would have a children()
method.

~~~
yipopov
Why? I see why it might be a good idea depending on what problem you are
trying to solve, but I don't see why anything has to have any references to
its children in order to be a tree node.

------
xenadu02
I assume the JIT isn't optimizing until the functions execute a certain number
of times. Given the speed difference maybe it isn't even natively compiling at
first and using the interpreter.

I don't know about Java but in .NET there's a call you can make to force the
dynamically generated lamba expression to be compiled immediately.

Of course if you're being really adventurous you can use IL emit to build a
function up from IL opcodes.

~~~
ShabbyDoo
Yes. The author didn't note his testing methodology. JMH (Java Microbenchmark
Harness) is the gold standard for executing Java benchmarking tests:

[http://openjdk.java.net/projects/code-
tools/jmh/](http://openjdk.java.net/projects/code-tools/jmh/)

It takes care of details like JVM warm-ups. ensuring sufficient invocations
for JIT compilation to have occurred, etc.

~~~
nalllar
Although sometimes the performance when interpreted before warm-up is
important.

Testing with CompilerControl.Mode.EXCLUDE in JMH is still more reliable than
testing manually, when you don't know if it got compiled or not.

Interpreted code is a case where method handles should be much faster than the
generated accessor classes used by reflection - lambdas and method references
are faster than an inner class implementing a functional interface before jit
compilation. Afterwards, performance tends to be sameish.

------
joemag
In the past, I've used Introspector API [1] for property reflection. It's part
of the JDK, and provides a lot of the caching this article is talking about.
Doubt this would provide the same level of improvement as adding the direct
children() method, or even ASM based approaches, but it's certainly faster
than naked reflection.

[1]
[https://docs.oracle.com/javase/7/docs/api/java/beans/Introsp...](https://docs.oracle.com/javase/7/docs/api/java/beans/Introspector.html)

------
solomatov
There's another, and IMO, a better way to make Java reflection faster. It's to
disable checks on reflection calls. Just call setAccessible(true), and you
will notice substantial improvement.

~~~
dmux
Isn't setAccessible required before you can interact with fields?

~~~
solomatov
If they are private, yes.

------
agentgt
What I do for these situations is do the reflection as a unit test to confirm
that the hand written children() is correct.

When the unit test fails for a node type it prints what it should be and you
just copy and paste that code for children() should be.

This is also nice for the case where nodes have some sort special children().
You can make the unit test ignore those.

------
fulafel
I wonder if this could be used by JVM languages that sometimes fall back to
reflection, like Clojure. Transparently profile the reflection calls and
recompile the hot ones?

~~~
divs1210
Clojure provides a var called _warn-on-reflection_ that can be set to true
during dev, and all (or a vast majority of) reflective calls can be optimized
away by adding type hints wherever it warns. This is to say that Clojure devs
get rid of reflection instead of trying to make it faster.

~~~
fulafel
Sounds like doing it automatically, instead of requiring code changes, would
save a lot of work and speed up many programs where the programmers haven't
gone through the trouble of doing this.

------
agumonkey
I wonder if the clojure devs use these techniques. IIRC clojure relies a lot
on reflection.

~~~
divs1210
Clojure devs get rid of reflection by adding type hints where required. The
compiler has a flag to help with this, called _warn-on-reflection_.

------
DonHopkins
C# reflection is also slow, and even slower in Unity3D with the IL2CPP ahead-
of-time compiler, where there is no just-in-time compiler or
System.Reflection.Emit to dynamically generate code at runtime (which is
impossible and prohibited on iOS, for example).

Here are the limitations of the Mono runtime:

[https://developer.xamarin.com/guides/ios/advanced_topics/lim...](https://developer.xamarin.com/guides/ios/advanced_topics/limitations/)

However, there are ways of compiling the little snippets of the code you need
ahead of time, then plugging them together with delegates at runtime, which
avoids much of the overhead.

Much of the cost is the overhead of converting from unknown types to specific
types. So you can pre-compile loosely typed adaptor thunks that take generic
"object" arguments, convert them to the required type, and directly call
strongly typed delegates (which you can plug into the adaptor, so adaptors can
be used for any number of compatible delegates).

Check out how the following code uses FastInvoke, ToOpenDelegate,
Delegate.CreateDelegate, Expression.GetActionType, and Expression.GetFuncType.

It may be that some of these general techniques could be applied to Java.

Faster Invoke for reflected property access and method invocation with AOT
compilation:

[https://web-beta.archive.org/web/20120826100615/http://whydo...](https://web-
beta.archive.org/web/20120826100615/http://whydoidoit.com/2012/04/18/faster-
invoke-for-reflected-property-access-and-method-invocation-with-aot-
compilation)

The bane of the iOS programmers life, when working with reflection in Mono, is
that you cant go around making up new generic types to ensure that your
reflected properties and methods get called at decent speed. This is because
Mono on iOS is fully Ahead Of Time compiled and simply cant make up new stuff
as you go along. That coupled with the dire performance of Invoke when using
reflected properties lead me to construct a helper class.

This works by registering a series of method signatures with the compiler, so
that they are available to code running on the device. In my tests property
access was 4.5x faster and method access with one parameters was 2.4x faster.
Not earth shattering but every little helps. If you knew what you wanted ahead
of time, then you could probably do a lot better. See here for info.

You have to register signatures inside each class Im afraid. Nothing I can do
about that.

So to register a signature you use:

    
    
        static MyClass()
        {
             //All methods returning string can be accelerated
             DelegateSupport.RegisterFunctionType<MyClass, string>();         
             //All methods returning string and taking an int can be accelerated
             DelegateSupport.RegisterFunctionType<MyClass, int, string>();    
             //All methods returning void and taking a bool can be accelerated
             DelegateSupport.RegisterActionType<MyClass, bool>();             
        }
    

Then when you have a MethodInfo you use the extension method FastInvoke(object
target, params object[] parameters) to call it. FastInvoke will default to
using normal Invoke if you havent accelerated a particular type.

myObject.GetType().GetProperty("SomeProperty").GetGetMethod().FastInvoke(myObject);

myObject.GetType().GetMethod("SomeMethod").FastInvoke(myObject, 1, 2);

You can download the source code for FastInvoke from here.

[https://web-beta.archive.org/web/20120826100615/http://www.w...](https://web-
beta.archive.org/web/20120826100615/http://www.whydoidoit.net/Downloads/DelegateSupport.cs)

Newer version from unityserializer-ng:

[https://gitgud.io/TheSniperFan/unityserializer-
ng/blob/maste...](https://gitgud.io/TheSniperFan/unityserializer-
ng/blob/master/Assets/Plugins/OpenUnityTools/unityserializer-
ng/Radical/System/Delegates/DelegateSupport.cs)

