Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Writing a Profiler in 240 lines of pure Java (mostlynerdless.de)
156 points by mfiguiere on March 29, 2023 | hide | past | favorite | 29 comments


(Author here) This blog post is actually part of a series to write profilers from scratch (https://mostlynerdless.de/blog/tag/writing-a-profiler-from-s...) and I'm looking forward to writing the next installment, in the next few days.

I write a blog post on this and related in-depth profiling topics every two weeks (you can follow me on social media https://twitter.com/parttimen3rd and https://mastodon.social/@parttimenerd to get notified).


Hi.

I believe the code of the constructor of Profiler is not thread safe, it has a publication problem. The method onEnd can be called by the profiler thread while the fields options and store are not fully initialized.

From another thread POV, a final field is only fully initialized after the call to the constructor not inside the call to the constructor.


doesn't `Runtime.addShutdownHook` have a synchronisation point. like all writes need to have occurred before `ApplicationShutdownHooks.add` has been called so there is no publish problem.

          https://github.com/openjdk/jdk/blob/2fa09333ef0ac2dc1e44292f8d45d4571cb22cca/src/java.base/share/classes/java/lang/Runtime.java#L69
          https://github.com/openjdk/jdk/blob/2fa09333ef0ac2dc1e44292f8d45d4571cb22cca/src/java.base/share/classes/java/lang/ApplicationShutdownHooks.java#L65
maybe I'm missing something that is java specific with the way constructors work. I'm guessing even without the synchronisation point java lets you share objects in a way that would normally be considered unsafe as long as all the fields are final.


Thanks. But this should only be a problem if the JVM terminates too fast, or am I wrong?


For the context, I'm re-reading Java concurrency in Practice now so I see publication issues everywhere.

I believe the issue is between the profiler thread and the shutdown hook thread. Both will run concurrently because the profiler thread is marked deamon. So the shutdown hook thread can see the Profiler fields not fully initialized. The call to addShutdownHook() should be done outside of the Profiler constructor.


You are correct. The solution is to use a static constructor method, like this:

    public static Profiler newInstance(Options options) {
        Profiler profiler = new Profiler(options);
        Runtime.getRuntime().addShutdownHook(new Thread(profiler::onEnd));
        return profiler;
    }

    private Profiler(Options options) {
        this.options = options;
        this.store = new Store(options.getFlamePath());
    }
In principle you can also use chained constructors:

    public Profiler(Options options) {
        this(options, null);
        Runtime.getRuntime().addShutdownHook(new Thread(this::onEnd));  // okay to leak this here
    }

    private Profiler(Options options, Void dummy) {
        this.options = options;
        this.store = new Store(options.getFlamePath());
    }
(cf. https://stackoverflow.com/a/35169705/623763)


Thank you. I fixed the code in the GitHub repository but kept the code on my blog the same, with a disclaimer regarding its problems.


It would have to be very quick as it would need the thread to be started before your object has finished being constructed. So it seems unlikely. What's not clear to me from a quick reading of the code is whether the store is actually correct. Although it is only being filled by a single thread that isn't the one that is going to read it on shutdown and I'm not clear on whether there is anything forcing a barrier before the onExit thread has starts reading the hash maps on the nodes.

Actually, is there anything stopping the sampler from taking another sample while onExit is running?


Would `volatile` help with the barrier?

Re the multi-threading issue, it should be possible to stop the profiler thread in the hook, if we keep a reference to the thread, no?


The fields are final, so they can't be volatile. :-) Using one of the fence methods on VarHandles would ensure the correct ordering. To stop the profiler thread it would be best to provide a controlled method of doing it, there's a lot of way it can be implemented but you probably want to signal it in some way and then do a join() to wait while it shuts down cleanly.


Thanks, I implemented it using a boolean flag.


An alternative to avoid safepoints (only samples executing threads)

    var r = new RecordingStream();
    r.enable("jdk.ExecutionSample").withStackTrace().withPeriod(Duration.ofMillis(1));
    r.onEvent("jdk.ExecutionSample", e -> {
        store.addSample(e.getStackTrace().getFrames());
    });   
    r.startAsync();
    ...


You get JFR's precision but inherit its blind spots as described in [1]:

> demo1: ... JFR will not report anything useful at all, since it cannot traverse stack traces when JVM is running System.arraycopy()

I'd rather run jstack in a loop than lose System.arraycopy() (or, in fact, any native code be it JVM's or JNI).

[1] https://github.com/apangin/java-profiling-presentation


I would rather use Andrei Pangin's async-profiler than doing something custom with jstack.


This of course relies on Java's reflection ability, in this case, an API to retrieve stack traces for all threads dynamically. An interesting question is how this is implemented and how regular invocation might affect the runtime (e.g. global locks).


(Author here) It's not about Java's ability to do reflection, many languages expose the ability to get stack-traces. It's usually used for logging and exceptions. But I could certainly write a blog post on how stack-walking is implemented for the likes of async-profiler, as I'm currently working on a new stack-walking API for Java (JEP 435).


> It's usually used for logging and exceptions.

That's why I was querying it's suitability for profiling. The time to get a lock for a global "safepoint" and then release it must be significant no?


It's not as significant as one might think (the JVM uses it internally for GC and other things). The main problem is the bias it leads to in profiling and that the time-to-safepoint can sporadically be in the tens of milliseconds with is not great.


>Profilers are not rocket science...

They are not. There are orders of magnitude more rocket scientists than good profiler writers. And no, bottle rockets with flight recorders do not count.


The number of profiler writers is not too large (if you ignore all the people writing the UIs), but it's the same with compilers: You only need a few to create a useful enough product. There are many more people working on tracing right now.

There are even fewer people working on the underlying profiling APIs. I'm one of the few working regularly on the stack walking API that powers most Application Platform Monitors (and async-profiler). I'm the person who writes all the code to test AsyncGetCallTrace currently. If anyone is interested: I wrote a blog post on it https://mostlynerdless.de/blog/2023/03/14/validating-java-pr....

I'm writing blog posts on the profiling topic every two weeks (usually publishing them on Monday or Tuesday).


Such a task is usually thankless work so, thank you. :)

I don't doubt my favorite profiler, YourKit also uses your code.

Profilers are probably the class of tool that most drastically improved the code I write so if anyone reading hasn't really felt the need to use one, you should.

Just a cursory glance can quickly verify assumptions about runtime performance. Even if the code is currently acceptable it helps to be mindful of where cycles are being spent (and lock contention, network blocking, etc) so if you end up doing some refactoring anyway that additional context and knowledge can guide improvements that aren't strictly "performance work".


> Such a task is usually thankless work so, thank you. :)

It's just a game of whack-a-mole, finding new segmentation faults and other bugs is quite easy when you know how to write good test code. Fixing the actual bugs less so. But it's fun and I do this for a living in the SapMachine team at SAP SE.

> I don't doubt my favorite profiler, YourKit also uses your code.

They surely do profit by work improving the profiling APIs.

> Just a cursory glance can quickly verify assumptions about runtime performance.

Yes. But there aren't enough people educating people on profilers. So I started blogging and talking (https://youtu.be/Fglxqjcq4h0) on this topic. Besides fixing all the bugs and implementing my own profile viewer based on FirefoxProfiler (https://plugins.jetbrains.com/plugin/20937-java-jfr-profiler).


That is awesome, I will be sure to checkout your viewer.


"It is not rocket science" is so 1970, I'd say nowadays it is "It is not semiconductor eng."


What is your criteria for 'good profiler writer'?


I know one when I see one


Someone correct me if I'm wrong but this looks like another failed attempt at profiling Java code.

This approach forces a safepoint and therefore all results will be safepoint biased. That's a very good way to mislead oneself about where the cpu time goes.

The correct way is to employ asynchronous stack probing like async profiler or any other modern profiler does.


(Author here) I want to quote from my article:

"Writing a Java profiler in 240 lines of pure Java is possible and the resulting profiler could even be used to analyze performance problems. This profiler is not designed to replace real profilers like async-profiler, but it demystifies the inner workings of simple profilers."

And you'll find a large disclaimer on top of the article explaining the safepoint-bias and linking to relevant blog posts by other people on this topic.

Please don't use this profiler in production, but you use it to further your knowledge of profilers.

> The correct way is to employ asynchronous stack probing like async profiler or any other modern profiler does.

Guess what my other articles on my blog are about :) But as you'll see, using the asynchronous APIs is far harder and requires knowledge on POSIX signals, C programming and more that many Java developers do not have.


Thanks for the clarification, I skimmed through the content faster than I should have. I stand corrected : - )




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: