Hacker News new | comments | show | ask | jobs | submit login
A Look at the Internals of 'Tiered JIT Compilation' in .NET Core (mattwarren.org)
84 points by matthewwarren 5 months ago | hide | past | web | favorite | 52 comments



I heard so many times that JVM is much better than CLR, especially in the JIT area (Hotspot?).

If it's true, do the new .NET Core internals, allow .NET Core to beat JVM? If not, are there any additional steps planned to beat JVM at last?

.NET ecosystem is much smaller than Java's (for instance all the Apache projects) and Linux + .NET Core happened too late to make up for it in my opinion. Hopefully, at least performance can be the point that can turn other people attention back towards C#/.NET.


> I heard so many times that JVM is much better than CLR, especially in the JIT area (Hotspot?).

I wouldn't disagree with this (I work at Oracle on VM research), but I've been told by other people that Microsoft or at least the .NET just generally prefer a simpler approach, without profiling, dynamic compilation, deoptimisation, and other complexity the approach used by the JVM brings. It's much more what-you-see-is-what-you-get from their JIT, rather than the mind-binding things the JVM manages to do with some code, and maybe that makes life easier for their use cases.

I'm not sure they want to add the same kind of complexity that the JVM has, even if it did bring them some better performance. But as I say that's second hand conversations.


I definitely agree with this stance, and would say that languages like C# and golang have a weaker runtime but try to make up for it in other ways, often surpassing Java in certain areas. For instance, both languages have stack-allocated types to help stress the GC less. C# also has a much better native interop story than Java, as well as the unsafe keyword.

At the end of the day, if you want amazing performance in Java, you have to write C-like code in Java, and that often involves things like managing your own offheap memory, sticking with arrays of numbers, etc. Java's biggest areas for improvement in the performance arena are going to come allowing for stack-allocated types, fully-continugous arrays, less overhead when calling native code, and providing official APIs for the other things you can currently do with sun.misc.Unsafe.


The .NET runtime is weaker on the JIT side, but the lack of type erasure on generics is very complicated all on it's own. Java doesn't have to deal with this. In fact, .NET is the only VM I know of with this feature.

Fortunately, it permits a lot of interesting manual optimizations you can't easily express when runtime types aren't available.


The lack of erasure is a huge usability win (types don’t disappear at run time), as well as a performance win (I’ve been able to do great things with C# generics + value types). I also like the way it naturally deals with static fields (useful for factory routines).


I think it's a huge loss. Generic type erasure is the main thing that makes simple polyglot interop and code-reuse on the JVM possible. Even things like Python become clunky on .NET because of runtime generics. I gladly give up on the tiny usability gain in exchange for having a rich language ecosystem.

Value types are a different matter, and they are now being added to Java as well. Specialized generics over value types are different from specialized generics for reference types, because they are always invariant.


I use C# in my day job and try to learn as much as I can about cool ways to use its type system. Genuinely curious to see some examples if you're able to point me in the right direction?

> I’ve been able to do great things with C# generics + value types


You can see a few tricks in my project here: https://github.com/naasking/Dynamics.NET

You can use static classes to cache information about types. That project caches delegates to perform deep copies, to invoke constructors, to check for deep immutability, and whether a type can contain any cycles. This is all done structurally via reflection, but the results are executed only once and cached in static fields by exploiting runtime types.

The cached delegate trick can be used whenever you need to some efficient dynamic dispatching. I use it here [1] in another project to cache the Dapper methods that would be invoked so I can easily compose smaller queries into larger ones that return very complicated objects.

A more complicated example would be the fastest immutable dictionary available for .NET in my Sasa library [2]. Generic methods and types defined on structs are JITted separately, so the dispatching overhead is low. I exploit this to create a very efficient hash array mapped trie.

[1] https://github.com/naasking/Dapper.Compose/blob/master/Dappe...

[2] https://sourceforge.net/p/sasa/code/ci/default/tree/Sasa.Col...


> but I've been told by other people that Microsoft or at least the .NET just generally prefer a simpler approach, without profiling, dynamic compilation, deoptimisation, and other complexity the approach used by the JVM brings. It's much more what-you-see-is-what-you-get from their JIT, rather than the mind-binding things the JVM manages to do with some code, and maybe that makes life easier for their use cases.

Interesting, it's clear after a bit of research that this is the case (i.e. .NET JIT simpler that Hotspot), but it had never occurred to me that it was a conscious decision

I'd always thought that the .NET was just playing catch-up and/or didn't need to do all the things that Hotspot does. (for instance .NET has value types or 'stucts' so some optimisations are not needed because the programmer can just use them instead of classes, if they need the perf boost)


It has been a deliberate decision. The Book of the Runtime documentation states "The goal of the CLR is to make programming easy" and acknowledges there are faster alternatives out there. When I have asked insiders why there isn't a bigger focus on performance, the response is essentially that if performance is really important you should use C++.


That point of view is outdated.

Hence Roslyn, .NET Native with VC++ backend, value types improvements and safer pointers in 7.x.

There are even plans to slowly move parts of the runtime from C++ into C#, now that .NET Native is a thing.

" So, in my view, the primary trend is moving our existing C++ codebase to C#. It makes us so much more efficient and enables a broader set of .NET developers to reason about the base platform more easily and also contribute"

https://www.infoq.com/articles/virtual-panel-dotnet-future

https://www.infoq.com/articles/high-performance-dotnet


Java seems to have a similar initiative called "Project Metropolis" to (re)write some stuff in Java instead of C++: http://mail.openjdk.java.net/pipermail/discuss/2017-Septembe...


> The Book of the Runtime documentation states "The goal of the CLR is to make programming easy"

In case anyone else is interested, that quote comes from https://github.com/dotnet/coreclr/blob/master/Documentation/...


Also says

> The .NET runtime supports a wide variety of high performance applications. As such, performance is a key design element for every change.

https://github.com/dotnet/coreclr/blob/master/Documentation/...


>> I heard so many times that JVM is much better than CLR, especially in the JIT area (Hotspot?).

> I wouldn't disagree with this

I would wholeheartedly disagree with this though. There's not even any way to write, say, a generic class (even one as simple as ArrayList<T>) and then let the user instantiate it with a primitive type like <int> in Java. To use generics in Java is to pay the speed penalty of an indirection for every single primitive type on top of the extra per-object bytes of space overhead. There's just no way Java is going to fix these problems without massive breaking changes in the type system or an extremely in-depth & expensive form of static analysis unlike anything I've seen in any real-world compiler.


You're not disagreeing with me, because we're talking about different things!

You're talking about Java, the language, I'm talking about the JVM, really HotSpot, really C2 actually, the implementation and the techniques it uses, whatever the language is, compared to the techniques that the .NET CLR uses.

The points you bring up are interesting though, because some of these complexities (which HotSpot isn't responsible for or in a position to change!) are as someone else has said, probably partly what has motivated the extra work in C2 compared to the .NET CLR JIT.

> To use generics in Java is to pay the speed penalty of an indirection for every single element on top of the extra per-object bytes of space overhead.

And this is just false - the extra indirection only applies for elements that are boxed primitives, not 'every single element'. There is no extra indirection for things that are references anyway.


> You're not disagreeing with me, because we're talking about different things! You're talking about Java, the language, I'm talking about the JVM, really HotSpot, really C2 actually, the implementation and the techniques it uses, whatever the language is, compared to the techniques that the .NET CLR uses.

No, we are disagreeing. I'm talking about how repercussions of decisions in the language affect the implementation. My comment was not at all limited to just the language itself. If you read my comment, you notice I mentioned fixing these issues would require changes in either the type system or an "expensive form of static analysis unlike anything I've seen in any real-world compiler". Because when the language dictates that there must be an indirection, the implementation can only remove it when it can prove that doing so would be transparent to the code, which is impossible to do in general and extremely difficult in practice. This is neither just about the language nor just about the implementation; it's about the consequences of language decisions in the implementation.

> the extra indirection only applies for elements that are boxed primitives

Yes, I just assumed since I had already mentioned primitives it was clear that I'm talking about the use of primitives in generics. But I'll edit it in to repeat myself.


> I'm talking talking about the repercussions of decisions in the language that affect the implementation

Well fine I welcome and respect your opinions on this topic, and you're not even wrong, but that's it's not remotely what anyone else is talking about in this thread!

We're talking about what the two main JIT compilers on the two teams have decided to implement, given the constraints they have from their input. We're not talking about how generics were implemented, or language design changes, because we're talking about how the JIT compilers work.


In that case, it seems we have a different metric for a JIT compiler being "better"? I was going by the quality of the final output, but you seem to be going by the complexity of the intermediate optimizations it goes through? Which I admit also makes sense, but it's not what I expected: when someone says "is X better than Y", the obvious implication is they should choose X over Y since it would yield a better result, and yet that generally wouldn't be the case here?


Think about it like this:

Two chefs.

Chef A gets really easy to use ingredients and produces a good dish with them.

Chef B gets rubbish ingredients but spends time doing amazing techniques with them anyway and produces a good dish as well, maybe even a bit better!

If this is all we know about the chefs, I'd say it looks like Chef B is the 'better' chef. It's not their fault they got worse ingredients, and they managed to do a great job with them.

You didn't need Chef B to handle Chef A's ingredients, because they were simple. Perhaps we should thank the supplier for that! But Chef B is still the 'better' chef.

It still might be nice if Chef A could do some of the amazing techniques that Chef B did, even though their ingredients mean they aren't required all of the time. They may help in some cases.

The .NET CLR JIT is Chef A, the JVM's HotSpot C2 JIT is Chef B.

I mean if we judge a JIT just by the quality of the output and don't consider the input, then reducing to absurdity an identity JIT for already-optimised machine code to the same machine code is the best JIT in the world. Clearly it isn't.


What you say would make sense in a hypothetical world where every JITter could work with every language, but that's not the world we live in. You can just hand chef B better-quality ingredients and have him do his thing, but you can't just hand HotSpot MSIL code and have it do its thing. Half of this is due to differences in the representations. The other half is because MSIL has features that (as far as I know; please correct me if I'm wrong) the JVM doesn't have, such as by-ref variables, which it would still need to support, but which might make optimizations such as escape analysis harder.

This is why people judge chefs independently from the ingredients they are given, but don't judge languages and implementations independently from each other. In the first case they are actually (mostly) independent in the real world, whereas in the second case they are frequently dependent on each other, and it is not possible to mix-and-match languages with compilers. (But I certainly do hope we get to a point where the latter would be possible, too.)


Your points do make sense.

I would argue that it just isn't the case in practice that MSIL includes anything that would make optimisations that the JVM does significantly harder, but I only have my academic and professional experience working on both systems, nothing that I can cite or prove to you.


Is it not that Chef B doesn't have a frying pan (i.e. decent generics) so has to do a lot of tricks?


Thanks for this analogy, I love it!!


> If you read my comment,

That was not necessary.


Are there standard benchmarks used to measure and compare the two approaches (jvm vs .net)?


Pinch of salt with all benchmarks, but "The Computer Language Benchmarks Game"?

http://benchmarksgame.alioth.debian.org/u64q/csharp.html

.NET does better in most of them, but falls down vs Java on Regex and k-nucleotide


It's crazy but I'm not really aware of any good benchmarks, at least in the research community.

But you can show the differences qualitatively, in that the .NET simply doesn't have large swathes of optimisations which I think are considered pretty fundamental in the JVM such as escape analysis.


The JVM jit _has_ to do a lot more for performance.

Lack of value types means greater allocation by default - so it needs escape analysis to avoid it.

Type erasure for generic types means speculative optimisation and deoptimisation steps for types.

etc

Doesn't mean the .NET Jit can't improve in these areas (e.g. now its doing devirtualization for .NET Core 2.1, and escape analysis would be great) - but it does start in a better place.


I think that the "lack of value types" is probably an overrated difference these days.

Empirically, I've found that value types generally only offer a tangible perf benefit when they're the size of a pointer or smaller. There aren't a whole lot of spots where that comes into play. Primitives, sure, but Java also has those. Beyond that, maybe you'd see some difference in code that does a lot of date calculations, maybe. I've done some slightly obscene and shameful optimizations around packing data into 64-bit structures in the past, but those are pretty special cases.


How about vectors or coordinates? This post from a few years ago comes to mind:

http://www.minecraftforum.net/forums/mapping-and-modding-jav...

(Allegedly, a Minecraft update added significant strain on the garbage collector by switching the codebase from using separate variables/parameters for x, y, and z coordinates to using a single “BlockPos” object.)


If you're just using local values, or storing them in an array, then, yeah, value types might help. That's a likely scenario for something like storing arrays of vectors.

The spot where you take a hit is when you've got a scattering of individual objects that are being passed into and out of functions. There, the extra data copying has a tendency to outweigh the other benefits. And that's a more likely scenario for most kind of the kinds of business apps that .NET and Java are typically used for.


> I think that the "lack of value types" is probably an overrated difference these days.

Well, seems that the JVM designers disagree, there's been a lot of work put into fixing this, see 'Project Valhalla' for instance

https://www.infoq.com/news/2016/11/valhalla-Implementation-p... http://www.jesperdj.com/2015/10/04/project-valhalla-value-ty...


No they are a real issue, and that is why Oracle is pouring millions into improving the JVM in this area.

Oracle is not a company that gives money away for free.

Fintech is one of the areas that is slowly moving from C++ into Java and doesn't do it faster because they cannot move all the kind of code they have into Java, hence Oracle efforts.

Or why we are starting to see languages like Pony and Chapel pop up.


> Fintech is one of the areas that is slowly moving from C++ into Java

Are you referring to various trading and middle ware engines? Or mostly FE projects ?


I am talking in general, based in talks from Martin Thompson and other guys working on the field.


My experience having managed a medium scale SaaS app’s slow transition from Java to C#: 1) the CLR has very few “nerd knobs” compared with the JVM. It mostly auto-tunes itself and just works 2) despite this “lack of features” we’ve never experienced long pauses, GC storms, or our other JVM pains despite the same high user load and cloud instances 3) performance as measured by median and 99.9% response times is better in the C# bits on the same hardware, but there are also fewer DB round trips due to eliminating Hibernate in favor of Dapper and EF 4) tuning the JVM for high load is still poorly documented whack-a-mole of switches and profiling. G1 didn’t help the “auto-tuning” aspects of the JVM much in our experience, we still had STW pauses of 10+ seconds under load occasionally 5) this comparison is likely totally unfair as the C# code is clearly “better-written” having been actually designed and drags in far fewer bloated dependencies than the Java code did


Can you say whether you moved to .net Core or .net framework?


Framework for now, Core was too young when this started. But we’ve been careful to make sure it’s a switch-flip and not a major effort to move to Core


The JVM has simply has a lot more resources poured into it. C# and the CLR is a more modern clone of Java and it shines in the great usability of C# and to some extent the more sane op-codes of the CLR. Transforming between the two is actually easy enough that several projects offer 100% interop between C# and pure Java libraries.

I expect performance of the JVM to beat C# for a long time, but since the designs are so similar the difference isn't important in most cases.

Having spent time in both ecosystems, I'll admit that C# is easily a superior language to Java, honestly the best designed language I've ever used. Not surprising since it has the same designer as Delphi and Typescript. It's all syntactic sugar, but so much so that programming C# is significantly faster and easier.

What really hurts the .NET ecosystem right now is an insanely large gap in open source support. Something MS is fostering now but spent more than a decade trying to kill. This will not be easy to fix, as it's more a people problem than technical. MS actively tried to hurt the open source community for over a decade and to get support they will need to change the mind of countless people. It remains to be seen if this is viable.

If I was Microsoft I would remedy this by adding first-class support for Java libraries in C# by throwing their support behind existing projects to do so. This might close a gap that could otherwise take decades to fill


> If I was Microsoft I would remedy this by adding first-class support for Java libraries in C# by throwing their support behind existing projects to do so. This might close a gap that could otherwise take decades to fill

Apparently you never heard of J#.

https://en.wikipedia.org/wiki/J_Sharp


I hadn't, but the primary problem is the inverse case. Java programmers have no problem switching to C# and vice versa. The languages are really similar. The problem is that many millions of lines of open source software have already been written in Java and developers are not going to spend thousands of man-years converting them. Especially because they know the languages are so similar.

Developers are well aware that it would take minimal effort to make the languages compatible so they make little effort to port from one to the other. And Microsoft's problem is that the vast majority of these projects already target Java. If the languages were vastly different ports would be more interesting and fun to do, but since they're not porting is basically a boring and vastly annoying syntax conversion. Something annoying that nobody wants to do for free.

If Microsoft officially supported interop with native Java libraries I believe C# would take off like a rocket. Unfortunately, so far, Microsoft hasn't really sent out an olive branch to the massive open source Java community


There is IKVM for those that care about using Java libraries in .NET.

Quantity does not matter, if the relevant libraries are there.

I do consulting mostly using JVM and .NET languages, Web and C++.

Never was a .NET project where I missed having access to a particular Java library.


I disagree strongly. I used .NET for years before jumping ship to Java. Guava and Apache commons dwarf anything available for C# at any price. I know because I went looking many times for libraries with similar algorithms and Bloom filters. .NET is pay to play and in most cases something similar in Java was free.

I also greatly miss native database libraries when I'm using .NET. a large number, maybe a majority, of open source databases have an outdated buggy port to C# if anything. I will never forget the hell of integrating Lucene with C# because the port is bugged and close to a decade old. This is par the course for .NET

MS has a huge problem with lack of open source support unless you're stuck on the (IMO archaic) SQL Server+ Windows server platform.

Microsoft has fallen many years behind the times. Having to get a multi thousand dollar license for a server OS and SQL database is a joke. Their VM support outside of Azure is severely lacking and support for containers is junk because they're not based on Linux cgroups like everyone else. Windows as a server OS is as dead as OS/2 was 5 years ago. Linux won, by a landslide, and MS is too proud to just deal with that. Their software is severely crippled in server land as a result.


I never bothered to use Guava in production code.

The type of customers I work with don't waste any second thinking about Oracle, WebSphere, Enterprise Architect or Visual Studio Ultimate licenses. They are a tiny water drop in the ocean of overall project budgets.

Outside the HN bubble, Windows servers are doing pretty well.

Containers are the new buzzword.

I could also say that Linux containers are junk, because they have quite a bit to learn from proprietary UNIXes experience and what IBM and Unisys mainframes are capable of.


Your customers are many years behind the times then, which isn't unusual if you're doing consulting. Windows Server only does well in the consulting bubble and in backwards industries like finance, it's basically dead everywhere else.

Containers are hardly a buzzword, I would say more than half the fortune 500 run their systems exclusively in containers and that's only increasing. The last two companies I worked at were 100% containers for pretty huge fleets of machines. I don't know of a single large tech companies that isn't exclusively containers or at least KVM virtualized machines.

Mainframes are so past dead that they've resorted to haunting us. Mainframes (and SQL server and Oracle Db) are so behind that they make it illegal to benchmark their systems in license agreements. From the leaked benchmarks I've seen, a power mainframe CPU is maybe 10x slower than a single X86 core from Intel or AMD. Oracle Db is also about an order of magnitude slower than Postgres and MySQL which you can get for free.

I'm not trying to be a dick, just convince you that based on your beliefs it is you that is in a bubble :) . In fact the same bubble I used to be in (used to do consulting and shared a lot of the same opinions until I moved to different parts of the industry). That's fine if you still making $, but it's something to be aware of if you're looking around for other opportunities in different areas of the space


> Oracle Db is also about an order of magnitude slower than Postgres and MySQL which you can get for free.

I work on postgres. And still: I wish.


> Windows as a server OS is as dead as OS/2 was 5 years ago.

Take a look at the typical server room at a typical small-to-medium business, and most servers (virtual or physical) you are going to find run Windows.

And a fair amount of those machines run Windows because they run some third-party application that is only available on Windows.

Much as I would like to agree with you on this, outside the cloud arena, Windows Server has a significant market share and will continue do so, even if it is only because of inertia.

> the (IMO archaic) SQL Server

Also, as much as I like Postgres, SQL Server is the one piece of software from Microsoft I really do like. It is very stable, offers good performance, and the fact that T-SQL offers a unified query and procedural language makes working with it fairly pleasant. It has its pitfalls (I once managed to paralyze our ERP system with a stuck transaction), but I strongly suspect other RDBMS have their own pitfalls.

The price tag on SQL Server makes me want to faint, when I think about it, but keep in mind that from Microsoft's perspective they are not competing with MySQL or Postgres, but with Oracle and DB2, which to my knowledge are at least as expensive, if not more so.

> MS has a huge problem with lack of open source support

Okay, this we can easily agree on. ;-)


I once worked at a hybrid IT/consulting shop. A lot of companies still run Windows Server, but the space is being rapidly cannibalized by cloud services. In the few years I worked there we had already lost maybe 20% of our IT contracts to customers moving to cloud services like Office365 and Google for work. Custom software as a business is still thriving but basically everything new is running on cloud (and not Windows) at the clients request. Which means no Windows Server and related service contract revenue. Quite a large percentage of our IT contracts were conversion contracts to move companies to cloud services, and I assume it's only gotten worse in the years since then.

Windows Server has decent market share because of inertia but I would bet money that it's in an irreversible decline. MS has been taking the unusual step of making their server software like SQL Server and .NET Core run on native Linux, so it seems they've seen the end coming as well.

I agree with you that SQL Server is a good product. Far better than other 'legacy' databases like Oracle and DB2. The problem is that the paid contract model is dying. There's tons of mature and free SQL databases, and for the average project you only need so many features. SQL is such a mature tech that most projects could use any SQL database. The differentiation is gone, so there's no longer a reason to pay.


> that several projects offer 100% interop between C# and pure Java libraries.

What are the leading/mature libs for this ?


IKVM is the best I know of, but I heard rumors last year that unfortunately it's main maintainer is moving on to other projects. There's others that cost money but I don't really pay attention to them


> If it's true, do the new .NET Core internals, allow .NET Core to beat JVM?

I'm hoping that 'Tiered Compilation' will help, but it's early days. We will have to wait until it is 'officially' in .NET Core before it can be properly tested




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: