If it's true, do the new .NET Core internals, allow .NET Core to beat JVM? If not, are there any additional steps planned to beat JVM at last?
.NET ecosystem is much smaller than Java's (for instance
all the Apache projects) and Linux + .NET Core happened too late to make up for it in my opinion. Hopefully, at least performance can be the point that can turn other people attention back towards C#/.NET.
I wouldn't disagree with this (I work at Oracle on VM research), but I've been told by other people that Microsoft or at least the .NET just generally prefer a simpler approach, without profiling, dynamic compilation, deoptimisation, and other complexity the approach used by the JVM brings. It's much more what-you-see-is-what-you-get from their JIT, rather than the mind-binding things the JVM manages to do with some code, and maybe that makes life easier for their use cases.
I'm not sure they want to add the same kind of complexity that the JVM has, even if it did bring them some better performance. But as I say that's second hand conversations.
At the end of the day, if you want amazing performance in Java, you have to write C-like code in Java, and that often involves things like managing your own offheap memory, sticking with arrays of numbers, etc. Java's biggest areas for improvement in the performance arena are going to come allowing for stack-allocated types, fully-continugous arrays, less overhead when calling native code, and providing official APIs for the other things you can currently do with sun.misc.Unsafe.
Fortunately, it permits a lot of interesting manual optimizations you can't easily express when runtime types aren't available.
Value types are a different matter, and they are now being added to Java as well. Specialized generics over value types are different from specialized generics for reference types, because they are always invariant.
> I’ve been able to do great things with C# generics + value types
You can use static classes to cache information about types. That project caches delegates to perform deep copies, to invoke constructors, to check for deep immutability, and whether a type can contain any cycles. This is all done structurally via reflection, but the results are executed only once and cached in static fields by exploiting runtime types.
The cached delegate trick can be used whenever you need to some efficient dynamic dispatching. I use it here  in another project to cache the Dapper methods that would be invoked so I can easily compose smaller queries into larger ones that return very complicated objects.
A more complicated example would be the fastest immutable dictionary available for .NET in my Sasa library . Generic methods and types defined on structs are JITted separately, so the dispatching overhead is low. I exploit this to create a very efficient hash array mapped trie.
Interesting, it's clear after a bit of research that this is the case (i.e. .NET JIT simpler that Hotspot), but it had never occurred to me that it was a conscious decision
I'd always thought that the .NET was just playing catch-up and/or didn't need to do all the things that Hotspot does. (for instance .NET has value types or 'stucts' so some optimisations are not needed because the programmer can just use them instead of classes, if they need the perf boost)
Hence Roslyn, .NET Native with VC++ backend, value types improvements and safer pointers in 7.x.
There are even plans to slowly move parts of the runtime from C++ into C#, now that .NET Native is a thing.
So, in my view, the primary trend is moving our existing C++ codebase to C#. It makes us so much more efficient and enables a broader set of .NET developers to reason about the base platform more easily and also contribute"
In case anyone else is interested, that quote comes from https://github.com/dotnet/coreclr/blob/master/Documentation/...
> The .NET runtime supports a wide variety of high performance applications. As such, performance is a key design element for every change.
> I wouldn't disagree with this
I would wholeheartedly disagree with this though. There's not even any way to write, say, a generic class (even one as simple as ArrayList<T>) and then let the user instantiate it with a primitive type like <int> in Java. To use generics in Java is to pay the speed penalty of an indirection for every single primitive type on top of the extra per-object bytes of space overhead. There's just no way Java is going to fix these problems without massive breaking changes in the type system or an extremely in-depth & expensive form of static analysis unlike anything I've seen in any real-world compiler.
You're talking about Java, the language, I'm talking about the JVM, really HotSpot, really C2 actually, the implementation and the techniques it uses, whatever the language is, compared to the techniques that the .NET CLR uses.
The points you bring up are interesting though, because some of these complexities (which HotSpot isn't responsible for or in a position to change!) are as someone else has said, probably partly what has motivated the extra work in C2 compared to the .NET CLR JIT.
> To use generics in Java is to pay the speed penalty of an indirection for every single element on top of the extra per-object bytes of space overhead.
And this is just false - the extra indirection only applies for elements that are boxed primitives, not 'every single element'. There is no extra indirection for things that are references anyway.
No, we are disagreeing. I'm talking about how repercussions of decisions in the language affect the implementation. My comment was not at all limited to just the language itself. If you read my comment, you notice I mentioned fixing these issues would require changes in either the type system or an "expensive form of static analysis unlike anything I've seen in any real-world compiler". Because when the language dictates that there must be an indirection, the implementation can only remove it when it can prove that doing so would be transparent to the code, which is impossible to do in general and extremely difficult in
practice. This is neither just about the language nor just about the implementation; it's about the consequences of language decisions in the implementation.
> the extra indirection only applies for elements that are boxed primitives
Yes, I just assumed since I had already mentioned primitives it was clear that I'm talking about the use of primitives in generics. But I'll edit it in to repeat myself.
Well fine I welcome and respect your opinions on this topic, and you're not even wrong, but that's it's not remotely what anyone else is talking about in this thread!
We're talking about what the two main JIT compilers on the two teams have decided to implement, given the constraints they have from their input. We're not talking about how generics were implemented, or language design changes, because we're talking about how the JIT compilers work.
Chef A gets really easy to use ingredients and produces a good dish with them.
Chef B gets rubbish ingredients but spends time doing amazing techniques with them anyway and produces a good dish as well, maybe even a bit better!
If this is all we know about the chefs, I'd say it looks like Chef B is the 'better' chef. It's not their fault they got worse ingredients, and they managed to do a great job with them.
You didn't need Chef B to handle Chef A's ingredients, because they were simple. Perhaps we should thank the supplier for that! But Chef B is still the 'better' chef.
It still might be nice if Chef A could do some of the amazing techniques that Chef B did, even though their ingredients mean they aren't required all of the time. They may help in some cases.
The .NET CLR JIT is Chef A, the JVM's HotSpot C2 JIT is Chef B.
I mean if we judge a JIT just by the quality of the output and don't consider the input, then reducing to absurdity an identity JIT for already-optimised machine code to the same machine code is the best JIT in the world. Clearly it isn't.
This is why people judge chefs independently from the ingredients they are given, but don't judge languages and implementations independently from each other. In the first case they are actually (mostly) independent in the real world, whereas in the second case they are frequently dependent on each other, and it is not possible to mix-and-match languages with compilers. (But I certainly do hope we get to a point where the latter would be possible, too.)
I would argue that it just isn't the case in practice that MSIL includes anything that would make optimisations that the JVM does significantly harder, but I only have my academic and professional experience working on both systems, nothing that I can cite or prove to you.
That was not necessary.
.NET does better in most of them, but falls down vs Java on Regex and k-nucleotide
But you can show the differences qualitatively, in that the .NET simply doesn't have large swathes of optimisations which I think are considered pretty fundamental in the JVM such as escape analysis.
Lack of value types means greater allocation by default - so it needs escape analysis to avoid it.
Type erasure for generic types means speculative optimisation and deoptimisation steps for types.
Doesn't mean the .NET Jit can't improve in these areas (e.g. now its doing devirtualization for .NET Core 2.1, and escape analysis would be great) - but it does start in a better place.
Empirically, I've found that value types generally only offer a tangible perf benefit when they're the size of a pointer or smaller. There aren't a whole lot of spots where that comes into play. Primitives, sure, but Java also has those. Beyond that, maybe you'd see some difference in code that does a lot of date calculations, maybe. I've done some slightly obscene and shameful optimizations around packing data into 64-bit structures in the past, but those are pretty special cases.
Well, seems that the JVM designers disagree, there's been a lot of work put into fixing this, see 'Project Valhalla' for instance
(Allegedly, a Minecraft update added significant strain on the garbage collector by switching the codebase from using separate variables/parameters for x, y, and z coordinates to using a single “BlockPos” object.)
The spot where you take a hit is when you've got a scattering of individual objects that are being passed into and out of functions. There, the extra data copying has a tendency to outweigh the other benefits. And that's a more likely scenario for most kind of the kinds of business apps that .NET and Java are typically used for.
Oracle is not a company that gives money away for free.
Fintech is one of the areas that is slowly moving from C++ into Java and doesn't do it faster because they cannot move all the kind of code they have into Java, hence Oracle efforts.
Or why we are starting to see languages like Pony and Chapel pop up.
Are you referring to various trading and middle ware engines? Or mostly FE projects ?
I expect performance of the JVM to beat C# for a long time, but since the designs are so similar the difference isn't important in most cases.
Having spent time in both ecosystems, I'll admit that C# is easily a superior language to Java, honestly the best designed language I've ever used. Not surprising since it has the same designer as Delphi and Typescript. It's all syntactic sugar, but so much so that programming C# is significantly faster and easier.
What really hurts the .NET ecosystem right now is an insanely large gap in open source support. Something MS is fostering now but spent more than a decade trying to kill. This will not be easy to fix, as it's more a people problem than technical. MS actively tried to hurt the open source community for over a decade and to get support they will need to change the mind of countless people. It remains to be seen if this is viable.
If I was Microsoft I would remedy this by adding first-class support for Java libraries in C# by throwing their support behind existing projects to do so. This might close a gap that could otherwise take decades to fill
Apparently you never heard of J#.
Developers are well aware that it would take minimal effort to make the languages compatible so they make little effort to port from one to the other. And Microsoft's problem is that the vast majority of these projects already target Java. If the languages were vastly different ports would be more interesting and fun to do, but since they're not porting is basically a boring and vastly annoying syntax conversion. Something annoying that nobody wants to do for free.
If Microsoft officially supported interop with native Java libraries I believe C# would take off like a rocket. Unfortunately, so far, Microsoft hasn't really sent out an olive branch to the massive open source Java community
Quantity does not matter, if the relevant libraries are there.
I do consulting mostly using JVM and .NET languages, Web and C++.
Never was a .NET project where I missed having access to a particular Java library.
I also greatly miss native database libraries when I'm using .NET. a large number, maybe a majority, of open source databases have an outdated buggy port to C# if anything. I will never forget the hell of integrating Lucene with C# because the port is bugged and close to a decade old. This is par the course for .NET
MS has a huge problem with lack of open source support unless you're stuck on the (IMO archaic) SQL Server+ Windows server platform.
Microsoft has fallen many years behind the times. Having to get a multi thousand dollar license for a server OS and SQL database is a joke. Their VM support outside of Azure is severely lacking and support for containers is junk because they're not based on Linux cgroups like everyone else. Windows as a server OS is as dead as OS/2 was 5 years ago. Linux won, by a landslide, and MS is too proud to just deal with that. Their software is severely crippled in server land as a result.
The type of customers I work with don't waste any second thinking about Oracle, WebSphere, Enterprise Architect or Visual Studio Ultimate licenses. They are a tiny water drop in the ocean of overall project budgets.
Outside the HN bubble, Windows servers are doing pretty well.
Containers are the new buzzword.
I could also say that Linux containers are junk, because they have quite a bit to learn from proprietary UNIXes experience and what IBM and Unisys mainframes are capable of.
Containers are hardly a buzzword, I would say more than half the fortune 500 run their systems exclusively in containers and that's only increasing. The last two companies I worked at were 100% containers for pretty huge fleets of machines. I don't know of a single large tech companies that isn't exclusively containers or at least KVM virtualized machines.
Mainframes are so past dead that they've resorted to haunting us. Mainframes (and SQL server and Oracle Db) are so behind that they make it illegal to benchmark their systems in license agreements. From the leaked benchmarks I've seen, a power mainframe CPU is maybe 10x slower than a single X86 core from Intel or AMD. Oracle Db is also about an order of magnitude slower than Postgres and MySQL which you can get for free.
I'm not trying to be a dick, just convince you that based on your beliefs it is you that is in a bubble :) . In fact the same bubble I used to be in (used to do consulting and shared a lot of the same opinions until I moved to different parts of the industry). That's fine if you still making $, but it's something to be aware of if you're looking around for other opportunities in different areas of the space
I work on postgres. And still: I wish.
Take a look at the typical server room at a typical small-to-medium business, and most servers (virtual or physical) you are going to find run Windows.
And a fair amount of those machines run Windows because they run some third-party application that is only available on Windows.
Much as I would like to agree with you on this, outside the cloud arena, Windows Server has a significant market share and will continue do so, even if it is only because of inertia.
> the (IMO archaic) SQL Server
Also, as much as I like Postgres, SQL Server is the one piece of software from Microsoft I really do like. It is very stable, offers good performance, and the fact that T-SQL offers a unified query and procedural language makes working with it fairly pleasant. It has its pitfalls (I once managed to paralyze our ERP system with a stuck transaction), but I strongly suspect other RDBMS have their own pitfalls.
The price tag on SQL Server makes me want to faint, when I think about it, but keep in mind that from Microsoft's perspective they are not competing with MySQL or Postgres, but with Oracle and DB2, which to my knowledge are at least as expensive, if not more so.
> MS has a huge problem with lack of open source support
Okay, this we can easily agree on. ;-)
Windows Server has decent market share because of inertia but I would bet money that it's in an irreversible decline. MS has been taking the unusual step of making their server software like SQL Server and .NET Core run on native Linux, so it seems they've seen the end coming as well.
I agree with you that SQL Server is a good product. Far better than other 'legacy' databases like Oracle and DB2. The problem is that the paid contract model is dying. There's tons of mature and free SQL databases, and for the average project you only need so many features. SQL is such a mature tech that most projects could use any SQL database. The differentiation is gone, so there's no longer a reason to pay.
What are the leading/mature libs for this ?
I'm hoping that 'Tiered Compilation' will help, but it's early days. We will have to wait until it is 'officially' in .NET Core before it can be properly tested