Hacker News new | past | comments | ask | show | jobs | submit login
Performance-Wise, C# Trumps Java (datacenteracceleration.com)
25 points by tkellogg on Feb 18, 2013 | hide | past | favorite | 63 comments



As a recovering C# programmer, I can safely say the cost of licensing and tooling is the main detriment, along with cross platform issues.

Working on C# OSS means stealing licenses from your employer, buying an MSDN subscription on your own dime, or evangelizing enough to get an MVP award so you get all the tooling for free. The free Express editions of the tools have built in limitations that don't make them viable for building Big Data projects. Other options exist, but pale in comparison to Visual Studio so they don't get much use.

As for mono, the uptick for the OSS stuff just isn't there and even Xamarin is leaning on commercial offerings to become a viable business. Maybe as they get more successful they can sponsor some C# server projects, but that may be a ways off.

This is all unfortunate, as C# really is a great language.


I'd love to know what you've moved on to. I'm still 'addicted' to C# these days and it pays the bills very well. For me, before C# it was PHP & Java and I have no interest in ever going back to those.


For me there was not one identicle fit to cover C# however a mixture of Java, Scheme, C++(11) and Python covers pretty much everything. Some will argue having to know more languages, libraries, tools, etc. is a pain in the ass and to an extent that is true but I much prefer using a technology that is better tuned for a task that forcing C# into lots of different domains like I used to do.


I've personally moved to Ruby and I'm slowly bringing Javascript and other JVM languages like Clojure and Scala into the fold.

In addition, I've moved to vim as a text editor and tried to avoid IDEs where possible.

The biggest thing to do is break your addiction to Visual Studio and the luxury tooling. That's harder for some programmers to do, but it opens a lot of doors with other stacks.


I'm also an ex-C# dev, now doing Python, haven't looked back once.


I have friends telling me to try python. I don't know how I will survive without curly braces. :)


Afaict, the only real option is to ignore .NET entirely and only use Mono, just like one would do with any other proprietary runtime.


IIRC, C# and the .NET compiler and runtime are free (as in beer). Outside of Monodevelop, I'm surprised nobody else has tried to re-roll a C#/.NET toolchain.

With that said, most .NET developers probably prefer their .NET OSS projects to have visual studio files so its easier to link into their projects.


Actually, the VS project file format is standard. Mono supports both Visual Studio formats (.sln, .csproj, *.fsproj, etc) as well as Makefiles


I'd actually be very interested in this. Recently went back to C# and have quickly remembered how painful .sln and .csproj files are to merge >_< Would love a solution that used something easier to merge, lol!


I have a license for the Professional edition but I for the most part I just use it as a text editor. So I have to ask, what limitations of the Express editions make them non-viable for big projects?


* Static code analysis, profiling, and HLSL editing and debugging

* Multi-unit testing framework and refactoring support

* Third-party extensibility support

Source: http://stackoverflow.com/a/13263786/204927


Honestly, all of those things are still secondary to actually developing. You can write great code without doing static analysis and profiling it. And with the most recent version, NuGet is included even in Express.

The biggest roadblocks to .NET development/adoption are not toolchain related IMO, they are platform related. If you want to bring up a farm of windows boxes, you're gonna pay.


* You get enough static analysis so that broken code won't compile.

* Testing can be done through other tools such as Nunit.

* With VS2012, you get extensions developed by Microsoft, such as NuGet.


It answers it at the end. When it comes to big data, if you don't run on Linux, you don't run.

Tech people who have to handle that kind of data and that kind of volume generally don't want to pay a Microsoft tax. In theory there are a lot of other alternatives. However all of the popular ones are Unix based and either are native to, or ported to, Linux.

There are exceptions. For example eBay decided for political reasons to go the Windows/Java route some years ago. (And reportedly paid a factor of 2 cost differential for it.) But Google, Facebook, Amazon, etc see no reason to be dependent upon Microsoft, or to pay that level of tax to a potential competitor.


Structs and erasure-free generics definitely make C# faster in principle, but I don't think that's necessarily enough to overcome the huge amount of work that has gone into making the JVM JIT fast code and giving it a world-class garbage collector. I don't think it makes sense to assume that it would be faster for 'big data' use cases unless you've tested it.

Furthermore, I doubt straight-line execution performance is really the biggest concern when choosing a language to use for a task like this. You probably want to prioritize being able to find people who know the language - or, if you're using existing staff, you pick something they either know or can learn easily. C#'s not hard to learn but the odds are better that your typical CS graduate knows Java. In particular, if you want high performance, high performance C# isn't particularly easy to write - you'll have to know the language well, just like in any other environment.


Memory usage trumps all other optimizations for tasks that use a lot of memory, which is to say for almost all tasks. C# (both Microsoft and Mono) uses a lot less memory than Java in principle and in practice.

Not having structured value types was the most consequential mistake in programming language design since 1970-01-01 00:00:00. It's probably the most important reason why Java never took off on the desktop and why C++ is still widely used.


Java never took off on the desktop because people had to download a large VM to run it. Now it doesn't matter but in the beginning it was a huge deal. Secondly most Java developers concentrated on Enterprise Java and could not write nice GUIs that performed well.


I don't think that's necessarily enough to overcome the huge amount of work that has gone into making the JVM JIT fast code and giving it a world-class garbage collector.

The CLR doesn't have a well-tuned JIT compiler and a fast garbage collector? Give .NET 4.5 a try -- it included some improvements to the garbage collector that were quite well-received (and the JIT was already quite good from .NET 4).


I didn't say that the CLR's collector is bad, I just said I doubt that MS has put enough energy into it to compete on an even footing with the technologies being used on the JVM.

Last I checked the JVM is more aggressive about inlining, can do escape analysis on reference types, and has a more scalable parallel collector. There are also alternative JVMs that go even further - Azul has a theoretically pauseless collector, which is certainly not something .NET offers (though pause times are pretty short for gen 1/2 collections, at least!)

The lack of aggressive inlining in many scenarios is arguably a big issue for .NET due to the more pervasive use of IEnumerable and properties, both of which add the overhead of additional method calls. In particular, when those additional method calls don't get inlined and you're passing structs around, each one involves a full copy of your struct, because the people designing the standard library didn't use 'ref' anywhere.


Simple properties are nearly always inlined, in my experience. But you're generally right about IEnumerable, for which the biggest culprit is LINQ. The problem is that the JIT doesn't inline across code which is referenced through a delegate, even though the actual method being invoked is the same each time. So code like "foo.Select(x => f1(x)).Where(x => f2(x)).Etc()" ends up making two function calls for every element in foo.


> So code like "foo.Select(x => f1(x)).Where(x => f2(x)).Etc()" ends up making two function calls for every element in foo.

Which is okay because Linq queries are easily refactored into proper classes and methods.

Make it fast then make it fast.


You're right. But honestly I'd like to see these improvements trickle down to Mono. There's 2 things big things that combined could change this trend - (1) performance enhancements like this for Mono and (2) a lot of work being done on MonoDevelop.

I've been doing my part to work on #2. I've sent in some improvements to VI mode (https://github.com/mono/monodevelop/pull/247) because I think it's critical to convince myself to work in MD. It also needs a lot of work on the debugger end. If we can successfully get people to work on Linux, C# might actually become prevalent.


My guess is that a lot of big data is deployed on linux, and thus the development environment as well as the well known deployment routines on linux revolves around tools that traditionally work well on Linux, like the headless JRE (for the servers) and Eclipse/Netbeans/IntelliJ (for the development environment).

Can you even set up a bunch of windows server nodes without running into a licensing headache?


Pretty much this in my experience. C# (and .NET in general) is pretty nice to work in but it is expensive from pretty much every angle. From Visual Studio (which runs over $10k for the Ultimate license) to needing the Enterprise version of Windows Server to do any kind of serious clustering you need a pretty big budget for software if you are going to commit to Windows for your production big data platform.

Also a lot of big data farms have some hardcore kernel optimisations which are not as easily done on Windows (if at all).

Also Java runs pretty much everywhere and is supported. C#/.NET is not (and Mono is not a decent response, if it isn't first party supported it isn't supported).


The $10k for ultimate thou is for the entire Microsoft ecosystem. Every version of Windows, every version of Office, every version of products I've never even heard about.

Its not for C#.

Also to say that its not first party support is a little disingenuous as ultimately one would never say pick a messaging queue that implemented AMQP because it wasn't first party. C# is an ECMA standard. AFAIK the mono compiler has no issues implementing it as well as the Microsoft msbuild/csc.

For me as to why I wouldn't, its simple. Mono isn't as fast as the JRE on linux for most operations, at least I've not heard it is, and its so far removed from my interests and work to test it. http://reverseblade.blogspot.co.uk/2009/02/c-versus-c-versus... I know the mono team have done a lot over work over the last 4 years, but still. I think that is the crushing blow.

Whilst F# support, or TPL or LINQ might make development nicer, I don't think the performance concerns can be adressed.

However, C# has one big thing going for it, unlike java its never tried to install the chuffing "Ask toolbar". Oracle are going to the special hell.


Yes you get pretty much everything (current) with an ultimate license that is only for development (with a minor exception for Office) not production. So while you can fire up a Windows Enterprise dev box you still need a production license for production. This is when Windows starts to get crazy expensive. Price up a 50 physical node cluster running Windows Server 2012 Enterprise and see why people don't want to both with it for such tasks. I have only ever seen a handle of large Windows server clusters and they were all in some way "sponsored" by Microsoft to show off the power of Windows server.

I 100% agree with you on JRE performance.

The Ask Toolbar thing is a pain in the ass. Although it was Sun who started it so blaming Oracle is a bit unfair. It does not excuse Oracle not removing it ASAP though. They don't include it as part of the offline installer, only the stub installer so it can be avoided at least.


That's not quite fair to Mono ... Xamarin is a great steward of mono, and is supporting the platform quite well. I guess that statement is pretty subjective, but I can only point to the release history of mono, and related products (MonoTouch, etc.) to show that not only are the releases consistent, but they are quite substantial. The platform is evolving and maturing at a great pace.


Not to nitpick, but VS Ultimate should be unnecessary for the 99% of developers who do not need an all-singing, all-dancing IDE, particularly around the integration testing stuff, and also the bundled TFS license. I suspect the pricing for Ultimate also bears some relationship to the price of Mercury/HP LoadRunner.

Windows Enterprise is not needed for compute clusters; AFAIK Microsoft doesn't actively sell any OS products at all for building a compute cluster (only for failover), rather expecting you to do something at the application level.

Anyhow, I'm not arguing that Windows is a fit for folks who need to spin up more instances on demand without a nearly linear increase in OS license costs, but Windows Enterprise and VS Ultimate are pretty unrelated to the concerns of heavy computing.


Would you consider Linux or BSD to be "first party supported"?

I've had good experiences with Mono and MonoDevelop, and given that Xamarin provides commercial support for Mono, I don't think it makes sense to automatically discard that option.


Yes. IBM, Red Hat, etc. Also there are tens of thousands of users running distros such as Debian on system much bigger than Windows Server can allow. While saying "first party supported" isn't like for like with Windows (if you are running say Debian) it has a proven track record. While Mono is very good it is still very young. Obviously this is personal opinion but if I am in discussions with vendors on what software stack we are using saying Linux isn't going to get me fired as it is pretty much know to the whole IT world, even PHBs, Mono not so much. Reputation matters a lot. Linux has a good rep in the enterprise world.


I agree, Linux does have a good rep in the enterprise world. My point was simply that, by the open-source, distributed nature of Linux/BSD development there isn't a true "first party" that has control over the whole thing. IBM, Oracle, Red Hat, Canonical, et al., certainly do well supporting Linux, but by definition they provide third-party support. (Unless they're providing support for code they've developed and contributed -- that would be first-party support.)


Agreed, by definition, you are right. However the whole "Linux is free therefore it can't be good" attitude that did exist circa 10 years ago is pretty much dead these days except for some hardcore MS customers who are blind to the real world.


I'm not sure about FreeBSD, but Oracle supports Java on Linux, so yes, it's first party supported.


yes you can! if you have money


If I were working with mountains of data in the .Net ecosystem, I'd have my eye on F# more than C#. In fact, F# is the only thing that makes me jealous of that tool chain.


I can confirm, F# is awesome to work with. (I've been using it primarily for ~2 years now.)

If you want to check it out: http://tryfsharp.org


Yes, as a functional language it is nice, although a little weird sometimes;;


Well the article's title is a bit misleading, since it does not try to proof that C# is actually faster than Java, it just argues that C# is better as a language. The reality is that C# and Java are pretty close and even if C# was twice as fast, it wouldn't be significant. C# may be a nicer language to work with, but Java's cross platform capabilities are a stronger argument for me. Mono on the other hand is a lot slower than Java on Linux.


I pity our industry, Bold claims "Performance-Wise, C# Trumps Java", no data.

TL;DR Don't look at data, look at the principles. Ignore reality, think about how the world should be. I think as an industry we can't aim lower than this quote:

"In fact, C# is a better fit than Java for high-performance requirements. I'm not referring to VM stats that change every year. C# has been designed from the ground up with efficiency in mind."

And a side note:

"Both C# and Java have generational mark and sweep collectors."

No Java hasn't.


I wonder how much of this performance penalty comes from the absolute overengineering of the Java classes and runtime

It seems most of the things are done in a very convoluted way and using 6 levels of inheritance.

At least that's my impression when coding C# vs Java, and no, I'm not an expert in any of those, but trying to accomplish something in C# is usually easier and straightforward.


Although I've not extensively used it, I like the features C# has over Java, and that was especially true before Java borrowed back many of those ideas in its later versions.

However, I clicked on this article expecting to see some performance benchmarks to justify the title. Rather, I see some theory about why it MIGHT be faster and also some opinions about what features might make a developer more efficient in C# than Java. I worry that any actual benchmarks would not validate the title. Certainly the web application benchmarks I've seen elsewhere do not align with the title.

Ultimately, it seems to partially conflate performance with developer efficiency. Developer efficiency I am willing to concede: C# may be more developer friendly as a language, putting platform aside.


May be people will pick up C# for non windows projects in the future, but the past just wasn't favorable for it.

The first version of C# was a 'meh' - Microsoft's Java(I already know Java; why bother?). C# got interesting(and a better Java IMO), but it took time. C# didn't have package management concepts till late(nuget is still in its infancy), and small to non-existent open source ecosystem. For a long time, it didn't run on anything else but Windows. Hobbyist who weren't willing to buy Windows(doesn't matter much if it comes with your system) and VS licences(express edition came late and explicitly disallows commercial use) didn't try C#.


Express edition has never disallowed commercial use.


Nice to see Xamarin making it's way in the mobile space, however, I have the theory that if there were a Mono distribution(installers, packages, etc) that favored other file extensions than ".exe" and ".dll" along with web frameworks designed for Command Line(like rails), only then Unix/Linux developers could see Mono as a serious Java alternative at least for Server-Side. I'd support an effort to create a separate Mono distribution that strips down GTK#, ASP.NET and family to a minimal Base-Class-Library powered by an amazing Runtime/JIT technology. If only I could find more developers who think this way...


easy: having your machines cost 50% more due to the Microsoft tax makes the entire point of having bucket-loads of cheap hardware utterly irrelevant.


Why did the mods just edit the 'big data' detail out of the title? Now this just looks like a link to a benchmark?


Just to say, you can get a free copy of VS2012 Pro through the WebSiteSpark scheme, which has a tiny-weeny barrier to entry.

I've had legal copies of VS for years through this.


For 3 years to be precise, otherwise you would have been in violation with the program terms:

"Eligible Web development and design Companies and/or Individuals can participate in WebsiteSpark for up to 3 years."


C# is amazing as a language. I love the static typed as much as possible and dynamic where needed part that the OP mentions. My library leverages that aspect of it, if anyone is interested in having a look - https://github.com/manojlds/cmd

To the comparison with JAVA - C# is not really cross-platform. I wish for C# on the JVM. Hope that becomes true someday.


C# on the JVM wouldn't be much better than Java because the JVM doesn't have value types, pointers, or a full set of primitive data types. At best you'd benefit from a little bit of syntactic sugar here and there, but much of that wouldn't be useful either since the JVM's standard library is missing support for it (your enumerable methods wouldn't be consumed by any non-C# code, for example, and nobody would know how to call your properties).


True, my real wish is CLR/.Net framework on other platforms.

And when I wish for C# on JVM, I wish that JVM gets the things to support C# in its entirety as a first class language.


C# is essentially Windows only and Windows is not a good platform for heavy data analysis.

Windows has many problems but one of the worst is poor file system performance. On Windows you waste a lot of time moving large files around when you would hardly notice these times on Linux.


C# has been usable on other platforms like Linux for years and years now. Yes, Mono isn't as good of a runtime as the MS CLR but it's perfectly usable.

Moving a file is a O(1) operation on Windows just like on any other reasonable operating system. If you're referring to copying between volumes, there's nothing magical about how Linux copies 1GB of data between volumes - Windows does the same thing. It's true that the Windows I/O stack has some performance deficits (the stat() equivalent is slower, for one) but don't be ridiculous, moving files is not a bottleneck.


Anyone who has actually spent much time working on Windows vs. Linux with large files and probably more importantly large numbers of files is aware of how much more of pain this is on Windows. Theoretical issues aside Windows FS implementations don't hold a candle to Linux performance-wise.

As for Mono the last thing I want as a developer is to be investing my time in a language with second rate support.

Note that I certainly don't advocate Java for big data either, C++ is the way to go until something better comes along (and I'm not holding my breath).


This sounds somewhat specious. What are you doing with the large or numerous set of files? How big or how many files are we talking about?


In two companies I've now compared development with Eclipse/Intellij and Maven between Linux and Windows. The Windows file system seems to be much slower (~2x) for Java development (working in IDE, building, unit tests etc). Identical hardware.


That's largely the stat() speed penalty. Linux-oriented software tends to call stat() way more often than it needs to since it's comparatively quite fast on Linux (not sure why - more aggressive caching?).

The most obvious example of this phenomenon is version control software like svn or git. If you profile them you'll see a ton of time spent just querying the attributes of files on Windows.


I love C# as a language, I am constantly finding new ways to use it, along with well-thought out restrictions that really help to avoid abusing it. The generics issue you mentioned, along with the lack of lambdas in Java (which is why I think languages like Clojure and Scala have gotten popular) are the two biggest reasons Java is a serious no-go for me. Java is a strictly typed language that makes dealing with types hard. C# is a strictly typed language that makes dealing with types easy.

The problem is that there is no community behind it. The vast majority of C# programmers are do-the-bare-minimum-contractors who hang up their keyboard at the end of the day and go home to their Top Chef and NFL on TV. Those people don't contribute to anything in any community, let alone the open source community.

There are a lot of ways to be able to do C# for free, both as in libre and as in gratis. Mono has about as big of an OSS community behind it as you will find in C#, and it even has support from Microsoft. The language itself is an EMCA standard, and much of the CLR VM and runtime library are, too. The parts that aren't OSS in the library have very reasonable alternatives, in fact, even if OSS wasn't a concern for you, those alternatives are so good you should probably be using them anyway.

And it's in almost every package repository out there, certainly all of the ones I've tried. I personally think setup is a sight easier than Java. I've personally found it a little easier to get setup with Mono that with Java; there is some fragmentation in Java that, while not insurmountable, is still a small annoyance.

Yes, you're not going to get a great IDE for it yet. MonoDevelop is buggy in weird ways. You could cheat and run VS Express in WINE, but I have a feeling some people are going to have a problem with that for no good reason other than to have a reason to complain.

I'd say, if you're the type of developer that is capable of putting together her entire toolchain from the ground up, C# is one of the best languages out there and it's a shame more people don't consider it. If you need hand-holding, then yes, it's really hard to do C# well on anything other than Windows. It's getting better, and will continue to grow, so don't discount it completely.


The problem is that there is no community behind it. The vast majority of C# programmers are do-the-bare-minimum-contractors who hang up their keyboard at the end of the day and go home to their Top Chef and NFL on TV. Those people don't contribute to anything in any community, let alone the open source community.

Your problem is that you're not aware of "the community", not that there isn't one.



Are there any serious benchmarks that compare the performance of Java and C#. As someone who works with both languages, I can generally but the qualitative arguments that the author is making, but I would really like to see some data...


Why?

A: MS Tax to run it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: