
Performance-Wise, C# Trumps Java - tkellogg
http://www.datacenteracceleration.com/author.asp?section_id=2933
======
hkarthik
As a recovering C# programmer, I can safely say the cost of licensing and
tooling is the main detriment, along with cross platform issues.

Working on C# OSS means stealing licenses from your employer, buying an MSDN
subscription on your own dime, or evangelizing enough to get an MVP award so
you get all the tooling for free. The free Express editions of the tools have
built in limitations that don't make them viable for building Big Data
projects. Other options exist, but pale in comparison to Visual Studio so they
don't get much use.

As for mono, the uptick for the OSS stuff just isn't there and even Xamarin is
leaning on commercial offerings to become a viable business. Maybe as they get
more successful they can sponsor some C# server projects, but that may be a
ways off.

This is all unfortunate, as C# really is a great language.

~~~
vyrotek
I'd love to know what you've moved on to. I'm still 'addicted' to C# these
days and it pays the bills very well. For me, before C# it was PHP & Java and
I have no interest in ever going back to those.

~~~
dev360
I'm also an ex-C# dev, now doing Python, haven't looked back once.

~~~
vyrotek
I have friends telling me to try python. I don't know how I will survive
without curly braces. :)

------
btilly
It answers it at the end. When it comes to big data, if you don't run on
Linux, you don't run.

Tech people who have to handle that kind of data and that kind of volume
generally don't want to pay a Microsoft tax. In theory there are a lot of
other alternatives. However all of the popular ones are Unix based and either
are native to, or ported to, Linux.

There are exceptions. For example eBay decided for political reasons to go the
Windows/Java route some years ago. (And reportedly paid a factor of 2 cost
differential for it.) But Google, Facebook, Amazon, etc see no reason to be
dependent upon Microsoft, or to pay that level of tax to a potential
competitor.

------
kevingadd
Structs and erasure-free generics definitely make C# faster in principle, but
I don't think that's necessarily enough to overcome the huge amount of work
that has gone into making the JVM JIT fast code and giving it a world-class
garbage collector. I don't think it makes sense to assume that it would be
faster for 'big data' use cases unless you've tested it.

Furthermore, I doubt straight-line execution performance is really the biggest
concern when choosing a language to use for a task like this. You probably
want to prioritize being able to find people who know the language - or, if
you're using existing staff, you pick something they either know or can learn
easily. C#'s not hard to learn but the odds are better that your typical CS
graduate knows Java. In particular, if you want high performance, high
performance C# isn't particularly easy to write - you'll have to know the
language well, just like in any other environment.

~~~
profquail
_I don't think that's necessarily enough to overcome the huge amount of work
that has gone into making the JVM JIT fast code and giving it a world-class
garbage collector._

The CLR doesn't have a well-tuned JIT compiler and a fast garbage collector?
Give .NET 4.5 a try -- it included some improvements to the garbage collector
that were quite well-received (and the JIT was already quite good from .NET
4).

~~~
kevingadd
I didn't say that the CLR's collector is bad, I just said I doubt that MS has
put enough energy into it to compete on an even footing with the technologies
being used on the JVM.

Last I checked the JVM is more aggressive about inlining, can do escape
analysis on reference types, and has a more scalable parallel collector. There
are also alternative JVMs that go even further - Azul has a theoretically
pauseless collector, which is certainly not something .NET offers (though
pause times are pretty short for gen 1/2 collections, at least!)

The lack of aggressive inlining in many scenarios is arguably a big issue for
.NET due to the more pervasive use of IEnumerable and properties, both of
which add the overhead of additional method calls. In particular, when those
additional method calls don't get inlined and you're passing structs around,
each one involves a full copy of your struct, because the people designing the
standard library didn't use 'ref' anywhere.

~~~
CurtHagenlocher
Simple properties are nearly always inlined, in my experience. But you're
generally right about IEnumerable, for which the biggest culprit is LINQ. The
problem is that the JIT doesn't inline across code which is referenced through
a delegate, even though the actual method being invoked is the same each time.
So code like "foo.Select(x => f1(x)).Where(x => f2(x)).Etc()" ends up making
two function calls for every element in foo.

~~~
noblethrasher
> So code like "foo.Select(x => f1(x)).Where(x => f2(x)).Etc()" ends up making
> two function calls for every element in foo.

Which is okay because Linq queries are easily refactored into proper classes
and methods.

Make _it_ fast then make it _fast_.

------
0x0
My guess is that a lot of big data is deployed on linux, and thus the
development environment as well as the well known deployment routines on linux
revolves around tools that traditionally work well on Linux, like the headless
JRE (for the servers) and Eclipse/Netbeans/IntelliJ (for the development
environment).

Can you even set up a bunch of windows server nodes without running into a
licensing headache?

~~~
ditoa
Pretty much this in my experience. C# (and .NET in general) is pretty nice to
work in but it is expensive from pretty much every angle. From Visual Studio
(which runs over $10k for the Ultimate license) to needing the Enterprise
version of Windows Server to do any kind of serious clustering you need a
pretty big budget for software if you are going to commit to Windows for your
production big data platform.

Also a lot of big data farms have some hardcore kernel optimisations which are
not as easily done on Windows (if at all).

Also Java runs pretty much everywhere and is supported. C#/.NET is not (and
Mono is not a decent response, if it isn't first party supported it isn't
supported).

~~~
profquail
Would you consider Linux or BSD to be "first party supported"?

I've had good experiences with Mono and MonoDevelop, and given that Xamarin
provides commercial support for Mono, I don't think it makes sense to
automatically discard that option.

~~~
ditoa
Yes. IBM, Red Hat, etc. Also there are tens of thousands of users running
distros such as Debian on system much bigger than Windows Server can allow.
While saying "first party supported" isn't like for like with Windows (if you
are running say Debian) it has a proven track record. While Mono is very good
it is still very young. Obviously this is personal opinion but if I am in
discussions with vendors on what software stack we are using saying Linux
isn't going to get me fired as it is pretty much know to the whole IT world,
even PHBs, Mono not so much. Reputation matters a lot. Linux has a good rep in
the enterprise world.

~~~
profquail
I agree, Linux does have a good rep in the enterprise world. My point was
simply that, by the open-source, distributed nature of Linux/BSD development
there isn't a true "first party" that has control over the whole thing. IBM,
Oracle, Red Hat, Canonical, et al., certainly do well supporting Linux, but
_by definition_ they provide third-party support. (Unless they're providing
support for code they've developed and contributed -- that would be first-
party support.)

~~~
ditoa
Agreed, by definition, you are right. However the whole "Linux is free
therefore it can't be good" attitude that did exist circa 10 years ago is
pretty much dead these days except for some hardcore MS customers who are
blind to the real world.

------
phren0logy
If I were working with mountains of data in the .Net ecosystem, I'd have my
eye on F# more than C#. In fact, F# is the only thing that makes me jealous of
that tool chain.

~~~
profquail
I can confirm, F# is awesome to work with. (I've been using it primarily for
~2 years now.)

If you want to check it out: <http://tryfsharp.org>

------
naranha
Well the article's title is a bit misleading, since it does not try to proof
that C# is actually faster than Java, it just argues that C# is better as a
language. The reality is that C# and Java are pretty close and even if C# was
twice as fast, it wouldn't be significant. C# may be a nicer language to work
with, but Java's cross platform capabilities are a stronger argument for me.
Mono on the other hand is a lot slower than Java on Linux.

------
Uchikoma
I pity our industry, Bold claims "Performance-Wise, C# Trumps Java", no data.

TL;DR Don't look at data, look at the principles. Ignore reality, think about
how the world should be. I think as an industry we can't aim lower than this
quote:

"In fact, C# is a better fit than Java for high-performance requirements. I'm
not referring to VM stats that change every year. C# has been designed from
the ground up with efficiency in mind."

And a side note:

"Both C# and Java have generational mark and sweep collectors."

No Java hasn't.

~~~
raverbashing
I wonder how much of this performance penalty comes from the absolute
overengineering of the Java classes and runtime

It seems most of the things are done in a very convoluted way and using 6
levels of inheritance.

At least that's my impression when coding C# vs Java, and no, I'm not an
expert in any of those, but trying to accomplish something in C# is usually
easier and straightforward.

------
bhauer
Although I've not extensively used it, I like the features C# has over Java,
and that was especially true before Java borrowed back many of those ideas in
its later versions.

However, I clicked on this article expecting to see some performance
benchmarks to justify the title. Rather, I see some theory about why it MIGHT
be faster and also some opinions about what features might make a developer
more efficient in C# than Java. I worry that any actual benchmarks would not
validate the title. Certainly the web application benchmarks I've seen
elsewhere do not align with the title.

Ultimately, it seems to partially conflate performance with developer
efficiency. Developer efficiency I am willing to concede: C# may be more
developer friendly as a language, putting platform aside.

------
irahul
May be people will pick up C# for non windows projects in the future, but the
past just wasn't favorable for it.

The first version of C# was a 'meh' - Microsoft's Java(I already know Java;
why bother?). C# got interesting(and a better Java IMO), but it took time. C#
didn't have package management concepts till late(nuget is still in its
infancy), and small to non-existent open source ecosystem. For a long time, it
didn't run on anything else but Windows. Hobbyist who weren't willing to buy
Windows(doesn't matter much if it comes with your system) and VS
licences(express edition came late and explicitly disallows commercial use)
didn't try C#.

~~~
moron4hire
Express edition has never disallowed commercial use.

------
thepumpkin1979
Nice to see Xamarin making it's way in the mobile space, however, I have the
theory that if there were a Mono distribution(installers, packages, etc) that
favored other file extensions than ".exe" and ".dll" along with web frameworks
designed for Command Line(like rails), only then Unix/Linux developers could
see Mono as a serious Java alternative at least for Server-Side. I'd support
an effort to create a separate Mono distribution that strips down GTK#,
ASP.NET and family to a minimal Base-Class-Library powered by an amazing
Runtime/JIT technology. If only I could find more developers who think this
way...

------
blibble
easy: having your machines cost 50% more due to the Microsoft tax makes the
entire point of having bucket-loads of cheap hardware utterly irrelevant.

------
kevingadd
Why did the mods just edit the 'big data' detail out of the title? Now this
just looks like a link to a benchmark?

------
macca321
Just to say, you can get a free copy of VS2012 Pro through the WebSiteSpark
scheme, which has a tiny-weeny barrier to entry.

I've had legal copies of VS for years through this.

~~~
Uchikoma
For 3 years to be precise, otherwise you would have been in violation with the
program terms:

"Eligible Web development and design Companies and/or Individuals can
participate in WebsiteSpark for up to 3 years."

------
manojlds
C# is amazing as a language. I love the static typed as much as possible and
dynamic where needed part that the OP mentions. My library leverages that
aspect of it, if anyone is interested in having a look -
<https://github.com/manojlds/cmd>

To the comparison with JAVA - C# is not really cross-platform. I wish for C#
on the JVM. Hope that becomes true someday.

~~~
kevingadd
C# on the JVM wouldn't be much better than Java because the JVM doesn't have
value types, pointers, or a full set of primitive data types. At best you'd
benefit from a little bit of syntactic sugar here and there, but much of that
wouldn't be useful either since the JVM's standard library is missing support
for it (your enumerable methods wouldn't be consumed by any non-C# code, for
example, and nobody would know how to call your properties).

~~~
manojlds
True, my real wish is CLR/.Net framework on other platforms.

And when I wish for C# on JVM, I wish that JVM gets the things to support C#
in its entirety as a first class language.

------
tgflynn
C# is essentially Windows only and Windows is not a good platform for heavy
data analysis.

Windows has many problems but one of the worst is poor file system
performance. On Windows you waste a lot of time moving large files around when
you would hardly notice these times on Linux.

~~~
kevingadd
C# has been usable on other platforms like Linux for years and years now. Yes,
Mono isn't as good of a runtime as the MS CLR but it's perfectly usable.

Moving a file is a O(1) operation on Windows just like on any other reasonable
operating system. If you're referring to copying between volumes, there's
nothing magical about how Linux copies 1GB of data between volumes - Windows
does the same thing. It's true that the Windows I/O stack has some performance
deficits (the stat() equivalent is slower, for one) but don't be ridiculous,
moving files is not a bottleneck.

~~~
tgflynn
Anyone who has actually spent much time working on Windows vs. Linux with
large files and probably more importantly large numbers of files is aware of
how much more of pain this is on Windows. Theoretical issues aside Windows FS
implementations don't hold a candle to Linux performance-wise.

As for Mono the last thing I want as a developer is to be investing my time in
a language with second rate support.

Note that I certainly don't advocate Java for big data either, C++ is the way
to go until something better comes along (and I'm not holding my breath).

~~~
WayneDB
This sounds somewhat specious. What are you doing with the large or numerous
set of files? How big or how many files are we talking about?

------
moron4hire
I love C# as a language, I am constantly finding new ways to use it, along
with well-thought out restrictions that really help to avoid abusing it. The
generics issue you mentioned, along with the lack of lambdas in Java (which is
why I think languages like Clojure and Scala have gotten popular) are the two
biggest reasons Java is a serious no-go for me. Java is a strictly typed
language that makes dealing with types hard. C# is a strictly typed language
that makes dealing with types easy.

The problem is that there is no community behind it. The vast majority of C#
programmers are do-the-bare-minimum-contractors who hang up their keyboard at
the end of the day and go home to their Top Chef and NFL on TV. Those people
don't contribute to anything in any community, let alone the open source
community.

There are a lot of ways to be able to do C# for free, both as in libre and as
in gratis. Mono has about as big of an OSS community behind it as you will
find in C#, and it even has support from Microsoft. The language itself is an
EMCA standard, and much of the CLR VM and runtime library are, too. The parts
that aren't OSS in the library have very reasonable alternatives, in fact,
even if OSS wasn't a concern for you, those alternatives are so good you
should probably be using them anyway.

And it's in almost every package repository out there, certainly all of the
ones I've tried. I personally think setup is a sight easier than Java. I've
personally found it a little easier to get setup with Mono that with Java;
there is some fragmentation in Java that, while not insurmountable, is still a
small annoyance.

Yes, you're not going to get a great IDE for it yet. MonoDevelop is buggy in
weird ways. You could cheat and run VS Express in WINE, but I have a feeling
some people are going to have a problem with that for no good reason other
than to have a reason to complain.

I'd say, if you're the type of developer that is capable of putting together
her entire toolchain from the ground up, C# is one of the best languages out
there and it's a shame more people don't consider it. If you need hand-
holding, then yes, it's really hard to do C# well on anything other than
Windows. It's getting better, and will continue to grow, so don't discount it
completely.

~~~
sultezdukes
_The problem is that there is no community behind it. The vast majority of C#
programmers are do-the-bare-minimum-contractors who hang up their keyboard at
the end of the day and go home to their Top Chef and NFL on TV. Those people
don't contribute to anything in any community, let alone the open source
community._

Your problem is that you're not aware of "the community", not that there isn't
one.

------
Associat0r
Because F# [http://www.infoworld.com/d/application-
development/microsoft...](http://www.infoworld.com/d/application-
development/microsoft-big-data-programmers-try-f-211914)

------
TeeDub
Are there any serious benchmarks that compare the performance of Java and C#.
As someone who works with both languages, I can generally but the qualitative
arguments that the author is making, but I would really like to see some
data...

------
nvk
Why?

A: MS Tax to run it.

