
Why is C++ still a very popular language in quantitative finance? - BlackJack
http://quant.stackexchange.com/q/1764/1275/
======
wglb
One reason that I know of is that you can get better performance.

Now I know that there are claims that C# and Java are performance competitive
with C++. You see some of that in the benchmark game at
<http://shootout.alioth.debian.org> for simple problems a wide variety of
results. One interesting one, the n-body problem, which involves a lot of
hard-core computation, is the winner C++ or Java or .net? Well,
[http://shootout.alioth.debian.org/u32/performance.php?test=n...](http://shootout.alioth.debian.org/u32/performance.php?test=nbody)
says that the winner is . . . Fortran!! (Not my favorite language either). g++
comes out at a factor of 1.3, the best Java at 1.5, and the best C# comes out
at 2.1. That is, it takes the best C# program 2.1 times as long to compute
this problem than the fastest program.

The stackexchange post said _But in this era, the performance of a program
written in a language based on frameworks such as C# and Java can be pretty
close to that of C++_ A factor of 1.6 is not pretty close, in my book. If you
are in high speed markets and all other things being equal, placing your order
at 1.6 ms when the other guy places it at 1.0 ms means what? It means you are
further down in the order book. You lose.

So we all know that these toy benchmarks don't really represent what happens
in a large, useful program. It would be interesting to build a more extensive
benchmark set, don't you think?

Having worked on a very high-performance stock options feed, I can share some
of my experience. The development goes something like this. You start out in
C++, cause you get objects, and other useful stuff. You see that during peaks
of the day, say around 0900 cst that you are beginning to fall behind. So you
begin to tune.

See, lots of people decry the C++/C combination saying that they are different
languages. Well, sort of. If you are on a theoretical or dogmatic bent, sure.
But if you have a C++ program that is taking too long, you can relatively
easily, bit-by-bit, turn it into a C program. So I am fine with the C/C++
designation in practice.

You tune this thing, making sure that you allocate your objects, say, at 0829
in the day and tend to leave them there until 1500. Then you up the -O count,
hoping you are not pulling a Heisenberg. If you still need a little headroom,
you learn that you can turn off exceptions in g++. Yes, even thought you don't
ever throw an exception, having that not disabled costs you CPU. (What were
they thinking?). So while the compile flags and the source program extension
says C++, what you are executing more closely resembles C.

But there are those who fervently say that in such an environment that Java is
competitive with C/C++. If you look at the cost to build the program, I am
likely to agree that that part of the effort is faster with Java. But can you
tune Java as much as C++? Or .net? I am suspecting not.

I think we should have a more broad-based real-life example. I am thinking
that a simulated financial exchange repeatedly implemented in competing
languages might be a more interesting example. In fact, I think I will go off
and give that a try. Say maybe Lisp (note that it whips Java server in some
examples, take that!), Java, C++ and certainly not Fortran (due to personal
prejudices).

I'll let you know how that turns out.

~~~
mey
I'm not a quant, and quite removed from the field. The transaction systems I
work with are much slower, and most likely deal with significantly less
information.

But, is the end all be all the execution speed? Does flexibility and robust
recovery not matter?

Everything I'm hearing is that, it's better to gut the car down to the frame
with no safety measures then even a little bit of coverage. If that program
crashes or cross wires some data, how much damage can it do to your financial
position?

From what I understand, billions of dollars are on the line everyday, but
everyone is racing towards the bottom of that nanosecond mark, generally at
the expense of risking instability etc.

Things that would never fly in other high-risk environments seem to fly in the
stock system, and I'm going out on a limb, but it seems that's because if
there was there is an adult watching the kids play in the stock market sand
box and a fuse will blow if an HFT feedback loop triggers some real idiocy.
([http://www.zerohedge.com/article/hft-fat-digital-finger-
brea...](http://www.zerohedge.com/article/hft-fat-digital-finger-breaks-citi-
stock-shares-halted-circuitbreaker-triggered-stock-plungi) )

~~~
sixtofour
I was only peripherally involved in trade systems. My understanding is that
the amounts of money to be made, and lost if you're second, completely drown
the arguable maintenance and personnel costs of C++.

~~~
dagw
That is true for a small subset of the stuff quants do. For a lot of other
stuff an hour this way or that makes hardly any difference at all. A normal
day for one of my quant friends basically looks like, get into the office and
spend the morning putting together some calculations, run them and got to
lunch, get back from lunch and discuss the results with some colleagues, if
you decided to trade based on the numbers, call a broker. Seconds and minutes
are totally irrelevant to her.

------
chrisaycock
I am a _pro tem_ moderator on the Quant Finance Stack Exchange.

This question was pretty contentious when it was asked. Worse still is that
the current accepted answer came from someone who doesn't even work in
quantitive finance.

Any argument about _performance_ is totally incorrect. There are indeed a few
areas of quant finance that require performance, but those are in the
minority. Really, C++ is the top language because of culture. Ie, _it's what
everyone knows_.

Just about every major programming language is used in finance, and each firm
has its own preferences. But almost all of them will still interview in C++
because it's so widespread.

~~~
cHalgan
We were developing data stream processing software for companies in financial
sector. We were unable to get performances and controllability/measurability
of the system our customer wanted using java. So we end up with C/C++.

~~~
chrisaycock
I do high-frequency trading. My feed handling and order routing logic is in
C++ for the performance benefits. But my backtesting is in q/kdb+, my loading
scripts are in Python, and my administrative tasks are in bash.

Some of my co-workers use Java because their models aren't as sensitive to
latency as mine. My best friend does options pricing in VB/Excel. And I know
tons of competitors who use R, MATLAB, OCaml, and Haskell.

There are _tons_ of languages used in finance.

~~~
ThaddeusQuay2
Could you please give me a specific example of how you use Q (something more
towards the language side, rather than the database side)? Also: How did you
learn Q? I've been playing with it, and I've read "Q for Mortals", but I'd
like to do more, although not in the field of finance. I find Arthur Whitney's
journey from APL to J to A+ to K/Q quite interesting, and I'm trying to figure
out just how powerful Q really is. Thanks in advance.

~~~
chrisaycock
> I find Arthur Whitney's journey from APL to J to A+ to K/Q quite interesting

Arthur Whitney didn't do APL or J; those were from Kenneth Iverson with Roger
Hui helping out on the later. A+ was Arthur's implementation of APL, from what
I understand. K is entirely ASCII (none of the special APL characters) and q
added reserved words plus the integrated kdb+ database.

> How did you learn Q?

I learned q as a quant for a trading desk that used it for most tasks. I've
been using it ever since because it's very expressive and has great
performance.

~~~
ThaddeusQuay2
"Arthur Whitney didn't do APL or J; those were from Kenneth Iverson with Roger
Hui helping out on the later."

I am familiar with the history. I meant "journey" as in "progression through
APL and APL-like languages". Iverson showed APL to Whitney when he was only 11
years old. Whitney created the first version of J, but then moved on, leaving
it to Hui.

"Work began in the summer of 1989 when I [Ken Iverson] first discussed my
desires with Arthur Whitney. He proposed the use of C for implementation, and
produced (on one page and in one afternoon) a working fragment that provided
only one function (+), one operator (/), one-letter names, and arrays limited
to ranks 0 and 1, but did provide for boxed arrays and for the use of the
copula for assigning names to any entity. I showed this fragment to others in
the hope of interesting someone competent in both C and APL to take up the
work, and soon recruited Roger Hui, who was attracted in part by the unusual
style of C programming used by Arthur, a style that made heavy use of
preprocessing facilities to permit writing further C in a distinctly APL
style. Roger and I then began collaboration on the design and implementation
of a dialect of APL (later named J by Roger) ..." - from Hui's "Remembering
Ken Iverson", referencing Iverson's "A Personal View of APL"
(<http://keiapl.org/rhui>)

In Appendix A, on that same page, you can find Whitney's code. It shows how
differently he thinks about coding, and is likely a good example of Q's roots.

------
mynegation
There are many reasons.

* quantlib: aside from (proprietary, very expensive, and damn slow) Matlab, no other language has a library of quant-related functionality that is so vast

* A lot of 3rd party libraries and APIs that do not have .NET, Python, Ruby, R (you name it) wrappers and you do not have time, expertise or resources to write them

* Like wglb mentioned, quants are obsessed with performance. Even aside from obvious things like high-frequency trading where you try to squeeze out every milliseconds. Let's say in middle office you run risk measurement calculation daily and it finishes in 9 hours for your portfolio. Well, if you happen to triple your portfolio (not unheard of in boom times), you cannot run it daily anymore, so a factor of 1.6 gets in your way here too.

* .NET programmers (on average, of course) tend to have less experience in dealing with algorithms and data structures

------
KirinDave
I don't know why everyone is so quick to dismiss the simple explanation,
"That's just the current culture of the industry."

Personally, I find this idea very compelling. The quantitive finance
profession isn't exactly open and taking a constant influx of new ideas from
the rest of the industry. They're a much more secretive bunch. Is it any
surprise, then, that their tools stagnate somewhat because of their reluctance
to engage openly with the rest of the industry?

I know that some people are engaged in high-speed trading applications where
they require a language with a close analogue to machine instructions for the
purposes of performance. I haven't seen any evidence that this is all quants
everywhere, or that these suites of quantlibs actually provide all their
functionality to that segment of the community.

Meanwhile, the CLR and JVM actually generate remarkably fast code from
remarkably high-level specifications and LLVM is a real thing. Haskell and
Ocaml take functional definitions and often generate better code than longer
C++ definitions. I suspect that there is an under-served market here that is
reluctant to adopt new tools for social reasons rather than for technical
reasons.

~~~
bd_at_rivenhill
I find it hard to use the term "stagnate" to describe an industry that is
developing its own specialized languages/systems (see Q/KDB) and is currently
in the process of exploring FPGA technology to improve performance. The
secretive nature of the industry means that people outside it don't know
what's happening on the inside, not that participants lack awareness of
developments in more open industries.

Just to answer your point about CLR/JVM code performance: one early comment in
the original article that stood out for me was "Pure computational performance
(ignoring memory allocation/deallocation) under .NET runtime (ignoring
vectorization) is pretty close to the performance of raw C++", which is all
well and good except for the fact that you are going to end up with worse code
than someone who doesn't ignore memory allocation and vectorization.

~~~
KirinDave
> I find it hard to use the term "stagnate" to describe an industry that is
> developing its own specialized languages/systems (see Q/KDB) and is
> currently in the process of exploring FPGA technology to improve
> performance.

That's fine. We disagree on what's the right direction for software
engineering. I welcome debate on the subject. I don't consider migrating to
customized hardware a cutting edge technique.

It seems like every second justification I hear for the toolchains I see seems
to revolve around performance complaints that are only reasonable in a hard-
realtime situations or 2001.

> which is all well and good except for the fact that you are going to end up
> with worse code than someone who doesn't ignore memory allocation and
> vectorization.

Actually, that's exactly not what happens. Modern GC is good, man, really
good. The vectorization scene is even better for the FP world.

------
forkandwait
I think discussions on performance go south because we all instinctively use
Big-Oh ideas, but there are times when constant multipliers really matter:
when the computation is really, really long (big scientific computations), or
when microseconds matter (interactivity and real time stuff).

As an example, for big scientific computations, Fortran seems to be about 30%
faster than C, its next fastest competitor (at least this is what some
physicists who did huge jobs to process imagery data to look for planets told
me once). If you are running a job that takes 15 minutes in C, this doesn't
matter, but if you are running a job that takes 10 days it matters immensely,
especially when you consider that you still have to debug. 30% is three more
days you have to wait for output before tweaking your stuff and trying to beat
the other guys to publication.

With interactive programming, performance means you either get under the
perceived instantaneous threshold or not, which is often the difference
between "this app is cool" or "this app sucks". If you can get a 2 fold gain
using C versus Java or C#, you can do a LOT of stuff "instantaneously" that
otherwise make the user tap their fingers impatiently. Not too mention the
high speed work of quants.

In the work I do -- processing moderately large datasets into summaries on 12
hour deadlines -- 20% here or there doesn't really matter. I think this is
kind of interesting

------
michaelfeathers
I think that the strongest indicator for the choice of C++ as language for a
new project is the prior existence of a C++ project in the development
organization.

~~~
lawn
Don't forget the existing codebase.

------
ig1
I built high-performance (~1ms) trading/pricing systems in Java and while Java
performance is comparable most of the time there are key areas where it's non-
competitive, for example garbage collection and socket I/O.

I used various techniques to get around these such as forcing GCs to happen
during quiet periods and making kernel modifications to default socket
parameters, but it's definitely non-trivial.

~~~
bd_at_rivenhill
So basically, you start with Java, and then you rip out a lot of the features,
such as GC, which differentiate it from C++ in order to make it faster. Why
not just start with C++?

~~~
prodigal_erik
C++ has failure modes that should horrify anyone who can conceivably use
anything else. The slightest mistake is likely to make even correct parts of
your program fail in non-deterministic ways. Worst case in a managed runtime
like the JVM is an exception and/or a big performance problem, and even those
are a lot more likely to happen reproducibly during QA.

~~~
bd_at_rivenhill
I agree that C++, like C before it, has some dangerous failure modes, but Java
has them as well. I've been trying literally for years to get some of the
people that I work with to understand how dangerous it is to have a runtime
exception outright kill one (or many) of the threads in your program if it
reaches the top of the stack while leaving the rest of the program limping
along as best as it can. The safe default behavior in this case should be the
Java equivalent of abort, and I push people to use factories that produce this
behavior as much as possible, but I can't count the number of times that I've
been trying to make sense of bizarre behavior while testing a program only to
suddenly come to my senses and go to the log file where stderr has been
redirected and see a ton of stack traces. It's only recently that I've gotten
into the habit of checking for that first before I try to make sense of
anything else.

With that said, the sort of C++ failure modes that you're talking about tend
to mostly occur when working in optimizations, and if you program with a bunch
of STL containers and shared pointers (i.e. program C++ like it was Java) then
you don't tend to see these problems very much.

------
dagw
For what it's worth, all three of my friends whom are quants (at three
different companies) spend most their time in SAS and some proprietary in
house language.

------
feydr
everyone likes to say speed -- but it's not just that -- it's memory
management as well

~~~
cHalgan
Exactly. All these claim memory is "cheap" are misguided. The hardest thing is
optimizing your program for appropriate usage of non-CPU resources (inter-
processor communication, memory, disk, etc.).

------
gte910h
I think there is also a bit of "I work too much to learn new things" going on.

------
cHalgan
The C/C++ is just faster. You can tune it. You know what is happening. Also
very important thing with C is that you can have a clear control how much of
resources your program uses. And the _most_ important resource is memory.

~~~
VMG
Memory? Really? At ~$7 per Gigabyte of RAM I doubt that.

~~~
neutronicus
I don't know how relevant this is to computational finance, but in scientific
computing you can use up an arbitrary amount of RAM in dimensionality-cursed
fields. For instance, in neutron transport theory, the governing equation is
in 7 continuous variables (or 6 in steady state), so that if you want your
mesh to be twice as fine, your memory usage increases by 2^7 (or 2^6).

And, just so we're clear, that's _multiplicative_ , not additive. So you're
using 2^7 _times_ as much memory as you were using before.

------
veyron
The question is about quantitative finance, which is much larger than just
HFT. In those spaces, the actual language is almost irrelevant. Hell, you can
eek out sub-100 microseconds tick-to-trade times using perl, 150 using python
and 200 using awk. And when your strategy has a 10-day horizon and predicts
returns of 5% on an asset, missing a few cents due to the system is almost
negligible.

That being said, the development performance (speed, iterative ability) become
more important, and most quants are fluent with C++ or Java.

------
tomjen3
>Performance compared to .Net or Java. When each array element access checks
the bounds and throws exceptions, you know you're leaking CPU cycles there.

/facepalm

Has this person worked with modern JVM runtimes? JIT compilers? Does he know
that they don't actually need to test the access each time but are able to
move the test out of the loops? Or use the Unix signaling system to avoid
having to create an exception, except when something goes wrong?

------
cpswan
Let's not forget the education angle here. Most quants these days come through
similar sausage machines, and C++ is de rigueur.

~~~
rluhar
I agree. If you look at the curriculum of any MFE (Masters in Financial
Engineering) program (CMU being the most _reputed_ from what I hear), you will
find C++ there. Most quants I work with have used C++ while they were at
University. Even shorter courses like Wilmott's CQF (Certificate in
Quantitative Finance) offer courses in C++.

------
mathattack
For better or worse, many things happen in Wall Street as a continuation of
market convention.

Why do people use Bloomberg rather than email?

Why is Sybase the standard?

Why is so much done in Excel?

There are a half dozen reasons for each of these (and a dozen more that they
shouldn't be) but the common link is, "It's the market standard, and it's hard
to change the standard."

------
supahfly_remix
I'm familiar only with C++. Does Java or C# have libraries that are similar in
concept to the C++ Standard Template Library (STL)? Error messages can be
inscrutable, but it is very powerful.

~~~
dkersten
Yes. The Java and C# standard libraries are actually much larger and broader
than the C++ standard library + STL (and possibly including Boost).

(fyi, I mostly program in C and C++ these days, but used a lot of Java when I
worked in telecommunications)

------
RomP
This has been answered million times: besides the historical reasons (which
are extremely powerful), it's performance. Performance not as in how fast
would it take to calculate this, but performance as in what is the worst case
scenario for calculating this. Think Garbage Collector, mostly. But also think
ability to control exactly how the data represented and stored in memory.
Tighter storage == fewer cache misses. Market moves away while fluffy managed
data travels from RAM -> cache -> CPU.

------
rch
Are libraries reasons? If so, then I'd say Boost deserves at least a nod.

Edit: I wanted to add some evidence, but the job looks to have been filled.
The current opening lists C++ but doesn't mention Boost. Last week though...

[http://tbe.taleo.net/NA11/ats/careers/requisition.jsp;jsessi...](http://tbe.taleo.net/NA11/ats/careers/requisition.jsp;jsessionid=0DE3BBE908622D1C3B9989549E3E6D65.NA11_primary_jvm?org=QUANTLAB&cws=1&rid=196)

(these guys are across the street from my old office - good people, from what
I hear)

