In a co-located datacenter with high performance network equipment this isn't true at all. The time from my CPU to a kernel bypass NIC to a cut-through switch to the exchange's FPGA gateway is in the low single-digit microseconds.
Single digit microsecond latency is also what you should expect from a well written market data gateway or a fix engine. Java definitely competes with C++ in this space.
If you hit a GC pause or have to do heap allocation or don't vectorize a calculation because you're not running native code, it all costs way more than the network stack. Again, I'm not saying that there aren't techniques to deal with that in Java. But what I am saying is that you can't just ignore those issues, like the author seems to suggest.
Tick-to-trade? The first version of our trading system in Java, had this level of performance in the first version.
> If you hit a GC pause or have to do heap allocation
We ended up running the code through profiler on our CI system which would fail the build if it detected any potential paths of GC. Object pools and flyweights everywhere.
Pretty standard practice in HFT Java, but the GP correctly points out this is not idiomatic Java.
> In other words, it’s possible to write Java, from the machine level on up, for low latency. You just need to write it like C++, with memory management in mind at each stage of development.
That's the tl;dr for the whole article.
> or have to do heap allocation
FYI, heap allocation in Java is much cheaper (~11 cycles) than in C++ as it's just a bump-the-pointer in TLAB.
It's absolutely not your everyday java (I've heard the answer to "how do you handle gc's" is just "don't allocate memory") but it's a Thing.
Edit: since I'm being down-voted for this comment - I worked in all the above banks on low-latency Java projects. That you don't like the language/runtime doesn't change the facts.
Every time one of these articles roll around it's always the same point. At the end of the day some systems actually do need low-latency in real time and you will be steamrolled by competitors if it is not.
If you don't live and die by a profiler, they you're flying blind.
I've seen a lot of claims that "our system absolutely requires low latency" and then when you dig, it turns out that there is a whole lot of praying under there.
That doesn't invalidate your point, to be fair. But we keep talking about all these mythical hard-real time applications and every time I explore, that actual space gets smaller and smaller.
edit: just to be clear, I brought up destructor chains because they are a potentially large, potentially unpredictable cost in the code that may profoundly depend on the heap layout that is not immediately visible looking at the code. They are something that you need to carefully think about and measure to understand their true cost.
That's it. That's what an RT system is. I don't care if your latency is 1us, 1ms, 1s or 1h. If you can't miss the deadline then your system is a real-time one.
Low-latency then depend on how your strategy works and your particular needs. It's useless to have sub ms timing of stock market data if your results take seconds to calculate.
Of course, your strategy that takes 1s might be more lucrative than one that takes 10ms to calculate. Then it's up to you and your algos.
I have never understood why people might think it ought to be faster.
Decoding video and interactive rendering is a great example of an RT system. If you can't construct a frame within (usually) 16 ms you skip it and start with the next one.
It's a good example of a problem where a realtime system would be useful but it's not a good example of a real-life realtime application (usually).
The time to market argument is really stupid imo.
But then it just restates the development time point! (at least acknowledging that is absurd this time)
> First, there’s the (slightly absurd) point that if you have two developers, one writing in C++ and one in Java, and you ask them to write an platform for high-speed trading from scratch, the Java developer is going to be trading long before the C++ developer.
C++ also allows expressing complex concepts in types without overhead (can you have price abstraction in java with no overhead? Can you write an efficient fix point numeric type? Can you post lambdas between threads without memory allocations?).
Yikes, this article is terrible.
But no. Literally just "actually, C++ is better, but Java is easier so you should use that instead".
Having worked on a 'low latency' trading system written in Java, this is true. Ultimately you end up using a flyweight pattern over shared memory, and you have to be aware of alignment and cache sizes. Then there are other tricks like zero GC, or pushing the collection back to once a day.
You also spend a fair amount of time analyzing performance to the nanosecond looking for, and eliminating, jitter.
I've also written the same in C++, and it's a similar exercise, along with dashes of [[unlikely]], template specialization, and far too much time inside godbolt.org looking at assembler to see if gcc/clang has "optimized" something in the latest release.
Java. And I'm a career C++ programmer.
The whole system was built to be event driven (messaging) from the ground up too. It had a lot of benefits like performance testing code without instrumentation, the whole junit/mockito ecosystem, code completion in Intellij that actually worked. You only had to worry about code errors, not linker errors. We had over 100,000 unit tests.
We leveraged the whole JVM and ecosystem too. Outside of the hot path code, we could just go and use json libraries, webservers, and so on. In C++ we'd be struggling (not so much these days), to throw together a JSON/REST/ws webserver in the same code base.
Though Java always feels like to trying to unwrap a present whilst wearing mittens. It's excessively verbose. We didn't pick Kotlin as at the time as that was known to emit various extra byte code for seemingly trivial calls, and was only on v1.0 at the time.
Java is much more comfortable to work with when you need to rapidly deliver some latency-sensitive functionality, doesn't blow up with segfaults if you do something stupid and has awesome profilers, debuggers, IDEs, dependency management, etc.
Where it looses compared to C/C++ is when you need to have absolute control over the emitted native code (this is getting very close to solved with Graal) or when you need to prevent deoptimization storms when hitting an uncommon branch in the middle of the trading day (there are ways to prevent or limit it but it's ugly).
The thing about banks doing systems in Java doesn't really convince me, the times I was facing a Java engine it didn't really matter how fast the thing was, due to the particular thing we were doing. (Not everything is a speed race).
The thing about C++ is you get ultimate control, but you have to use it. For instance if you just write a dummy application where you populate a hashmap and pull out some values, it's easy to do this slower than a managed solution, because C++ allows you to leave out the allocator, meaning you end up using a default one that isn't so great. On the surface it will look like any old hashmap, because if you've looked at a few languages that kind of thing will tend to look like `map<mytype>` in most languages, but there's actually a way to say `map<mytype, myallocator>` in C++.
I find it's actually easier to perf debug c++ than managed languages. With managed, you end up having to think about how the GC works, and the GC is actually not a very simple piece of software. You can sidestep the GC, but then you're not in a very idiomatic Java. There's a bunch of extra rules you have to adhere to.
If you do that, you lose on a lot of stuff in C++ that supports that. For example, destructors/RAII make managing memory resources a lot better. Tooling like Valgrind, ASan, etc make it easier to find issues in your memory management. With Java, you don’t have those. The vast majority of Java is written with GC in mind, and when you go off the beaten path and do memory management on your own, you won’t have the language features and tools to support you, making it harder to do than in C++.
I have coded both in Java and C++. I enjoy coding in C++, and I detest with a passion coding in Java.
Clearly the solution is Rust. :-P
Quite an exaggeration. I've never seen a C++ codebase that doesn't resort to bare references or pointers at some point. None of them exclusively use unique_ptr and shared_ptr.
Writing stuff like that isn't mandatory.
But this website appears to be a place for free-lance writers to post articles of somewhat lesser quality. I wonder what happened? Appears the Engineering Blog is now here: https://stackoverflow.blog/engineering/
In one sense, the author is right that C++ based systems do not suffice to trade that order, but he is very wrong that Java ones do.
His justification that banks use Java is not compelling. Morgan Stanley is the only bank that is remotely competitive technologically. And his justification that dev time is faster on Java is just ridiculous. It’s even faster on Python, but we aren’t talking about dev time. We’re talking about building systems that can reliably execute ingress-egress in less than a microsecond. Even using kernel bypass cards and assembly doesn’t get you there. So, no, Java isn’t a good choice for low latency systems.
The JVM and especially the GC are the definition of undefined behavior. You don't know what happens, when it happens and how it happens. When GC is working fine your likely doing okay, if not - good luck.
The articles from Azul are supposed to be a "goldmine". I've randomly opened two of them:
and laughably, they recycle the same (stock) diagram, which, by the way, has no unit on the Y axis.
I'd love to see a GC that is as high-performing as manual memory management. But all these articles are meaningless without numbers (and a rigorous methodology).
> Since IDE support for Java is much more advanced than for C++, most environments (Eclipse, IntelliJ, IDEA) will be able to refactor Java. This means that most IDEs will allow you to optimize code to run with low latency, a capability that is still limited when working with C++.
He cites IntelliJ IDEA as a powerful Java IDE, but JetBrains does have a C++ IDE which is quite advanced (CLion) and whose refactor function always impressed me. I'm sure Visual Studio has some pretty impressive refactoring capabilities by itself or with JetBrains plugins. Is it really that bad and I've just not worked on large enough projects ?
However, even though at heart I still feel like a C++ programmer since I started my career before Java existed, I actually end up using C or C++ less and less. I started transitioning to C# (which I equate somewhat with Java) and Python for small projects but I'm finding that I'm spending most of my time there.
Things I still do in C or C++:
- embedded systems (small memory footprint, no MMU)
- OS-like work
- numerical algortihms, especially anything with pixel data or large matrices (the good stuff stitched together with Python code)
Unexpectedly, I don't miss manual memory management at all but still sometimes want something closer to RAII.
C# and Java are more like the pair.
I think C++ is like western (cowboys and open ranges) and Java is like country (the setting is closer to civilization but still far from progressive, was originally aimed for western music audience).
Why is sub 1ms latency so important in trading systems? For automated trading?
You can't be the first to react to price changes unless you can hear about them sooner, process them faster, and issue responses sooner than everyone else. If your business model is "buying just before everyone else buys and drives the price up" and "selling just before everyone else sells and drives the price down", you need to turn yourself into a paperclip maximiser using ever more resources to try to out-compete the other companies doing the same.
Speed of light is 1ft per nanosecond. So 1000 ft of network cable between you and the exchange server, is 1 microsecond of latency, just on the wire.
Hence straight-line microwave towers, or co-located servers "in the same cabinet".
Then you have the latency from the wire to machine memory, so there's a whole industry of Solarflare-esque network cards that work entirely in userland to further reduce latency.
Thankfully there are more honorable examples for low-latency systems, like medicine, games, planes, rockets, communication systems :)
2. Create a String pool and use == for equality checks.
3. Avoid Java constructs that create temporary iterators.
for (i = 1, size=myList.size(); i<size; i++)
creates no iterator whereas
for (Object obj : myList)
does. You employ a profiling tool to find the garbage-creating sections of Java code and rewrite them.
I know of a wall street trade matching engine that operates with no pauses for garbage collection that uses these techniques.
It's certainly fast when adding more useless template trickery to the language instead of fixing real problems with it. I wish people never found out that templates are turing-complete.
What "real" problems do you think it has, and how do you propose fixing them?
C++20 added niebloids. Not exactly template trickery, but still another concept introduced to fix problems, caused by other complicated feature, introduced to deal with other features problems.
edit: yes, first google hit. Apparently it is now a term of art :)
niebloids sound like a medical disorder.
Edit: checked the JSF C++ standard and templates are allowed but not really "encouraged"
Edit: the author’s entire point is his claim (never backed up) that Java is better at optimizing away less-used branches in code. OpenMP lets the programmer do exactly that with C++, if the C++ compiler isn’t already superior.