Hacker News new | past | comments | ask | show | jobs | submit login
JITs are un-ergonomic (abe-winter.github.io)
145 points by awinter-py 65 days ago | hide | past | web | favorite | 255 comments



An anecdote from a non-JS JIT, but similar: I once spent a summer working on a game engine with a couple others where the host language was LuaJIT.

It started out great. The iteration cycles were incredibly short, performance was competitive, the code was portable, and we could do wild metaprogramming nonsense that was still fast. If you haven't worked with LuaJIT, its C FFI is also incredible!

As we started scaling though, the wheels fell off the wagon. We'd add a new feature and suddenly the game wouldn't run at interactive framerates. One time, it was the 'unpack' function, which would trigger a JIT trace abort. We would drop from 12ms frames to 100ms frames. I wrote a length-specialized version that didn't abort and moved on.

Another time, it was calling Lua's 'pairs' method (iterator over a map). Okay, so we can't do that, or a few other things that made Lua productive before.

The other problem we hit was GC predictability being impossible. We tried to mitigate it by using native data structures through the C FFI, taking control over the GC cycle to run it once or twice per frame, etc. In the end, like the JIT problem, we weren't writing Lua at the end, we were writing... something else. It wasn't maintainable.

That summer ruined dynamic languages for me. I didn't really want to be writing C or C++ at the time. I ended up picking up Rust, which was predictable and still felt high-level, and the Lua experience ended up getting me my current job.


Thanks for the story details, quite interesting. In the end this was unfortunately a case of picking the wrong tool for the job. Don’t use JITed / GC languages when you‘ve got hard realtime requirements.

Don’t build a datastore on the JVM if you care about tail latencies, you‘ll be fighting the GC forever (see Cassandra). Don’t rely on auto-vectorisation in your inner loop if possible, one tiny change could bring that house of cards crashing down.

I‘d be interested in how your team ended up picking that tech stack. Was it a „rational“ weighing of options with pros and cons? Was it „eh it‘ll be alright“? Was it personal preference and/or prior experience?


> Don’t build a datastore on the JVM if you care about tail latencies, you‘ll be fighting the GC forever (see Cassandra)

OpenJDK's ZGC [1] is well on its way to have worst-case latencies of under 1ms (as soon as this year) on heaps up to 16TB in size.

[1]: https://wiki.openjdk.java.net/display/zgc/Main


This does not speak anything about end result. You might end up with 1000 GC collections per request. Each will be 1 ms, but total will be 1 second. It's still non-determenistic.


That's a question of throughput, not latency, and there's material about ZGC's throughput as well if you're interested. E.g. Go's GC has good latency but horrible throughput; ZGC is much better.


The Java developers have been promising to deliver a performant GC since the 1.0 days in 1990’s, and they’ve failed to deliver one 14 times in a row.

Also, 1ms is an eternity on modern hardware. That’s enough time to handle thousands of disk I/Os or network packets. That means each GC cycle can latency blip 1000’s of other machines.


That seems like kind of an extreme position. They've delivered the most performant GCs around, many of which are performant enough for a huge number of users.

Also, 1ms is an eternity on modern hardware

I hate to break it to you but malloc and free aren't instant either. Also ZGC can do less than 1msec pauses, often way less. At some point it just doesn't matter anymore for virtually all users.

That means each GC cycle can latency blip 1000’s of other machines.

Linux kernel timeslice is longer than 1 msec, so good luck if your server runs >1 process of any kind at all.


> and they’ve failed to deliver one 14 times in a row.

Failed at what exactly? People are happy with what he have now, and generally happier with every release (there have been some issues around G1 vs. the now-defunct CMS collector in some special circumstances). Netflix, Amazon, Apple, Google, and Twitter run much or most of their infrastructure on OpenJDK, without even using the newest collectors (AFAIK).

> Also, 1ms is an eternity on modern hardware.

That depends if you're looking at latency or throughput. Java's GCs have achieved good throughput a long time ago. But ZGC isn't a throughput collector; it is intended for online applications that aren't CPU bound. If you're serving transactions and your application pauses for no more than 1ms every few seconds you're probably OK. There are kernel subsystems that may cause a similar or higher worst-case blip.

In any event, Java is not intended for those who must get 100% of the performance the hardware can support, and are willing to pay a high price for getting there, nor is it intended for specialized, niche usage, where perfect performance is easy to achieve. It is intended to get to 95-98% performance at the lowest effort, for a wide variety of applications, and its main runtime cost now is in memory footprint. It's not intended to replace C/C++; we've got Zig for that.


I can't speak for the others, but I wouldn't use Google as an example there. Java is something like half the LoC but only 1% of the compute cycles (handwaived). It's very rarely used in the systems that do the heavy lifting. Though Go is growing on that metric, and also uses a GC.


Thats not exactly a fair characterisation. The Java GCs have been state of the art for a long time. Maybe they used to optimise too much for throughput, not latency, but that has changed.


That didn't exist yet when C* was started though. I'd be interested in how it compares to ScyllaDB, I suspect still not favourably.


Sub 1ms incremental GCs did exist, but they weren't common in production languages. They typically optimised for throughput rather than latency.


'pairs' was jitted recently by the RaptorJIT crew, fwiw.

I've hit fewer snags with LuaJIT, but they're definitely there. Really wish Mike Pall had written that hyperblock scheduler before retiring...


He did not retire; he is still quite active and improving LuaJIT [1].

    [1] https://github.com/LuaJIT/LuaJIT/tree/v2.1


Well, both are true:

https://www.freelists.org/post/luajit/Looking-for-new-LuaJIT...

It's more of an emeritus situation than anything, he's still active on the list as well.

I'd be ecstatic if hyperblock scheduling, or the quad-color garbage collector, were to drop onto the 2.1 trunk, but I'm not counting on it.


I'm working with Lua right now(gopherlua) as a scripting option for real-time gaming. I've done similar things to your story in the past with trying to make Lua the host for everything and I'm well aware of the downsides, but I have a requirement of maintaining readable, compatible source(as in PICO-8's model) - and Lua is excellent at that, as are other dynamic languages, to the point where it's hard to consider anything else unless I build and maintain the entire implementation. So my mitigation strategy is to do everything possible to make the Lua code remain in the glue code space, which means that I have to add a lot of libaries.

I'm also planning to add support for tl, which should make things easier on the in-the-large engineering side of things - something dynamic languages are also pretty awful at.


You might still run into GC problems, but none of the Go-based Luas (built on Go, rather than binding to another Lua library) I am aware of have a JIT built in.


GP is talking about LuaJIT, you're talking about Lua. Lua has lower performances but should be completely predictable, GC aside (not sure what its GC scheme is), so it's a very different situation.


The OpenJDK JVM (aka Hotspot) addresses both issues: control [1] and monitoring [2] (there are built-in compilation and deoptimization events emitted to the event stream). You can also compile methods in advance [3], and inspect the generated machine code when benchmarking [4]. You can even compile an entire application ahead-of-time [5] to produce a native binary.

[1]: https://docs.oracle.com/en/java/javase/14/vm/compiler-contro...

[2]: https://docs.oracle.com/en/java/javase/14/jfapi/why-use-jfr-...

[3]: https://docs.oracle.com/en/java/javase/14/docs/specs/man/jao...

[4]: http://psy-lob-saw.blogspot.com/2015/07/jmh-perfasm.html

[5]: https://www.graalvm.org/docs/reference-manual/native-image/


The article [incorrectly] equates all JITs with the author's experience with V8. The article even states "faster than python, but slower than Java" which makes no sense because Java is a JITted language.


The GC is still nondeterministic though.


True, but few people care with ZGC [1] well on its way to having worst-case latencies of under 1ms (as soon as this year) on heaps up to 16TB in size. We're getting to the point where jitter due to GC is no larger than jitter introduced by the OS, so the only real cost is footprint.

[1]: https://wiki.openjdk.java.net/display/zgc/Main


This is great. Some practical downsides though: it's not available for all architectures (e.g. WebAssembly); the license is GPL, which makes this unsuitable for some applications; and finally the project is run by Oracle, which is reason for some to avoid this project.


The architectures, license (GPL + classpath exception; i.e. it is not viral), and Oracle's leadership of OpenJDK has worked well for many companies for years, like Amazon, Google, Netflix and Apple, that base much or most of their infrastructure on it.


> GPL + classpath exception

This seems only useful if you want to use the VM, not if you want to embed it into your own application.

But the whole classpath exception clause is confusing to me (and from what I see I'm not the only one). I just want to write code, as opposed to interpreting licenses ... I.e., exactly why I would avoid this project, regardless of any good intentions from the side of the project team.


It's OK, some things are not for everyone; some people avoid Linux, that has a similar license -- perhaps even more complex when you consider gcc and glibc -- for similar reasons. The GPLv2+CPE license has been around for over a decade and is used by one of the largest and most successful open-source projects that, in turn, is used by millions of developers and the world's largest companies.


You can embed the JVM into an app using the JNI APIs without triggering the GPL.

I agree the GPL+CE is a confusing license, however. But you're arguing that you should avoid Java because of the license. Think about all the companies that use it, when was the last time they had any issues because of the GPL?


The GC amortizes the cost of allocation. Allocation is expensive even in manually allocated languages.

The real problem with the GC is that it leads people to believe that they don't need to be considerate about how they are using memory.


So is malloc


It is much easier to avoid malloc writing C than to avoid new in Java.


Nothing to do with GC though. It's also fairly easy to avoid `new` in Go and C#.


To some degree, probably.

Structs have many limitations though and two different argument passing semantics in the same language mean more complexity. Also you may be lucky to eliminate allocations in your own code, but what about libraries? Existence of GC shepherds programmers and library creators into heap allocation. You don't need to write new explicitly to allocate on the heap.

Also high level languages with no GC make manual memory management much more usable and provide way better ergonomics in this area, just because they have to. E.g. automated reference counting, ownership/lifetime control, move semantics etc.


Automated reference counting is a garbage collection algorithm, and it is possible to have ownership/lifetime control without giving up on GC, Chapel, ParaSail, Haskell, OCaml, Swift, D are following up exactly this path.


How do you avoid heap allocations in Haskell?

Automated reference counting is technically a GC, but a kind of GC that can be enabled only for a subset of objects. In languages which force GC on everything (even reference counted GC) the incentive to provide abstractions that work without GC are much weaker. It is just hard to opt-out from GC once you have it and once all your stdlib relies on it. I thing D learned it the hard way.


Stack and register allocation in Haskell is under compiler control via escape analysis, however you can allocate native heap outside GC via mallocBytes/allocaBytes/free, and if they get integrated into mainstream GHC, linear types.

Besides, plenty of GC enabled languages offer the option to stack allocate, static global allocations, or native heap.

Some examples, including languages that for whatever reason failed on the mainstream market.

D, C#, Swift, Oberon, Oberon-2, Active Oberon, Component Pascal, Mesa/Cedar, Sing#, System C#, Nim, Modula-2+, Modula-3, VB, Xojo, C++ (via C++/CLI, C++/CX and Unreal C++), Common Lisp.


You might want to look at the new RTSJ spec [1] for how memory is divided into a GC heap and nested arena-collected regions in real-time Java.

[1]: https://www.aicas.com/cms/en/rtsj


As addendum to my comment, D learned the hard way how not having a focused goal or godfather company hurts adoption.

Go and .NET are doing quite alright.


True, but the bounds are much better.


This article has a number of issues. JS with JIT is waaay faster than Python. Not “between python and java” as purported. Second, generalizing jits as “un-ergonomic” seems silly given that what’s being specifically looked at is benchmarking. But what makes this claim ridiculous is that nothing is easy to benchmark. Even native code is hard to profile and this is literally my day job. If the JIT makes your code that much faster, this strikes me as a pretty suspect complaint


I think that by “between python and java” they meant “faster than python and slower than java”. I think Java still beats JS unless you get lucky.

You’re totally right that benchmarking and profiling is hard even for native code. I think this post fetishizes whether or not a piece of code got JITed a little too much. Maybe the author had a bad time with microbenchmarks. There’s this anti pattern in the JS world to extract a small code sample into a loop and see how fast it goes - something that C perf hackers usually know not to do. That tactic proves especially misleading in a JIT since JITs salivate at the sight of loops.


> I think that by “between python and java” they meant “faster than python and slower than java”. I think Java still beats JS unless you get lucky.

Yeah, that is what they meant, but it is a little misleading. Javascript's performance is usually within a single digit multiple of java's, whereas python is often significantly slower. Javascript is somewhere between Java and an abacus too.


That still doesn't make sense to me.. how can JIT in general be slower than Java when Java is JITed?


The peer comment about static typing is correct, but there's more, err, flavor to be enjoyed in the challenges of JITing JS. Here's some deopt scenarios to keep you up at night.

To whet your appetite, what happens if you redefine Math.round? Java prohibits this for obvious reasons, but a JS joker may write:

    Math.round = () => Infinity;
JS engines really do inline Math.round, and must be prepared for this nonsense.

It gets worse. Maybe you check for a property which is usually not found:

    if (!val.uuid)
       val.uuid = uuidgen();
Hidden classes can almost reduce this to a pointer comparison. But what if someone adds a uuid property on Object.prototype? Every check is busted! v8 handles this with "validity cells", and it's ugly, requiring that every object know about every object which "inherits" from it.

Now if you are a monster you may choose to write:

    Object.defineProperty(Array.prototype, "42", {value: "lol"});
    console.log([][42]); // yes it prints "lol"
Every array gets a default value for 42. Think about how you would JIT numeric code in such an environment...


Python suffers the same issue. There's some discussion about dealing with it in this PEP and the pages it links: https://www.python.org/dev/peps/pep-0509/


Python does not seem to be even trying to be competitive in performance though so it's less of an issue for them.


They do, it's just not a priority. The PEP I linked (as well as most of the work of its author, Victor Stinner, in the last years) are motivated by JITing.

There's also the "frame evaluation API" PEP [1], whose purpose is to allow pluggable evaluators in CPython without forking the entire interpreter, like Unladen Swallow had to.

[1]: https://www.python.org/dev/peps/pep-0523/


Python is orders of magnitude slower than JS, though. It has the same problems, but doesn't even try solving most. PyPy would be a better VM to compare against.


Java deals with almost exactly the same issue. If you define an interface and a single implementing class, the code will be compiled to always call that classes method, and the deoptimized / recompiled if you load another class that implements that interface. JS could deal with these issues in a similar manner.


You can redefine Math.round. It's an ordinary method from rt.jar. You can change rt.jar or you can redefine it with java agent.

Not as straightforward as with JS, but Java JIT still have to account for those things.

May be it will assume some things about standard classes for optimization, but for user classes that will be true anyway.


You can really redefine an existing Java method at runtime? Well TIL!


The Java JIT has static type information to work with, the JS JIT can only infer type information via heroic efforts. Static types do mean something, especially when working with primitives and other unboxed data types.


This. Javascript engines have to do stuff like https://mathiasbynens.be/notes/shapes-ics to assume many objects will have the same key names and types and create an optimized class layout, with some kind of expensive fallback if anyone ever randomly stores a key or value that violates the assumption. They can't even be sure you'll always be doing integer math (unless you resort to bit masking a la asm.js). Hinting (like __slots__ in Python) might have helped some.


This was one of the main motivations for writing dart. The V8 implementers were tired of having to write these kinds of hacks. Even though dart was initially a dynamically typed language, the shape of the objects was stable.


> The Java JIT has static type information to work with

I don't know that the java JIT has type feedback. However regardless Java being statically typed means Java code is structurally closer to what a JIT wants and can analyse, it's much more difficult (and thus rarer) to fuck around with types and objects generated on the fly at runtime for instance, you're not going to add new attributes to an instance whereas that's just tuesday in javascript.


Java JITs definitely do have type feedback. They profile the types of objects at virtual call sites and do guarded inlining, as well as removing expensive interface casts, etc. The JVM doesn't really have a distinction between classes generated on the fly and those that were on disk, as they all go through a classloader anyway. JVMs, for the most part, blow their brains out at the end of the day and start with zero the next VM run.


Well, you're right that JS JITs can only infer type information via heroic efforts, but AFAIK the Java compiler throws away any type information from the source code, which means that the JVM JIT needs to inver type information again from the byte code.

Still, the JVM JIT is faster than JS due to reasons explained in the sibling comment of the parent one.


Java is only throwing generic type information away in the declaration of classes and methods. The code that is accessing generic types is compiled with casts to the expected type. List<String>.get(1) will generate a cast to String therefore nothing is actually missing in the generated code. It is only missing when you use reflection to e.g. deserialize a List via Jackson by passing List<String>.class. That unfortunately won't work because the generic type parameter is not part of the declaration, only the generated code.


> but AFAIK the Java compiler throws away any type information from the source code

You're thinking about generics. .class files preserve a whole bunch of type information (I'm building a .class decompiler in my free time, and I'm looking at that very same data in my debugger ATM).


You probably know this, but it's not obvious to the random reader.

Even generic types are available in the Java .class file and are accessible from the reflection API. Spring for example uses this quite heavily.


It depends on where you are at.

Type information is present in fields and class inheritance

For example, a class like this

`class Foo implements Bar<String>`

Retains the fact that the generic type is a String.

That information is completely lost at method invocation. So a method that takes a `Bar<String>` ultimately compiles to a method that takes a `Bar` and knows nothing of the String.

To get that generic information down you have to engage in some fun tricks using either the class or field method I mentioned earlier. (Usually you do this with a second type parameter where it matters).


Java JITs don’t throw away type information. The bytecode carries strong types, though you have to do some work to get them out (some abstract interpretation). But that’s something you can do statically; no need for profiling or speculation.

In JS you can only get the types by profiling.


Generics are erased in Java. Not so in the CLR.


Yeah but the thread is about types, not generic types specifically.

Also it’s not really true that genetics are erased. There’s that horrible thing javac does so that the VM can support reflection for generics. I get your point though.


Do you mean the passing of classes as additional arguments? I haven't seen the horrible thing.

Overall I think C# got this one right. Generics are right there in the binaries.

bUt ActuALLy, I think the rightest right thing is to do specialization up to representation at link time (or compile time), when the whole program is available, ala MLton. Virgil does this. Of course this is not possible in a dynamic code loading environment but I only have so many fucks to give in this life.


No, I meant that the Class format has metadata about what the genetics were so some java.lang.reflect thing can query about what the genetic type was. I don’t think it comes with soundness guarantees.

I agree C# got this right.

I agree that doing specialization up to link time is ideal from a certain standpoint.


All generics in Java are of type Object from the start, you can't call any methods on them not implemented by Object. So it is wrong to say they are erased, they were never there to start with.


You can add a constraint to a generic declaration, which is an upper (or lower) bound on the allowable type arguments.

e.g.

  class A<Y extends X> {
    // in this scope, Y is known to be of at least type X,
    // so, we can call methods on expressions of type Y
    // that belong to type X (and not Object)
    Y m() { ... }
  }
Type arguments are omitted from usage sites. The technique is literally called erasure in papers and documentation.

  a = new A<Foo>();
  f = a.m(); // should return a Foo, in bytecode returns Y
             // and compiler inserts a cast from Y -> Foo
Generic code is slower in Java because of these extra casts. To get back (most of) the performance, the JVM has to inline enough methods to be able to track the types from start to finish. It can't always.


And that's why Java is slower than it could be. If I'm using String[] array, it can't contain anything but String objects, so JVM does not have to check type everytime I'm accessing that array. That's not true for ArrayList<String>, where compiler must check returned object type from `get(index)` (because it really is Object and can contain anything), but it could be true with better language design.


This is not true, bytecode stores fields and classes as fully qualified strings: https://stackoverflow.com/a/17406592


Yeah what you said.


You're not wrong, but I always wonder how the verbosity and crazy class hierarchy of Java code doesn't make things slower.

I know that compilers are smart, but the API doesn't make things easier at all


The Java standard library isn't particularly verbose or insane. It's the "enterprise java" world that deals with the sort of excesses you're thinking about.


Verbosity makes the programmer slower not the program.


It depends. For example, you can verbosely write a loop to copy data from one array to another, or you can use memcpy. The later is typically implemented as hand optimised assembly - it's a pretty small number of operations in x86_64. The former - maybe the compiler optimizes, maybe it doesn't. If it does optimize it, it's definitely more complexity in the compiler and slows it down.

In general having some higher level, well optimized helpers can certainly reduce verbosity and increase speed. That said, some types of verbosity just make the programmer write what the compiler would translate to anyway. Or can end up being unnecessary - e.g. a strongly typed language with no inference certain slows the programmer with no effect on the program (though maybe an effect on the type checker).


A fast memcpy used to be complicated to implement but these days I heard a simple

    rep movsb
will do the trick. This assumes that the source register points to source, destination register points to destination and counter register is set to number of bytes to copy.


I wish that was true but my data says it isn't:

- A verbose SIMD copy loop is often faster even on modern Intel CPUs that have the more modern rep implementation.

- A simple byte copy loop will beat both SIMD and rep when you're copying smaller amounts at a time.


Yes I understand some verbosity is structural and won't change anything, for example

SomeClass x = new SomeClass();

won't make anything slower (ignoring auto for now)

Now, I read Java code and I can't help but wonder how things are harder

For example, setters and getters, what would be a simple memory write becomes a function call. Not complaining when you actually need it.

Reading the examples here (and those are not too bad) https://developer.android.com/reference/java/net/HttpURLConn... it seems you have to actually fight the API to get anything done

Why do you need to cast the return of a url.openConnection() to a HttpURLConnection? (I mean, how many connection types exist?)


So for getters and setters a Jit can inline the function call and in the end you have a direct memory access. In your example you have to cast because URLConnection could be a HttpURLConnection or JarURLConnection. But again a good Jit would speculate that it is always a HttpURLConnection and deoptimize if not.


The JVM also has full information about the class hierarchy at runtime. If you add a class that overrides a function it will deoptimize that overriden function to a virtual function but otherwise the JVM will just treat it as a static function and inline it if necessary.


In modern Java I don't see why you would directly use an HttpURLConnection. The stdlib provides an easy to use HttpClient.

https://docs.oracle.com/en/java/javase/11/docs/api/java.net....


> I mean, how many connection types exist?

Two? https://developer.android.com/reference/java/net/URLConnecti...


The design of JavaScript is not very friendly to high performance. Even though Java is itself not easy to generate efficient code from, producing efficient code from JavaScript is far more difficult.


The claim wasn’t about JIT in general, but rather about JS.


Then it’s problematic to be so JS and V8 centric. Also the post specifically talked about JS being faster than Python and slower than Java so that’s what this thread is about.


I’m not advocating for the post or anything else, I was just clarifying a misunderstanding.


That was not a general statement about Javascript performance. The entire article is about the unpredictability of the JIT. When the JIT hits a bad codepath then it really does perform like python or when it hits a good code path (most of the time) then it performs like java. This unpredictability is what is causing the issues, not the general performance.


Pypy JIT is way faster than CPython


It is, but that’s not saying a lot and it’s still far far slower than Go/Java/C#, and it also isn’t compatible with lots of existing Python code. Such as anything that would like to talk to a Postgres database. Pypy is great, but it can’t fix a broken ecosystem and general lack of leadership in the Python community. :(

(Broken ecosystem = lots of things don’t work with lots of other things; package managers are another example; Mypy another)


What's the problem with Postgres on PyPy? Running fine in production here.


There are no supported drivers for it as far as I could find. The common answer is to use psycopg2cffi, but that hasn’t had a commit in like 4 years. The alternatives I could find were in the same shape or worse.


There's a few pure python drivers. Check https://wiki.postgresql.org/wiki/Python and https://wiki.python.org/moin/PostgreSQL

E.g. pg8000 looks like it's being maintained?


I recall looking at pg8000 in particular, but it didn't inspire confidence for our production use case. It had only 20 commits in 2019 and fewer than 100 stars. I also didn't have a good understanding of the performance tradeoffs (pypy/pg8000 vs C library).


Poetry and mypy are fine. Postgres works on pypy. But generally.. yeah you're right.


My experience with Mypy has not been pleasant. It doesn't work with a whole host of packages including zip packages (PEP 441) or those that aren't built for Mypy directly (you have to sprinkle empty .pytyped files in various places in your project IIRC or mypy will ignore the type hints you provide; I ran into this with the `black` package. Further, getting it to find installed packages is tremendously difficult--it generally just fails with "couldn't find package" or something similarly useless and the docs it links you to are not helpful in my experience. Mypy also barfs on recursive types, so you can't express things like JSON, nor can you constrain generic types based on protocols.

Similarly, Postgres doesn't work for Pypy unless you're willing to use libraries that are unmaintained or only very lightly maintained, and even then who knows what the performance tradeoffs will be. Those are pretty substantial caveats for a production use case.

I haven't used Poetry, but I've heard good things. I'm very cautious though, because I've also heard high praise about pip/virtualenv, pip-tools, pipenv, etc and every single one had spectacular problems on the happy path (30 minutes to resolve dependencies with pipenv is my favorite).


> who knows what the performance tradeoffs will be

Sure that's true, but performance is about measurement.

Poetry has some missing pieces still (I can list them if you're interested but I'm on my phone) but it's the first time I've felt ok recommending a packaging tool to beginners. The experience is way better than the alternatives like pipenv.


> Sure that's true, but performance is about measurement.

I have a very low degree of confidence that a pure Python solution (even JIT optimized) is going to compete with a native solution, not to mention that by all appearances it isn't sufficiently well-supported to meet our criteria. Measuring the performance for me feels like a waste of time; it's easier just to forego Pypy until the ecosystem matures.

> Poetry has some missing pieces still (I can list them if you're interested but I'm on my phone) but it's the first time I've felt ok recommending a packaging tool to beginners. The experience is way better than the alternatives like pipenv.

This is good to know. I'll have to give it a try.


>JS with JIT is waaay faster than Python. Not “between python and java” as purported.

You are correct that JS is not between Python and Java. Python is faster than Java which is faster than JS. Though some people seem to think calling APIs written in C is "not Python" but if the ecosystem provides the library and I call it from Python then it's Python enough for me!


By that logic, Python, Javascript, and Java are all the same speed, just call out to C, or assembly perhaps if you really want to tune some code.


No, not at all. Java FFI is extremely expensive.


Yeah. And so is JS’s. I think CPython’s design makes it a champ at native interop and a loser at pure compute perf. That’s a valid trade off for lots of important situations.


Are you referring to V8's/Node's C FFI? Why is that extremely expensive?


It's not a trade-off. There's no reason why you can't have a blazing fast and expressive language that also has good C FFI. A single good programmer could make a language that was as expressive as Python (that is, not very), faster, and with a good C FFI in under 6 months.


You couldn’t be more wrong. Java and JS have expensive FFI because of the specific costs of synchronizing the VM entry/exit with the GC and JITs.


Aren't you saying it's possible to make a perfect language with no tradeoff, and all that in just 6 months?


No I'm saying it's easy to make a language better than Python in six months.


Then where is it?


What you're implying is that the "speed" of a language is purely defined by the speed of the C FFI? This view is absolutely ridiculous and I don't think you would find a single other programmer on this green earth that would agree with you.


I think everyone agrees that when they want to do some data analysis they pip install numpy and get to work. It's only the navel gazers who are concerned about cpythons performance instead of treating python as a 'race to C' wherever performance is concerned.

Get used to this approach since it crops up everywhere. If you target a gpu, you try to keep the work in the gpu. When we move to use bpf and io_uring things will be a race to the next kernel program. If you target the CPU then your code is a race to C or a race to the function that calls the relevant cpu instruction that will make your code fly.


As someone who's been writing a lot of JavaScript, Go, and a handful of other languages for a while, I feel this. In Go, I can basically know what's going to happen when I write a function. This operation will read from the stack, these instructions will be run, and I can take a peek at the assembly if I'm not sure (though I've developed a pretty good feel for what Go will do without needing that). I can benchmark it and know that the performance I see on my machine will be the performance when I ship this bit of functionality into production, barring hardware differences.

In JavaScript, it's a black box. I know some constructs might deoptimize functions when run on Wednesdays because I read them on a blog published in 2018 that's _probably_ still accurate. In my benchmark running on Node 12.14.1 on Windows this seems to be true. But then who knows if it'll be the same thing in production, and it might 'silently' change later on.

JavaScript in V8 is incredibly fast these days, but I find it much easier to write optimal code in Go.


It's really no different from native compiler optimizations, which are also mysterious and always changing, except for one key aspect: when recording timings to compare against other timings, you can control whether optimizations are turned on or off, to remove that variable from the comparison.


Thankfully, with a native compiler you don't have to deal with your JIT warming up, or some other code causing your function to get deoptimized, or subtle JIT trace aborts. :(


Instead you have to deal with the myriad of optimization flags, optimizations that take advantage of UB and change between versions regarding C and C++ compilers, inlines being dropped because a bug correction made the function body bigger than the allowed threshold,...


> or some other code causing your function to get deoptimized

Why doesn't the optimizer generate different compiled implementations of a function for different pieces of code?


I would imagine this to be due to the overhead of then having to track where EVERY variant of a JITted function can then be called from, when to deoptimize, etc.


> It's really no different from native compiler optimizations, which are also mysterious and always changing

Go’s compiler is kind of stupid in this sense.


Have you tried GCC go?


No, I have not. Does it do saner things than the “my way or the highway” standard Go toolchain?


Yes I believe it is more flexible. The work is also funded by google


Javascript lets you work at a pretty high level, and tries to fill in a reasonable implementation, but you can't be sure how its choices will perform. Go (like C) forces you to dictate the exact implementation in painful detail without abstracting away much of anything, which might be a reasonable tradeoff for very hot paths but not the entire system.


This seems a little unfair, given while the compiler is an even bigger pedant than I am, Rust does allow you to abstract away quite a lot of stuff and still get performant results.


I like Rust for reusable low-cost abstractions, but using it professionally seems to require an employer with a lot more trust in their team than usual.


> In JavaScript, it's a black box.

You can actually look at the assembly generated by V8 and you can trace de-opts. The workflow is just so awful, you're not going to want to do it.


I really hate that one just isn't able to provide a basic -S or similar switch.


> if your economics are such that servers are a bigger cost than payroll

Sorry, and I may be oversimplifying the author's situation, but this really sounds like a case where you need to not be using JS for your server. On the client you don't have much choice, but on the client the pure-JS performance rarely gets tight enough to warrant this degree of micro-optimization work.

Author makes some good points - it would be great if the JIT were more profiler-friendly - but I have to question a little bit how important it actually is, the way the use-cases line up.


They are, Common Lisp, Java/Android and .NET all provide very developer friendly graphical profiling tools.


The profiling and tooling integration of Intellij with Java has probably been the biggest thing keeping me from using another ecosystem for my personal hacking


When someone realized you pay a lot for cycles at the server, but you pay nothing for cycles at the client, that was the moment the world-wide web began to die.


This is baseless mudslinging. The bottleneck on the client is virtually never the cycles consumed by the actual running of the actual app's JS code. It's usually, in order starting with the most common:

- Piles of ads/analytics scripts which have no motivation not to slow down the page

- Reflow; i.e. needlessly many elements on the page causing the browser to do extra work calculating layout

- The initial loading and JIT-ing time of a needlessly heavy JS bundle


Well this is hyperbolic.


> Interpreters, which read the program line by line as it runs

Byte code interpreters do that? That would be surprising. Programs are represented with 'lines' in byte code?

Things he wants, let's look at SBCL, a Common Lisp implementation (see http://sbcl.org ):

> compile short bits to native code quickly at runtime -> that's done by default

> identify whether a section is getting optimized -> we tell the compiler and the compiler gives us feedback on the optimizations performed or missed

> know anything about the native code that’s being run in a benchmark -> we can disassemble it, some information can be asked

> statically require that a given section can & does get optimized -> done via declarations and compilation qualities

> compile likely sections in advance to skip warmup -> by default

> bonus: ship binaries with fully-compiled programs -> dump the data (which includes the compiled code) to an executable


Yes, I thought of SBCL when I was reading this too.

A JS frontend for SBCL could be nice...


Or at very least improve browser developer tools to give us what Lisp compilers have been doing the last half century.

Why should we need to hand compile special versions of JavaScript VMs, just to get (decompile ....)?


>Byte code interpreters do that? That would be surprising. Programs are represented with 'lines' in byte code?

kdb+/q has a bytecode compiler and each line is turned into bytecode and run, line by line. Also the performance is as good as C, most of the time.


A JIT is just another cache, like memory. Yes, it’s hard to predict, but not fundamentally that different from caching in any language. It does mean perf tests have to be end-to-end and match real-world loads, but it doesn’t mean it’s “impossible” at all, it means you need to measure.

Is this a real problem? I’ve been profiling my JS for years and never actually run into a mysterious problem where some important code I profiled was way way slower in prod than when I was profiling. Has that happened for you? How often does this happen? I take it as an assumption that profiling is something you mostly do on inner loops & hot paths in the first place. I mean, I profile everything to look for bottlenecks, but I don’t spend much time optimizing the cold paths.

> Get notified about deopts in hot paths

Studying the reasons for de-opts help you know in advance when it might happen. If you avoid those things, do-opts won’t happen, and you don’t need notifications.

For example, ensure you don’t modify/add/delete keys in any objects, make sure your objects are all the same shape in your hot path, don’t change the type of any properties, and you’re like 90% there, right?

> statically require that a given section can & does get optimized [...] compile likely sections in advance to skip warmup

While these don’t exist in V8, it’s maybe worth mentioning that the Google Closure compiler does help a little bit, it ensures class properties are present and initialized, which can help avoid de-opts.


Hey Node/bluebird person here: You want to run Node with --teace-opt and --trace-deopt and --allow-natives-syntax with %OptimizeFunctionOnNextCall before benchmarking.


The high level point that is of great importance is that if one wants a computer program to function in a reliable way one needs simple and understandable algorithms/components. Nowadays even the processor is no longer simple and assembly has become a high level language and it got us 'nice' things like spectre. For the practical day to day work the KISS principle that has been the corner stone of effective programming for half a century is now more important than ever. Yes, there is a nice library available to do such and such and maybe you need it, but are you also thinking about the dangers that any additional moving part may increase unpredictability? Let me give one very stupid example that I ran into this very week. The agreed-upon answer according to https://unix.stackexchange.com/questions/29608/why-is-it-bet... is that it is better to do #!/usr/bin/env bash than #!/usr/bin/bash. I say: absolutely not! You have just increased the number of moving parts involved by one for an immeasurably small benefit. And if you are worrying about bash versions then just stop using any bash features that are less than a decade old. I also say that unless it is necessary for the core problem that you are trying to solve stay away from any features of anything that are less than 5 years old. And if you actually need the new and fancy stuff, and maybe you do, expect to pay a hefty price for it. Every new tool that you introduce has its own peculiarities that you will spend hours of debug time on and one should err on the side of just saying no.


LuaJIT, true to form, has a sort of solution for this, in the form of a profiler cheap enough to run in production.

You do have to change how you think about performance analysis, but in return, you get to actually answer the question you're trying to reason about, namely, how does this run in production.

Pacifying the JIT is a bit of a dark art, but the whole thing is pretty transparent with good tooling. I've yet to regret building on LuaJIT.


Worth noting that the JVM has this feature as well, under the name of "flight recorder".


This article contains a logical error. The premise is JITs are hard to benchmark and keep good performance on. That's true.

But the alternative is bad performance all the time (the JITS fall back to interpretation, after all).

What's the value in having clearly understood bad performance? If you care enough about performance that you need to understand it, surely you care about the absolute level of performance.


An unreliable optimisation can be worse than no optimisation, because if you understand your performance you can take steps to prepare appropriately (e.g. maybe you scale horizontally, or put some time into manual optimisation). If your code suddenly gets 10x slower in production for no clear reason, that's a lot worse than your code having been running 10x slower from day 1.


False dichotomy. No one should be using shitty byte code interpreters like (c)python for real production work. Or at least stop whining about global warming. The real alternative is AOT.

Also in many cases having it's much better to trade off average performance for lower variance.


The vast majority of the time it doesn't matter. Most of us don't write performance critical code. Even games use bytecode VMs for scripting. You'd be better off spending the money saved on planting trees.

If you write a web app in Ruby using a similarly light weight frameworks as you would using Go, you'll be lucky to get 3x the throughput out of the Go app.

There are also ways to more efficiently utilize hardware by using bytecode than distributing binaries. Using bytecode verification rather than an opaque binary, you can pack thousands of different web applications into a single process.

This is an application of webassembly which IMHO is much more promising than in-browser use.


> The vast majority of the time it doesn't matter. I keep hearing this all the time and keep seeing all the time that it isn't true. Do you know of a single comparable Go web app that's not 10-100x less wasteful than goddamn gitlab? Most of the pathological sources of CO2 cloud emissions are a combination of terrible architecture and slow as molasses scripting languages like python or ruby. But these are not orthogonal. In theory gitlab could probably be 100x less wasteful even when written in Ruby. In practice many bad architectural decisions are far more likely to occur with Ruby and Python than a statically typed language (although Java may serve an interesting counter-example).

> Using bytecode verification rather than an opaque binary, you can pack thousands of different web applications into a single process

Can you expand on what problem this solves?

Anyway, the problem isn't the bytecode, it's the shitty interpreter. There are plenty of bytecode based languages that are within < 10x of C.


Yeah, absolutely. I've worked on some incredibly poorly performing Go services. The idea that a team struggling to deliver on a roadmap and keep up application performance while using a high level language like Ruby will magically be able to do it while using a lower level language is just a complete fallacy. It never happens!

People get bogged down. They don't write low allocation code. Nobody has any time to turn on the compiler warnings for escape analysis. Your benchmarks end up a year out of date and don't even run anymore. Architecture becomes an afterthought. Function/method calls end up becoming even slower gRPC calls as the team struggles to box up complexity and extracts more services.

WRT bytecode verification it's better just to read this: https://www.fastly.com/blog/announcing-lucet-fastly-native-w...

There's nothing shitty about the CRuby bytecode VM. It's all a question of resourcing. You're doing a huge disservice to all the people who worked very hard to make CRuby 2.7 5-15x faster than 1.8 and it detracts from a valid discussion about reducing the CO2 footprint of datacenters around the world.


The real alternative is using toolchains that offer both, JIT/AOT.


Well, as long as we can agree that shitty bytecode interpreters are not the way to go ;)

I think AOT/JIT as a dichotomy is stupid, for what it's worth. Why shouldn't I have e.g. both profile guided optimization and JIT in the same program?


That we surely agree, interpreters are only good as one is ramping up a language. or doesn't plan to use it beyond basic scripting or learning activities.

You can have both on the same toolchain, and plenty language do have them.

However in some scenarios, like constrained embedded devices, you are left with AOT only.

By the way, modern Android is all three, interpreter, AOT and JIT with PGO.


On the server side, the alternative is an AOT compiled language.


Many languages do offer both toolchains (AOT / JIT), including Java and .NET.

The alternative is to leave JavaScript to the browser, where it belongs.


Could you provide some more detail/links about AOT-compiled Java?


As referred in another comments.

Back to 2000's, ou had several commercial players, the most well known ones were Aicas, PTC, IBM, Aonix, J/Rockit, ...

In what concerns Android, it went to a couple of ways, in Android 5 introduced AOT compilation, but it did not scale to do that on device, when multiple applications where installed, with Android 7 Google introduced another approach.

Basically a mix of interpreted using a hand writter interpreter in Assembly, followed by a JIT, and then AOT compiling with PGO information gathered from the JIT, https://source.android.com/devices/tech/dalvik/jit-compiler

There are also a couple of Google IO related talks.

Back to standard Java implementations, not only you have those commercial vendors, IBM and Oracle have finally made AOT a free feature so you get it on OpenJDK and OpenJ9.

Besides AOT compiling, you also have the possibility to use a JIT cache between execution runs instead. This is sometimes a better approach, because you cannot use AOT on code that depends on reflection, unless you tell the compiler about all the classes that it doesn't see, but are also required to land on the executable.

Most well known Java related conferences (JaX, Devoxx, NDC, Java ONE, CodeOne, JVM Language Summit) have a couple of AOT related talks every now and then.



GraalVM native image but it has some major limitations.


This post is extremely V8-centric. For example it uses terminology like “deopts” which means nothing in JavaScriptCore (we distinguish between exits, invalidations, jettisons, and recompiles). The post also assumes that there is only one JIT (JSC has multiple).

And that’s where you’ve lost me. Not sure how you expose anything about how the JIT is operating without introducing a compat shitshow since JIT means different things in different implementations.

If you really want to know that something gets compiled with the types you want, use a statically typed and ahead of time compiled language.

If you have to use a JIT but you find that it doesn’t do what you like then remember that it’s meant to be stochastic. The VM is just trying to win in the average. Which functions get compiled and with what type information can vary from run to run.

Probably the best thing that could happen is that developer tools tell you more about what the JIT is doing. But even that’s hard.

There are some specifics that I disagree with:

- I don’t think all JIT architects for JS claim that the perf is about competing with C for numerical code. I don’t explain it that way. I would say: JITs are about doing the best job you can do under the circumstances. They can make JS run 4x faster than an interpreter if things really go well. “Between Python and Java” is a good way to put it and that’s exactly what I would expect. So if that’s your experience then great! The JIT worked as expected.

- It’s usually foolish to want your code compiled sooner. Compilation delay is about establishing confidence in profiling. I’m pretty sure we’d JIT much sooner if it wasn’t for the fact that it would make the EV of our speculation go negative.

TL;DR. the JIT can’t unfuck up JavaScript.


The post is likely V8-centric because it's discussing server-side JavaScript. As far as I know, pretty much everyone uses V8 for server-side JS because that's what Node is based on. And compatibility seems to be much less of an issue there as well, at least if all the "dropped support for Node version X" changelogs I read are any indication.

Not to say server-side JavaScript is necessarily a good idea. But if people are determined to make it work, I could see these sorts of changes happening that wouldn't make sense in a web browser.


> Probably the best thing that could happen is that developer tools tell you more about what the JIT is doing. But even that’s hard.

Why? You have the exact same problem when writing SQL code but there you have lots of powerful introspection tools to make it easier to control performance. You can also use indices and hints to nudge the RDBMS into executing queries in the most optimal way. PyPy for example has a lot of support for introspection.


Also most SQL based languages are strongly typed, which helps a lot when it comes to generate native code.


Because JS JITs aren’t deterministic. Just starting the profiling or introspection tool could change internal behavior.

The nondeterminism can come from lots of places. In JSC it’s that the JIT polls the heap for some of its profiling and it profiles concurrently. So OS scheduling decisions affect what types the JIT sees.


Performance in general is non-deterministic in a multi-tasking os because you have multiple processes competing for cpu time. My point is that there is nothing inherent to Javascript VMs that makes them harder to control or analyze. Runtimes for other dynamic languages and database engines shows that proper tooling makes it much easier for developers to optimize for performance.


On some level of course that’s true but surely you’re not suggesting that the nondeterminism of JS is no worse than the nondeterminism of C. C at least runs with the same types every time.


C is weakly typed, so that it runs with the same static types means very little.


C has strong types for the purpose of optimization. A C compiler never wonders whether + means int addition or string concatenation for example. In JS the optimizing compiler doesn’t get such type information except by profiling and profiling is nondeterministic.


> Because JS JITs aren’t deterministic. Just starting the profiling or introspection tool could change internal behavior.

All all JavaScript JITs this way?


All the good ones.

In all seriousness, I think I heard that FF doesn’t profile concurrently. Maybe that’s one of the reasons why they’re so slow.


If you were to get the information you were after it would be specific to a particular implementation, and a particular version of that implementation. The nature of how JIT is done for JavaScript is not defined in the spec and varies a lot (starting from totally nonexistant many years ago).

To have useful performance measurement tools that are going to help your code run across multiple implementations would require a) all of those implementations to work the same, forever, and b) the semantics of this to be specified in the standard.


The thing is, javascript performance is mostly "good enough", and honestly that's what really matters.

When you have workloads that are mostly io bound, having syntactic sugar like js async/await to avoid blocking is really a huge strength.

When we write systems, we operate under constraints, and frequently the largest constraint is time to market rather than pure performance.

Dynamic typing can be a huge strength for TTM.

If performance was the only concern, we'd all be in straight C with inline asm.


What’s TTM?


Time to market I'd guess?


Time To Market


I'm not convinced that dynamic typing helps productivity. In my experience, an expressive type system allows one to deploy code much more quickly. The key word, of course, is expressive. Neither Go, nor Java, btw, qualify as having expressive type systems.


My guess is that productivity is really from the ecosystem. It is easier to create and use modules in JS than Java/C#. You do not need to mend your codebase to use modules. Once you have all these third party modules it easier to move fast with the product.

Structural typing fit really nicely with existing JS. Typescript have all expressiveness of JS and granular control of type safety. Given this, there is a lot of sharp edges.


Definitely! And tooling. Java is verbose but IntelliJ writes a lot for me.

`main` turns into[1]:

public static void main(String[] args) {

}

Because of this and Lombok I don't find Java particularly taxing to write. I often prefer writing Java as my IDE can provide me better information about the code.

[1] https://www.jetbrains.com/help/idea/using-live-templates.htm...


And what is an example of an expressive type system? I am asking just out of curiosity.


What sweeneyrod said. Scala is also OK. It has advantage over both in some aspects, but loses in other aspects. Neither of Haskell, Scala or OCaml is _the best_ -- there just isn't the perfect language. Yet! :)

Features I care about:

* Type classes: Haskell, Scala

* Module system: OCaml, Scala

* structural subtyping / row polymorphism: OCaml

* Higher-kinded types: Haskell, Scala

Other important features, but not related to the type system:

* Powerful runtime system (multicore support, green threads, ...): Haskel, Scala (kinda, with the TypeLevel libraries, but still not 1st class)

* pure functional programming focus: Haskell, Scala (kinda, with the TypeLevel libraries, but still not 1st class)

* compiling to JavaScript (so that you can use 1 language for both backend and frontend): Scala, OCaml

... maybe PureScript would tick the most boxes ¯\_(ツ)_/¯


For the frontend, PureScript (or Reason, or OCaml with BuckleScript) work well; Rust inherits most of Haskell's type system as well.


Haskell or OCaml


I would say: Scala, Haskell or OCaml.


As a nodejs developer: what is the best compiled server language to learn right now? Is it Java still, or is it better to look at go or rust?


Maybe I'm missing the intended distinction, but in normal usage, Java is also a JITted language. It's almost the canonical one... JITs are much older, but HotSpot really popularized the concept in mainstream production.


Java’s bytecode is JITted, which is a subtle distinction since there is an AOT translation step capabale of optimizations etc.


Technically yes, but the AOT step in practice does almost no optimizations and is just a direct translation, at least if you're using the mainstream javac. It's there just to simplify the VM/JIT, so the latter can read bytecode instead of textual source code.

There have been attempts at AOT compiling Java, but afaict they have very little usage. GCJ, now dead, was the main one for years. There's also a newer experimental compiler jaotc: https://openjdk.java.net/jeps/295.


AOT compiling Java has lots of usage, including on my smartphone since Android 5.

Outside Android, it has always been available in commercial JDKs since the early 2000's.

GCJ was never the main one, it never went beyond a toy AOT compiler.

Anyone that cared about AOT compiling in Java would be paying big bucks to Excelsior JET, Aicas, Aonix, PTC, IBM, J/Rockit, Gemalto and a couple of smaller players.

Naturally in 20 years quite a few things changed, and now PTC, Aicas, IBM, Gemalto are the survivors, while MaximeVM graduated into GraalVM and there is a free beer layer for everyone not willing to pay for AOT compilers in Java.


Good history! But to reiterate for anyone missing it in there, look at GraalVM's AOT compiler today. It's OSS, well-funded, and mature: runs most large programs like Spring.


You seem to have more familiarity than me with Java :). So I guess V8 JITs some kind of p-code then? If V8 allowed full AOT comp. to the internal bytecode, then the perf would theoretically be around Java’s avg perf?


Avoiding the source parsing step would only provide a startup time performance win. JS engines tend to be faster than Java here already since it's something they've been heavily optimized for.

It's difficult for JS engines to match Java's performance without type information to work with. This is only discovered at runtime and the JIT'ed code for a function requires traps in case it is later called by parameters of a different type.


So the language itself is the limiting factor, being dynamically typed. I imagine sometimes what we’d get if typescript were natively supported. People could still write JS, or add the type information which would then allow higher performance to be achieved - approaching native or Java/C# levels.


Interestingly, one of the cases where Java JITs can beat native code are the same as JavaScript’s, and it’s when code can be speculatively monomorphized.


Depends on the JVM implementation. Eg with Graal you can have faster startup in Clojure, Java or other JVM language vs Node.


Nope, because java is statically typed.


Java's bytecode can be JITted, AOT compiled to native code, stored in a cache and reused with PGO information, directly executed in hardware, or just plain interpreted.

There are plenty of Java implementations to choose from.

I wish professors would actually spend more than 5 minutes discussing how languages and implementations aren't the same thing.


Not sure why you're downvoted. It's definitely difficult to talk about performance, for example, without keeping languages and implementations separate.


I mean to say the bytecode is the input to the JIT.


Other than startup time, isn't AoT more like pessimization, as runtime data can't be used to optimize performance?


That's true in Javascript because its highly dynamic nature isn't amenable to static analysis.

But in general, AoT is usually a win, even though it doesn't have access to runtime data. C and Fortran are still the gold standard for performance, for example. One reason is that JITs are time constrained, even in long-running processes. Another is these languages are designed for AoT compilation.

And there are ways for AoT compilation to have some, albeit limited, access to runtime data. One way is 'profile guided optimisation'. Another is inserting probes for different types of CPU.


Ram usage is a big one for AOT java.


Python is also bytecode compiled first.


If you are are looking for a mature high performance language to keep in your back pocket for the cases where you can't use node, Java seems like a fair bet. Go would also be useful. Neither are too much fun IMO, but that would fit the bill.

There are newer languages that are more interesting, but picking one of those is more about either attempting to anticipate where the industry is going, or just plain wanting a more fun language. Rust is an example here. Personally I think rust is worth learning, and has a fair shot at taking over a large chunk of software development that would have previously gone to C or C++. Learning rust might make you a better programmer because lots of poor-programming-hygiene type things are compiler errors in rust.

If you just want a way to wring maximum performance out of a piece of hardware, then I would caution you to consider your choice carefully. Writing high performance code requires a complete understanding of the stack underneath you. If you want to see my point in practice go check out the programming languages shootout. Specifically notice that there are benchmarks implemented in "fast" languages that are outperformed by better optimized code written in "slow" languages.

I would recommend against C++ unless you are planning on devoting a very large amount of time to learning it well. I also don't personally like C++. Too many footguns.


> notice that there are benchmarks implemented in "fast" languages that are outperformed by better optimized code written in "slow" languages.

I haven't observed that, at least not where one can't write a "better optimized" code in the "fast" language too. Can you give any specific examples?


Sorry, I wasn’t clear. On shootout they have multiple implementation of each benchmark in the same language. The fastest implementations are always fast languages, but there are lots of implementations in fast languages that are bested by better implementations in slow languages[1]. The point I was trying to make is that picking a fast language isn’t enough to write truly fast programs.

1. https://benchmarksgame-team.pages.debian.net/benchmarksgame/...


But even that doesn't follow from the link you give: the slowest C++ entry is taking 54 seconds there. But I don't see any, for example, Python code which is faster. There is node.js code which is fast, but we know that node.js depends on really, really, good JIT, which doesn't make it "slow language" in my view. The node.js implementation also uses all CPU cores, so we should at least compare among the implementations which also use all CPU cores.

So in this case it's less "an algorithm" in the narrow sense and more "is it written to use all CPU cores effectively" and is it using the JIT to run the code compiled to the native. Other languages with good scores are also all that can do a good JIT and use all the CPU cores, which yes, confirms that good JITs are really good.

Which still also doesn't mean that I don't know (or have happened to know) enough incompetent programmers who would, indeed, make even worse performing C++ than any entry that we see in the Benchmarks game. I could write a whole article about all the things they would or could do wrong.

I only wanted to see the examples in "the Benchmarks game" specifically.


Its really hard to be slower than python, unless you are using ruby. Considering that the parent was asking about node, that is what I was looking at. While V8's performance is hugely impressive, it is still significantly slower than other languages.

I'm not sure what you are trying to say though? My point was that you have to develop expertise before you can create high performance programs. If you are considering dropping node because you need more performance, be careful.


> If you are considering dropping node because you need more performance, be careful.

That's what is surely not surprising to me, as I've measured the performance of V8 even before node.js existed, and already knew of enough examples that resulted in the code executed with the speed comparable to C. I don't consider any modern JavaScript JIT implementation to be "slow", neither V8, nor the one in Safari or in Firefox. Moreover, all of them have more than one level of achieving the speed -- they adapt to the needs of the code executed, they are really the works of art, and not an accidental development.


Java has a JIT too…


And AOT as well.


It depends on a lot of factors. I’m a typescript Dev and I’m really enjoying working in Kotlin. I certainly enjoy how expressive it is compared to Go, and I picked it up in a couple of days in an existing codebase, so it’s much easier than Rust. I think you can’t undersell how good the JetBrains IDE makes the programming experience with code completion, auto-imports, corrections, or how great it is that Kotlin interfaces with Java seamlessly.


Funnily enough Scala provides the same feature set. However, because it is even more powerful than Kotlin there are also more ways to achieve the same goal. Doing any of the advanced stuff makes it hard to use that Scala code from Java. Using java from scala is easy. And still kotlin appears to be gaining ground on scala.


In my limited experience, Kotlin seems to have smoother Java interop than Scala. For example, last time I checked, Java getter/setter methods automatically become Kotlin properties, and vice versa, whereas Scala has its own convention for getters and setters. Is that not the case now?


Part of me wonders if ASP.NET will make a comeback. A lot of the historical reasons not to use it don't exist anymore (there's an officially-supported Linux port, for example), and a lot of the cool new JavaScript language features like async/await and arrow functions came from C#.

Besides, most JavaScript programmers are already dependent on Microsoft products like TypeScript, NPM, and Visual Studio Code. What's one more?


ASP.NET Core has been running fine on Linux and Mac OS X for years now. I have been building backends in it for those same years using Linux mostly. The ecosystem is nice and stable (been run high traffic APIs in production for years which never have any issues) and code written years ago simply works without issues after updating libs/runtime to the latest version (Even from 2.1 to 3.1 we had barely any work). Unlike RoR or JS projects I maintain which break quite a lot on update.


It starts, yes, but the experience is still as miserable as on Windows. They just brought the Windows idiocies with them.

The debugger (clrdbg) is still artificially restricted from running inside anything else than MS' IDEs.

The global certificate store is still a thing.

On that note, the cryptography support varies wildly between platforms, because they still insist on using SChannel on Windows. And I guess that means that they have to use SecureTransport on macOS, to avoid appearing to play favourites or something.

The action at a distance that it encouragtes would make you think you're in a haunted house.

Despite bringing the Windows mentality with them everywhere, there still isn't a competent migration story from Framework, so what's the point? Legacy code is still a pain to port, and you'd be crazy to actually want to write new code in this mess.


I've used it as well and it is really really nice, especially on Linux as it starts extremely much faster there which was important for my work.

It is still not in the same league as Java when it comes to dev tools and the Java ecosystem keeps getting better as well, but at the moment I am torn between the two (or rather, I have two excellent options that I can choose depending on circumstances.)


I agree: I also work with Java and JVM languages and I was a fulltime Java dev for 10 years before but the first client (a bank) on our product insisted on .NET 7 years ago and we had some .NET expertise already and went all in.

.NET Core is nice enough to do very large projects in though so I cannot see many reasons of moving away from it (especially from F# which I mostly use now).


I imagine nobody knows what "ASP.NET" means nowadays as distinct from C# / .NET Core.


Comeback? It never went away.

From where I am standing it has always been either Java or .NET since the last 20 years, while we keep seeing stuff coming and going every couple of years.


Is it open source?


Kind of? Vital parts are still closed (like the debugger), and the tooling still assumes a closed-source world (for example, with OmniSharp, jump-to-definition to dependencies just gives you a synthetic source file that only contains method signatures).

It's also not exactly run like a community project. The issue tracker feels more like a customer support system: "can you try updating to the newest release now?" abound, with no mention of what the actual problem or fix was.


I think so! They have a repo called "ASP.NET Core" on GitHub with an Apache license: https://github.com/dotnet/aspnetcore . ("Core" here means that it's the version of .NET that can run on Linux.) The "dotnet" organization also has repositories for the C# compiler and runtime.


Yes! Compiler, sdk, runtime and framework.


Java is still the largest and most in use server side language at 39%. It is followed by C# at 32%.

After that, you have Go at 9%, Kotlin (also a JVM Lang) at 6.6% and Scala (also a JVM Lang) at 4%.

Finally, you get Elixir and Clojure both at 1.5%

Source: https://insights.stackoverflow.com/survey/2019

So yes, I think learning Java and the JVM ecosystem is a good investment, considering Java dominates, and if you tally up JVM langs you get something like 51% of professional developers using a JVM based language for work.


This up-to-date one seems to be putting PHP in a large lead with C# then Java following: https://w3techs.com/technologies/overview/programming_langua...

We've drifted from just compiled languages here, but the most important server-side language to know is probably PHP. Personally I've really enjoyed working with Drupal.


Sorry, I assumed parent meant backend and web services, like REST, GraphQL or RPC APIs, and not server side HTML rendering, that's why I excluded certain languages like PHP. I also skipped over interpreted languages like Python and Ruby since they mentioned compiled.

Also, that doesn't look like a great metric, since it's just the number of website using a particular technology. That's going to be massively skewed by prefab solutions like WordPress and all.

I liked that the StackOverFlow survey asked people what language they used professionally. I feel that's much more relevant.


PHP is used for REST and all that just fine. Why are you excluding it from "backend and web services"?


Well, in all honesty, I've stopped using PHP at v5, when it was all running as CGI scripts. So I'm not sure I'm doing it justice since I'm pretty outdated on its development.

I've never seen PHP being used for those usages. So I probably made an assumption here that it isn't, or rarely is.

All the languages I listed generally run their own server, and can handle the full request, down to its connection. While I remember PHP only managing the payloads from an existing server, running as an embedded script which the server delegated too on the last step for final rendering of the response.

So for example, it couldn't manage any type of long running in-memory process or anything like that. And it didn't have direct access to IO, as that's all managed by the containing server. Things like that, which not all REST APIs would need, but is generally something I'd expect to be able to do for more generic backend work. Don't know if there are now ways to do all this.


I always thought php was just a templating language. Any recommended books or resources?


It can be a good investment if you look at your programming career in economic terms. But the Java culture is pretty uncreative and rife with overengineered enterprise stuff. If you can stay with Clojure, Scala, etc, then it's better.

(I'm not convinced it's long term profitable to work on boring platforms either, but of course looking for fun things is more risky)


The Java ecosystem is the size of a mid-sized country. I love Clojure but the small tip of Java (language) developers doing extremely interesting, unusual, creative stuff (like, say, controlling robot swarms [1] or marine robots [2], or sophisticated formal methods tools like for TLA+ [3] or Alloy [4], or computing orbits for NASA [5], or embedded hard-realtime for cars and manufacturing [6], or games like Minecraft) still outnumber all Clojure and Scala developers working on creative unusual stuff at least 10 to 1. It's just that in a small pond it's easier to spot a few islands of creativity than it is to spot the many such islands in an ocean.

[1]: https://www.infoq.com/presentations/java-robot-swarms/

[2]: https://en.wikipedia.org/wiki/Liquid_Robotics

[3]: http://lamport.azurewebsites.net/tla/tla.html

[4]: https://alloytools.org/

[5]: https://www.infoq.com/presentations/java-science-aerospace/

[6]: https://www.aicas.com/


I agree the absolute number of cool things done with Java is bigger, but the average Java project is very bland and as a Java programmer you are very unlikely to land one of these NASA projects.


Yes, but the selection of cool things in Java is bigger, and it's not like it's a random draw. If you want to work on this kind of thing and you're able to, Java still offers you more opportunities.


I don't know, get to work in very interesting large scale projects despite politics at corporate level, work my 40h week, get to travel comfortably around the world, play digital nomad when I feel like it.

Doesn't seem that bad to be invested into Java/.NET/C++/Web eco-systems.


Adoption rates matter too. I learned angularjs when it was widely used, but only a couple of years later the world has moved on to angularN, react, and Vue


Best is highly subjective and entirely dependent on what you want to build.

I find a lot of joy in working in C# and the .NET core. It's very pleasant to work in and might be worth a look.


Very interesting. I have heard that from some people as well. Any tips on how to get started in *nix/FreeBSD/Mac ecosystem? Last I tried was in mono. Any tips appreciated.


It's super easy to download and get started on Linux/Mac:

https://dotnet.microsoft.com/download/dotnet-core/3.1

I even managed to get it to run on an Android tablet via Termux by downloading the ARM64 binaries.


In my experience Go is amazing for webserver development. Very easy to learn and you can build things very quickly.

However, I still use C++ for extremely complex networked applications where performance matters, unfortunately Rust doesn’t have battle tested crypto libraries, so I wouldn’t personally use it for networked applications.


Go's lack of generics makes it better for write-once, run-forever systems like proxies. It's not as ideal for application servers that will otherwise need a lot of code duplication via code generation (to be type safe) or nil interfaces (to throw away static typing).

Code generation isn't so bad. Even JOOQ (Java's awesome ORM) will suggest you do it. And similarly, Java's generics aren't that great themselves. But you come to expect more from a newer language with so much history to learn from and they missed the ball on generics. (In contrast they got right: gofmt, go test, cross platform builds, go get, etc.)


just use the c libraries from rust?


Or use rustls, if you just need modern TLS. I found this back and forth of rustls vs native-tls (which picks up openssl/your favorite system ssl) helpful: https://users.rust-lang.org/t/any-reasons-to-prefer-native-t...


There a few wrapper libraries that provide safe Rust interface for a limited feature set on top of system's crypto libraries like OpenSSL, so the applications can benefit from decades of testing and get security updates without timely rebuilds. I'm maintaining one, for example, which has Rust bindings among 12 other platforms.


kotlin is pretty neat to work with.

In the end it is all bytecode (so java, or rather both java and kotlin output bytecode), but kt removes a lot of the ceremony, boilerplate, and altogether ideas that have proven to do not be that great in the years since java's inception.


This is no longer true. Java has eaten a lot of the features Kotlin once had. The only things not provided by java are nullable reference types and coroutines (coming).


meh, java has indeed eaten a lot of the features that kotlin has (why do you use past tense ??). However, the java version is not necessarily as convenient, concise or powerful than Kotlin.

E.G. I strongly prefer data class to records. No named parameters -> no default params -> way more boilerplate


Some people on Android bubble keep mistaking Android Java for the real deal.


Check out Scala! It has modern features and the tooling around it is mature.

If you like books, this should be a good intro: https://underscore.io/books/essential-scala/ (you should use IntelliJ IDEA instead of Eclipse, though)

And you can also use it for frontend development: https://www.scala-js.org/doc/sjs-for-js/es6-to-scala-part1.h... (having one language for both backend and frontend leads to great synergy)


To learn for what goal? For education, for enterprise, for speed?

To which my shortlist is Haskell, C#, and Rust.


C# is pretty decent. Has been since the get go.


JS is also frequently a compiled language these days. What makes you qualify the question this way?


I would say Scala, or maybe F#.


I completely agree, it's possible to do some tracing of what the V8 JIT does, but the workflow is awful.

Microbenchmarks don't represent the real world. Instead of running a single microbenchmark, I suggest running a host of them, but all of them "at the same time" (not serially). They'll get in each others ways and total performance will be far worse.

This happens with AOT code as well of course, because caches are being trampled no matter what, but JIT code only exarcerbates the issue because it is larger.


Lancet [1] seems like it fixes at least some of the unpredictability. (Haven't used it though, only read about it on another site.)

[1]: https://github.com/TiarkRompf/lancet


Not a big deal if you're mainly writing client-side JS. You're never gonna know how fast something will run in a browser anyway.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: