Having worked in environments where GC was an absolute "no go", I'm always amaze...

anp · on Aug 4, 2016

> But most likely - you simply do not need a language without a GC.

Absolutely. That doesn't mean I can't want predictable performance or deterministic destruction. I also think it's a shame that we waste so much electricity and rare earth minerals on keeping ourselves from screwing up (i.e. on the overhead of managed runtimes and GCs). Before, I'd have argued that it was just necessary. Having spent a bunch of time with Rust, I don't think so any more, and I'm really excited to see non-GC languages build on Rust's ideas in the future.

> Hadoop stack with especially Cassandra, Elasticsearch? Java. Prometheus and InfluxDB? Go.

Cassandra has a drop-in-ish C++ replacement (Scylla, IIRC?) which supposedly blows the Java implementation away in performance. A magic JIT (and HotSpot is really magic) doesn't make everything better all of a sudden.

In a somewhat recent panel (https://www.infoq.com/presentations/c-rust-go), the CEO of InfluxDB basically admitted that if Rust had been more stable when they started they would have been able to use it instead of Go and would have had to do far fewer shenanigans to avoid the GC penalty.

> Just face it: there is an need for something intermediate to fill the gap of a script-like, native compiled, low-overhead, modern language, and a GC is part of this.

Indeed. I'm not in denial of this. I made an offhand remark about my personal preferences and what I'd like to see from future languages. I still write a ton of Python for things where speed really doesn't matter.

> "oh no, a GC!" knee-jerk reaction

I don't think having a refreshing experience without a GC counts as a "knee-jerk reaction." I've thoroughly enjoyed not having to tune that aspect of performance, and I remarked on it. I think Crystal shows great promise, and certainly has the potential to offer easier ergonomics than Rust.

bluejekyll · on Aug 4, 2016

> That doesn't mean I can't want predictable performance or deterministic destruction.

Exactly. To just add to your point, there is no longer a reason to settle for GC pauses with Rust. It does require more thought while writing the code, but what you gain is a firmly consistent runtime. If your memory allocation is slow, you can create your own allocator/slab, and then use that for hot memory space and optimize it out.

As a longtime Java geek who never understood the argument against GC, this has been a mind altering experience. I was a big C++ person before, but after one too many memory leaks and segfaults, I could never imagine not wanting a GC. Then Rust came along and taught me better.

mike_hearn · on Aug 5, 2016

Two reasons:

1) Rust appears to be significantly less productive than a true GCd language. I see a lot of people talking about "fighting the borrow checker" with Rust and I see a lot of articles describing basic patterns that would be simple in any other language, but are complex in Rust.

2) If you want to invoke code that assumes a GC you need to have one.

You can do manual memory allocation in Java by the way and a few high performance libraries do. It's just not common.

It's interesting to note that the Chrome guys have gone in the direction of deploying a GC into C++ whereas the Mozilla guys have gone in the direction of moving manual memory management into the type system. I've got nothing against Rust but I'm a Chrome user, personally.

anp · on Aug 5, 2016

> fighting the borrow checker

My experience is that this is an initial hurdle to clear. As an example, I've almost exclusively worked in GC'd languages for a while, and after learning Rust for a few months I very very rarely have borrow check errors.

The fact that it occasionally requires a complex pattern to do right should get better with time (non-lexical lifetimes would help), and there's also discussion around GC integration so that you could interact with a scripting language GC when writing a plugin for it, or you could farm out GC'd objects when you need to have cycles (i.e. in graph algorithms).

> the Chrome guys have gone in the direction of deploying a GC into C++

Interesting. I'm curious how much of the browser relies on it. I'm also curious whether it's an attempt to paper over C++ with a little memory safety, or whether it actually offers performance improvements. My original point was not that GC is bad, per se, but that I quite like being able to avoid it when it's reliable to do so, which is not the case in C++, IMO.

dbaupp · on Aug 5, 2016

Mozilla has also put significant effort into improving their C++ GC in Firefox, e.g. switching from a non-generational one to a generation GC[0] and then to a compacting one[1]. Just like Google can both improve Chrome and work on Go, Mozilla both improves Firefox and works on Rust.

[0]: https://hacks.mozilla.org/2014/09/generational-garbage-colle...

[1]: https://hacks.mozilla.org/2015/07/compacting-garbage-collect...

mike_hearn · on Aug 5, 2016

Those are all about the JavaScript GC, right? Not using the GC for actual pure C++ objects.

dbaupp · on Aug 6, 2016

I wasn't giving examples of Mozilla doing exactly what Chrome is doing, just counterexamples to your implication that working on Rust means no work on improving memory management in Firefox.

That said, it's not even like a single-GC approach is incompatible with Rust: https://blog.mozilla.org/research/2014/08/26/javascript-serv...

steveklabnik · on Aug 5, 2016

  > I see a lot of people talking about "fighting the borrow checker" with Rust

It's also usually described as "at first, I fought the borrow checker, but then I internalized its rules and it's now second nature." You're not wrong that it's a hump to get over, but once you do, it's not a big deal.

  > that would be simple in any other language,

Any other _GC'd_ language. You still fight the same kinds of complexity when you don't have GC.

pjmlp · on Aug 4, 2016

The C++ advantage will not be so much after Java 10 comes out and finally have the value types and reified generics the language should have had since beginning.

Also I am yet to see any large scale production deployment of those Hadoop alternatives.

But it might still be like 5 years from now, so who knows how it will evolve.

anp · on Aug 4, 2016

Re: deployments, I expect that would take some time for a transition to occur. The first post date on the ScyllaDB blog is from February 2015 (http://www.scylladb.com/2015/02/20/seastar/), and it looks like it wasn't until September 2015 that they specifically started publishing benchmarks of the database itself as opposed to the network I/O library they built for it (http://www.scylladb.com/2015/09/22/watching_scylla_serve_1m/).

I look forward to those changes coming to Java, and I think that stack-based value types could do a lot for the language. That said, the Scylla folks seem to have gotten a lot of their performance gains from CPU/thread affinity and async I/O (http://www.scylladb.com/2016/03/18/generalist-engineer-cassa...). NIO is pretty great in Java-land, IIRC, but CPU/thread affinity is, I imagine, hard to pull off with a garbage collector.

Another thing I'm curious about w.r.t. value types in Java -- hasn't C# had those for a while? If so, and if your claim that value types will provide large performance benefits is true, why isn't C# always blowing Java away in benchmarks? Perhaps it is and I'm just not seeing them. Perhaps Java's escape analysis is already pretty good and solve the 60/70/80% case? Perhaps I'm not well versed enough in the subject to understand the interactions here.

pjmlp · on Aug 4, 2016

Regarding C#, Microsoft hasn't invested too much on their JIT/AOT compilers optimization algorithms.

NGEN was just good enough for allowing quick application startup.

Also they didn't invest too much in optimizations in the old JIT.

Specially since .NET always had good interop to native code via C++/CLI, P/Invoke and RCW.

There were some improvements like multicore JIT in .NET 4.0 and PGO support in .NET 4.5, but not much in terms of optimization algorithms.

Hence why .NET 4.6 got a new revamped JIT called RyuJIT with SIMD support and lots of nice optimizations.

But this is only for the desktop.

.NET for the Windows Store is AOT compiled with the same backend that Visual C++ uses. In the Windows 8 and 8.1 they came up with MDIL from Singularity/Midori but with 10 they improved the workflow to what is nowadays known as .NET Native.

With the ongoing refactorings they plan to make C2 (Visual C++ backend) a kind of LLVM for their languages, similar to the Phoenix MSR project they did a few years ago.

If you watch the Build 2015 and 2016 talks, most of them are making use of C# with the new WinRT (COM based) APIs, leaving C++ just for the DX related ones.

So they are quite serious about taking their learnings from project Midori and improve the overall .NET performance.

bunderbunder · on Aug 4, 2016

From what I've seen, stack based value types are not necessarily the big performance win they're touted to be. The rule of thumb I've noticed is that, if the struct is much bigger than the size of a pointer, you start seeing a pattern where it's quicker to allocate in the first place but slower to pass around.

I think this is because, on a platform like Java or .NET that uses generational garbage collection, the heap starts to behave like a stack in a lot of ways. Allocations are fast, since you just put objects at the top of the heap. And then, since they're at the top of the heap, they tend to stay in the cache where access is fast, so pointer chasing doesn't end up being such a big deal. On the other hand, if you use a struct, every time you pass or return it you end up creating a shallow copy of the data structure instead just passing a single pointer.

(Disclaimer: preceding comment is very speculative.)

pjmlp · on Aug 4, 2016

Kind of true, but you can minimize copies if the language supports ref types, which was already common in languages like Modula-3, D or even Eiffel.

Also a reason why C# 7 is getting them as return types in addition to ref/out parameters.

whateveracct · on Aug 4, 2016

> value types and reified generics

I think you mean specialized generics (i.e. no autoboxing of primitives when used in generics)? Reified generics implies carrying around all generic type information at runtime, which will not be the case and also has nothing to do with performance. Non-value generics will still be erased I thought.

pjmlp · on Aug 4, 2016

Have you seen the status update?

https://www.youtube.com/watch?v=Tc9vs_HFHVo&list=PLX8CzqL3Ar...

They will change the constant pool to have some kind of template information that gets specialized (what they call type species) into a specific set of types.

The plan is even if Java cannot fully take advantage of all possibilities due to backwards compatibility with existing libraries in binary format, the JVM will support it for other languages not tied to Java semantics and backwards compatibility.

bluejekyll · on Aug 4, 2016

Java 10? I'm still waiting for Jigsaw (originally slated for 1.8, is it coming in 1.9???)

pjmlp · on Aug 4, 2016

No need to wait.

https://jdk9.java.net/download/

jerven · on Aug 4, 2016

And Jigsaw has landed, build 116 if I recall (too lazy to look it up)

current_call · on Aug 4, 2016

That doesn't mean I can't want predictable performance or deterministic destruction.

You're assuming your compiler or operating system won't cause memory to be freed at different times.

I also think it's a shame that we waste so much electricity and rare earth minerals on keeping ourselves from screwing up.

Wasting man hours on manufactured problems is far worse than wasting coal.

anp · on Aug 5, 2016

Memory isn't the only resource managed by deterministic destruction.

Manufactured problems? When what's now coastline is underwater, I'll be glad to see if you remain as smug.

paulddraper · on Aug 4, 2016

GCs are complex and require lots of end-user tuning -- just look at the performance articles on Java.

Beyond that, however, there are many uses for ownership beyond controlling memory resources. Closing a TCP connection, releasing a OpenGL texture...there are lots of applications of having life cycles built in to the code rather than the runtime.

EDIT: fixed typo

ynniv · on Aug 4, 2016

just look at hand the performance articles on Java.

Just look at the hand performance articles on C... People talk about it because you can do it, not because you have to do it.

paulddraper · on Aug 4, 2016

And you can do it because someone found that it was necessary to do.

It's crazy I can't tell Java and NodeJS "use the memory you need". Instead I have to specify max memory sizes (and then watch as they inevitably consume all of it).

wtetzner · on Aug 4, 2016

Of course they will consume all of it. That's how GCs typically work. They won't invoke a collection until there's no space left to allocate. Just because they "use all of it" doesn't mean all of that memory is actually live. It just hasn't done a collection yet.

paulddraper · on Aug 4, 2016

> Of course they will consume all of it. That's how GCs typically work.

Perfect agreement.

oldmanjay · on Aug 4, 2016

The disagreement probably lies in your characterization of the GC behavior as "crazy" when it merely doesn't suit your tastes.

renox · on Aug 4, 2016

> It's crazy I can't tell Java and NodeJS "use the memory you need".

Define the 'memory you need'? You know the computer doesn't have a cristal ball to know what latency vs memory usage trade off you want..

aianus · on Aug 4, 2016

"Until the OS refuses to give you more"

Just like every compiled, non-GC program gets. It's annoying af to fiddle with interpreter/VM "maximum heap sizes".

munificent · on Aug 4, 2016

Alas, "until the OS refuses to give you more" is also not a meaningful signal anymore.

https://en.wikipedia.org/wiki/Memory_overcommitment

anp · on Aug 4, 2016

Tracing garbage collectors have to have some metric for when to trigger a stop-the-world collection. IIRC, many GCs track heap usage and trigger stop-the-world when it gets above a threshold. GC'd language runtimes which don't have a tracing/compacting collector (e.g. CPython with refcounting) don't need to configure a heap size because the behavior doesn't vary as you use up your heap.

aianus · on Aug 4, 2016

That's still no good reason to throw all kinds of "Out Of Memory" errors when you're hitting 1GB out of the 16GB of physical memory on my system.

mike_hearn · on Aug 5, 2016

That's what most GCs do set the default max heap size to, assuming you don't have a swap file.

happyslobro · on Aug 4, 2016

I wonder, could we just tell each JVM instance that it may use all of the memory on the system, and then let the OS kill the first VM that allocates more than the system has to offer? Would this get us the same semantics as those of a native application? Or does the JVM preallocate all of the memory that it is allowed to use?

mcosta · on Aug 4, 2016

The JVM allocates at start a portion, takes what it needs whenever it needs until the max, but never releases memory back to the OS. So in a given moment a 6GB vm is a 5GB process but internally is using just 3GB.

jerven · on Aug 8, 2016

The standard JVMs do give memory back. The standard settings are not very friendly to do so but it does work.

mike_hearn · on Aug 5, 2016

As of one of the most recent Java releases, I believe the G1 GC does this on Windows. G1 is not the default but can be selected with a single command line flag.

mcosta · on Aug 4, 2016

A native application on what OS?

yoklov · on Aug 4, 2016

Most of the articles I see on C performance tuning are generalizable to any programming language, GC or no. Stuff like maintaining cache coherency, avoiding false sharing for threaded code, etc.

I don't write java, but my impression is the articles being talked about are much more java specific than the C ones are (C specific).

Narishma · on Aug 4, 2016

Cache locality, not coherency.

vosper · on Aug 4, 2016

Sometimes GCs require a lot of tuning, but I'd bet that 90% of software written in GC'd languages works just fine, without touching the GC settings.

In my limited experience writing performance-critical Python code the improvements always came from choosing better libraries (eg for faster serializations) or improving our own code. The GC never showed up in profiling as an issue for us.

mike_hearn · on Aug 5, 2016

Python is so slow its pseudo-GC is never the bottleneck anyway.

chrislgrigg · on Aug 4, 2016

This is all true for some languages but I think it it's far from universal. It seems to me that in many of the most frequently used and taught modern languages, the nearest most devs will come to being concerned with GC is an awareness of why object allocation should be minimized. For their purposes, that is a much better use of time than giving any consideration to GC tuning.

_ak · on Aug 4, 2016

Go's GC has a single setting that can be tuned: https://golang.org/pkg/runtime/debug/#SetGCPercent

And yet, it performs great, including predictable STW latencies, all with a relatively simple and straightforward algorithm. With that in mind, the Java GC's manifold ways of tuning the GC in all its aspects for minimal performance improvements sound more like something that was purposely built as something people can build their livelihood upon by providing consulting services, rather than something that was built for the best performance possible for everyone.

pmelendez · on Aug 4, 2016

"GCs are complex and require lots of end-user tuning --"

And they tend to be very memory hungry. Often, the memory overhead is the difference between running a program or having a bunch of browser's tabs open.

RX14 · on Aug 4, 2016

The GCs used in go or crystal tend not to eat memory more than 10% more than the peak memory of a c implementation. The perception of GC == hundreds of MBs of memory usage comes from java, where the GC aggressively preallocates, and has the memory baggage of a whole vm too. I rarely see crystal or go programs use excesses of memory.

pmelendez · on Aug 4, 2016

I can't talk about Crystal and Go because the lack of experience with them. But the described issue is not only for Java (i.e I have seen it in Node too).

Since memory deallocation is not deterministic, there have to be a tradeoff between lazy scheduling (which increase memory consumption) or frequent scheduling (which has a performance overhead).

You can do a fine tuning between those variables but that means that a high performant with a low memory footprint system is a very challenging thing to make using a tracing garbage collection (the ones in Java and Node).

atombender · on Aug 4, 2016

Java needs a sophisticated GC particularly because its OO model requires that it allocate an extraordinary amount of small objects, especially as things are boxed and unboxed. Ruby, one of the few true "everything is an object" languages, also suffers from an explosion of tiny objects. Node.js/V8 seems to suffer from this to a lesser extent, although it also ends up being a very memory-hungry language.

Go has been able to perform well with a simple GC because it doesn't suffer from this problem.

user5994461 · on Aug 4, 2016

"GCs are complex and require lots of end-user tuning -- just look at the performance articles on Java."

These articles are bullshit. Most settings are either obsolete or forcing the default. The rest is just useless.

I spent months doing performance tuning of applications stacks which were using Java (for app, database or both). Most of the settings are useless and barely change +-1% in performance.

The JVM has had good defaults for a while. The only thing one MUST configure is the -Xmn and -Xmx options to set the maximum amount of memory allocated to the java process (both settings to the same value).

mike_hearn · on Aug 5, 2016

> If you're talking about huge projects with heavy performance constraints (os kernels, AAA games and browsers come to mind)

Actually two of your three examples are no longer correct: game engines often use a core GCd heap because that's how Unreal Engine works since v3, and Chrome has switched to using garbage collection in the core Blink renderer as well. The GC project is called oilpan.

The benefits of GC are so huge, that they're used even for very latency and resource sensitive apps like browsers and AAA games.

mmargerum · on Aug 4, 2016

In the era of cloud computing, memory usage and performance are as important as they have ever been. If you can rent a smaller instance to do the same job that it really money savings.

uptownfunk · on Aug 4, 2016

> Just face it: there is an need for something intermediate to fill the gap of a script-like, native compiled, low-overhead, modern language, and a GC is part of this. The popularity and "I want to be cool so I hate it" trend of Go proves this, but the devops space is getting new useful cool toys at a breakneck speed, pretty much exclusively written in Go.

Could the answer be lbstanza when it gets there? Lbstanza.org

pjmlp · on Aug 4, 2016

> the devops space is getting new useful cool toys at a breakneck speed, pretty much exclusively written in Go.

This is why I came to peace with Go's way of life.

Way better to push for less code being written in C, than argue about the language design decisions.