From this post, Crystal appears to have some of the things many people have been lusting after in Rust: sophisticated metaprogramming, fewer sigils, a bigger standard library, fibers/coroutines/whatever-they're-called-now.
But it still has a GC :(. Rust has completely spoiled me with making it easy to minimize dynamic memory allocation and copies, and to know (almost always) deterministically when something will go away.
EDIT: I should also say that if you want to bash on Rust's lack of these things, 3 out of the 4 items I cited have solutions being actively worked on (either at planning, RFC, or implementation phase). I don't think Rust's sigils are going away any time soon, but I have no idea how you'd do that and preserve semantics anyway.
Having worked in environments where GC was an absolute "no go", I'm always amazed that so many people have problems with a GC. Yes there are types of software where using a GC'd language would probably be a bad thing. If you're talking about huge projects with heavy performance constraints (os kernels, AAA games and browsers come to mind), I would probably try to avoid it.
But most likely - you simply do not need a language without a GC. If you look at the sheer amount of applications written in interpreted languages, anything compiled straight to machine code is a win, even with a GC. The interpreter and runtime overhead is so much bigger that a GC does not really matter in them, unless you're talking about highly tuned precompiled bytecode that is JIT'ed like Java and .NET, or natively compiled languages like Crystal and Go. So yes, when compiling to native code, the GC can become the "next" bottleneck - but only after you just removed/avoided the biggest-one. And that 'next' bottleneck is something most applications will never encounter. I initially thought of mentioning database engines in the above list of "huge projects with heavy performance constraints", but then I realized a good number of specialized databases actually use runtimes with a GC. Hadoop stack with especially Cassandra, Elasticsearch? Java. Prometheus and InfluxDB? Go.
Just face it: there is an need for something intermediate to fill the gap of a script-like, native compiled, low-overhead, modern language, and a GC is part of this. The popularity and "I want to be cool so I hate it" trend of Go proves this, but the devops space is getting new useful cool toys at a breakneck speed, pretty much exclusively written in Go.
So I really don't get the whole GC hate. If you don't want GC, there are already many options out-there, with Rust being the latest cool boy in town. But in reality there are huge opportunities and fields of applications for languages like Crystal and Go. And most likely - you could use such a language, only you don't think you do because you have an "oh no, a GC!" knee-jerk reaction.
> But most likely - you simply do not need a language without a GC.
Absolutely. That doesn't mean I can't want predictable performance or deterministic destruction. I also think it's a shame that we waste so much electricity and rare earth minerals on keeping ourselves from screwing up (i.e. on the overhead of managed runtimes and GCs). Before, I'd have argued that it was just necessary. Having spent a bunch of time with Rust, I don't think so any more, and I'm really excited to see non-GC languages build on Rust's ideas in the future.
> Hadoop stack with especially Cassandra, Elasticsearch? Java. Prometheus and InfluxDB? Go.
Cassandra has a drop-in-ish C++ replacement (Scylla, IIRC?) which supposedly blows the Java implementation away in performance. A magic JIT (and HotSpot is really magic) doesn't make everything better all of a sudden.
In a somewhat recent panel (https://www.infoq.com/presentations/c-rust-go), the CEO of InfluxDB basically admitted that if Rust had been more stable when they started they would have been able to use it instead of Go and would have had to do far fewer shenanigans to avoid the GC penalty.
> Just face it: there is an need for something intermediate to fill the gap of a script-like, native compiled, low-overhead, modern language, and a GC is part of this.
Indeed. I'm not in denial of this. I made an offhand remark about my personal preferences and what I'd like to see from future languages. I still write a ton of Python for things where speed really doesn't matter.
> "oh no, a GC!" knee-jerk reaction
I don't think having a refreshing experience without a GC counts as a "knee-jerk reaction." I've thoroughly enjoyed not having to tune that aspect of performance, and I remarked on it. I think Crystal shows great promise, and certainly has the potential to offer easier ergonomics than Rust.
> That doesn't mean I can't want predictable performance or deterministic destruction.
Exactly. To just add to your point, there is no longer a reason to settle for GC pauses with Rust. It does require more thought while writing the code, but what you gain is a firmly consistent runtime. If your memory allocation is slow, you can create your own allocator/slab, and then use that for hot memory space and optimize it out.
As a longtime Java geek who never understood the argument against GC, this has been a mind altering experience. I was a big C++ person before, but after one too many memory leaks and segfaults, I could never imagine not wanting a GC. Then Rust came along and taught me better.
1) Rust appears to be significantly less productive than a true GCd language. I see a lot of people talking about "fighting the borrow checker" with Rust and I see a lot of articles describing basic patterns that would be simple in any other language, but are complex in Rust.
2) If you want to invoke code that assumes a GC you need to have one.
You can do manual memory allocation in Java by the way and a few high performance libraries do. It's just not common.
It's interesting to note that the Chrome guys have gone in the direction of deploying a GC into C++ whereas the Mozilla guys have gone in the direction of moving manual memory management into the type system. I've got nothing against Rust but I'm a Chrome user, personally.
My experience is that this is an initial hurdle to clear. As an example, I've almost exclusively worked in GC'd languages for a while, and after learning Rust for a few months I very very rarely have borrow check errors.
The fact that it occasionally requires a complex pattern to do right should get better with time (non-lexical lifetimes would help), and there's also discussion around GC integration so that you could interact with a scripting language GC when writing a plugin for it, or you could farm out GC'd objects when you need to have cycles (i.e. in graph algorithms).
> the Chrome guys have gone in the direction of deploying a GC into C++
Interesting. I'm curious how much of the browser relies on it. I'm also curious whether it's an attempt to paper over C++ with a little memory safety, or whether it actually offers performance improvements. My original point was not that GC is bad, per se, but that I quite like being able to avoid it when it's reliable to do so, which is not the case in C++, IMO.
Mozilla has also put significant effort into improving their C++ GC in Firefox, e.g. switching from a non-generational one to a generation GC[0] and then to a compacting one[1]. Just like Google can both improve Chrome and work on Go, Mozilla both improves Firefox and works on Rust.
I wasn't giving examples of Mozilla doing exactly what Chrome is doing, just counterexamples to your implication that working on Rust means no work on improving memory management in Firefox.
> I see a lot of people talking about "fighting the borrow checker" with Rust
It's also usually described as "at first, I fought the borrow checker, but then I internalized its rules and it's now second nature." You're not wrong that it's a hump to get over, but once you do, it's not a big deal.
> that would be simple in any other language,
Any other _GC'd_ language. You still fight the same kinds of complexity when you don't have GC.
The C++ advantage will not be so much after Java 10 comes out and finally have the value types and reified generics the language should have had since beginning.
Also I am yet to see any large scale production deployment of those Hadoop alternatives.
But it might still be like 5 years from now, so who knows how it will evolve.
Re: deployments, I expect that would take some time for a transition to occur. The first post date on the ScyllaDB blog is from February 2015 (http://www.scylladb.com/2015/02/20/seastar/), and it looks like it wasn't until September 2015 that they specifically started publishing benchmarks of the database itself as opposed to the network I/O library they built for it (http://www.scylladb.com/2015/09/22/watching_scylla_serve_1m/).
I look forward to those changes coming to Java, and I think that stack-based value types could do a lot for the language. That said, the Scylla folks seem to have gotten a lot of their performance gains from CPU/thread affinity and async I/O (http://www.scylladb.com/2016/03/18/generalist-engineer-cassa...). NIO is pretty great in Java-land, IIRC, but CPU/thread affinity is, I imagine, hard to pull off with a garbage collector.
Another thing I'm curious about w.r.t. value types in Java -- hasn't C# had those for a while? If so, and if your claim that value types will provide large performance benefits is true, why isn't C# always blowing Java away in benchmarks? Perhaps it is and I'm just not seeing them. Perhaps Java's escape analysis is already pretty good and solve the 60/70/80% case? Perhaps I'm not well versed enough in the subject to understand the interactions here.
Regarding C#, Microsoft hasn't invested too much on their JIT/AOT compilers optimization algorithms.
NGEN was just good enough for allowing quick application startup.
Also they didn't invest too much in optimizations in the old JIT.
Specially since .NET always had good interop to native code via C++/CLI, P/Invoke and RCW.
There were some improvements like multicore JIT in .NET 4.0 and PGO support in .NET 4.5, but not much in terms of optimization algorithms.
Hence why .NET 4.6 got a new revamped JIT called RyuJIT with SIMD support and lots of nice optimizations.
But this is only for the desktop.
.NET for the Windows Store is AOT compiled with the same backend that Visual C++ uses. In the Windows 8 and 8.1 they came up with MDIL from Singularity/Midori but with 10 they improved the workflow to what is nowadays known as .NET Native.
With the ongoing refactorings they plan to make C2 (Visual C++ backend) a kind of LLVM for their languages, similar to the Phoenix MSR project they did a few years ago.
If you watch the Build 2015 and 2016 talks, most of them are making use of C# with the new WinRT (COM based) APIs, leaving C++ just for the DX related ones.
So they are quite serious about taking their learnings from project Midori and improve the overall .NET performance.
From what I've seen, stack based value types are not necessarily the big performance win they're touted to be. The rule of thumb I've noticed is that, if the struct is much bigger than the size of a pointer, you start seeing a pattern where it's quicker to allocate in the first place but slower to pass around.
I think this is because, on a platform like Java or .NET that uses generational garbage collection, the heap starts to behave like a stack in a lot of ways. Allocations are fast, since you just put objects at the top of the heap. And then, since they're at the top of the heap, they tend to stay in the cache where access is fast, so pointer chasing doesn't end up being such a big deal. On the other hand, if you use a struct, every time you pass or return it you end up creating a shallow copy of the data structure instead just passing a single pointer.
(Disclaimer: preceding comment is very speculative.)
I think you mean specialized generics (i.e. no autoboxing of primitives when used in generics)? Reified generics implies carrying around all generic type information at runtime, which will not be the case and also has nothing to do with performance. Non-value generics will still be erased I thought.
They will change the constant pool to have some kind of template information that gets specialized (what they call type species) into a specific set of types.
The plan is even if Java cannot fully take advantage of all possibilities due to backwards compatibility with existing libraries in binary format, the JVM will support it for other languages not tied to Java semantics and backwards compatibility.
GCs are complex and require lots of end-user tuning -- just look at the performance articles on Java.
Beyond that, however, there are many uses for ownership beyond controlling memory resources. Closing a TCP connection, releasing a OpenGL texture...there are lots of applications of having life cycles built in to the code rather than the runtime.
And you can do it because someone found that it was necessary to do.
It's crazy I can't tell Java and NodeJS "use the memory you need". Instead I have to specify max memory sizes (and then watch as they inevitably consume all of it).
Of course they will consume all of it. That's how GCs typically work. They won't invoke a collection until there's no space left to allocate. Just because they "use all of it" doesn't mean all of that memory is actually live. It just hasn't done a collection yet.
Tracing garbage collectors have to have some metric for when to trigger a stop-the-world collection. IIRC, many GCs track heap usage and trigger stop-the-world when it gets above a threshold. GC'd language runtimes which don't have a tracing/compacting collector (e.g. CPython with refcounting) don't need to configure a heap size because the behavior doesn't vary as you use up your heap.
I wonder, could we just tell each JVM instance that it may use all of the memory on the system, and then let the OS kill the first VM that allocates more than the system has to offer? Would this get us the same semantics as those of a native application? Or does the JVM preallocate all of the memory that it is allowed to use?
The JVM allocates at start a portion, takes what it needs whenever it needs until the max, but never releases memory back to the OS. So in a given moment a 6GB vm is a 5GB process but internally is using just 3GB.
As of one of the most recent Java releases, I believe the G1 GC does this on Windows. G1 is not the default but can be selected with a single command line flag.
Most of the articles I see on C performance tuning are generalizable to any programming language, GC or no. Stuff like maintaining cache coherency, avoiding false sharing for threaded code, etc.
I don't write java, but my impression is the articles being talked about are much more java specific than the C ones are (C specific).
Sometimes GCs require a lot of tuning, but I'd bet that 90% of software written in GC'd languages works just fine, without touching the GC settings.
In my limited experience writing performance-critical Python code the improvements always came from choosing better libraries (eg for faster serializations) or improving our own code. The GC never showed up in profiling as an issue for us.
This is all true for some languages but I think it it's far from universal. It seems to me that in many of the most frequently used and taught modern languages, the nearest most devs will come to being concerned with GC is an awareness of why object allocation should be minimized. For their purposes, that is a much better use of time than giving any consideration to GC tuning.
And yet, it performs great, including predictable STW latencies, all with a relatively simple and straightforward algorithm. With that in mind, the Java GC's manifold ways of tuning the GC in all its aspects for minimal performance improvements sound more like something that was purposely built as something people can build their livelihood upon by providing consulting services, rather than something that was built for the best performance possible for everyone.
"GCs are complex and require lots of end-user tuning --"
And they tend to be very memory hungry. Often, the memory overhead is the difference between running a program or having a bunch of browser's tabs open.
The GCs used in go or crystal tend not to eat memory more than 10% more than the peak memory of a c implementation. The perception of GC == hundreds of MBs of memory usage comes from java, where the GC aggressively preallocates, and has the memory baggage of a whole vm too. I rarely see crystal or go programs use excesses of memory.
I can't talk about Crystal and Go because the lack of experience with them. But the described issue is not only for Java (i.e I have seen it in Node too).
Since memory deallocation is not deterministic, there have to be a tradeoff between lazy scheduling (which increase memory consumption) or frequent scheduling (which has a performance overhead).
You can do a fine tuning between those variables but that means that a high performant with a low memory footprint system is a very challenging thing to make using a tracing garbage collection (the ones in Java and Node).
Java needs a sophisticated GC particularly because its OO model requires that it allocate an extraordinary amount of small objects, especially as things are boxed and unboxed. Ruby, one of the few true "everything is an object" languages, also suffers from an explosion of tiny objects. Node.js/V8 seems to suffer from this to a lesser extent, although it also ends up being a very memory-hungry language.
Go has been able to perform well with a simple GC because it doesn't suffer from this problem.
"GCs are complex and require lots of end-user tuning -- just look at the performance articles on Java."
These articles are bullshit. Most settings are either obsolete or forcing the default. The rest is just useless.
I spent months doing performance tuning of applications stacks which were using Java (for app, database or both). Most of the settings are useless and barely change +-1% in performance.
The JVM has had good defaults for a while. The only thing one MUST configure is the -Xmn and -Xmx options to set the maximum amount of memory allocated to the java process (both settings to the same value).
> If you're talking about huge projects with heavy performance constraints (os kernels, AAA games and browsers come to mind)
Actually two of your three examples are no longer correct: game engines often use a core GCd heap because that's how Unreal Engine works since v3, and Chrome has switched to using garbage collection in the core Blink renderer as well. The GC project is called oilpan.
The benefits of GC are so huge, that they're used even for very latency and resource sensitive apps like browsers and AAA games.
In the era of cloud computing, memory usage and performance are as important as they have ever been. If you can rent a smaller instance to do the same job that it really money savings.
> Just face it: there is an need for something intermediate to fill the gap of a script-like, native compiled, low-overhead, modern language, and a GC is part of this. The popularity and "I want to be cool so I hate it" trend of Go proves this, but the devops space is getting new useful cool toys at a breakneck speed, pretty much exclusively written in Go.
Could the answer be lbstanza when it gets there? Lbstanza.org
Manual or deterministic memory management might be a must-have for certain usage domains, but for any domain in which one would be using ruby, this seems unlikely, and presumably one could FFI into C when this is the case. There are hardly any languages commonly used in industry which don't have GC (essentially just C/C++). And many of these garbage-collected languages are capable of blazingly fast code with a small memory footprint.
Regardless, for a language which is meant to operate in the same domain as ruby and be as easy and declarative, not having a GC would be a puzzling decision.
As a side note, I'm curious what areas you are programming in where the presence of a GC is such a downside. Having written almost exclusively in garbage-collected languages over the last few years, it's something I almost never think about (and happy not to). Of course I don't deny that stricter memory control is sometimes necessary.
Crystal seems to be targeted at a domain where ruby is not fast enough. That includes domains where GC is a problem.
A tracing GC means that you either have to deal with potentially long GC pauses or you need a lot of extra free memory at all times to give the GC time to catch up before running out of memory [1].
Go says it can achieve 10ms max pause time using 20% of your CPU cores provided you give it 100% extra memory. In other words, memory utilisation must be kept below 50%.
Cloud/VPS prices scale roughly linearily with memory usage. So using a tracing GC doubles your hardeware costs. Whether or not that is cheap depends entirely on what share of your costs is hardware cost and how much productivity gain you expect from using a tracing GC.
I would be very interested in learning how much CPU and memory overhead Swift's reference counting has, because in terms of productivity Swift is certainly competitive compared to languages using a tracing GC.
[1] Azul can do pauseless, but I don't know exactly what tradeoffs their approach makes. Their price is too high for me to even care.
Note though that a lot of the problems with GC in crystal can be worked around by replacing classes with structs. The latter are passed by value and allocated on the stack. There is also access to pointers and manual allocation if that should be needed (though that will end up with roughly the same lack of memory safety as in C) to optimize a hotspot.
For the JVM Shenandoah GC [1] can do so as well (or at least very low consistent pauses) and is available via EA builds or the OpenJDK in fedora 24 [2].
This is with pointer happy java code, not with special effort to have pointer less data.
> The key to performing concurrent evacuation is having the Java Threads and the GC threads agree on the location of objects. This is accomplished in Shenandoah by the use of a Brooks forwarding pointer. All reads by the Java Threads indirect through this forwarding pointer. All writes to objects in targeted regions must first copy the object and then write to the object in its new location.
I'm a bit surprised that indirection is efficient enough to be worth the trouble (since you need reads and writes to branch for the indirected-object case), but I can't argue with results.
If you are on a server do you need 10ms max pause time? For most applications running go on a remote machine, 25ms should be in the realm of acceptable.
People want non-GC language because everyone already has GC language that their are comfortable with.
So basically C/C++ replacement is the only niche that is left to fill. It would be even better if new language could replace even GC-languages, so I can can write fast low level libraries or websites in single language, without sacrificing productivity. That would be the Holy Grail I guess.
I agree, classes are silly in JavaScript. It just masks the prototype and creates ambiguity; using the prototype effectively is part of being a good JavaScript developer.
I understand the hate and everything but honestly I think it presents a fun and refreshing way of solving problems.
Also npm is pretty awesome, aside from how massive the node_modules folder gets.
One of the largest areas of concern is for real-time systems (systems which fail if they do not respond within some small time threshold). Most GC involves stopping the world to perform the GC which can pause your program's execution for some number of milliseconds. If GC pauses exceed your real-time requirements, you're out of luck.
Some languages, like erlang, do slightly better by garbage collecting erlang processes individually, so other erlang processes can continue running during GC.
For hard realtime systems it actually doesn't matter anymore. These are mostly implemented with a "don't allocate at all" strategy, since every allocation is not determistic. Therefore things are mostly statically allocated and bounded. And maybe there are some objects pools are around. You can do this in Go in the same way as in C. The only question there is if the GC will still run in the respective languages if no allocations happen through the user (e.g. because the runtime could do allocations in the background for it's housekeeping).
Yeah, the idea that malloc/free "don't pause" can only be based on not understanding how mallocs actually work. Advanced mallocs are often even multi-threaded and do some work on background threads.
Hard-realtime is always "allocate everything up front". It has to be. Allocation of dynamic sizes is not a problem you can make fully deterministic.
This is known, but it's always possible to overcome these situations and its part of the language maturing. Golang has already had its run with optimizing their GC for real-time systems. Twitch uses Golang for their IRC chat, and they've taken the Golang GC on a journey which you can read about here: https://blog.twitch.tv/gos-march-to-low-latency-gc-a6fa96f06...
Crystal will at some point also be forced to optimize their GC for these cases, although it currently uses an out-of-the-box GC called Boehm-Demers-Weiser conservative garbage collector http://www.hboehm.info/gc/ which they have acknowledged they need to replace sooner or later.
The parent comment is referring to hard real-time systems (where not responding within a certain timeframe would lead to catastrophy). We're talking things like pacemakers, anti-lock brakes, industrial control systems.
Regardless of how good GC is you would never use it in a hard real-time system because it is non-deterministic. IRC chat is only soft real-time.
Those kinds of systems won't get compilers for something other than C/C++ or maybe Ada for a long time. Usually you're stuck with a compiler from the chip vendor that kinda-sorta supports C.
Indeed, and you'll probably also need a special-purpose RT OS stack (or have to write your own/go without).
EDIT: I'd also add that this is such a niche[1] area of programming that expecting any mainstream language to meaningfully support it is... optimisitic and that mainstream languages shouldn't try to support it. (Soft real-time may be reasonable, but I believe that can be achieved with GC as demonstrated by the Azul JVM.)
[1] Niche, but obviously important, but perhaps not lucrative enough for anything to displace C or perhaps Ada -- given that these industries tend to be extremely conservative. (I wonder if ATS is used, though. Can't claim it's pretty, but proof seems like it would be a good thing for these systems?)
Just as a point of interest, I believe there are special forms of garbage collector that are suitable for hard real time systems.
The principle is to regularly use a bounded amount of time for collecting - in line with the latency requirements for the whole system. I think the relevant term is 'tick tock', as in tick - compute, tock - collect.
The thing about hard real time systems is that they must be predictable, which is quite wide term. Predictable memory utilization, predictable computation cost, predictable response time. In an attempt to at least fit into these requirements GC must be "passive", on-demand i.e. callable from code. Even with bounded collect times, number of collectings must be predictable/controlled to predict computational cost/time of code paths. And that becomes not much different from manual memory management.
All of these can happen with non-GCd languages through heap fragmentation (i.e. even when correctly allocating and deallocating memory, you can still end up with a fragmented heap.) Tho only way to aviod this is to avoid all dynamic allocation (which is indeed done in a lot of systems) or exclusively use memory pools instead of a traditional heap.
If there is no need for predicting memory utilization, then doesn't real time GC fit the bill?
Consider all your know w/e you want execute in time T. A real time GC make sure it always execute in time 2T. For any sequence of operations.
Any program running on any non-realtime OS can stop for any number of milliseconds. There are a billion reasons for such interruptions like other processes wanting to run, cleanup phases internal to the OS, memory paging...
If your program stops for 50 ms, do you really care if it was because of a GC cycle or something else? If you really do care, then you are not allowed to target Linux, Windows or OS X all of which are decidedly not real time operating systems.
To answer your last question, Hard real time embedded systems are everywhere in Robotics, Aerospace, Telecommunications, Automotive, Medical devices..
The real time capabilities are not always done in pure SW, there are some FPGAs, but when you do rely on SW, you often can not afford to spend even a few milliseconds in GC. In some case, that would mean killing or maiming someone.
And you are often tied to the HW vendor toolchain for a specific DSP, MCU,.. that is only supporting C or C++. This is a domain that is moving very slowly, currently my most optimistic time table would be able to have vendor support for Rust toolchain in 10 or 15 years but I don't foresee any GC language coming to replace the critical part written today in C or C++.
Even soft real time systems like games or real time networking solutions suffer from non-deterministic GC pauses. Audio processing is another example.
You can get by in a GC'd system if you're careful not to allocate while being in the "hot path", but it's much more difficult than manual memory management (you need to know the internals of the GC algorithm) and interference from other threads may spoil your hard work.
Minecraft is a prime example of annoying (Java) GC pauses causing annoying interruptions. Another one is Kerbal Space Program's choppy audio (from C#/Mono GC). Although these games made millions or billions of dollars regardless, so you might argue it's a non-issue.
> currently my most optimistic time table would be able to have vendor support for Rust toolchain in 10 or 15 years
Not sure how much you'd need changes for the Rust compiler to be able to use it on MCUs and DSPs, but LLVM is more and more common and it might be (almost) enough to have the LLVM backend ported to the target arch. LLVM is moving fast, so for some targets it might be viable much sooner than your estimate.
It was not clear from my post, but what I would like to have is a RTOS with an implementation of the Rust std modules, to be able to develop an application on top of it. As far as I know, no one is working on that.
If you write programs with a gc there is always a level which you can't get your hands on to change. If you want to create a piece of a puzzle that is tiny, does one thing, and does it right, like a shared library, a command, a linkable object, or any tiny standalone binary, the gc is always a thorn.
Anything where memory or interactivity needs to be tightly controlled is problematic with a gc. Not only that, but a gc doesn't scale as well with lots of threads. Ultimately you need thread local allocation since you will eventually be bottlenecked by the fact that typical allocation (with malloc, VirtualAlloc, mmap, etc) is protected by a mutex, and deallocation suffers the same fate.
Except that garbage collected languages generally use per-thread nurseries, so the fast path is a small number of instructions. Having a GC also makes lock-free programming easier.
Whether reference counting is GC or not is arguing about semantics.
But correctly implemented reference counting is essentially pause-free. It's consistently "slow", which is better for some cases that unpredictably "fast".
IMO it's not just semantics, rather, GC is too general of a term if it includes ARC. In terms of performance analysis the two are vastly different. One has basically an unbounded worst case but a good average case, the other is the opposite.
Reference counting also has an unbounded worst case: what if you drop the last reference to a very large graph of objects? Then you free the whole thing, which can take an arbitrarily large amount of time.
The main difference there is: You can control the timing of when to pay that penalty. With a full blown GC, if you run into performance issues because of it, you basically have to rearchtect the whole app (with something like memory pools, which you pay by needing more RAM than strictly necessary and with a vastly more complex code). With ARC I can track down a slow memory operation to a single line of code and deal with it there (e.g. by moving the complex object into a singleton). IMO a full GC is just the wrong level of abstraction for anything that's timing relevant, which includes all UI threads.
> IMO a full GC is just the wrong level of abstraction for anything that's timing relevant, which includes all UI threads.
Hard real-time GC systems exist. In these systems, you can prove that pauses last no longer than a certain number of milliseconds. They're definitely applicable to programs with UI.
Can you prove that dropping a reference doesn't free an arbitrarily large number of objects? You can probably convince yourself in specific cases for specific programs that you don't see arbitrarily large refcount-release times, but any change you make to the code might invalidate this analysis.
I keep hearing about these, but which of the popular GC languages (read: lots of library support) have real time GC? Aren't we talking about industrial RT applications rather than GUIs?
I agree that to compete with Ruby ergonomics one probably needs a GC. I think part of what I'm getting at is that there are other ways to approach these problems, and aping Ruby isn't necessarily one I prefer. Not to knock Crystal, it seems very cool.
Re: application domains, I've recently been doing some work in CPU/memory constrained applications (not embedded, running big >500GB jobs on HPC clusters), and a GC is unfortunately a non-starter for this kind of data processing.
I have also been watching with great anticipation the work being done on "big data" processing with Rust (https://github.com/frankmcsherry/timely-dataflow) and how that might obviate the need for a GC with the various JVM RAM-hogs which dominate that field.
There are also many areas where people work (many of whom provide the tools that programmers of GC'd languages use for their jobs) which can't admit a garbage collector.
For example, I currently deploy Django code (running on an interpreter that needs to implement, not run on top of, a GC) to a machine with a Linux kernel, running nginx, backed by another machine running PostgreSQL, with caching in Redis. None of those very important tools can reasonably offer the performance needed in a garbage collected language.
For another example, I'm typing this (quite lengthy) response in a low-latency application (a browser) which would also be difficult to implement in a garbage-collected language.
About GC: it would be nice if there would be some kind of standard-ish implementation framework for a GC in an LLVM language.
LLVM has been enabling fantastic new programming languages, and while it has support for a GC, I have not found a GC library that would be easy to embed in a new compiler/runtime environment.
Now there are dozens of LLVM-based languages (or language prototypes) that have different, incompatible implementations of GC with varying degrees of quality. If there was a relatively simple but efficient GC available, it would be much easier to implement a new language on LLVM.
At one point there was a project called HLVM, but it was targetted at implementing JVM and .NET -style virtual machines. This is not what I'm looking for and I think the project is dead now.
If anyone knows about a GC implementation for LLVM, I'd really like to take a look. If it's a part of a programming language project but would be relatively easy to rip out of the rest of the compiler/runtime, it's not a problem.
It is very hard to have a general purpose GC library, because the best GC algorithms require a tight cooperation between compiler, GC and language semantics.
For me, the only viable alternative to GC are substructural type systems like in Rust's case.
You're definitely right and it's not an easy task.
However, I think there's a sweet spot where you could implement a fairly nice boilerplate/framework that would be an 80% solution to the problem which would be a vast improvement over the current state.
The missing 20% would be language specifics and that would be either solved by forking the boilerplate code or writing some kind of callbacks for discovering references given a root object.
edit: Additionally, there's no simple example of using a GC with LLVM. It would be very helpful if there was, for example, a GC'd version of the Kaleidoscope language used in the LLVM tutorials. Even a trivial Lisp-style cons/car/cdr object system coupled with the simplest possible mark'n'sweep GC would be good.
The Boehm collector (http://www.hboehm.info/gc/) appears to offer what you're describing. In fact, it looks like it's what Crystal added in 2013 to implement its original garbage collection:
"About GC: it would be nice if there would be some kind of standard-ish implementation framework for a GC in an LLVM language."
Not quite LLVM, but take a look at the Eclipse OMR project.
OMR intends to provide a set of reuseable components like a GC, port-library and given more effort a jit to be reused into existing language runtimes or build a whole new language out of them.
Perhaps that's not the right way to phrase it. I guess I'm mostly thinking of compiler plug-ins which are very unstable right now. Which means that most users probably never write procedural macros in their own code. (I certainly don't)
There's also syntex (https://crates.io/crates/syntex), which basically provides compiler plugins for stable rust. It does so via code generation though.
I have no problem with GC, but I want to see reasonably complicated benchmarks that actually show that it's "fast as C". Because I don't believe that at all.
Are you saying the GC is as fast as C? I bet it's not.
That being said, the programs I've built in crystal "feel" very fast, here are a few random performance tests, if you're asking about overall performance:
The biggest problem with GC, it seems, is some sort of non-determinism that it introduces in the program's behavior. Otherwise, garbage collection, being 'lazy', is, in fact, more efficient way of releasing unused memory compared to how it is usually done in C and, especially, C++, where memory is released 'eagerly' (e.g. as part of the destructor), thus wasting precious machine cycles on something that may not be even necessary at all.
> wasting precious machine cycles on something that may not be even necessary at all.
I'm not familiar with very many scenarios where one has a garbage collector but doesn't need to free some piece of memory when it's no longer used. Could you clarify what you mean here?
For most GC algorithms (but not ref-counting), the time complexity is O(N) where N is the number of surviving objects (or it could be the number of surviving edges/references, I forgot!). For manual/deterministic/eager memory management, the complexity is O(N) where N is the number of allocated/freed objects.
So if the number of survivors << the number of allocated objects, which it always is in many functional languages, then GC can be faster than manual memory management. Especially if you use a copying GC algorithm which makes allocation extremely cheap.
This is true for those compute job types which are mostly 'CPU bound' and which usually create most of the objects that they need at the very beginning; these objects would not be released until the job is finished anyway. I admit that in this case it may take some thought and deliberate effort on the programmer's part to avoid creating many short-lived objects.
Nim[0] takes an interesting approach. It uses "deferred reference counting", effectively allowing GC cleanup to be spread out over a period of time. This at least helps with GC pauses.
It also seems to allow tweaking for soft realtime systems, e.g. games.
I've never understood people's abhorrence of sigils. They are useful shorthand for very specific concepts. It's like hating apostrophes in English. You can do away with them entirely, but I don't think most people would consider that an improvement.
Now, the greater density of concepts shorthand notation can be abused, and too much of that often shifts the cost benefit ratio further to the cost side for all but the most expert in the language, but that's a problem of too much, not on inherent with their use at all.
I don't personally think any of them should go away. I just note that many newcomers to the language feel they're opaque. Having written a good bit of rust now I do sometimes find the sigils impact readability (especially in macro_rules).
I think it's target is more along the space where "go" is, as little code as possible (like a scripting language), but still speedy. So in true scripting language form, you don't have to worry about collecting your objects ever. I'm not aware of any benchmarks on the cost/hit of this, I do know it uses the BDW GC which is hopefully pretty battle tested...
There seems to be a lot of GC hate in this thread. I wonder how many people are aware that Unreal Engine, the dominant multiplatform game engine used to create the majority of AAA games, uses a GC.
Yeah it's weird how Rust has suddenly made me look for no GC languages everywhere I look. It opened up a whole new desire to not accept no for an answer in that regard. I have this burning thought in the back of my head that there just has to be a simpler way to offer it than Rust does it too.
Memory management is difficult, extremely difficult, to get correct in the way rust does. I don't think I've seen a leak or bad dereference in years. The only way it really manages this is by tying references into what amounts to a proof assistant. Every simpler method of which I can think either sacrifices capability (e.g. no references at all; only raii + copy on write) or it becomes a GC with all its wonderful trade offs.
GC isn't terrible, though. Azul has struck an amazing balance between latency and eagerness—even if you can't afford it the technology does exist. If you don't have latency, memory restrictions, or embedding requirements, rust may be overkill.
> Memory management is difficult, extremely difficult, to get correct in the way rust does
Memory management isn't hard --- you just need to pay attention to detail and not say "YOLO, let's abort on OOM" like the Rust stdlib does. Rust is an unacceptable language for anyone who cares about robustly responding to heap exhaustion.
"Memory management isn't hard --- you just need to pay attention to detail"
You're quite right. The problem is that every bit of attention you spend on that detail is attention that you're not spending on details that are actually solving your problem.
I programmed in C for decades. I do not miss malloc() and free() in the least.
(I still do use C when the situation warrants, but the situations where it is warranted are becoming rarer and rarer with each passing year).
- deregister threads so that they are not stopped by GC
- eventually avoid the runtime altogether
There really is no realtime system that D can't do.
The whole anti-GC thing is a giant strawman that consider all GC stop-the-world, unavoidable, and overarching. Academia decided in favor of GC decades ago, and industry has been following suit for good reasons: mental overhead associated with finding owners to everything.
It is not. Languages that are designed to run a GC get crippled when run without: depending on the language you might lose access to closures (if dynamically allocated), rich data structures (standard dictionaries, lists, etc), sometimes even aggregates (if boxed by default) and you have to resort to using arrays of primitive types.
As they say, you can write FORTRAN in any language, but you don't necessarily want to (this is unfair to modern fortran which I hear is actually a decent language).
You speak as if a GC is unconditionally better than alternatives and it is a solved problem but using a GC has issues as well.
On the theoretical side, not reasoning about ownership means sharing data betweent threads is done with copies (slower) or locking (slower and error prone); if you know about ownership you can share references to data while it can't be mutated for free.
Ownership is also important for any non-memory resource (file handles, mutexes, etc). GCs release those "whenever", maybe never, unless you close manually.
And even though manual memory management has some small non-deterninistic overhead for heap coalescing (which one can usually work around with pools), most GCs I've worked with add measurable overhead. This equates to more cost per server, more load, more battery life drained, higher response times...
> On the theoretical side, not reasoning about ownership means sharing data betweent threads is done with copies (slower) or locking (slower and error prone); if you know about ownership you can share references to data while it can't be mutated for free.
I don't think it follows and it's rather the reverse: it what I share has a global owner (ie. the GC), I don't have to lock or copy by definition: once it stops being reachable it will be collected. That's why some lockfree algorithms are enabled by the GC.
With ownership you would have to have a unique owner, or reference counts.
GC does require write barriers or stop-the-world though so let's say it's a draw :)
> Ownership is also important for any non-memory resource (file handles, mutexes, etc). GCs release those "whenever", maybe never, unless you close manually.
Yeah, it's a big problem that the GC even attempts poorly to close them. But D has scope guards and RAII builtin so for the 50% of non-memory resources you still have to think about ownership indeed. That's more complicated that the C++ situation. But realtime it does not prevent, you may well find yourself having more time to optimize :)
> if what I share has a global owner (ie. the GC), I don't have to lock or copy by definition
Then how you do avoid data races? Two shared references which can mutate your shared data requires either a copy, a lock, immutability, or a single writer.
I use "parallel foreach", sometimes worker queues, implicit single writer... like in C++.
It sounds like you think only Rust-style ownership can avoid data-races. Sure, if you want the type system to do it. For me discipline is enough and I've seen it work in teams too. Not seeing such a problem really.
Nothing specific to Rust, although Rust encodes in the type system what I usually have to keep track of mentally, which is nice.
Trivially parallel algorithms do benefit from constructs like "parallel foreach" and implicit single writers, but in general, one either has to stick to those models (where the cognitive overhead is low but manageable) or if one ventures into more complex territory, one has to either deal with a higher mental complexity (ownership or locks), performance degradation (copying), or immutable data (if it fits your problem and doesn't decrease performance, win-win).
My argument is simply that GC doesn't fix everything, and the mental overhead of tracking ownership of memory (to me) isn't a huge burden, especially since I have to do it for non-memory resources and memory resources shared between threads already.
I'm not against a GC - I like languages that mix GC and non-GC side-by-side - because sometimes I do want to just forget about my memory, but only if it fits my problem domain. But I don't think GC beats non-GC hands-down for-all-cases.
Yes. I'm currently trying to think of a good way to add it to Myrddin, which currently makes it more or less manual. Doing it in a simple way is a tough problem.
The simplest solution is to add the moral equivalent of 'null' -- objects that transition to an idempotently destructable state, which solves a lot of complexity with the data flow and analysis (yay!) at the cost of some safety (boo), and nulls (louder boo).
My language (Lily) handles the problem by trying to avoid the gc where it can.
Lily is statically-typed, built-in classes can't be inherited from, and there's no C-like casting.
With those rules in mind, most objects can't become cyclical. It's impossible for a list of strings to loop back onto itself, for example. It helps that the value classes backing enums (like Option and Either) are immutable, which I so far suspect prevents a cycle.
That at least allows you to group classes into three groups:
One of the goals I have is to keep the required runtime absolutely minimal, as well -- I'm ok with the compiler inserting some user-defined code in the appropriate places to initialize or release values, but I'd like to avoid growing the required code in https://github.com/oridb/mc/tree/master/rt unless it's absolutely necessary.
And yes, that ~60 lines per platform is really all that's needed. (And actually, I should be able to merge more of it for SysV platforms.)
So, I've thought about a GC, but I'd really prefer not to have it.
The claim is "fast as C", so I was surprised that the performance comparison was with Ruby, not with C. On my machine, the Ruby Fibonacci program executes in 47.5s, while a corresponding C program executes in 0.88s - that's a factor 54 difference, while the article reports a factor 35 for Crystal. That's good, but what causes the difference? This benchmark is pretty much all function call overhead, so I doubt it's representative of real performance-sensitive code.
The Crystal website itself makes a more modest claim than "fast as C" under its language goals: "Compile to efficient native code", which it clearly does.
All of these "fast as C" claims about modern, high-level Python-like languages (be they statically typed and natively compiled) are missing the point. It is mostly the minimalistic and terse programming style that C encourages that makes C programs performant. You avoid allocations wherever possible, you write your own custom allocators and memory pools for frequently allocated objects, you avoid copying stuff as much as possible. You craft your own data structures suited for the problem at hand, rather than using the standard "one size fits all" ones. Compare that to the "new this, new that" style of programming that's prevalent today.
While Nim is my favorite language, I can understand that it has a small userbase, for these reasons:
1. No major backer like Google for Go or Mozilla for Rust
2. No killer feature like "memory safety and performance without GC" for Rust, instead a mix of all the reasonable down-to-earth features I want in a programming language
3. Some unique decisions instead of what you're used to from other languages, for example partial case sensitivity
I am a strong proponent of Nim but this is probably the worst idea I have ever encountered in language development. Honestly!
Partial case sensitivity and the special underscore case are features I can live with. Unfortunately this has actually become a stumbling block for a wider adoption of Nim.
All strange special features should be optional, not default.
What makes you think this is default? It most certainly is not and will be removed completely in the future.
Edit: here is a source: http://nim-lang.org/docs/manual.html#syntax-strong-spaces ("... if the experimental parser directive #?strongSpaces is used..."). The last time this was discussed I said that it would be removed completely, and I still believe it will be. It's simply not a priority for us right now.
Some of those features are default in Nim, some (strongspaces) are not. I say that all such weird features should be optional in general so that newcomers don't get scared off.
Also case and underscore should work like in C per default since Nim interoperates with C seamlessly anyway. Case insensitivity and ignoring underscore are ok if optional.
If you're consistent about how you space your infix operators, this will have no impact on your code.
If you put space around some operators and not around others, in a way that doesn't correspond to precedence, you're going to confuse anyone who reads your code, in any language.
I like it for distinguishing homonym operators, but not for the precedence stuff they seem to have there. I'd like something like this though:
let a = 10;
a / 5
output> 2
let b = pwd();
b/temp
output> Directory<"~/temp">
b / 2
error> b:Directory does not implement method "divide(:number)"
a/temp
error> a:int does not implement method "get(:string)"
I used both nim (back when it was still nimrod) and rust for a while, before eventually settling on rust. I tried to give nim a chance, and was told that "they will grow on you" ("they" being the things you mentioned that were "unique decisions instead of what you're used to from other languages"). They never did, and though I got used to avoiding the problems I initially had with them, the language just never "felt good" to me.
wow... have never heard of that partial case sensitivity before.
I think this goes beyond syntactic sugar. Holding the hand of the developer too much?
Personally, as a Python programmer I like interfacing with C++ code like Qt via PyQt. If I see a camelCase method I know where it came from, but if I see a PEP-8 style name or method I know it's our own code, not from Qt.
Shameless plug: is it considered totally uncool in 2016 for one to be developing a memory-unsafe, manual MM, non-OO, thread-denying language that preserves most of the C semantics?
I'd be very interested in a language that is roughly as low level as C, but has some obvious warts "fixed" while still being able to run on bare metal or with a minimal runtime system. I also don't care about a standard lib as long as I can call open(), close(), read(), write(), socket(), etc.
Native threads is another requirement for me.
Things I'd like to see in a language:
- compile to native executable
- type inference
- module system without header files
- easy to call into native C code, and export functions so they can be called from C or any other language
- first class SIMD structures (this is missing from Rust!), so that you don't have to duplicate code for sin4f and sin8f (which would be line-by-line equal, except types)
- perhaps some kind of modern polymorphism (ie. not class based OOP)
- can target GPUs via LLVM or SPIR-V
- memory safety is optional, but nice to have. I'd be mostly interested in using this kind of language for GPU kernels and tight inner loops, where you wouldn't be allocating anyways
I have a bunch of design ideas and prototypes in my drawer waiting for a lot of free time and inspiration appearing.
I like my tools sharp, even if it means there's going to be blood occasionally.
My next big endeavour with Quaint will be to create a clean module and linking system (without header files or any textual inclusions). Each source file will be transformed to a corresponding unit which contains code, data and exported type definitions. The linker would then merge these units and produce a native executable that runs your program in the self-hosted VM which will be a part of that executable. Pure native compilation or LLVM integration is too much of a hassle for me at this point.
One of the virtues of the language would also be the direct correspondence between the HLL code and the emitted VM instructions, without any optimisation passes. This makes it much easier to reason about code performance and to write code which performs consistently and predictably (albeit a bit slower).
Nim fits everything you ask, except for "can target GPUs via LLVM or SPIR-V". Even that may eventually be fixed by having OpenCL C as a compilation target.
Also, I am not sure what you mean by "first class SIMD structures", but you can definitely have a single definition for sin4f and sin8f if they are line by line equal except types, by using union types.
Nim is definitely on my short list of languages to learn, however...
Targetting GPUs is a deal-breaker. I'm sure the Nim compiler would be pretty easy to retarget to GPUs via SPIR-V (the new binary IR for Vulkan/OpenCL shaders and kernels) or OpenCL/CUDA C. But I don't think that would work for Nim's runtime system or existing Nim libraries (including any standard libs it has).
Also Nim's pauseless low latency automatic memory management (I guess you can call it a "GC") is very interesting but it's not what I'm after.
> Also, I am not sure what you mean by "first class SIMD structures",
I mean this:
def multiply_and_add(a : <n x f32>, b : <n x f32>, c : <n x f32>) : <n x f32> {
return (a*b) + c;
// TODO: figure out how to use "madd" from FMA4 or NEON instruction set
}
The trivial piece of code above should be "generic" so that it can be called with any width of vector.
Now the example above is very trivial but more complex examples might have challenges for correct implementation of the type checker. In particular, doing vector shuffles (ie. equivalent __builtin_shufflevector in GCC/Clang vector extensions) would need to have a strange type. Shader languages typically use a syntax like `myvector.wxzy`, which might work.
This might perhaps be possible with an ungodly mess of C++ templates and explicit template specialization for each vector type (and hoping that the compiler is aggressive enough in inlining). But I'm not really a fan of template-heavy C++.
In fact, the kind of solution I've been thinking about would be semantically similar to what I'd do with C++ templates.
> but you can definitely have a single definition for sin4f and sin8f if they are line by line equal except types, by using union types.
I'm not familiar enough with Nim's union types to be sure, but my guess is that this would not compile to efficient low level code apart from the most trivial of circumstances. This is my (not very) educated guess based on other high level languages with some concept of union types.
Anyway, Nim is a very cool language that I will check out sometime in the near future. It just isn't what I'm looking for my very specific use case.
A union type in Nim can only be used in funciton arguments, and it does the obvious thing: when you actually call the function, it specializes to the type you are calling with. Think about templates in C++, where the type parameter can only assume one of two (or more) values. Hence it would generate exactly what you would write by hand, but the syntax is much less messy than C++ templates
You might also be interested in Jai [0] which has many of those things but is not a 'real language' yet or possibly ever. Lots of interesting ideas though.
Thanks, I've read about it before, but haven't spent too much time looking at it.
However, this "single program, multiple data" isn't exactly what I'm looking for (it would solve the sin4f vs. sin8f issue mentioned above, though). I need explicit, low level access to SIMD, coupled with genericity over vector widths. This means doing almost assembly-style SIMD code with explicit shuffles, blending, etc as well as access to intrinsics where needed.
I also need portability (ispc is from Intel, it probably doesn't support ARM NEON) and targetting GPUs.
I'm very well aware that my needs are very specific. I need to do math stuff for 3d graphics and physics applications.
All I need is for a lot of free time to appear from out of nowhere and I can write a prototype compiler for this myself :)
See example above in this thread. In C + GCC vector extensions, I just use normal arithmetic operations (+, -, *, /).
However, when using specific intrinsics they are for a specific width. It might take some "library code" to take advantage of some instructions like dot products, etc.
The only thing I would add would be that compared to Ruby, Nim still takes you quite a while to put something together, so defaulting to Ruby isn't necessarily a great idea.
The first benchmark ("Nim vs Rust") I looked at says
> Rust regex! runs faster than Regex
which is a very old claim - Regex should now be much faster than regex! ever was. Any pre-1.0 Rust benchmarks are probably wrong (to be fair, most benchmarks are probably wrong anyway).
A lot of them did show the other way, then came to Rust people's attention and improvements were submitted. As well as both languages' implementations changing over time.
If you don't have to write cutting-edge games or embedded software for tiny systems, why do you have to care about allocations at all? Today's systems and RAM's are so fast that garbage collections don't really matter in most cases. Consider SBCL (compiled Common Lisp) which is almost as performant as Java and C++.
I used to develop software in C and C++ for many years, and a garbage collector was the thing I wanted the most. GC-free programming is unnecessarily tough in most cases, except you desperately need it for games and embedded systems.
> If you don't have to write cutting-edge games or embedded software for tiny systems, why do you have to care about allocations at all? Today's systems and RAM's are so fast
What? RAM is not fast at all, the latencies have almost not improved in 20 years (compared to the improvement of other subsystems like the CPU, of course).
Because allocating something on the stack means adding a something to a pointer and puts the object somewhere that is likely to be in the cache during the function and allocating something on the heap usually takes a tree traversal and puts the object far away from the other stuff you might be using.
Also, using numbers from a benchmark game is not representative of the performance of real world applications. If you look at the code, you'll find that it's written in a style that avoids heap objects and GC wherever possible. Forcing heap allocation is what makes Java slightly slower than C in many cases.
Nitpick: everything you say is probably correct, but such performant C programming is also the very opposite of a "minimalistic and terse style".
Which one is more minimalistic, 'new Foo' or a collection of various custom-tuned allocation methods? Which one is more terse, 'myList.Where(foo).Select(bar).Aggregate(baz)' or an explicit for loop?
It is minimalistic in the sense that the language provides a narrow set of primitives and a skilled programmer combines these primitives in the most sensible way to solve the problem at hand. Higher level stuff in most other languages is much more generic.
Indeed, it may not be minimalistic in terms of the code size.
> All of these "fast as C" claims about modern, high-level Python-like languages (be they statically typed and natively compiled) are missing the point. It is mostly the minimalistic and terse programming style that C encourages that makes C programs performant. You avoid allocations wherever possible, you write your own custom allocators and memory pools for frequently allocated objects, you avoid copying stuff as much as possible. You craft your own data structures suited for the problem at hand, rather than using the standard "one size fits all" ones. Compare that to the "new this, new that" style of programming that's prevalent today.
Exactly! I cannot agree more.
I have a small test program I port to different languages to test the length of the code and the speed of the program. Of course it only represents a single use case.
* C is first, of course.
* twice as slow, come Pascal, D and... Crystal!
* x3 to x5, come Nim, Go, C++ (and Unicon).
* x6 to x9, come Tcl, Perl, BASIC (and Awk).
* x15 to x30, come Little, Falcon, Ruby and Python.
* x60 to x90, come Pike, C#, Bash.
* x600 to x1000, come Perl6 and Julia.
This list looks byzantine, I know :-) The trends I can get out of it:
* the last 2 are languages with JIT compilation, and that's horrid for short programs.
* the "old" interpreted (or whatever you name it nowadays) languages (Tcl, Perl) are not so bad compared to compiled languages, and much faster than "modern" one (Ruby, Python). (Again, this is only valid for my specific use.)
* compiled languages should all end up in the same ballpark, shouldn't they? Well, they don't. The more they offer nice data structures, the more you use them. The more they have some kind of functional style (I mean the tendency to create new variables all the time instead of modifying existing ones) the more you allocate and create and copy loads of data. In the end, being readable and idiomatic in those languages means being lazy and inefficient, but what's the point of using those languages if don't use what they offer? C forces you to use proper data structures and not re-use existing ones. It comes naturally. What is unnatural in C is to copy again and again the data, it is simpler to modify the existing one and work on the right parts of it, not to pass the whole chunks every time you need one single bit. In more evolved languages, compilation won't save you by doing some hypothetical magic tricks, it cannot remove the heavy continuous data copying and moving you instructed your program to do. And that is what made the difference in speed between C on one side, and D, C++, Go on the other side.
I'm curious where Rust fits in on your hierarchy with their emphasis on "zero cost abstractions". Of course that's more a lofty ideal than reality but it means that at least in some cases it does much better than C++. Is it a long program? Something that could be posted in a Github Gist maybe?
I don't think your list is quite right. Crystal is probably more on the tier of Go. Julia is also much faster than that, at least up there at TCL. Of course both vary a great deal depending on what you are using for.
> I doubt it's representative of real performance-sensitive code
Two data points (one-off timings of a few lines of code doing the same work load) just don't make for a comparison we should spend time bothering about.
Whatever you think of the benchmarks game, I don't see why we need to waste time with comparisons that don't meet that low standard:
From this quote it sounds like the more CPU intensive it is, the more you can expect when comparing to Ruby.
>Remember: The cake is a lie, and so are benchmarks. You won’t have 35x increase in performance all the time, but you can expect 5x or more in complex applications, more if it’s CPU intensive.
For now, if you want a fast language with the beauty and productivity of Ruby, check out Elixir [0] and its web framework, Phoenix [1]. I've been using Phoenix for a year, and it's the first framework that I've actually liked more over time. And I've been a web developer for a decade. With its recent 1.0 release, Phoenix is gaining a lot of momentum.
If you want some idea of the performance differences between Phoenix and Rails, see [2] and [3].
Elixir is not a fast language. Not even close. Yes, it handles concurrency and parallelism beautifully which in turn enable distributed applications to perform quite well. But the language itself is significantly slower than Crystal / Rust / Go / Swift. It's not in the same category at all.
That said, it's a great language worth recommending.
Depends what we mean by fast. I have seen Erlang VM handle 100k requests per second on a distributed cluster. That's plenty fast. Moreover, because of fault tolerance, it means ability to have a better uptime, with less people on-call. "Fast" can also be measured to include that, if system goes 200k requests per second, but crashes at midnight and stays down for a few hours, the average "speed" can be quite low. In a laptop demo that's not visible, but in practice that's money and customers lost.
But if fast means, "let's multiple some matrices", then yeah can probably use Rust or C for that. It all depends on the problem domain.
Phoenix is slightly faster than Gin (Go web framework) so depending on your use case it can be fast :). Obviously one of the primary selling points is OTP/BEAM VM with all the concurrency, HA, soft realtime etc. but with simpler syntax than Erlang.
Its only faster at IO. Once you start adding CPU bound work to the equation, Phoenix's performance will begin to drop. It's similar to Node in that regard.
True any generalisation has flaws :) Even faster is relative do you care about avg or worst case performance do you care more about throughput or latency and on and on :)
Well it does, but it still doesn't do it efficiently. That was the only point I was making. But in general, yes, Elixir/Erlang will handle CPU bound tasks better than node whenever they can be effectively parallelized.
Well what I meant to say is that both JS and Elixir/Erlang aren't very efficient at using CPU resources. Elixir/Erlang does a better job at compensating for this by being able to run many operations in parallel and across different machines, which isn't something you can easily achieve (or should even attempt) with Node.
I have seen Erlang VM handle 100k requests per second on a distributed cluster
A single JVM server can do that load, scaling and providing fault tolerance for a server that just accepts requests is trivial these days, also, if your requests do computationally intensive stuff you are going to have a very bad time with Erlang.
It all depends on the problem domain.
Exactly, and the domain for Elixir/Erlang is way more niche and specialized than applicable domains of other languages.
I don't have anything against Elixir but part of its crowd just advertises it as the best thing for everything.
LFE (Lisp Flavored Erlang) is a great alternative to Elixir if you prefer Lisp over Ruby syntax.
Pony lang is looking to enter the BEAM/OTP arena with its own implementation of supervisors and such. It is actor-based, OOP and is supposedly very fast in the distributed niche.
Okay so the given example where on my machine elixir does the calculation of the the fibonacci number in 12.25s but if you add in HiPE and compile to native code in that module it's much closer to crystal than you might expect at 3.34s.
Not bad really for a language that's meant to be slow at computational stuff :^)
Hm, what about HiPE and ability to natively compile Erlang modules? That's pretty solid technology by now (I think?) and it promised some good results last time I checked. Honestly asking - I did quite a few things with Erlang and a little less with Elixir, but never even tried to use them for raw speed, instead delegating number crunching to other processes via ports.
Oops, you're right that Elixir is not a computationally fast language. If you're looking for fast, raw number-crunching, look elsewhere. That said, real-world performance tends to be extremely good for real-time and networked applications (read: web apps). It's common for requests to be handled in microseconds, even in development.
Is Elixir really fast for web apps though? According to the web framework benchmark [1] the performance of Elixir is pretty bad - it's consistently slower than python and ruby frameworks.
Chris McCord said the following on reddit after the results came out:
"We don't know what caused the errors and unfortunately we didn't have a chance to collaborate with them on a true run. A few months ago they added Phoenix in a preview, but it was a very poor implementation. They were testing JSON benchmarks through the :browser pipeline, complete with crsf token generation. They had a dev DB pool size of 10, where other frameworks were given of pool size of 100. And they also had heavy IO logging, where other frameworks did no logging. We sent a PR to address these issues, and I was hoping to see true results in the latest runs, but no preview was provided this time and we weren't able to work with them on the errors. tldr; these results are not representative of the framework."
From my personal experience going from Rails/Sinatra to Phoenix it feels a lot faster but I haven't done any benchmarks so take that with a grain of salt.
Erlang/Elixir is much, much faster than either Python or Ruby especially when it comes to workloads like your typical web app.
In the Techempower benchmarks the Phoenix tests had a ton of errors and there was no preview run so whoever submitted them wasn't able to fix them. Look at the error column. I assume they'll be fixed in the next run.
Honestly that site's benchmarks have never been accurate for me in production. But that's how benchmarks work I guess - my bad for bringing up benchmarks! My experience with using Elixir/Phoenix for web applications has been extremely positive performance-wise however, and has surpassed or matched Go, Ruby, PHP, and Python consistently in all production scenarios. For the latter three, the difference has been order-of-magnitude.
It depends on the problem domain as always. If you're doing a lot of naïve single-threaded number-crunching, have fun. But Elixir/Phoenix haven't failed me for web applications, even in very intensive situations. It's the first time I've barely had to do any tuning beyond external factors such as network and database queries (which, by the way, Phoenix's Ecto handles very gracefully and explicitly).
This is my experience with every Elixir/Phoenix app I've worked on thus far. I apologize if I made it sound like some sort of universal truth.
It is hard to say how they measure and what they measure. According to their "multiple queries" benchmark, which I guess is the real world one? (Unless everyone expects all custumers to line up and send their requests one after another one).
as-in FAQ 1.4 "What sort of problems is Erlang not particularly suitable for? … The most common class of 'less suitable' problems is characterised by performance being a prime requirement and constant-factors having a large effect on performance. Typical examples are image processing, signal processing, sorting large volumes of data and low-level protocol termination."
I'm not sure why you're grouping those four languages as they overlap very little. Go has latency and emphasizes network bound services (and randomly docker for some reason, I'm sure a good one); rust is a c replacement; swift is for writing iOS/Mac apps; and crystal is a newborn. I'd actually call it quite close to go: a high emphasis on concurrency and services. Single thread performance matters much less when scaling horizontally is mandatory to smooth latency spikes, and I'm betting on io bound work erlang would be competitive with go.
I've never used elixir but I assume it has a similar performance profile to erlang as it shares the vm.
I actually had assumed it was cross platform but there was no reason to use it.
I certainly wouldn't invest my code anywhere near Apple unless that was also my market. Who knows what direction it's moving, aside from in Apple's interest. I'll stick with rust and go: between the two I get everything but easy objective c interop.
Plus, their design decisions with respect to null ability is... Interesting. It's gonna feel gimped by legacy needs for a long time.
I listed a few examples of emerging languages to clarify that Elixir is not in the same class when it comes to computational performance on a single machine. The performance profile is indeed Erlang-like.
Keep also in mind that my reply was within the context of a thread on Crystal. OP sort of sold Elixir as a fast language that we can use now while we wait for Crystal to mature.
My point is not that Elixir is useless. My point is that we must not oversell Elixir as a fast language. Generally speaking, it isn't. It excels at horizontal scaling, which is great, but I wouldn't call it "fast" without proper qualifiers.
Elixir gets plugged so often in other-language threads - whether it's Julia or Ruby or, like here, Crystal - that if it wasn't FOSS I'd have decided it's being astroturfed.
I guess it's a good thing that people like it so much, but it's really starting to feel marketing-y by now.
> that if it wasn't FOSS I'd have decided it's being astroturfed.
That's a good sign!
You know why? Because it has a great community and is very friendly for new comers. Jose, Eric and the rest of the team made that a priority and it shows. It doesn't just mean being nice on IRC, it also means putting usability first, putting more effort in how example looks, how documentation looks and so on.
If Google invented a language then proceed to push and sponsor it, by paying authors to work on it, organizing marketing, hackathons etc, then it is hard to say if it popular because of Google's backing or because it has its own merits.
"Friendly for newcomers"? I tried to write something the other day, and had an Erlang developer friend to help me. The Elixir docs section just says "buy one of these books to get started", which is completely unacceptable, and we (mostly my friend, as I had no idea what anything is) spent an hour trying to figure out how to run a node, with Google not providing any useful answers.
That's as hostile to newcomers as it gets. Contrast this with the Rust book, that gets you from "I have no idea how anything works" to "hey I just wrote a small program!" in a few minutes.
Elixir has a very good getting started guide[0]. I don't know if you have seen this or not. Even in the learning section, it recommends going through the getting started guide. It mentions other books in other resources. I don't know why you think it's hostile to new comers.
I learned Elixir totally from Getting started guide and then the documentation. Then for OTP, I have read an Erlang book to understand it well. Elixir's documentation is really awesome and one of the best I have seen.
Jesus... I saw that page. What I didn't see is the "next" link at the very bottom, right above the footer :( There's a whole bunch of stuff I haven't read! That page should be better designed... I think I also skipped the sentence below "running scripts", the one that says "chapter 2", because I skimmed later and it was just "here's where you can ask questions".
I have no involvement in Erlang, but I just looked at that page, and as an outsider I would agree that the navigation could certainly be improved. I think a lot of the problem for me is that there is no distinction between the numbers for "Chapters" and for "Sections", and this confused me as to where I was in the progression.
Maybe using numbers for chapters, letters for sections, and a 1.A notation for the headers? At the least, adding the chapter numbers to the header, so it says "1. Introduction"? Putting the chapter numbers in the URL would help too. So would adding some highlighting in the right column index to indicate the location of the current page.
It seems like fantastic introductory material, but only if people can find it. Usually, the first thing do when I encounter a paginated manual like that is to search for a "single page" or "print" or "PDF" link. Is there one there that I couldn't find? If not, adding one might be a simple (partial) fix.
Their IRC channel, they were pretty helpful for me. And also I am sure they'd want to hear about your experience to also improve their docs and tutorials.
Thanks, I seem to have missed the fact that there are multiple pages on that guide. Weird, because I saw that page again just now and thought "how is two paragraphs a good guide?" before noticing the contents on the side...
I have similar feelings about it, but I have chalked them up to the web-dev crowd where there are more numbers of Ruby/ROR devs flocking to it. It has the same uptake as Ruby/ROR did at certain times.
Case in point, LFE (Lisp Flavored Erlang) was created by one of the original designers of Erlang, Robert Virding, has great support for a small FOSS project, true macros, but the popularity of Ruby has rocketed Elixir way ahead in terms of repositories and users. Erlang Solutions has it on the site, but it is not as touted as Elixir. People go with what they know, and let's admit it, Lisp is a great language, but not as popular in the web-dev crowd sans Clojure (which I don't see as so Lispy).
From the early looks of it, having come from industry and academia, Pony lang looks poised to muscle in on Erlang/BEAM/OTP, Elixir and LFE anyway. I personally don't like the syntax, but syntax is not semantics, and you get over it.
Popularity doesn't always win the day if you do something a bit more off the main road, and potential to earn more researching what you love: Look at qdb/k devs and jobs, and Haskell has started increasing in uptake by fintech. Go with what you like, or as Joseph Campbell said, 'Follow your bliss' and the rest will fall into place.
But don't listen to me. I spend many waking moments fiddling with J (jsoftware.com). Not actually the most loved or known PL out there. I think the array languages J/APL/K/Q will have their day due to where software and hardware are heading: Multicores, array processing (GPU/FPGA hybrids, custom computers).
> Case in point, LFE (Lisp Flavored Erlang) was created by one of the original designers of Erlang, Robert Virding, has great support for a small FOSS project, true macros
LFE had no documentation, no tools, no learning resources for a really long time. Compare that to Elixir that focused on those aspects since day one. Furthermore, LFE has reached 1.0 only recently, almost 1.5 years after Elixir, and that has an impact on industry adoption.
LFE also was, for a long time, literally Lisp-flavored Erlang while Elixir attempted from the beginning to bring its own abstractions such as protocols, collections, Unicode support, UTF-8 strings, and they are still pushing it forward: http://elixir-lang.org/blog/2016/07/14/announcing-genstage/
So I think you are selling both languages short. There is much more happening in Elixir besides the "popularity of Ruby" and there is a lot of potential in LFE now that they are focusing on being more than a "lispy" Erlang.
Not sure why people would discuss it in Julia threads,but for Ruby and Crystal >50% of Elixir community came from Ruby so basically it's Ruby devs discussing newer alternatives to Ruby.
While the language may not be as computationally performant as some of others mentioned, all the things above lower the barrier to entry for adoption and make Elixir a more attractive language than some of the counterparts. And it's amazing that a language this young has nailed it on these fronts.
Dialyzer uses success types, which don't always let you know when you have a soundness problem. The only thing you can count on is that if it does cry out, there certainly is something you need to fix. It also lacks parametric polymorphism, which means you often lose a great deal of useful type information. The theory behind Dialyzer is impressive, but I was pretty disappointed with it in practice.
Try using Wunderspecs, Woverspecs, and Wspecdiffs. You'll find that dialyzer catches things that are more the shape of what you'd accept a more typical type-checking system to catch.
That's what Type Variables are explicitly for. They present the necessary semantics for bounded parametric polymorphism. In fact that's really the only reason they exist at all. The rest of the sub-type system works without them, but parametric polymorphism wouldn't work without them.
I don't know if the Erlang/Dialyzer docs cover that specifically, but I know the originial paper on Dialyzer and Dialyzer type specs does.
I don't think it aims to replace ruby (they don't even claim any kind of "compatibility" level with ruby, more like "inspired by ruby" I would think), but...I wish it would replace it :)
I don't know if Elixir is mainstream or not, but there are a lot of companies using Elixir in production (including the company I work for). You can find the list of companies at https://github.com/doomspork/elixir-companies
Looks like Play! took #2. I can attest to it's efficiency. Just yesterday one of our mobile apps went nuts with repeating requests, and rps shot to 53K for a few minutes, but no one noticed, as our API max response time didn't break 15ms.
But it still has a GC :(. Rust has completely spoiled me with making it easy to minimize dynamic memory allocation and copies, and to know (almost always) deterministically when something will go away.
EDIT: I should also say that if you want to bash on Rust's lack of these things, 3 out of the 4 items I cited have solutions being actively worked on (either at planning, RFC, or implementation phase). I don't think Rust's sigils are going away any time soon, but I have no idea how you'd do that and preserve semantics anyway.