Hacker News new | past | comments | ask | show | jobs | submit login
Fearless Concurrency: Clojure, Rust, Pony, Erlang and Dart (sites.google.com)
298 points by pplonski86 26 days ago | hide | past | web | favorite | 143 comments



I'm a bit surprised that there didn't appear to be any mention of Clojure's built-in concurrency support outside of basic immutability. core.async gives a nice channel-based system, agents give an actor-ish system, and STM/atoms let you mutate the variable safely, without having to manually work with locks.

This is definitely a good high-level article, just something I was surprised by, since core.async is what drove me (and several other people I know) to start using Clojure.

EDIT: Just a note that I know core.async isn't built in, that was a mistake. It is a first-party library, however.


Yes, core.async should definitely be mentioned. It's also much more than a nice channel-based system, but you have to dive in to appreciate how good it is. For example, you have to write real systems to appreciate the fact that core.async channels can be used both from "go-threads" and real threads. That's really useful when your async work involves both quickly sending responses to clients (where lightweight go-threads shine) and doing heavy I/O for database updates (where you want a thread to pick up the work). You use the same channels for all types of work, which is really neat.


It’s also worth noting that almost nobody uses agents or STM (except for some highly specific use cases, but I’ve never seen them in years), and core.async is a library, not a part of Clojure (which is a good thing, because it promotes choice and keeps the language small).


I did use agents in the past, but stopped since core.async became available: it addresses the use cases for agents in a much more flexible way. Also, I found that handling errors in agents is difficult.

As for the general use case, atoms and core.async are great tools, and I haven't needed to use refs (aka the STM) for years.


Don't the core.async go-blocks have issues if you need to block on IO heavy stuff, as in the thread-pool will be blocked until the side-effects are done? Agents don't have that problem, at least I don't think they do.


Yeah this is true, typically you want to keep blocking functions outside this. By default, core.async allocates as many threads as cores in your CPU.

In addition to this, I personally find core.async to deal poorly with flow control (e.g. slow subscribers slowing down the entire flow for every subscriber), and seems to not have a lot of telemetry / it’s difficult to find out what’s going on in the thread pool behind it.

I’ve personally settled on Zach Tellman’s Manifold library, which provides a much more sane abstraction and is fully compatible with core.async.

For some addition thought and material on these topics, I highly recommend watching this talk by Zach, “Everything Will Flow” https://youtu.be/1bNOO3xxMc0


Hmm, interesting — but manifold is clj-only, and I use core.async both on server side and client-side in ClojureScript.


Well, they don't have "issues", they are simply not intended for I/O — but you can simply do your I/O processing in a (thread) block instead of a (go) block and use <!! instead of <!. The beauty of core.async channels is that one end of a channel can be used in a go block and another end in a thread block, and things will be fine. I use this a lot in practice.

Agents also have a fixed thread pool, and you should use send-off if you intend to do I/O.


All of the core.async operations having blocking versions (eg. >!! vs. >!) which means you can use your own threads/thread pools for any work, if needed. Basically, you create your own threads instead of using go blocks and then use the blocking operations, >!!, <!!, alts!!, alt!!, etc.


I believe all I/O is blocking unless you use something that wraps NIO. This is definitely true for core.async threads, but I'm not sure why it would any different for agents.

This is also a bit of a departure for people coming from Go or Erlang/Elixir.


Agents are run in a separate pool, and no one agent is processing at a time, so I suppose they're not "parallel", but concurrent still. I think the agents are preemptive, but I'm not 100% sure on that.


I found this comment enlightening: https://clojuredocs.org/clojure.core/send-off#example-593472...

It seems like there are a lot of parallels between agents and core.async.

Agents, like go blocks, run in a thread pool. Any blocking IO should be pushed off to a separate thread outside of the thread pool. In the case of agents, there is (send-off). In the case of core.async, there is (thread)


I actually use the agents semi-often, partly because I come from an Erlang background and they're relatively easy to work with.

I mentioned in an edit that I realized core.async is a library. It's still first-party, so it's still somewhat idiomatic.


I used agents for things like polling (combined with a scheduling library) because the additional plumbing with core.async wasn't worth it. But it's pretty niche.


It was not built-in because it WAS possible, unlike other languages. So I think the argument in this specific case is mute.


Indeed, Clojure also has unfettered access to Java's concurrency primitives.


STM in Haskell is really nice to use also.


52 major open issues, 2 critical issues, and 38 others

major issues list memory leaks. https://dev.clojure.org/jira/secure/IssueNavigator.jspa?rese...

first-party?


I don't really know what that has to do with anything; every piece of software has bugs. I haven't really encountered any issues with core.async yet. I suspect you could find similar bugs in nearly any popular concurrency library.

By first-party, I mean it seems to be officially part of the Clojure ecosystem, the docs are hosted on the main site: https://clojure.github.io/core.async/


Core async is maintained by Cognitect (maintainers of Clojure). It is very commonly used, and it has been upgraded to mesh well with the rest of the language over time (namely support for transducers).


Every commonly used project has a ton of open issues. Look at React, VS Code, Docker, etc

It doesn’t mean it’s not production ready.


What's great about the actor model is that you can kinda apply it in languages not having such native support. Even if it will not be a strict model, with some education from the programmer it can do wonders and can be backed by lock free queues. What I see as a problem is always they for a thread to wait on new items a syscall must be executed which is costly. But one could use a spinlock a few cycles and later degradate to syscall.


There are a few gotchas with actors:

* they really do need M:N threading (M green threads for the actors, N system threads where N ~= #CPUs),

* message queues have some complexities; it's not hard to get into a situation where messaging costs overwhelm actual work. Also, you need some form of back-pressure to keep a fast producer/slow consumer from killing everything.

Pony is actually a pretty sweet design for all of these. Complicated, though.


I found it interesting that the article mentions actors much at all for the same reason.

It's pretty trivial to construct actors out of some other message passing system. The interesting design choices in doing so are mostly down to the semantics in A) message passing, and B) scheduling/triggering/mapping-onto-green-threads/etc.

And for part A, all the other choices in the first three segments of the article are still effectively the choices that you've got to contend with: copying vs immutablity (bonus, COW mode, but still) vs ownership semantics.

Or to come at it from the other way: actors are far more featureful than is appropriate to directly compare to mere message passing semantics choices, because actors generally have some concept of error handling as a result of their relationship to scheduling, and that puts them on a whole different field.


That’s a big reason I’m a fan of Erlang: individually, its features are interesting, but collectively they form an amazing system. Greater than the sum of its parts.


Agreed, it's one thing to have good primitives like Go channels but Erlang/Elixir provide an entire system from which to build a concurrent/async application. Things like error handling, messaging, storage, and structuring your application well for concurrency are already basically built-in into the standard approach you take building Erlang/Elixir apps.


Whats a tldr for how Erlang handles error handling?

This is something that is usually glossed over / afterthought when it is kind of a big deal in real life programming.


The language is built upon a few core concepts:

* Processes are extremely lightweight, orders of magnitude smaller than operating system processes or JVM threads.

* Exceptions should generally not be handled; instead, those lightweight processes are allowed to fail, and an external supervisor process will re-launch if appropriate.

* Assertions about the state of the data are effectively enabled on every line of code, so the processes crash early, before corrupting other parts of the system.

* Data is immutable, which plays a role in making those assertions happen.

You could probably strip one or two of those bullet points and/or add a couple of more, but I think that captures the highlights.

I'll toot my own horn. If you find that interesting, this is my favorite talk I've given about the above: https://youtu.be/E18shi1qIHU


It’s hardly an afterthought in Erang. It’s a fundamental part of OTP to handle failure. Which is far easier when everything is wrapped in restartable processes with immutable state.


Yep, I have used Cocoa(Touch) runloops in this way. Very straightforward, particularly if you use a bit of Higher Order Messaging syntactic support:

   [[someObject async] doStuff];
This will schedule the doStuff message to run on the runloop of the actor (=object + runloop) in question.


> What's great about the actor model is that you can kinda apply it in languages not having such native support.

Yep, although not listed in the article, JS would be such an example for me. TBH I am not entirely sure what's the difference between Web Workers and Isolates but for me it's more or less the same: Both allow safe multi-threading simply because threads do have separate heaps.


> TBH I am not entirely sure what's the difference between Web Workers and Isolates

> dart:isolate - This library was an attempt to provide a single API that provides common concurrency functionality across Dart's web and native platforms. While useful in some cases, most users found the isolate API limiting compared to the Web Workers API. The infrastructure for supporting isolates also adds substantial overhead when compiling to JavaScript. In the future, you should use Web Workers to access concurrency on the web.

> Part of the isolate API has never worked (spawnFunction) in JS platforms, and the other (spawnFile) doesn't work in Dart/AOT (Flutter). Neither are able to use transferrable/shared memory objects

> On the other hand, the JS platform has evolved quite a bit since isolates were created:

Web workers

Service workers

Animation worklets

... and other APIs like SharedArrayBuffer

... we'd like to enable our web users to seemingly use these, with as little overhead as possible

From https://groups.google.com/a/dartlang.org/forum/#!topic/misc/...


In case the author sees this: The indentation of the code samples is all over the place, making it quite hard to read. Even more so for a white-space sensitive language like Pony. Did you mix tabs and spaces?


Yeah, it really makes it annoying to read the code snippets. I thought Pony just looked bad in general, then I looked at the snippet of a familiar language (Erlang) and it seemed horrible, too.


Pony is not white space sensitive.


Readers are though..


Oh. Good to know, thanks. I see that there are `end` tokens, but since I don't know which tokens they correspond to, it's still hard too read.


The Rust examples seemed a little bit oddly structured to me; the case of collecting a value from a single thread would be just as easily solved by returning from the closure and receiving it through join(): https://gist.github.com/lorkki/e7386df8fff186b3e473994e9d31b...

The power of channels could be illustrated better by adapting the next example to show how you can avoid shared state completely: https://gist.github.com/lorkki/e7386df8fff186b3e473994e9d31b...

It's also worth noting that a more versatile channel implementation than mpsc can be found in the crossbeam crate, in case you're thinking of using them for anything more serious.


And Eiffel Scoop which is always forgotten:

https://en.wikipedia.org/wiki/SCOOP_(software)

https://www.eiffel.org/doc-file/solutions/eth-46802-01.pdf

It was ported to Java in a student project. That means other languages without great concurrency could probably use it with macros or a preprocessor.


I was really excited to discover Pony a couple of years ago, sadly there is negative momentum with this project. So much potential, yet the world isn’t ready for it yet.


I looked into Pony some time ago and I always found the capabilities-thing to be quite complex although very interesting. It was hard to imagine using it in a project with developers unfamiliar to this concept. Especially since I was mostly working on large web applications where only tiny parts were actually using shared state between threads.

TBF I have never written Pony code, I was mostly reading documentation.


Mind explaining what you mean by 'negative momentum'? I'd never heard of Pony before and only took a look at it just now.


The problem with fancy new programming languages is that no big company is willing to back it and then actually use it (this is the most important step). But languages without backing/usage go nowhere and then get donated to the apache or eclipse foundation.


You sure about that? Sylvan Clebsch [1] -- Pony's creator -- is now part of Microsoft Research [2], has been working on a distributed actor model, and is presenting on Pony at QCon London next week [3].

[1] https://github.com/sylvanc

[2] https://www.microsoft.com/en-us/research/people/syclebsc/

[3] https://qconlondon.com/speakers/sylvan-clebsch

Pony: Co-Designing a Type System and a Runtime [video] https://www.microsoft.com/en-us/research/video/pony-co-desig...


unless they invented it, then it is flying monkeys everywhere.


Or Pony is not ready for the world. I mean, they don't even have arithmetic operator precedences [1]. If you're doing things concurrently it must be something uber-important, like censoring cat pictures, but apparently not arithmetics.

[1] https://github.com/aksh98/Pony_Documentation#precedence


>Or Pony is not ready for the world. I mean, they don't even have arithmetic operator precedences

Many languages don't have arithmetic operator precedence (Smalltalk, Lisp, etc) and that's fine.

It's not some feature that's missing, it's a design choice.


In Lisp and Smalltalk it's obvious enough how composite expressions are evaluated, and they don't have the usual infix syntax to begin with. Pony does and the different precedences are just a bad surprise. As far as that's a design decision, I'd call it bad design.


>In Lisp and Smalltalk it's obvious enough how composite expressions are evaluated, and they don't have the usual infix syntax to begin with.

That's Lisp. In Smalltalk they do (have the usual infix syntax).


> have the usual infix syntax

Isn't it the case that all operators in Smalltalk have the same precedence and left-to-right associativity? That's not "the usual" infix syntax.


Usual referring to the mere "infix syntax" part (as opposed to some other kind of syntax that isn't infix).

Not to having "infix syntax plus the associativity / precedence you get in C". We've already established that it doesn't have the usual precedence.

The distinction was with e.g. Lisp which has a prefix syntax (polish notation), and other more exotic styles.


If you're going to complain about that, you might as well mention that the run-time system puts the terminal into a weird, pseudo-raw state that makes stdin/stdout act kind of funny, if you're not expecting it.

Does have a nice readline integration, though.


It's too bad Perl6's interesting approaches to concurrency aren't mentioned here. Supporting concurrency and parallelism were key points in the design of Perl6, and in the multi-paradigmatic way of Perl, it provides a variety of tools.

Jonathan Worthington can write and speak about this topic far better than I can so I refer you to his talk and slides:

1. http://www.jnthn.net/papers/2018-conc-par-8-ways.pdf 2.C https://www.youtube.com/watch?v=l2fSbOPeSQs


But these explicitly don't mention the two best approaches he chose to ignore and even kill.

First the parrot threading model, which provided lockless safe threadpools, and second the pony threading model which provides the same on top of supporting shared refs, and forbids blocking IO. Which makes it even faster.

His talk ends with his simple approach taken being the best. Which is not only wrong but also a lie, because he was part in killing off the parrot threading model.


Reini, get off your crazy horse man! No one in the world, including the author of the Parrot threading model, thinks it is the "best" threading model.


For those of you who want to test multiple concurrency models, I can highly recommend Seven Concurrency Models in Seven Weeks from Pragmatic.


I'll just mention that if you're using C++, the SaferCPlusPlus library[1] supports a data race safe subset of C++, vaguely analogous to Rust's.

[1] shameless plug: https://github.com/duneroadrunner/SaferCPlusPlus#multithread...


Sorry, I do not know any of these languages but I have used Haskell's STM which is incredible. Any of the languages mentioned here close to the STM idea (basically using strong types to ensure mutations happen in isolated context)?


The idea of STM has nothing todo with strong types. You can have STM with types or without.

Clojure has a full STM since version before version 1.0. Simple example:

(def stm (refs {}))

(alter! stm assoc :testkey "testvalue")

(println @stm)

See: https://clojure.org/reference/refs


It definitely has a lot to do with types in Haskell though. You can only operate on transactional variables in STM context. The STM action when run is what gives you atomicity and isolation.

Ref page 8-9 in [1]

[1] https://www.microsoft.com/en-us/research/publication/beautif...


Important part of STM is that the retry mechanism requires functions to be pure - which haskell compiler will check for you.


In fairness to clojure, you can label your side-effectey functions with `io!`, and it then won't let you use it in an STM retry function.

https://clojuredocs.org/clojure.core/io!


Technically, STM requires that you're not updating your state via side effects. That's a different proposition from requiring your functions to be pure. For example, you could have a function with a print statement inside your STM transaction just fine.


If you want to go way down that rabbit hole, it's an active topic of conversation in the Haskell community right now how to break up IO into more granular pieces than what the IO type natively provides, where a function is either "pure" or it's a dirty rotten effects producer, and no middle ground in between.

Although, even a stray print inside an STM transaction could actually do you a lot of damage, since "print" is actually a fairly expensive operation. I've written all sorts of programs in my life that were technically bottlenecked not on reading the input, writing the output, or any of the processing in-between, but on the printing it was doing. And, relatedly, every community that tries to write a really blazingly fast web server in their language runs into a bit of a wall around the mere act of logging the hits. (Even just the date computation starts to bottleneck things, but the writing does too.)


That's the difference between Clojure and Haskell mindsets in a nutshell. Clojure approach is to have sane defaults and guide the programmer towards doing the right thing, but ultimately letting them do what they need to. Whether it makes sense to do something or not is context dependent in practice. Ultimately, the person writing the code understands their situation best, and the language shouldn't get in the way of them doing what they need to.

You could of course argue that by preventing the user from doing certain things you avoid some classes of errors. However, I will in turn argue that by forcing the user to write code for the benefit of the type checker often results in convoluted solutions that are hard to understand and maintain. So, you just end up trading one set of problems for another.


> You could of course argue that by preventing the user from doing certain things you avoid some classes of errors. However, I will in turn argue that by forcing the user to write code for the benefit of the type checker often results in convoluted solutions that are hard to understand and maintain. So, you just end up trading one set of problems for another.

Funny, this is exactly the opposite of my take from a type system like Haskell's. The thing is, whether it's enforced by types or not, the same invariants exist in your code. The only difference is that in one case they are checked explicitly at compile time, and the other case they are hidden and can blow up your programs.

If anything, explicit invariants make code easier to maintain and understand.


Except that they're not the same. Static typing restricts you to a set of statements that can be verified by the type checker. This is a subset of all valid statements, otherwise you could just type check code in any language at compile time.

Static typing makes many dynamic patterns either difficult or impossible to use. For example, Ring middleware becomes pretty much impossible in a static language https://github.com/ring-clojure/ring/wiki/Middleware-Pattern...

The pattern here is that the request and response are expressed as maps. The request map is passed through a set of middleware functions, and each one can modify this map in some way.

A dynamic language makes it possible to write middleware libraries that know absolutely nothing about each other, and compose seamlessly. A static language would require you to provide a full description of every possible permutation of the request map, and every library would have to conform to it. This creates coupling because any time a library needs to create a new key that only it cares about, the global spec needs to be modified to support it.

My experience is that immutability plays a far bigger role than types in addressing the problem of maintainability. Immutability as the default makes it natural to structure applications using independent components. This indirectly helps with the problem of tracking types in large applications as well. You don't need to track types across your entire application, and you're able to do local reasoning within the scope of each component. Meanwhile, you make bigger components by composing smaller ones together, and you only need to know the types at the level of composition which is the public API for the components.


I understand that, but that has nothing to do with STM in general and everything with Haskell solution in particular.


Pony’s does this with the added benefit that all the checks happen at compile time. I’m only somewhat familiar with STM and think some of the safety checks are at runtime but could be wrong.


One of the issues with STM is you can't use IO inside the atomic blocks, which is easy enough to get around but if your users are unaware it'll present bugs. Haskell enforces a separation of IO at the type-level, hence this risk is gone so you get every benefit of STM with none of the risk. STM itself eliminates a set of concurrency problems based on the way it works, but you do still have to do some run-time checks to eliminate deadlocks (i.e. use `modify` functions or have the atomic blocks behave like them, or you could end up with a deadlock).


The safety checks are at compile-time, since it only allows pure functions to be run. You can circumvent this of course with unsafePerformIO, but then you are completely on your own.


Rust uses types to ensure mutations happen in the correct way as to prevent data races (compile time checks)


Speaking of concurrency, is there a current environment that does something similar to Linda? I always thought that sounded quite promising, but apparently that generally is a death sentence (I also liked Modula-3, the Palm Pre and Tcl).


I think that creating a Linda engine is a chicken/egg problem. If there are no applications that are using the paradigm, then there's no need for a highly performing engine.


You can also use Actors and elements of the OTP from Erlang in Clojure via certain libraries.


> Its use of dynamic typing makes me a little bit hesitant to use it, though, as I really love the help provided by static typing.

There is dialyzer, a static type (and not only) analysis tool for Erlang. It is part of Erlang/OTP (i. e. built-in.)

http://erlang.org/doc/apps/dialyzer/users_guide.html


Yeah as a long-time Erlang dev I would strongly recommend Dialyzer to anyone trying to build a serious Erlang project. It doesn't give you quite the same level of rigor that a strong typing system would, but it is nice to be able to spec out types on an as-needed basis while letting it infer types in other parts of the code. It kind of ends up feeling like a nice middle ground between static and dynamic typing, in my experience.


"The problem is, that sometimes, unfortunately, these tools are just not sufficient, it's still easy to shoot your own foot and get lost in a sea of complexity."

As a non-native English speaker I found this use of commas very difficult to understand. Often I get the feeling that native speakers don't even notice it.


The first comma is simply incorrect. "unfortunately" does need commas around it, but it could be moved to the beginning for a simpler sentence.

The fourth comma is also incorrect and should be replaced by a semicolon.

"Unfortunately, the problem is that sometimes these tools are just not sufficient; it's still easy to shoot your own foot and get lost in a sea of complexity."

And looking at that now, I would also get rid of "the problem is that". "Unfortunately" and "sometimes" back to back lacks flow, so I would also replace "sometimes" (in this case with "often" in place of "just"). In the original, "just not" has a stronger grouping than "not sufficient", which is why "insufficient" wasn't used. Now with "just" gone, I'd swap back in "insufficient".

"Unfortunately, these tools are often insufficient; it's still easy to shoot your own foot and get lost in a sea of complexity."


Just to aid people who are interesting in looking this sort of thing up, the fourth comma in the original is an example of a "comma splice", where the comma is too weak to join the two independent clauses. You can solve a comma splice by either making the two independent clauses into separate sentences, joining them with a stronger piece of punctuation (like a semicolon, colon, or a dash), or by using a conjunction like "and".


British English seems to be more forgiving about comma splices than American English. One certainly sees them much more frequently in British English.


I suspect it may be more the case that many people under 40 in the UK weren't really taught much at all in the way of grammar, and even people who have involuntarily picked up fairly decent grammatical skills still struggle with comma splices, which are, admittedly, quite a subtle sort of error whose detection is predicated by a full mastery of the purposes of the various punctuation marks and an understanding of clauses.


It sounds like OP is a non-native speaker too. I'd simplify it to this:

"The problem is that sometimes these tools are just not sufficient. It's still easy to shoot yourself in the foot and get lost in a sea of complexity."

(Hemingwayapp is great for making sure writing is not too complex)


"These tools are often insufficient" ?


"These tools are insufficient" (or: are not enough).

A tool that's "often insufficient" isn't a very good tool - by virtue of often being insufficient - it becomes [ed:simply] insufficient?


Native speakers are often prone to different mistakes than non-native speakers. For example "would of" (instead of "would have" / "would've"), confusing "they", "they're", "their" or confusing "where" with "were". a vs an is an interesting case, too.


as the others have said, its incorrect grammar in English.

it would however be correct in German, so that might be it?


I am surprised it does not mention Ada. I really feel like that Ada is not getting enough attention, it definitely deserves more. Even if you have heard of the language before, please do check out some of the resources available, you will not regret it!

It has been supporting multiprocessor, multi-core, and multithreaded architectures for as long as it has been around. It has language constructs to make it really easy to develop, say, embedded parallel and real-time programs. It is such a breeze. I admit I am not quite sure what they are referring to by fearless, but if it means that they can handle concurrent programming safely and efficiently in a language, well, then Ada definitely has it.

Ada is successful in the domain of mission-critical software, which involves air traffic control systems, avionics, medical devices, railways, rockets, satellites, and so on.

Ada is one of the few programming languages to provide high-level operations to control and manipulate the registers and interrupts of input and output devices.

Ada has concurrency types, and its type system integrates concurrency (threading, parallelism) neatly. Protected types for data-based synchronization, and task types for concurrency. They of course can be unified through the use of interface inheritance, and so on.

If you are interested in building such programs, I recommend two books:

https://www.amazon.com/Building-Parallel-Embedded-Real-Time-...

https://www.amazon.com/Concurrent-Real-Time-Programming-Alan...

...other good resources:

https://en.wikibooks.org/wiki/Ada_Style_Guide/Concurrency

https://www.adacore.com/uploads/books/pdf/AdaCore-Tech-Cyber...

The last PDF will summarize in what ways Ada is awesome for:

- contract-based programming (with static analysis tools (formal verification, etc.))

- object-oriented programming

- concurrent programming

- systems programming

- real-time programming

- developing high-integrity systems

and a lot more. It also gives you a proper introduction to the language's features.


We learned some Ada in school. I really liked it, but the free toolchain was poor (hard to get working properly, unintuitive) and the community stubbornly defended its various idiosyncrasies like its verbose Pascal syntax and its homegrown project file format. Most importantly, it just didn’t heave much of an open source ecosystem, and the community was pretty hostile and defensive toward newbies. But yeah, the language was neat! :)


When did this happen? Have you checked its current state? It has been growing ever since. There are dozens of tools available today for free, and it is very easy to set up.

I agree that its open source ecosystem needs to grow, but for that we do need more Ada programmers! :P

By the way, I am really sorry if you experienced hostility from the community. May I ask where it took place? I had similar experiences with a variety of communities, even Rust. I try to not be demotivated from the language itself, after all, it is not particularly the language's fault, and there are people like that everywhere. They need to learn that what is obvious to them is not necessarily obvious to other people, and asking is a sign of wanting to learn, which I believe is a good thing. :)


Hmm, circa 2012. I've checked in on it a handful of times in the ensuing years, but I came across Go in 2014 and it ended up suiting my needs almost perfectly (rapid application development, simple, great performance, mostly safe, fantastic tooling/ecosystem, zero-runtime-dependencies, etc).

As for where the hostility took place, it was most Ada proponents who would pop up in /r/programming, here on HN, etc. I'm sure the circumstances select for the most toxic folks from any community, but it seemed especially potent from Ada folks (could have been bad luck, ymmv and all that).

Would love for Ada to modernize and improve tooling/ecosystem, but between Go and Rust, I'm afraid that the advantages for a modern Ada might be marginal. It seems unfortunate for Ada that it didn't modernize prior to 2012; it could have eaten both Go and Rust's lunch before they even existed.


> it seems quite a lot easier to manage concurrency in Clojure from my biased point-of-view.

Assuming you know that none of the code in your concurrent blocks are is side-effecting, which may be hard to know if it's not your code; Haskell checks this for you, so to me it seems easier in haskell, not harder, because I don't have to hunt down and read all the source code to know if what I'm doing has concurrency bugs or not.


I'd also like to mention cooperative concurrency. I know the author kind of skipped over this from the first sentence of the article where he defined a concurrent program as having more than one thread of execution. But with coroutines, we can have a single threaded-application that handles multiple tasks concurrently too. In fact I believe this is much easier to reason about for the programmer: the kernel won't randomly decide this thread's time slice is up in the middle of an important data operation; the programmer knows exactly when an operation may cause the current task to be "descheduled" because those are usually explicitly tagged with the "await" keyword. In my experience, this model eliminates the vast majority (but not all) of the use cases for locks and other synchronization primitives. It is also very performant.

Take a look at Python's asyncio, which has been in the standard library for years, and you'll see how easy it is to write concurrent networking code without threads and with very very few uses of locks.


Did you just say single threaded concurrently reasoned applications are more performant than multithreaded applications?

Trying to reason about how this could be possible. I'm still amazed by the performance you can get out of async non blocking style application code so don't take this the wrong way, truly trying to understand here.

I'm also a little confused about threads being rescheduled by the OS being treated like an inconvenience, to me it's a feature that prevents one application being too hoggish.

I can see your point that it's nice to have these labelled with await but I'm struggling to pinpoint a time I've asked myself what thread scheduling might mean for my code. Other than just assuming that another thread could be anywhere doing anything which I think is the correct approach no?

Locks btw, totally non issue with the right API/Lang support.


No I didn't say "single threaded concurrently reasoned applications are more performant than multithreaded applications" because I was highlighting this style of concurrency from the programmer's perspective. But yes, for certain types of applications it can be more performant.

Naturally, if your application is computationally intensive, a single thread can't compete with a multithreaded application. But for applications that use a lot of slow, blocking I/O, converting them to a single-threaded application that uses non-blocking I/O is a significant reduction in overhead. Compare a traditional web server that uses one thread per request and blocking I/O, versus one that uses an event loop, non-blocking I/O. You will see why the latter is more efficiently on system resources. Again this isn't a panacea, and for some applications you do have to use threads. I'm just pointing out an omission of the article.


Thanks for the clarification I see your angle now.

I'm not trying to debate on this, pretty sure we're just reasoning about facts we both agree on here.

I was aiming for a more general stance but if were talking about web servers there is one thing we might disagree on being the net benefit of non blocking in that specific scenario.

Let me see if I can explain, the individual requests may contain a lot of slow blocking io but these are handled async with a thread per request at the server level. So the request and internal blocking is limited to the request only (ignoring thread pool limits.)

While it may be that non blocking might produce more performant request level handling in some cases most or at least a substantial amount of web request logic is very dependent on chaining those blocking io one after the other, e.g. get thing modify, return.

In blocking you execute and reason about sequentially in non blocking you reason about with callbacks or syntactic sugar around promises specifically because sequential programming is easier to reason about.

The point of difference being that in non blocking you have a lot of overhead in the event loop and in the implementation, all to produce sequentially executing code that doesn't block other requests.

This is sort of why me and a few colleagues came to the conclusion PHP is still pretty damn decent for web stuff.

Non blocking is great for concurrency where it actually results in parallelism but if it doesn't then there isn't much gain other than gained complexity.

All this needs to be taken with a grain of salt. Node has Async await for a reason, and I've seen a few different implementations of non blocking php as well.

The choice really comes down to which style you find your self benefiting from on the regular.

Bleh that needs about three rounds of editing before it makes sense, something I don't have time for. Sorry for the ramble!


its a two-edged sword. if you can trust the application code to effectively schedule itself then it's good but you can also get bugs and perf issues where a synchronous chunk of code blocks progress on everything (see js blocking the main thread and causing the ui to freeze)

the pro is that your async code explicitly defines points where context switching is okay since you're blocking on something anyways. this could be good for perf if context switching in the middle of a synchronous operation is expensive.

the con is that your async code might not cede control often enough to allow other coroutines to make progress.

so yes, you can have something hogging the runtime but in the context of an application that you control as a whole this is something you can avoid/fix if necessary.

at the OS level this might not make sense because you have to assume that applications are adversarial and will try to hog cpu time...


What about picking jobs from an SQL-table as writing back with a check? I naver had any of the problems specified on a multi-core/process system.. or have I just been lucky or blessed by ignorance?


That's an example of shared-nothing message-passing system, which is a perfectly valid solution. It's implementable in all the aforementioned languages.


I am not sure but I think Go channels were inspired by CSP (communicating sequential processes) which is not inherently unsafe though.


from the article:

> Notice that this article does not include Go, a language that admittedly has an elegant concurrency solution as well (Go channels) because that solution is not actually thread-safe - it's not very hard to have race conditions in Go or corrupt state because Go does not enforce a separation of shareable and not-shareable mutable state.


My vote for Go's #1 mistake isn't the popular "lacking generics", but missing the opportunity to have fearless concurrency, and making it so that goroutines can only communicate via immutable shared values and copied values. If you want to optionally and explicitly penetrate that barrier sometimes, I'd be OK with that, but this should be the default.

Generics will probably be added after the fact 10 years after the 1.0 release. They may not be quite as slick as something that was in there from the beginning, but based on the many other languages that added them after the fact, it'll probably work out well enough. Fearless concurrency can't be added to a language; if it's not in there from the beginning it'll never be added. It's a change so big it's almost automatically a new language. And, as can be logically deduced from that statement, I am aware that it would have some additional effects in the language, such as introducing immutability, it wouldn't be simply what we now know as "Go" with just a minor tweak. I'm comfortable with that. Even with Go's focus on simplicity, I do think there was a slight error towards being a bit simpler than the problem permits here.

(I do pretty well with Go's concurrency, but I was first trained brutally by Erlang and Haskell. I see those who don't come by the same route have more trouble.)


So you didn't understand what I said? CSP is not unsafe. What is your point of quoting if you don't even understand what I said?


That's incorrect though. It's impossible to have data races with Go channels.

The example the author gives doesn't use them.


Man the indentation in this article is confusing me.


technically node achieves it as well.


I have a, perhaps unjustified, concern at the loss of fear around things like concurrency. People keep substituting appeasing the compiler with actual thinking around hard problems. But until the halting problem is solved at a compiler level (at a level that can account for runtime cases), you should always have a healthy amount of fear when writing concurrent code.

Fear, and respect, the absurdly difficult challenge that is writing correct concurrent code, even when your compiler is helping you out.


I find this line of thinking to be a holdover from an earlier age. One could replace the word "concurrency" in the above comment with "memory safety" to express the popular sentiment as of the 1980s--but in the decades hence the vast, vast majority of programmers have come to be able to completely ignore concerns related to the careful management of allocating and deallocating memory, and though we can argue that the result is a proliferation of memory-hungry Electron-like apps, on balance it's been a dramatic victory for letting people focus on solving the problem at hand rather than distract them with tiresome pointer-juggling.

It's true that in the 90s it was important to cultivate a healthy fear of concurrency in the same way that parents in Cambodia must sadly teach their children to fear landmines. However, there's nothing inherent in the problem space that dictates that the concerns of one's ancestors must be the concerns of their descendants. One day we hope the landmines in Cambodia will be cleared, just as we hope the landmines in concurrent programming will be, and I'll be thankful when that day comes.


I like to say that threads & locks makes concurrency an exponential problem, but the fearless concurrency solutions make them a polynomial problem. Still a problem, but no longer insane, incomprehensible, and impossible.

Personally I suspect generalized concurrency is always going to be something that "professional" programmers deal with, and non-professionals will only get pre-canned solutions for particular problems, because the general case will always be a bit more complicated than non-specialist programmers are going to want to handle. I think concurrency is worse in terms of what you need to keep in your head than raw pointers are, and those are already too much for a non-specialist.


Good comment. I will note, though, that there were forms of safer concurrency going way back. PL designers and developers mostly just didn't use them. The main exceptions in industry were Ada Ravenscar and Eiffel SCOOP. Concurrent Pascal deserves mention as forerunner.


How do you solve for deadlocks at the compiler level? Even if all your memory access is perfectly safe, you can still deadlock on external resources if you aren't pay attention.

That's what I mean by fear and respect for concurrent programming. That's the problem that hasn't been solved.


Deadlocks is a solved problem. Technically, they can't even exist in any concurrency model that doesn't share anything. What can exist is processes waiting for messages from each other, but that's not a deadlock, but a valid behavior and is only potentially problematic without timeouts. Asynchronous message passing with event-driven/reactive semantics farther enforce impossibility to block on waiting for a specific message. In practice strict event-driven semantics are not necessary for it to never be a problem.


Deadlocks are not restricted to shared memory communication. Two Unix processes talking via a socket pair can trivially deadlock ( for example if they are both blocked waiting for the other side to speak first).

Also asynchronous systems can deadlock as well, it is just much harder to debug as the debugger won't show an obvious system thread blocked on some system call; the deadlocked threads of execution still exist but will have been subject to CPS and hidden from view (just some callback waiting forever on some wait queue).


It's not useful to use the same term for very different kinds of things. Shared resource deadlocks are common and disastrous problems. Share nothing mutual blocking is uncommon, not necessarily a problem at all and can be completely harmless and automatically recovered when it is a problem. For example, spawning actors to wait without timeouts would be absolutely ok, parent can do all the timeouting and kill the children.

Two processes blocking on a socket is not a deadlock. Surely there are timeouts on both sides, because using sockets without timeouts is just ignorance, and both will just timeout and move on.

Also strictly statically declared event handlers per actor are 100% mutual blocking and deadlock free. Because they can't wait for messages in a way, that blocks other event handlers.


The term deadlock has been used for message passing issues since the dawn of time. It is literally the same issue.

Using timeouts to paper over issues is just wrong. I accept that timeouts are necessary to deal with network issues (and a timeout should cause the connection to be dropped, so won't solve the deadlock issue), but certainly they are not required for in-application message passing.

Finally if an actor won't send a message untill it has received another one, I fail to see how statically declared handlers will help.


> Finally if an actor won't send a message untill it has received another one, I fail to see how statically declared handlers will help.

Think of it as reacting to messages, not waiting. In that model actors of course can react by sending messages, but can't have a special waiting state for specific messages, making it impossible to block other handlers. I'm not sure why this is hard to understand.

We do have this problem solved in every possible way. But it's so not a big deal with actor model, that there is no point sacrificing any flexibility for it.


Forget about waiting. Think about state machines. Let's say that that there is a rule that, if the machine is on state S1, on reception of message M, send message M and move to state S2. This the only rule for state S1. Now if two actors implementating this state machine and exchanging messages find themselves in state S1 at the same time, they are stuck. This is a bug in the state machine specification of course and I would call it a deadlock. How would you call it? How would the actor model statically prevent you from implementing such a state transition rule?

Edit: BTW, not sure why you got downvoted.


This is why I'm talking about specific model with static handlers per actor, where you can't choose handlers dynamically depending on the state you are in. Whether you are on state S1 or S2, all handlers are still able to receive messages, what they can't do is run at the same time.


It can receive all messages you want, but if the only message that it that would cause a state transition, and send out a message, is M, then it is still stuck.

I mean, I'm no expert, but I guess you could statically analize the state machine and figure out, given a set of communicating actors, which sequence of messages would lead to a global state from which there is no progress. I assume that, because message ordering is not deterministic the analysis is probably non easy to do.


Well, this is the limit all models have. You can abuse memory safety the same way and use indices of bounds checkable arrays as raw pointers for example.


>holdover from an earlier age

People have stopped using Java and Go?


> Fear, and respect, the absurdly difficult challenge that is writing correct concurrent code, even when your compiler is helping you out.

There are plenty of safe and easy models for writing concurrent code. Here's a famous one that's easy to overlook:

    gzip -cd compressed.gz | grep error
On Unix, this doesn't use a temporary file. It creates two concurrent processes. The first decompresses a file and writes it to a pipe, and the second reads from a pipe and searches for a string of text. You could call this "coroutines over a stream," I suppose.

And of course, people have been writing shell pipes for decades without concurrency errors. Unix enforces process isolation, and makes sure all the data flows in one direction.

Now, there's no reason a programming language couldn't enforce similar restrictions. For example, I've spent the last few years at work writing highly concurrent Rust code, and I've never had a single case of a deadlock or memory corruption.

One slightly trickier problem in highly concurrent systems is error reporting. In the Unix example, either "gzip" or "grep" could fail. In the shell example, you could detect this by setting "set -o pipefail". In Erlang, you can use supervision trees. In Rust, sometimes you can use "crossbeam::scope" to automatically fail if any child thread fails. In other Rust code, or in Go, you might resort to reporting errors using a channel. And I've definitely seen channel-based code go subtly wrong—but not necessarily more wrong than single-threaded error-recovery code in C.

With the right abstractions, writing concurrent code doesn't require superhuman vigilance and perfection.


Yes there will still be fear sometimes, but the point of the compiler is to help people to be fearless sometimes.

The halting problem does not need to be solved, since no program in practice runs forever. In some of the environments like C we don't have a way to specify finite things, where cases would be enumerable and safety guaranteed. To be fearless sometimes greatly reduces the mental workload of programmers and people who read other people's code.


A lot of the fear around concurrency comes from hard won experience. Shared memory concurrency is _really_ hard to get right, and the results are often streams of execution that are very perplexing. Message passing concurrency isn't without pitfalls, but the pitfalls are fewer, and the debugging usually more clear -- your stuck processes/threads usually indicate what they're waiting for, and when you find two threads waiting for a message from the other, you know you messed up the ordering. Of course, the message passing infrastructure still has some tricky concurrency problems itself, but that's a smaller, more tractable problem.


> But until the halting problem is solved at a compiler level (at a level that can account for runtime cases)

Idris can already check for totality (i.e. guarantee its programs will halt) without solving the (unsolvable) halting problem. So we're almost there!

And even in Haskell, you can write certain classes of concurrent code without thinking thanks to 1) its great RTS and concurrency libraries and 2) the fact that Haskell programs compose so well: If I have thread safe programs A and B, Haskell gives me a variety of ways to compose their results in such a way that thread safety is always preserved.


I'm not sure I understand the argument, why is the halting problem even important here? Surely a compiler can split loops into ones that halt so they only run a little bit at a time. But you don't necessarily need a compiler for that, OS can help too and, for example, invoke a signal handler from which you can stop currently running actor. Actors can be safely stopped, destroyed, preempted, etc. without breaking other actors.


It is impossible to ever handle concurrency completely at the compiler level. We can handle them at the runtime level (via things like transactional semantics and timewarps), though the overhead can be great.


Whether or not this is "fearless" depends on your point of view.

As a long-time pthreads programmer, these systems feel like skittish concurrency. It's for folks who are too afraid to use the underlying concurrency primitives, like shared memory, synchronization of some kind provided by OS, threads.


> As a long-time pthreads programmer, these systems feel like skittish concurrency.

I interpreted "fearless" to mean the opposite of "defensive programming." You have to be pathologically defensive when coding with basic synchronization primitives. These languages provide an alternative that frees you from that mindset. I only have experience with Erlang/OTP but it has been mind expanding when it comes to writing complex, robust, concurrent code. In fact one of the core Erlang style principles is "do not program defensively." Try that with pthreads!


Rust's primitives are all thin wrappers on top of libpthread. The difference is that the type system ensures you don't forget them, and makes it harder to misuse them (such as accessing data without locking its mutex, or mutating shared memory in a non-atomic way).

And even if you think you never make mistakes, and it's always the other programmers who are the problem, Rust keeps other peoples' code in check too.


Isn't it better to let the computer handle things humans are bad at unless you have a real good reason to muck around in the lower level?


No, because that would make sense. You MUST shoot yourself in the foot at absolutely every opportunity that presents itself or you're too afraid.


Real programmers use Assembly/C/Fortran 77 for everything.


Couldn't the same argument be made for manual memory management? Couldn't I say that GC'd languages are for "people too afraid to handle pointer juggling"?

I've done a fair amount of pthread work, and I coded myself into enough hard-to-debug issues in C that when I discovered I could fairly-easily structure my stuff around ZeroMQ, and then later discovering a book on different process calculi, I decided that there's no way I can go back to doing the low-level stuff. I'm just not smart enough to handle any kind of elaborate threaded programming.


Nobody who promotes GCs calls it “fearless memory management”. So no. The same thing cannot be said for GC.


Exactly. Because the recorded history shows those constructs are nearly impossible to use correctly.


Yes. Having spent too much time debugging code by people who thought they were clever enough not to need language safety, I want language-level protection.

(Did the Python crowd ever fix the race condition in CPickle?[1] They were in denial about this about eight years ago when I reported it. Doing multiple CPickle operations in separate threads can crash CPython. If you search for "CPickle thread crash" you find many reports of hard to reproduce problems in that area.)

[1] https://bugs.python.org/issue23655


I don't see a denial in that bug report - they're just saying that they don't have an actionable repro, and don't know how to get one.


False. Lots of software uses pthreads and its direct wrappers.

Correctly? What does anyone ever do correctly in software? Does anyone write GUIs correctly? Does anyone do databases correctly? Proving even simple software correct is an enormous endeavor that most pragmatic programmers simply never have the luxury to do. So proclaiming that threads are bad because nobody does them “correctly” is setting an arbitrarily high standard. Or maybe you just don’t know what that word means.


Are you saying that buggy software is good because it’s widespread, and then trying to ad hominem me? Because that’s adorable both ways.


While those primitives you mention make it possible to get a correct result, they require consistent and correct use from the programmers. While some of us may do this perfectly most of the time, bugs are inevitably created. This is reality.

Fearless concurrency means guaranteeing many of those problems cannot occur. That frees up brain cycles, eases maintenance, reduces bugs, and perhaps even opens up concurrency to a wider audience or set of use cases.


Every Turing complete primitive makes it possible to get an incorrect result. Every Turing complete primitive inevitably generates bugs. That’s the reality. Singling our threads because they also obey that basic law of computer science is a little selective, don’t you think?

I understand what the thing they call “fearless concurrency” is. And I will repeat again: it’s skittish concurrency. It’s for fearful people.


On the other hand, a powerful concurrency abstraction can make it much easier to reason about a large system. But I agree that the “fearless” part is mostly unjustified.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: