Hacker News new | comments | show | ask | jobs | submit login
Concurrency is not Parallelism (it's better) - A talk by Rob Pike (googlecode.com)
174 points by luriel 1809 days ago | hide | past | web | 54 comments | favorite



great to see this coming from non-functional language folks (after years of banging out about it in Haskell land -http://ghcmutterings.wordpress.com/2009/10/06/parallelism-co... )

Of course, concurrency is not the end of things. Deterministic parallelism is a beautiful model for making things faster, easily. So you need that in your language somehow too.

Edit: there's a cite for Bob Harper on this, http://existentialtype.wordpress.com/2011/03/17/parallelism-... , which was a bit later than Simon's post.


The process isolation model that CSP provides is generally useful within languages with mutable state.

In particular, the Kilim library for Java uses "microthreads" (similar to goroutines) plus a linear ownership transfer system such that only one microthread owns a given object/object graph at a time, preventing concurrent state mutation while still providing mutable state. Sending an object from one Kilim task to another transfers ownership, and thus messages become the mechanism by which state is isolated at the process level.

Mutable state can be safe too. You just need the proper abstractions.


This sounds like runtime safety, not compile-time safety. Otherwise, how can you avoid aliasing that modifies the object for which you lost ownership?

Immutable state gives you compile-time-safety. I think when people say "shared mutable state is unsafe", at least some of them are referring to compile-time safety. If it compiles, these kinds of concurrency-related bugs are impossible.


At least in Rust, unique pointers should be guaranteed by the compiler so when you pass one to another thread you know there's no aliasing.


AFAIR kilim guarantees statically checked safety, exactly because "messages" that can be sent across actor borders have guaranteed properties of being aliaseable^mutable.



Robert Harper made the distinction between concurrency and parallelism in a paragraph (freely available text to boot!)

"Parallelism should not be confused with concurrency. Parallelism is about efficiency, not semantics; the meaning of a program is independent of whether it is executed in parallel or not. Concurrency is about composition, not efficiency; the meaning of a concurrent program is very weakly specified so that one may compose it with other programs without altering its meaning. This distinction, and the formulation of it given here, was pioneered by Blelloch (1990). The concept of a cost semantics and the idea of a provably efficient implementation are derived from Blelloch and Greiner (1995, 1996a)."

39.5 Notes, Practical Foundations of Programming Languages, Version 1.30 Revised 03.15.2012

http://www.cs.cmu.edu/~rwh/plbook/book.pdf


Since being schooled by other HN readers on parallelism vs. concurrency in the past, my way of distinguishing it:

Concurrency is a programming language issue, parallelism is a resource allocation issue.

EDIT: Well, this appears to be stated pretty clearly in the presentation, but I only got the presentation working after trying it for the 3rd time (in the 3rd browser).


I'm loving the gopher drawings-- although I found it hard to maintain interest after the gophers were gone, even though the material still made sense.

We need more cute illustrations of CS topics.


_why?


There are several types of learning, and some people are better at (or prefer) one over others. Some people do well with text. Others do well with pictorial representations, other do better with demonstration and talks and some learn by doing/failing/doing again.

I personally am good with pictorials and the do/fail iteration styles of learning. Text is passable, but I am usually translating it to pictures in my head. There have been times where I have gotten more out of one crappy but representative picture than I have out of an hour lecture. Good pictures that engage the viewer are even better. In this case I was easily able to imagine the gophers walking along doing their tasks, and just "seeing" the flow because the picture was pretty representative. I could have gotten the same message from boxes and arrows, but not as quickly I think.

So why should we have more good diagrams? Because it would help more people of a certain type learn to program easier.



Oh. Well, either way, the illustrations in the ruby book are just commics as a cute way to make side points, but the pictures themeselves don't actually convey information in the same way as the gopher diagrams in the talk. Please feel free to reinterpret my comment retroactively to be about how that is the case :)


Didn't do readers any favor by pointing at the title slide, where controls need to be discovered by accident. A better link is the unfestooned

http://concur.rspace.googlecode.com/hg/talk/concur.html


Well that's the weirdest thing. I've seen this kind of thing before, so I guessed that <--> would work. But the page asks my browser to store information. If I ignore the question and just allow the query to continue to display at the top of the window, I can navigate fine. If I say "Not now," I'm stuck on whatever page I was on when I answered, and have to reload to be able to move again. Ubuntu 11.10, FF recent.


The gophers remind me of Dwarf Fortress. Moving stuff around in piles and doing it in an efficient manner so your fortress has no bottlenecks.

So for a good lesson in concurrency and parallelism, play Dwarf Fortress.


try spacechem


On the "Goroutines are not threads" slide, it says, "When a goroutine blocks, that THREAD blocks but no other goroutine blocks."

I hope that's a typo. A thread is blocked when a goroutine blocks would exhaust the worker thread pool pretty fast. Can an GO expert clarify on that?


I think the point he's trying to make is that if you make a blocking system call, it blocks the thread that goroutine is scheduled within, but other goroutines will not be blocked by this and will continue to execute.

This is not the case with many other (single-threaded) systems built exclusively around a god-loop that schedules coroutines. One coroutine making a blocking system call gums up the whole works.


Goroutines are threads though (in the sense of preemtively scheduled concurrent execution sharing memory), they just aren't OS threads.


But they aren't "preemptively scheduled concurrent execution sharing memory".

They are "cooperatively scheduled concurrent execution sharing memory".


The current implementations are cooperatively scheduled, but nothing in the specification prevents a preemptively scheduled implementation. After all, gccgo had a preemptively scheduled implementation until a few months ago.

That being said, I prefer my goroutines to be cooperatively scheduled.


When Go was first presented Rob said the term "Goroutine" was introduced precisely to avoid this kind of confusions, but old habits are hard to kill and seems that even Rob himself still falls into this trap from time to time :)


No, he didn't. It's not a typo. :-)


Similar/analogous (I believe these all have significant differences)to F# MailboxProcessor, akka actors, erlang/BEAM processes and userspace/green threads in e.g. GHC,


But the title of the slide said, "Goroutines are not threads."

I understand what you meant, but it's kind of confusing to use the same term for different things within the same slide while claiming they are different.


Yes, that is a typo. Goroutines are multiplexed to OS threads, but this is done transparently, you don't have to worry about it.

You can have tens of thousands of Goroutines without any problems.


No, that is not a typo. A goroutine that blocks by making the thread it's currently running in block in the kernel is blocking that thread too; it must since the kernel needs some context for the on-going operation. What he's saying is that the pool of other threads used by the scheduler can continue to run other goroutines, just not the one that's blocked. No typo.


But _how_ does this actually happen?

I have seen this same idea ("goroutines are very light and multiplexed into OS threads, don't worry about it") stated often, but I do not understand it, and I have not found any place where the implementation of this is actually discussed, do you have any pointers?


The post you're replying to is wrong, it's not a typo; see my reply to it.

For a background in the CSP languages have a browse of Russ Cox's http://swtch.com/~rsc/thread/ The reference [4] to Squeak is broken, it's now at http://research.microsoft.com/pubs/67508/squeak.pdf and well worth a read.

For the general case, not Go specific, there's a scheduler that tracks what processes (in the CSP sense, e.g. goroutine) are waiting on what channels. Whenever a channel becomes ready, e.g. something is available to read from it, one of the processes waiting to read from it is non-deterministically (that's important) chosen and control (the CPU's program counter, still in user-space) jumps to where it returns from the `read'. It then has the CPU all to itself until it gives control back to the scheduler by needing another particular channel-state. I'm omitting sleeping, etc., for simplicity. The switching of control is all in user-space so quite lightweight.


Ted Stevens had it wrong: The internet is a series of gophers!

More relevantly - definitely going to come up with more stories/silly pictures for presentations in the future. It channels the memory technique of making ridiculous associations that actually stick.


I've stared to dive in to Clojure over the last week, and I have been looking at how to implement an async WebSocket server.

It seems the Clojure way is to use something like Aleph (https://github.com/ztellman/aleph, http://blip.tv/clojure/zach-tellman-aleph-a-framework-for-as...), which is a library that implements channels like Go; however, Go's channels are first-class constructs built into the language.


To be fair, with lisps there's no real difference between a feature from a library and a feature provided by the language.


But what are we to do in a world built, at its lowest level, on imperative languages (C) without intuitive support for concurrency nor parallelism?

Does Google foresee a future entirely built in Go, the same way NeXT expected the world to evolve towards Objective-C?


If you need to rely on C libraries, it seems to me the best use of Go is as a coordination language. Use it to distribute tasks that are then executed by isolated, shared-nothing C code.


Why does the whole world have to be built that way? If, the next time you do a new project start on a web app, your web app were built this way, then you would see the advantages of this approach in your web app.

If there are equivalent ways of designing for concurrency in Java or C# (or those ways are added to the language in the future), then you may be able to restructure existing Java and C# code bases to take advantage of this approach.

If you have 20 million lines of Cobol transaction processing code and there's no practical way of migrating your code base to use this methodology, then your 20 million lines of Cobol will keep running the same way they always have.

Unless I'm missing a point you're trying to make (certainly possible) there's no reason to think that this is an all or nothing approach.


C# 4.0 actually has all the pieces you'd need to write Go-style concurrent code. It just won't be pretty.

The upcoming 4.5 release fixes some of the ugly.


> Does Google foresee a future entirely built in Go, the same way NeXT expected the world to evolve towards Objective-C?

Eventually, sure, why not?

Still you don't need your whole OS to be in pure Go to take advantage of Go's features, most Go systems are purely in Go as the stdlib is quite complete, but even then there is no issue linking with (most) existing C code if you need to (C code that does its own threading stuff obviously is more problematic).


I don't know what Google sees, but I have a hard time imagining anyone honestly thinks the future is going to be built entirely in one language. That's the fantasy Sun had with Java. Now Java is this awkward thing that lives on top of the OS which lives on top of a hypervisor in a typical cloud setup.

It looks like we are headed toward a future with many more languages, not fewer.


Keep on truckin'. The situation is not necessarily hopeless, it just takes time to resolve. For example, see: http://halvm.org/wiki/

It may be a long time but there will someday be a world not built on C or C++. Probably within my lifetime. (Probably.)


Probably not within mine, depending on which side of the family I favor.


He could've described SIMD with gophers:

putting only one manual per cart is SISD vs. putting several manuals per cart which is SIMD

Also his example (moving manuals) is really example of parallelism and not concurrency. He has one large problem, which is decomposed to two smaller problems. Not much different from multiplying two large vectors.

The real example of concurrency would be: librarians handling books returned to the library. They don't know how many books will be returned and when - i.e. very similar to web server handling HTTP requests.


I don't like his definitions of these terms. IMHO these definitions are better:

Concurrency

property of systems in which several computational processes are executing at the same time, and potentially interacting with each other

Parallelism

computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently (i.e. "in parallel")

http://www.slideshare.net/nivertech/migrationtomulticore


I think how you think of both concurrency and parallelism depends on what you use to achieve both. When you use akka, a an actor model library for scala and java (http://akka.io/) doing both feels identical.


What software produced this slide presentation?


If you didn't know about it, you might want to look at asciidoc: http://www.methods.co.nz/asciidoc/ (search for slidy)


Software that predates the iPad, it seems.


It looks like a custom one, but it reminds me of "impress.js"


You can't expect people to take your new "concurrent" language seriously if their concurrent programs won't run proportionately faster on multicore systems. Channels & green threads are nice, but you can get true parallelism with Java. Don't make excuses for your language. Just solve the problem!


This and that are different, languages like Go and Erlang use m:n mapping to map their m lighweight threads/processes onto n kernel threads (where n is usually the number of physical cores on the machine) to benefit from hardware concurrency.

But the presentation is about concurrency and parallelism not being the same thing, and more concurrency not necessarily leading to more parallelism.

> Don't make excuses for your language.

There's no excuse made.

> Just solve the problem!

There's no problem to solve. If you're asked to count from 1 to 100, you can't parallelize it: it's a purely sequential task, each step has a hard dependency on the previous step. So there is no possible parallelism.

But it is possible to do something else at the same time, or to have one process handling the counting and sending the current count to an other process printing it. That's concurrency: you've got two things running at the same time.


Single-threaded concurrency was never the problem to begin with and it's used everywhere, in specific forms. Even C has it... for instance in your example printf formats to a buffer and writes at some later time, which is concurrent in that the next numbers can be created and formatted before the first is printf operation is complete.

But the reason almost nobody uses any of the general forms of this concurrency (coroutines, fibers) is because the general case of this is as useless as it is easy to do. For instance there are plenty of C coroutine libraries (including Russ' libtask) and they work fine. The reason nobody uses them is the because the situations where this is called for are precious few.

The other day on reddit Ian of gccgo even butted into a conversation about 1:1 threads vs m:n threads, but could not muster up an answer as to what conditions m:n threads (ala goroutines) would be called for. Simultaneous execution (threads for example) was always the hard part and any talk about concurrency that isn't about simultaneous execution is just tilting at windmills.


Not sure what you're trying to say, m:n means simultaneous execution of concurrent tasks for any n > 1.


The point is that it is threading/simultaneous execution that is the big deal. A language designed around some 'concurrency not parallelism' slogan has missed the boat... concurrency was never the problem that needed to be solved.

For instance in golang the only support the language has for simultaneous execution is a threadsafe queue -- that's all. And the runtime libraries only have variations on mutex that you even have to manually create locks and manually remember to unlock them. This is extremely weak sauce for a 'concurrent' language.


I don't think you understood the slides. Go does run faster on multicore systems. It uses a configurable number of native threads to run the goroutines in.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: