
Parallelism and concurrency need different tools - mhd
http://www.yosefk.com/blog/parallelism-and-concurrency-need-different-tools.html
======
pcwalton
Awesome article. I definitely agree that parallelism and concurrency require
different approaches, and one is not superior to the other.

In Rust we've known for a long time that success in Servo will require many
different parallel approaches, not just message passing between actors—for
example, parallel layout requires performing multiple tree traversals on
heterogeneous, garbage-collected doubly-linked trees, which seems all but
impossible to do efficiently in a message passing environment.

We have plans to add safe, race-free support for parallel loops to Rust, in
the same vein as the Parallel JS (River Trail) work. There is a prototype of
the ideas in the par module:

[https://github.com/mozilla/rust/blob/incoming/src/libstd/par...](https://github.com/mozilla/rust/blob/incoming/src/libstd/par.rs)

Rust's type system is designed to enable safe parallel computation. The idea
is to use uniqueness, lifetimes, and bounds on closures to achieve this:
uniqueness is used to ensure that there's no way to sneak in and access the
data structure you're operating on in a racy way, lifetimes are used to ensure
that data in a parallel loop cannot leak outside the loop, and bounds on
closures are used to ensure that closures can't capture mutable data and race
on it.

Finally, one nit: "Rust won't let you share memory unless it's immutable."
isn't quite accurate. There is a reader-writer lock type in the standard
library, which lets you share mutable data, but uses lifetimes to make sure
you take a lock on it. (You have to take a lock on the whole structure at a
time, to avoid the bank account problem you described.) We haven't used it
much though, because we try to avoid such heavyweight locks in performance-
sensitive code.

~~~
_yosefk
I'm sorry if I misrepresented the way things were in Rust; I also gleaned from
some places that you might add things like parallel loops in the future, and
was pretty sure that you were heading towards the same place as, say, Haskell
where you have tools for both concurrency and parallelism.

I'll update the article to link to your comment.

~~~
pcwalton
No problem! Rust's documentation is pretty scattershot at this point in its
life.

------
LukeHoersten
I think the author is missing the point regarding Haskell's concurrency vs.
parallelism.

"Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise."

This isn't talking about parallelism vs. concurrency, the concepts, this is
talking about Parallelism vs. Concurrency, the modules (capitalized). The
emphasis here is on using PURE when you can (meaning no side effects like disk
IO).

In Haskell, when you're working with pure computations, Haskell can make
guarantees about what's going on and effectively manage the concurrency for
you. From this concurrency, you can get parallelism almost for free (meaning
no extra programming restructuring on the devs part). So the module is called
Control.Parallel.

If you want to be doing parallel IO with Haskell, concurrency can't be managed
automatically for the dev. This is Control.Concurrent.

"The people behind Haskell realize that one tool for both isn't good enough."

Both Control.Concurrent and Control.Parallel are for parallelism! The
difference here is whether you're parallelising pure or side-effectful
computations and by extension can get free concurrency. Haskell doesn't have
different tools for parallelism and concurrency. It has different tools for
parallelising pure vs. effects.

If you have parallelism, you have concurrency. You can have concurrent
processes without parallelising them though. Simplifying: parallelism means
"at the same time" and concurrency means "can safely happen at the same time".

~~~
_yosefk
OK, so here's another quote
(<http://www.haskell.org/haskellwiki/Concurrency>):

"This page contains notes and information about how to write concurrent
programs in Haskell. If you're more interested in performance than non-
determinism, learn about parallelism first."

Here it's not capitalized so it must be the concepts and not the modules...

"If you have parallelism, you have concurrency" - not according to Rob Pike,
who called Sawzall parallel but not concurrent.

The main point I was making though was that computational code can be easily
checked even if it's not pure - and the fact that you can express it as pure
is a result of the same thing that let you easily check an impure spelling as
well (basically it's determinism, and avoidance of constructs where
determinism is hard to prove.) For event handling systems you can't easily
check them, for much the same reasons that you can't express them as pure
code.

~~~
LukeHoersten
Parallelism and concurrency are certainly different. Don't think I'm saying
otherwise. Our disagreement is that I'm saying parallelism is a subset/type of
concurrency and you are saying parallel execution can somehow not be
concurrent execution.

I still stand by my assertion that you misunderstood the quote from the
Haskell wiki:

"Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise."

as something that supports your claim that parallelism is not concurrency.

From Wikipedia:

"In computer science, concurrency is a property of systems in which several
computations are executing simultaneously, and potentially interacting with
each other. The computations may be executing on multiple cores in the same
chip, preemptively time-shared threads on the same processor, or executed on
physically separated processors."

I find it hard to believe that parallel execution of multiple threads aren't
"computations executing simultaneously" and therefor aren't concurrent.

Do you have any more detail about why you (or Rob Pike) think that parallel
execution is not also always concurrent execution?

~~~
_yosefk
Pike's definition was "handling many things at once" vs "doing many things at
once"; Sawzall or, to take a simple example he also used, a dot product where
you compute sub-products in parallel, is basically doing one logical thing at
a time - you have your inputs and you're computing the output. You're doing it
with more resources but it's one thing. SIMD instructions or superscalar
execution is like that and everybody calls it a form of parallelism and nobody
ever calls this concurrency, though the Wikipedia quote could be used to
defend calling this concurrency.

I think I don't want to argue about whether parallelism is a special case or
not, really; I'm fine with it being a special case, as long as I get the point
across that this special case deserves special treatment and is underserved by
tools aiming at the general case...

~~~
LukeHoersten
Seems reasonable. Worth the discussion at least. Thanks for responding to my
comments.

------
Glyptodon
I don't think this does a good job of clarifying concurrency and parallelism -
rather it muddies the waters, or perhaps I should say will have have the
result of confusing many people.

While people are generally clear on what parallelism is, or at least find
themselves in very much the same ballpark, there seem to be two separate and
completely different common usages of 'concurrency' in vogue as part of
current technical vernacular.

Articles like this draw hackles and controversy because because two groups of
people who've defined the same word differently refuse to realize that they're
having two completely different conversations.

While the may be a 'correct' usage, it's functionally meaningless if large
segments of people refuse to recognize the 'correct' definition.

...meh. Though the article actually had some interesting things to say, it
seems to sure to have them overlooked just for the sake of begetting another
round of terminology war.

~~~
_yosefk
So which terms do you suggest? What I was talking about was event handing
systems needing different things than computational systems, and
misapplications of tools designed for the former in the latter. The way I use
the terms "concurrency" and "parallelism" is largely consistent with the way
language designers involved in Haskell, Rust and ParaSail use these terms.

What other widespread uses of these terms conflict with my usage, and what
should I have said to avoid "another round of terminology war"?

~~~
scott_s
Your usage is slightly different than mine. My usage comes from what I learned
way back in my operating systems class in undergrad. I've been doing work in
parallel computing of various forms for a bit now, and this is how I think
about it (cribbing from what I wrote here a few months back):

Concurrency is a statement about a _duration_ of time. Parallelism is a
statement about a _moment_ in time.

During a one second window, did both of the tasks execute? If yes, then they
are concurrent. If no, then they are not concurrent in that window.

Was there any moment in time when both tasks executing simultaneously? If yes,
then they executed in parallel. If no, then they did not execute in parallel.

From this, it's easy to see that if something is parallel, it must also be
concurrent - in order for two tasks to execute during the same moment in time,
they must also both have executed during a window of time.

Concurrency without parallelism happens on machines with a single core.
Processes and threads must time-share the only core. Frequently, some of these
processes will have multiple threads, so the threads need to synchronize with
each other, even though they don't execute at the same time.

Concurrency and parallelism implies non-deterministic interleavings of threads
and processes, but whether or not the result is deterministic is up to how we
program.

------
danbruc
In my opinion the article is somewhere between missing the point and plain
wrong.

Concurrency is a property of a set of tasks describing dependencies among
these tasks. Think of it as a graph with a vertex for each task and an
directed edge between two tasks if one task depends on the other tasks result.
Two tasks are concurrent if there is no dependency between them. Concurrent
task can be executed in parallel, but do not have to. Two non-concurrent tasks
can not be executed in parallel,

Parallelism occurs during runtime when independent task are actually executed
at the same time (no matter if you do this on two cores or if you are just
interleaving the execution of both tasks on one core). Executing two tasks in
parallel requires that these two tasks are concurrent.

Example

Two (read-only) web requests are concurrent because there are no dependencies
between them. Each web request consists of some subtasks like parsing the
request, querying the database, performing some business logic and building
and sending the response. These subtasks are not concurrent because each one
depends on the result of the one before.

There are now (at least) two options for processing these both web requests -
one after the other or both in parallel. In the first case you ignore the
concurrency between both web request, in the second case you exploit the
concurrency between both web request by parallelizing their execution.

So IMHO there is no choice between concurrency and parallelism. Concurrency is
a property of the system you are building (and there is not much you can do in
order to add or remove concurrency). You then have a choice to introduce
parallelism to exploit concurrency or you just ignore cocurreny, but that's
it.

~~~
_yosefk
OK, so how come Rob Pike calls his own Sawzall language parallel but not
concurrent?

I presented your way of looking at it in the article more than once and I said
it was valid, just not the only one; then several comment threads emerge
reiterating that view, more or less wrongly, and arguing that it's got to be
the one and only way of looking at it. Sigh.

~~~
danbruc
IMHO - and I may of course be wrong - there are (widely) accepted definitions
for concurrency and parallelism in the computer science community and in my
comment I tried to present them as good as I could (but the comment is far
from being flawless). Assuming I am correct with the previous sentence, there
is no way of interpreting concurrency and parallelism this way or another
because the terms are fixed by their definition. You can of course look at
their relationship from different angels and highlight or ignore different
aspects in different situations but this does not change their basic
relationship or even definition.

I can think of two reasons why one would call a language parallel but no
concurrent. First this could obviously happen because somebody is not aware of
the meaning of the concepts or uses the words in a non-standard way but I
think this is probably not the case - I would be surprised if one designs a
concurrent/parallel language and does not know the distinction.

My best guess is that the intended meaning of the statement is, that the
language is no good for general concurrent task but well suited for SIMD-like
concurrent tasks, that is running the some operations on multiple different
inputs in parallel. (I have to admit that I know nothing about Sawzall and did
not invest any time to figure this out, just guessing.)

------
fmstephe
I think the comment that synchronization around mutable data doesn't prevent
all data races (as in the nicked iphone and the bank acount debit and credit
example) is really important.

A very good discussion on this can be found between Rich Hickey and Cliff
Click on the Clojure mailing list

[http://www.azulsystems.com/blog/cliff/2008-05-27-clojure-
stm...](http://www.azulsystems.com/blog/cliff/2008-05-27-clojure-stms-vs-
locks)

This exposes an important, and to me non-obvious, property of concurrency.
That it's not the locking that's really hard, it's how to be sure that every
piece of related data is included in the lock (or STM).

~~~
_yosefk
I think that according to standard terminology, syncing accesses to mutable
data _does_ prevent all data races - but not all _race conditions_ :
<http://blog.regehr.org/archives/490> (Race Conditions vs Data Races)

The nice thing about eliminating queue-like interfaces - and locks, STM,
message passing, lock-free containers etc. are actually a form of a queue-like
interface - is that it eliminates all race conditions which aren't data races.
Then you can have automated debugging tools find said data races.

Unfortunately, this doesn't work with event handling systems, because they're
often _required_ to have queue semantics to resolve inevitable conflicts
between requests.

~~~
fmstephe
Can you give an example of a system where queue-like interfaces have been
removed but data races still exist. I feel that there are some important, and
interesting, details I am not grasping.

~~~
_yosefk
Sure; a parallel for loop with each task processing a[0], a[1], ... a[N] has
no data races but if task 0 modifies a[0] and reads a[1], task 1 modifies a[1]
and reads a[2], etc. then this loop is full of data races without there being
anything queue-like (that is, you never say "I'll do this thing here unless
someone beats me to it in which case I will let them finish first.")

~~~
fmstephe
Thanks

------
eikenberry
The author is making the same classic mistake of conflating concurrency and
parallelism. The main difference here is that the author first creates a
concurrency straw-man that he then compares to parallelism, but then when
talking about parallelism he makes the same old mistake of conflating the two.

Concurrency is about modeling a problem as an orchestration of multiple things
going on at once. It is an abstraction mechanism in much the same way as
objects or functions. Whereas parallelism is a machine level occurrence like
cache or registers. Not something to be ignored for sure, but not something
typically targeted when talking high level abstractions.

------
oneofthose
I might be confusing things here but does concurrency, at least the definition
give here, not directly map to task parallelism?

Task parallelism and data parallelism are two important concepts and the tools
we need to implement them are very different. The two words used by the author
to convey the difference between those two concepts are not chosen optimal in
my opinion. There seems to be a hierarchy between concurrency and parallelism
whereas task parallelism and data parallelism are directly comparable.

~~~
_yosefk
Task/data are orthogonal to concurrent/parallel.

"Task parallel": compute a() in parallel with b().

"Data parallel": compute a(0), a(1), ..., a(N) in parallel.

"Task concurrent" (not idiomatic, but hear me out): handle mouse events
concurrently with keyboard events.

"Data concurrent" (again not idiomatic): handle events from 1000 sockets
concurrently.

~~~
oneofthose
I don't see how

compute a() in parallel with b() and

handle mouse events concurrently (could be a()) with keyboard events (could be
b())

differ and how does

compute a(0), a(1), ..., a(N) in parallel differ from

handle events from 1000 sockets concurrently which would be something like
a(event0) a(event1)?

Don't we require the same tools for "task parallel" and "task concurrent" as
well as "data parallel" and "data concurrent"? To me they don't seem
orthogonal but I believe they are very much the same thing.

------
egonschiele
How timely! I just wrote a blog post that explains STM in haskell with
pictures: [http://adit.io/posts/2013-05-15-Locks,-Actors,-And-STM-In-
Pi...](http://adit.io/posts/2013-05-15-Locks,-Actors,-And-STM-In-
Pictures.html)

------
agentultra
I tend to agree. I've raised some hackles with colleagues over, "splitting
hairs," on the difference between concurrency and parallelism. They're two
different things and this articles does a pretty good job of pointing out some
of them.

I've always thought of parallel computation as lines in time that never
intersect or share resources but are computed in the same order at the same
time:

    
    
       ---|----
       ---|----
       ---|----
    

The results of each line are part of one computation. The problem is split
over space and their computation happens in the same order as a single task.

Conversely, concurrency is a little more lenient on causality; the lines are
scheduled co-operatively and the computation will complete at some point when
all the tasks are complete.

Causality in programming languages is certainly an interesting area to improve
upon. Some interesting perspectives are <http://bloom-lang.org> and my own
attempt at implementing it: <https://github.com/agentultra/knockblock> which
attempt to address the causality of eventually-consistent distributed
replicas.

~~~
st2p
It's as simple as, something that's concurrent (simultaneous) may not be
parallel (physically) but if it's parallel it's also concurrent. When used as
a verb, parallel is interchangeable with concurrent.

------
kibwen
_yosefk, I'm trying to digest your argument and I'm wondering if I have this
correct:

1\. Concurrency is a property of any problem that involves resource
contention.

2\. Parallelism is merely a performance optimization for certain classes of
problems.

3\. Tools that address the concurrency problem are capable of performing the
parallelism optimization by recasting a problem as a resource-contention
problem.

4\. However, casting a problem as a resource-contention problem introduces
_necessary_ nondeterminism and _likely_ race conditions, as these are inherent
properties of the problem domain for which these tools were designed.

5\. Instead, use a tool that is explicitly designed to address parallelism in
a deterministic, race-free manner.

Is this the essence of your essay? Personally, I'd love to see an example of a
small real-world problem solved in e.g. Erlang vs. the same solution in
checkedthreads.

~~~
varelse
I use deterministic results as my #1 debugging tool.

I do this by building queue-based concurrent systems with a resulting
nondeterministic order of execution but I then insure that the results
themselves are deterministic.

Consider it to be a single queue with multiple vending machines to process
requests. No one knows which vending machine will dispense their soda, but
they will always get the soda they desire.

No need for special tools there. The overhead of insuring a deterministic
result I'd estimate is <2% (the use of 32-bit float to 64-bit int type
conversions (fixed point) and 64-bit versus 32-bit atomic operations on GPUs
for accumulation). The biggest problems I run into are end users who'd happily
trade that predictable output for that <2% perf gain and competitors who are
happy to oblige them.

And the problem there is that I don't know how you diagnose race conditions or
the quality of the underlying algorithm without a deterministic output from a
deterministic input.

Recently, someone told me they address this problem by running single-threaded
to check for race conditions. Sigh... But I guess that lets them evaluate the
model.

~~~
_yosefk
Well, I wouldn't call that "concurrency" just because there's a queue in it -
I'd call it "parallel but not concurrent"; or we can call it differently - I
really don't claim to use the terms in "the right way" - but that's what I was
referring to as "parallel, not concurrent", anyway.

"Insure that the results are deterministic" - how? Race detectors
_deterministically_ insure that results are deterministic :) which I always
found nice... "No special tools" - you mean you just run the thing several
times and check the results don't change, or "just" write the code correctly?
How do you know it's yielding deterministic results except that it's supposed
to do so by design without any tools?

~~~
varelse
I run the code multiple times on different GPUs of the same generation (not
necessarily the same model) in different physical machines and get the same
bit-accurate output. You're right, technically, there _could_ be a race
condition somewhere somehow that just doesn't fire very often. It isn't
_proven_ to be free of race conditions, but in practice I suspect it is so
(right up until it isn't, a situation often detected by end users using a code
path in a unique way, at which point it's a bug, and I fix it).

But in practice, since it works like gangbusters to find broken GPUs if one
checks for that deterministic output on known working examples, even finding
broken GPUs that otherwise pass existing memory tests, I'm reasonably sure it
works.

~~~
varelse
PS in my experience, the existing automated GPU tools for finding race
conditions Heisenberg the process sufficiently that the race conditions stop
occurring.

So in practice, I end up running the code as usual, but writing all the
intermediate data to a buffer, dumping that to a file with a key for ordering
it, sorting it, and then diffing an enormous text file to find out what's
inconsistent between two runs.

~~~
_yosefk
If race detectors so behave, then they aren't very good race detectors - most
aren't good, in fact, because they can't be very good, because they support
concurrency (that's why Helgrind misses bugs and Cilkscreen doesn't), which
was a part of my point.

It's a pity if GPUs, which are really a platform for computation acceleration
and not for concurrent event handling, lack deterministic race detectors; a
good detector not only wouldn't mask a bug that was occurring, but it would
detect a bug which perhaps you never observed until running the tool.

------
kevingadd
A thought: How does one apply this to socket-based IO?

As far as I can tell, everything involving sockets goes through tons of queues
like the ones he describes; concurrency, not parallelism.

If I kick off reads on ten different sockets, when the read completes the
kernel has to put the thread waiting for the read onto a 'ready to run' queue
before any work can actually happen. Scale up to 1000 pending reads and now
that queue could get pretty full pretty quick, and lots of data is moving in
and out. As I understand things, select and epoll improve on this by making
the queue more efficient: now instead of using a thread per socket, and having
the OS queue up threads on the ready-to-run queue, the OS helps you manage a
queue of 'ready sockets', and you respond to those manually. But there's still
a queue!

Are there IO APIs out there in modern OSes that let you reduce queueing in
such a way? Like, hypothetically if performing a read on a socket required me
to pass in a callback that would process the data from the read, the OS could
respond to incoming data on that socket by _immediately_ running the callback,
whether it's on an OS-managed user-mode thread pool or via some other
mechanism. This would at least reduce the amount of queueing going on - at
most you have a queue of tasks for a thread pool that are responding to
complete IO (similar to epoll, I guess?) but I bet there are ways to improve
on that too.

(If memory serves, IO Completion Ports on NT have functionality that looks
like this... but I bet it still involves lots of queueing and scheduling)

~~~
_yosefk
I think that this angle adds another dimension - dealing with latency and not
letting it hurt throughput. The problem could be "embarrassingly parallel" -
10000 processes with 10000 sockets not ever interfering with each other - or
it could be concurrent, with a lot of communication between the processes and
with event ordering greatly affecting results.

Latency would regardless be a problem - the fact that you read from a socket
and you get blocked for a while, and can you find something useful to do in
the meantime or not, and how easy it will be to get back to what you were
doing once the data arrives.

In this sense, sockets are not fundamentally different from, say, DRAM latency
- it's a much smaller latency, but you still want to do something in the
meanwhile.

~~~
sbov
It depends upon how big of a picture you want to look at. E.g. if you look
down to the hardware level (your single ethernet cable connecting your
computer to a router) it is again no longer parallel (from what I recall -
it's been a while since I've studied this).

In the parallel gifts example, they had to somehow decide who would buy people
what, which tends to not be very parallel. If the store has only one entrance
then that too turns into a non-parallel part of the problem. If the gifts were
mailed there's only 1 mailman who delivers to an area on a given day. And so
on.

You're probably always going to find resource contention somewhere, it just
isn't always something you have to deal with directly when solving your
problem.

------
micro_cam
Good article but perhaps focused too much on the single machine case and/or
shared memory architecture. I think that getting the most our of parallelism
at scale on commodity hardware is facilitated by or even necessitates good
concurrency tools.

Ie the results of computation do no one any good if they are sitting in memory
spread across compute nodes in the cloud and combining them back down to the
point that they can be presented to the user as quickly as possible usually
involves a fair amount of concurrent code.

~~~
_yosefk
The vast - _VAST_ \- majority of computing power in the world today is
standalone devices (PCs, phones, tablets and innumerable embedded crap with
stunning horsepower - each car has ~90 microprocessors for a single example).
Most of these standalone devices are multicore and it'd be nice to use the
extra compute capability somehow, and it'd be a pity to waste performance by
using tools designed for the multibox case which isn't going to happen outside
a small share of use cases, even if those use cases are loud.

That said, I didn't focus on the single machine case or shared memory
architecture, I think - it was more about correctness. A parallel system is
very easy to get right compared to a truly concurrent system; I mentioned
memory access monitoring but it can be other communication instead of through
memory - the point is that you know a parallelism bug when you see one, but
not a concurrency bug, because concurrent systems have non-deterministic
results and sometimes it's correctly non-deterministic and sometimes
incorrectly so.

In particular, though I don't work on these things, I think tools for
detecting parallelism bugs that scale past a single chip/box would be very
valuable for large compute clusters - and there again using scalable message-
passing-based systems without such tools would underserve computational
workloads.

~~~
micro_cam
Perhaps I am the one is over focused on distributed use. However, their are a
lot of servers in the world ;)

Also how do you classify work loads like physical simulation that are one of
the classic use cases for parallel processing but not embarrassingly parallel
in that they need lots of communication between cores?

Better tools for analyzing distributed computation would be cool. I think that
we would have to think carefully about the kind of bugs to look for. Since
data access/distribution is slow more thought tends to go into it (and there
are systems in place for doing it well) but there are of course loads of other
issues that show up.

~~~
_yosefk
I think if the simulation is deterministic it's parallel, if it's OK with not
being deterministic then it's concurrent; I'd fight to keep it parallel and
not concurrent, but I don't know how hard it is, maybe in some cases you can't
realistically win the battle.

I very much think it's valuable to treat distributed, parallel systems
specially and not just use concurrency-oriented tools - it's better to exploit
determinism for automated debugging, it's just not my area at the moment - I
work more on shared-memory systems. I was involved in some work with
parallelizing on multiple boxes at process granularity, but it wasn't
elaborate enough to go deeply into any of these issues.

~~~
micro_cam
Things definitely get messy in scientific computing. It is common to have
algorithms that are stochastic for other reasons.

I'm working on a parallel random forest implementation at the moment...very
little communication between threads is needed but the algorithm itself is
stochastic (unless you seed your random number generators in a clever way i
guess). (And I've use go's concurrency primitives and race tool for things
like writing the forest to disk as trees are grown in parallel and compiling
summary statistics).

I do think that tools like the one you are writing are very useful and needed.
I just have a bad habit of wanting everything to meet my needs.

------
human_error
Correct me if I'm wrong but Cilk and Cilk Plus are different technologies so
I'd write Cilk Plus instead of Cilk to prevent confusion. And where's Scala
with Akka? Nice post though.

------
Paradigma11
I do think this article from peter van roy has pretty clear and useful
definitions:
[https://docs.google.com/viewer?url=http%3A%2F%2Fwww.info.ucl...](https://docs.google.com/viewer?url=http%3A%2F%2Fwww.info.ucl.ac.be%2F~pvr%2FVanRoyChapter.pdf)

See page 25 and 34.

------
zwieback
When I started doing multi-threaded programming a number of years ago there
really wasn't any confusion about this, parallelism, concurrency, preemption
and contention were just some of the basic terms you had to learn.

Why is there so much more confusion now that we have "better" tools?

------
ccdan
There's plenty of concurrency and parallelism out there but there's not a
single piece of important software that is written in a functional language.
Many people claim many things but when it comes to empirical evidence. it
turns out that imperative, stateful programming is still the best choice.
Erlang has some limited and very niche uses in telecom, bua it was replaced
even at Ericsson to good degree with c++.

~~~
papsosouid
>There's plenty of concurrency and parallelism out there but there's not a
single piece of important software that is written in a functional language.

Important to who? I use tools written in functional languages every single
day.

>Many people claim many things but when it comes to empirical evidence. it
turns out that imperative, stateful programming is still the best choice.

You don't see the obvious hypocrisy there?

~~~
ccdan
>>Important to who? I use tools written in functional languages every single
day.<< To the market. There are all kind of servers from web servers to
database servers to other kinds of servers. Almost none are made using
functional programming. Then, there is lot of software out there, which is
made almost exclusively in C/C++, Java & .Net. Haskell & co. are only used in
academia and by hobbyists (math people who can't learn proper programming)

~~~
papsosouid
Well, at least you were funny that time.

------
chaostheory
> Parallelism and concurrency need different tools

Not if you're using Akka.

~~~
_yosefk
Care to elaborate? Specifically, what mechanisms are used for handling
conflicts (two withdrawals from the same bank account), and what mechanisms
are available for preventing/detecting conflicts (two tasks modify the same
array index)?

~~~
chaostheory
Akka is an actor model concurrency library (so instead of threads and locks
-they are abstracted away- you pass immutable messages between actor objects
which are managed by other actors with queues - there are locks though if you
really need them) for scala and java that just happens to use the same
interface (more or less) for distributing work between different nodes. It's a
tool for both. I don't know of anything else like it.

<http://doc.akka.io/docs/akka/2.0.1/scala/transactors.html>

~~~
_yosefk
OK, so I'm seeing the usual concurrency stuff; what I'm not seeing is conflict
detection/prevention for parallel stuff, where you don't let two parallel
tasks modify the same array element or some such.

~~~
chaostheory
> what I'm not seeing is conflict detection/prevention for parallel stuff,
> where you don't let two parallel tasks modify the same array element or some
> such

I could be wrong but based on your own article, that doesn't sound like a
parallel task. Aren't parallel tasks already assigned work that don't conflict
ala map reduce?

> "concurrency is dealing with inevitable timing-related conflicts,
> parallelism is avoiding unnecessary conflicts"

did I miss something?

~~~
_yosefk
Yeah, the question is what if you have a bug - if you thought you were
assigning work so that tasks don't conflict but in fact they do. In Map/Reduce
you technically can't have this sort of bug, nor can you have it in pure code;
restricting the programmer so that such bugs can not be committed is fair
enough. If it is in fact possible to have such bugs, as it is with shared
mutable memory, then I expect to see tools for finding such bugs.

~~~
chaostheory
>If it is in fact possible to have such bugs, as it is with shared mutable
memory, then I expect to see tools for finding such bugs.

Well with the actor model, it is less likely since you're supposed to be
passing immutable messages around and you're not coordinating threads around
locked resources.

------
papsosouid
The claim that haskell thinks you need separate tools is silly. The fact that
the language provides abstractions for parallel computation and for
concurrency does not mean the language or its designers consider them
unrelated problems requiring different tools. Haskell is a particularly bad
example to support this sort of misleading claim, as the language supports
both out of the box without using different tools, and the same page you
quoted even says that (<http://www.haskell.org/haskellwiki/Parallelism>):

>Concurrent threads (forkIO) will run in parallel

The simple example being say a webserver. You use forkIO to spawn a green
thread per connection. This gets you concurrency. Now if you want parallelism,
you simply tell the haskell runtime to multiplex those green threads over more
than one OS thread. You don't have to change your code at all. The whole
argument and terrible gift analogy seems to rely on redefining parallelism to
mean "making sequential code parallel". I think most people are in fact
thinking of making concurrent code parallel when they talk about wanting
parallelism, like the webserver example.

~~~
_yosefk
From here: <http://www.haskell.org/haskellwiki/Concurrency>

"This page contains notes and information about how to write concurrent
programs in Haskell. _If you're more interested in performance than non-
determinism, learn about parallelism first._ "

From here:
[http://www.haskell.org/ghc/docs/7.0.3/html/users_guide/lang-...](http://www.haskell.org/ghc/docs/7.0.3/html/users_guide/lang-
parallel.html)

"GHC implements some major extensions to Haskell to support concurrent and
parallel programming. Let us first establish terminology:

* Parallelism means running a Haskell program on multiple processors, with the goal of improving performance. Ideally, this should be done invisibly, and with no semantic changes.

* Concurrency means implementing a program by using multiple I/O-performing threads. While a concurrent Haskell program can run on a parallel machine, the primary goal of using concurrency is not to gain performance, but rather because that is the simplest and most direct way to write the program. Since the threads perform I/O, the semantics of the program is necessarily non-deterministic.

GHC supports both concurrency and parallelism."

Finally, the part that I quoted in the article:

"Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise."

How isn't this "providing different tools for the two and recommending to use
them in different situations" I don't know.

~~~
papsosouid
It is almost like you didn't read my post. They provide forkIO as a basic tool
to do both. That is explicitly not different tools. They _also_ provide
additional abstractions to make different tasks easier. Tasks like
parallelising sequential code, which you want to pretend is the only kind of
parallelism that exists. Haskell provides different abstractions for lots of
things, that doesn't mean those things _require_ different tools. It means
convenience matters.

>If you're more interested in performance than non-determinism, learn about
parallelism first."

First, not instead of.

>GHC implements some major extensions to Haskell to support concurrent and
parallel programming

Yep, that really sounds like they require different tools, the way the mention
them together as a single topic.

>GHC supports both concurrency and parallelism

And does so with the same tool: forkIO. Completely contrary to your claim that
haskell is evidence that it requires separate tools. This is once again made
very clear on a page you link to:

>Ordinary single-threaded Haskell programs will not benefit from enabling SMP
parallelism alone: you must expose parallelism to the compiler. One way to do
so is forking threads using Concurrent Haskell (Section 7.18.1, “Concurrent
Haskell”), but the simplest mechanism for extracting parallelism from pure
code is to use the par combinator

They are literally saying "use a higher level abstraction instead of rolling
your own directly with concurrency constructs". This is the same advice that
is always given when a higher level abstraction is available. It directly
contradicts the claim that parallelism requires a different tool. It
explicitly states that you can do it with the same tool.

~~~
_yosefk
"It is almost like you didn't read my post."

Seconded.

"They provide forkIO as a basic tool to do both."

Computers provide bits to do both and then some. You can call the ensuing
distinctions "matters of convenience" and not "requiring different things", of
course.

~~~
papsosouid
>You can call the ensuing distinctions "matters of convenience" and not
"requiring different things", of course.

Or you can read the code and see that it is quite literally an abstraction
layer built on the same tool, not two different tools. That is like saying
folds and recursion require different tools, and providing "haskell has both
recursion and folds" as proof. Of course it does, but folds are simply an
abstraction on top of recursion.

