
Regent: A Language for Implicit Dataflow Parallelism - federicoponzi
http://regent-lang.org/
======
mratsim
Hey, I've been following up on Legion and Regent quite a bit, excellent work
there.

Do you have a set of benchmarks that others can reimplement to compare the
approaches?

I've added Dataflow Parallelism to my own multithreading runtime[1] but I
didn't had dataflow focused benchmarks yet, well I could add Cholesky
Decomposition but it's quite involved.

I expect the people from TaskFlow[2] and Habanero[3] (via Data-Driven Futures)
would be quite interested as well in a common set of dataflow parallelism
benchmarks.

By the way if you didn't read the DaCe paper[4] you absolutely should, seems
like the age of Dataflow parallelism and properly optimizing for data is
coming.

[1]: [https://github.com/mratsim/weave#dataflow-
parallelism](https://github.com/mratsim/weave#dataflow-parallelism)

[2]:
[https://github.com/taskflow/taskflow](https://github.com/taskflow/taskflow)

[3]: [https://github.com/habanero-rice/hclib](https://github.com/habanero-
rice/hclib)

[4]: [https://github.com/spcl/dace](https://github.com/spcl/dace),
[https://arxiv.org/abs/1902.10345](https://arxiv.org/abs/1902.10345)

~~~
lightsighter
Please correct me if I'm wrong, but I think all of those system only work
inside of a single process. Legion/Regent support distributed multi-node
multi-process execution both on supercomputers and in the cloud.

------
brudgers
some past comments,
[https://news.ycombinator.com/item?id=10764268](https://news.ycombinator.com/item?id=10764268)

------
smabie
I've always wondered why there isn't a general purpose purely functional
language that computes a graph of deps and implicitly parallelizes all
operations. For some things, like a map over a list, I understand that the
overhead of distributing the work is greater than just using one thread, but
for things known at compile time (like deps between variables), the cost
should be zero to distribute.

~~~
lightsighter
Prior to working on Legion, I worked on a programming language called Sequoia
that matches much of what you are describing [1]. In many ways Sequoia was the
spiritual ancestor of Legion/Regent.

[1]:
[http://theory.stanford.edu/~aiken/publications/papers/sc06.p...](http://theory.stanford.edu/~aiken/publications/papers/sc06.pdf)

------
chenzhekl
This idea seems to be pretty similar to TensorFlow. Could anybody familiar to
this elaborate the difference?

~~~
lightsighter
The main difference between Legion and TensorFlow is how and when the dataflow
graph is constructed. In TensorFlow the graph is constructed lazily (no
execution is performed until you've asked for it), it's optimized, and then
distributed to processors (GPUs/TPUs) for execution. In Legion, the graph is
built, distributed, and executed on the fly. What this means is that Legion
can react to things like dynamic control flow (e.g. branches inside of loops)
and analyze dependences at runtime to find task parallelism, in a similar way
to how your out-of-order CPU extracts instruction level parallelism from a
program. Doing things in the TensorFlow model works better when you can see
your whole program up front and can "statically" optimize and schedule it
because it has lower overheads, but it also has limits to the kinds of
programs it can handle; the Legion approach works better when you have dynamic
data-dependent behavior in your program and you need to react to it on the
fly, but it does have some overhead to the dynamic analysis.

------
BubRoss
I don't understand why something like this would need a separate language.
Switching languages means starting over in many ways with regards to tools
libraries and semantics. A graph of tasks can be made with a cdecl library.

~~~
chrisseaton
Have you read Boehm's paper 'Threads Cannot be Implemented as a Library'?

It's the same reason.

Your dataflow semantics need to be part of the language semantics, otherwise
they're bound to be loosely defined and even more loosely enforced.

~~~
BubRoss
That's an assertion, but not anything to back it up.

First, threads have been implemented as libraries many times. Second, if
checks need to happen theg can happen at debug run time if they can't happen
at compile time. I don't know what specifically has to be integrated into a
language here that makes throwing away the enormous amount already built in
other languages.

~~~
chrisseaton
These points are all clearly addressed in the paper I referenced.

> That's an assertion, but not anything to back it up.

Yes it's an opinion on style, not a falsifiable claim.

> First, threads have been implemented as libraries many times.

The title isn't intended to be taken quite so literally. The author explains
why they think these don't work correctly.

> Second, if checks need to happen theg can happen at debug run time if they
> can't happen at compile time.

But languages don't have mechanisms to implement these kind of checks.

> I don't know what specifically has to be integrated into a language here
> that makes throwing away the enormous amount already built in other
> languages.

You're mistaken. It's not 'integrated into', it's 'taken out'.

Libraries let you add things, but to design a model for parallelism you
generally want to _take things away_. You want to take away the ability to do
things outside the model's rules.

