
Elemental: C++ library for distributed-memory linear algebra and optimization - espeed
https://github.com/elemental/Elemental
======
math_and_stuff
I am the lead author of Elemental and am surprised to find this here. I'll try
to answer any questions.

~~~
semi-extrinsic
I understand (correct me if I'm wrong) that Elemental started somehow as a by-
product of your PhD work. How did you find keeping working on it after the
thesis, and growing a small team? Anything in particular that you think was
important for this to succeed (I mean the keeping-on part, not the technical
part)?

~~~
math_and_stuff
The brutally honest answer would be that I prioritize its development above
everything else. I have no idea if this is sustainable in the long term, but I
don't plan on slowing the development down as long as I have an income.

When I was finishing my PhD I turned down several industry opportunities for
fear of having to stop development. I am currently a (tenure track) assistant
professor but increasingly worry about academia's lack of respect for library
development.

~~~
kxyvr
At this point, I don't think it'll help you, but about a decade ago, COIN-OR
was put together to partially provide an organization that helps give
academics a place to peer review and receive credit for the codes they
generate ([http://www.coin-or.org/](http://www.coin-or.org/)). The other
reason was to provide a place to host open source optimization codes before
things like Github became popular. Now, did it every really live up to that
promise? Not as much as I'd like. Honestly, after IBM purchased ILOG, it
looked like their support dropped off pretty significantly and IBM was always
the largest backer. That and a number of key people retired. In any case,
there was just an election, but I know they're still looking for people to
help drive the organization forward. As an assistant prof, it's a risk, but
perhaps one worth looking at at least in the sense of what some kind of
organization with moderate backing looked like that was trying to give
academics better credit for their codes.

~~~
math_and_stuff
This is a great point. I'm an outsider in the optimization and OR communities
(having started out in HPC linear algebra) but have a lot of respect for what
I've seen from COIN-OR.

------
vmarsy
Tldr from the website: _Elemental is an open-source library for distributed-
memory dense and sparse-direct linear algebra and optimization which builds on
top of BLAS, LAPACK, and MPI using modern C++ and additionally exposes
interfaces to C and Python (with a Julia interface beginning development)._

It'll be interesting to see the how the Julia interface behaves and how big
the performance penalty is going to be.

~~~
semi-extrinsic
PETSc has an Elemental interface for doing dense linear algebra. Can't do much
better for an endorsement than that.

~~~
math_and_stuff
I'm definitely happy that the PETSc developers have put up with Elemental
being such a moving target (it has expanded from dense factorizations to
sparse-direct, Interior Point Methods, and now some number theory), as there
have been substantial growing pains. Barry Smith has also been nice enough to
poke me any time Elemental isn't valgrind clean.

------
kxyvr
When you say that you can do arbitrary precision linear algebra, do you
control the scalar type or is the code templated on it? Basically, can we run
the factorizations over arbitrary scalar types, which allows things like
automatic differentiation over the factorizations?

How does the performance of the linear algebra compare to other templated
linear algebra codes like Eigen?

~~~
math_and_stuff
As of a few days ago, Elemental supports a BigFloat class which wraps MPFR
along with MPI wrappers and distributed matrix classes which template over it.

It would be a modest project to support other datatypes, as the main obstacles
are MPI wrappers and perhaps some explicit instantiations. I would be happy to
respond in excruciating detail on dev@libelemental.org.

As for performance: Elemental is focused on distributed-memory functionality
but still achieves performance comparable to optimized libraries when run on a
single process (indeed, Elemental often maps down to OpenBLAS, MKL, etc. when
run with only one process).

------
amelius
Does this overlap functionally with TensorFlow ([1])? How are the two
different?

[1] [https://www.tensorflow.org/](https://www.tensorflow.org/)

~~~
math_and_stuff
I would say that the goals align much more than the current approaches, which
are somewhat orthogonal. My focus in developing Elemental has been on
expanding from the scope of libraries like ScaLAPACK (which focused on
distributed dense linear algebra) into more general algorithms (e.g., Interior
Point Methods and lattice reduction) and modern programming practices (using
templates to support arbitrary-precision versions of the above).

Arguably the biggest weakness of Elemental is its lack of focus on dynamic
scheduling at an intranodal level, though this can to a large degree be pushed
into a local BLAS layer. One of the big items on my wish list would be proper
integration of intranodal dynamic scheduling with the current high-performance
internodal static scheduling.

The current TensorFlow releases seem to be primarily focused on intranodal
dynamic scheduling; there has been a substantial amount of work in this area
from the PLASMA [1] and (less recently) SuperMatrix [2] projects.

[1] [http://icl.cs.utk.edu/plasma/](http://icl.cs.utk.edu/plasma/) [2]
[http://www.cs.utexas.edu/users/flame/pubs/SuperMatrixTR.pdf](http://www.cs.utexas.edu/users/flame/pubs/SuperMatrixTR.pdf)

~~~
Gtifn
How about dask's distrubted array? Do you know how the architecture and goals
compare to Elemental?

~~~
math_and_stuff
Distributed [1] is very new and seems to have similar core architectural goals
as TensorFlow. But perhaps I'm being too politically correct: both make use of
very course-grain parallelism relative to a typical distributed-memory linear
algebra library (e.g., the current communication mechanisms of both are likely
to be too course-grain to efficiently support distributed dense matrix
inversion or eigensolvers; not that this is likely to be a design goal of
either).

[1]
[http://matthewrocklin.com/blog/work/2015/06/23/Distributed/](http://matthewrocklin.com/blog/work/2015/06/23/Distributed/)

~~~
Gtifn
I think distributed's scheduler has seen much iteration since then and is now
based on tornado. I wonder if that changes your assessment.

Also what do you think of Julia's native distributed capabilities and this
library here:
[https://github.com/shashi/ComputeFramework.jl](https://github.com/shashi/ComputeFramework.jl)

