ModelingToolkit.jl, an IR and Compiler for Scientific Models

eigenspace · on May 16, 2019

Awesome work, it’s really cool to see the progress being made in ModellingToolkit.jl.

As someone who spends more time on symbolic mathematics than numerical mathematics, I’ve often wished that Julia’s symbolic libraries had the same sort of energy behind them that the numerical libraries had. But it’s great to see that a lot of the work put into the numerical libraries is relevant to the symbolic ones.

lewis500 · on May 16, 2019

Me too. I think that because of the meta programming capabilities we’re going to see a lot more development in that direction. Recently I started using Turing.jl, which uses a sort of declarative syntax that was easy to write and understand.

marmaduke · on May 16, 2019

as a research engineer in data science/modeling, posts like this just look too good to be true. there's no free lunch. what is the catch with all this stuff? who is going to maintain all this when Chris Rackauckas gets plucked by MathWorks or Wolfram?

(this is a compliment! it's great stuff)

ChrisRackauckas · on May 18, 2019

If you're looking for a catch, I guess the issue is that the modeling domain is relatively simple, restricted to "mathematical models", i.e. no conditional-based control flow like while loops. Like systems like Jax, for loops are supported by simply unrolling the expression into something without control flow. This lets the resulting IR be relatively simple and then all of the math is easy to define, but of course means the domain for this is "math" and not "programs". That said, it's still large enough for things like Modelica models, which is a large enough domain for a lot of the algorithms I want to explore like automatic index reduction of DAEs.

marmaduke · on May 16, 2019

I have the obligatory does it work with GPUs question. I have lamented writing out jacobians for GPUs by hand for ages, and thought, SymPy should be able to do this.

new4thaccount · on May 16, 2019

As someone whose entire industry runs off of a Jacobian matrix, I wonder how easy it is to run on the GPU? I assumed this was difficult as splitting up the matrix would impact the solution.

ChrisRackauckas · on May 17, 2019

DifferentialEquations.jl supports GPUs. The question is what is the right way to utilize DiffEq's GPU support here. That's actually a much deeper question than you might at first think. The reason is because sparse factorizations are not an inherently parallel operation, so CUSOLVER is actually quite bad at doing them. It even seems that CUSOLVER does some CPU operations in there... ouch.

But what can we do with ModelingToolkit to start pushing more into this domain? Well, here's a few things:

1) We plan to have new compilation targets, kind of like the SArray one that was shown, which target parallelism. For example, for large Jacobians, we want a target so you can say "generate the code that fills the Jacobian optimally with 16 threads", and then what it will do is split up the Jacobian into 16 parts with approximately equally many operations and thread between those. Since it's all scalar operations then this should be something that gets as efficient as you'd want to write by hand! Then a similar compilation target can be made for the GPU, where it would need to know the number of GPU threads you'd be using and it would split apart that Jacobian filling and write a gigantic `if threadid() == ...` setup, and use CUDAnative.jl to compile this to a .ptx kernel so that all of those threads are parallelized. Both of these trivially extend to sparse matrices doing what I showed in the writing. So filling sparse matrices is something we can and plan to have nice compilation targets for.

2) We can also have compilation targets for `J*v` operations, so that way it can accelerate Newton-Krylov solvers, which is probably the better way of using sparse matrices on the GPU since it would avoid sparse factorizations.

3) We can also implement a generic sparse factorization in Julia, to then build analytical solutions for sparse factorizations, run Rewrite.jl simplification passes on these, and generate the parallelized kernels for them. This would then be something that does sparse factorizations on the GPU with parallelized code but does not need to use CUSOLVER, and so that would be quite ideal. The issue is that it would require writing generic sparse factorization tools, since Julia currently relies on SuiteSparse for this with no generic fallback. This is not an easy task, but I think Yingbo will be up for it in a few years since we have a plan to not use standard sparse linear algebra routines in Julia's differential equation suites because there is a way to speed up operations specifically in this domain by using a new factorization, and so if we go this direction our first task will be to learn how the current sparse LU is implemented.

new4thaccount · on May 17, 2019

Wow! A lot going on there and my sincere thanks for your part in what be a massive amount of free work that I get to benefit from.

I'm really excited for what y'all have coming in the future and will stay tuned. There is some literature on Newton-Krylov solvers for my domain (power systems) so I'll have to brush up on that math.

Help me understand one thing. I suppose if we could one day compile Julia to a binary and sell software (just curious...I don't sell software today), would you have to secure a license for suitesparse? Is that another benefit from rewriting?

ChrisRackauckas · on May 17, 2019

Yes, you would need a commercial license for suite sparse I think, so that would be a good reason to rewrite its core methods to a new license.

new4thaccount · on May 17, 2019

Thanks for clearing that up. I think someone else on here pointed that out as well as the fact that Matlab and some other tools provide that for you as part of your license/subscription fee. I started digging around and found out that it seems like the suitesparse author is one of the top authorities in this area and therefore, it might not be super realistic to get something as fast or faster on your own. Maybe a big enough open-source project could do it though. Any thoughts on this?

ddragon · on May 17, 2019

The latest release of DiffEq.jl seems to support it (and the creator commented on it), but I wouldn't know the details.

http://juliadiffeq.org/2019/05/09/GPU.html

https://news.ycombinator.com/item?id=19877173

b_tterc_p · on May 17, 2019

What industry would that be and approximately how valuable do you think such a thing might be worth

new4thaccount · on May 17, 2019

Haha. Great question. Power Transmission.

When you model the transmission grid, you can solve for the flows on the equipment (lines and transformers) where the inputs are certain equipment parameters as well as generation and load values. This is called powerflow or loadflow and the traditional Newton-Raphson method represents the equations in Jacobian form where the sensitivities represent how real and reactive power is impacted by the bus voltages and bus phase angles. This solution takes voltage into account (pretty important) and forms the basis of many computationally intensive operations such as contingency analysis (also known as N-1). You basically determine before the analysis which equipment you want to monitor and then you have a list of contingencies that you want to guard against. The grid itself is a pretty massive NxN matrix that is thankfully both sparse and symmetric. When you simulate a contingency, you take it out of service in the model and then re-solve the AC powerflow. For each of those solves, you check that the bus voltages are ok and that the equipment is within the rated limits. Violations are reported to the user. There are other calcs that are also important and that use the Jacobian. I think one reason this area hasn't received much GPU attention is that the problem is easy enough to parallelize and most vendors do so now, so a server with a decent amount of cores can whittle away at even large grids with thousands of contingencies in a reasonable amount of time. Less important studies that need to be quick and robust use the DC powerflow which is linearized and doesn't show voltage impact.

There are many entities that monitor this in real-time, but transmission planners work with text file based models and will run millions of contingencies with computer clusters in a batch mode and dump the results in Excel or a SQL database. I'm not sure if a high end GPU would help there or if it is just a throw more CPU cores kind of problem.

How much would it be worth? It is tough to say. I'd have to know how well GPUs perform here. I'm guessing it is not as good, but I'm not knowledgeable here. If you had a product that bundled GPU, GPU-based power systems analysis software and a client that would run on desktop with an API, AND it performed significantly faster than the competition, you might have some customers. That is a ton of upfront work though.

Certhas · on May 17, 2019

I work on the research side on some Julia code for power grids, we're mostly focused on dynamics and transients, not load flow. GPUs are definitely on our radar, as are automatically extracted Jacobians. None of this is industry grade though, at least for the time being.

Drop me a line if you want: Certhas@gmail.com

new4thaccount · on May 17, 2019

Sounds like some pretty cool research and good luck! I work with folks who manage large dynamics models for transient stability tasks, but have only been partially involved myself as I'm more focused on the market side, but dabble in other areas. The math definitely looks more challenging for dynamics simulations (I usually only browse through the dynamics sections in textbooks, but plan on learning more when I finally start my masters program).

I'll send you an email sometime in the near future for some discussion. If you're using Julia for grid work, you're probably either one of the firms listed on their success stories, or one of the national labs. Either of which sounds interesting from one engineer to another.

vsskanth · on May 17, 2019

Please forgive me if I'm wrong but this sounds like a sparse jacobian problem? There are solvers that seem to support this (esp. Julia) using GPUs. I believe Boost odeint also supports this.

new4thaccount · on May 17, 2019

Yep, the Jacobian is sparse. I'm using Julia a bit for work and play, but don't own a GPU to test out their GPU support quality.

Does anyone (I assume the poster I originally replied to) know which situations are even worth it to is solve the Jacobian on a GPU?

One of the questions I have about the Julia diffeq module is how many simultaneous equations can it take? Is it thousands?

ChrisRackauckas · on May 17, 2019

DifferentialEquations.jl can take as many simultaneous equations as you have memory for. You can even use an MPIArray when you run out. When we solve PDEs, sometimes I'm testing with a state space of tens of millions of ODEs of course, which comes from a Method of Lines discretization. Right now we are doing a lot of tests in the thousands of ODEs range since we have a reader for chemical reaction networks which we are using to optimize on test models. This is all with directly writing the Julia functions.

If you question is, how far can ModelingToolkit be pushed? That's unknown. We made sure it's quite an efficient representation, but we haven't stress tested it to see the memory consumption at a million operations yet. Since it's able to generate thousands of operations in symbolic Jacobian inversions, it easy goes to thousands and tens of thousands of equations. How much larger? Who knows. What we can say is that its efficiency is more similar to SymEngine's symbolic types, so if SymEngine handles the millions of equations range (on a computer with enough RAM, symbolic representations may use up quite a bit here!) than we should be able to.

new4thaccount · on May 17, 2019

Wow...again this is all really impressive as is you taking the time to reply.

I'm looking forward to starting to play with these new libraries. Is there a more focused place to talk about Julia applied to numerical work (linear algebra and nonlinear equation solvers).

newen · on May 17, 2019

Looks like you can transfer sparse matrices to the GPU and use cuSOLVER to solve least-squares [1] but there isn't a nice interface for it yet (https://github.com/JuliaGPU/CuArrays.jl/blob/master/test/spa...).