This is the kind of thing we imagined experts being able to do when we designed Julia. Having implemented some such solvers in the past, I can safely say this stuff is easily the best out there.
But in all seriousness, the comparison was far more thorough than I expected going in. I learned quite a bit, and it's making me consider giving Julia another shot. I've been tinkering with mostly Rust lately, very happily, but this post makes me miss something with the strong scientific bend of Julia. So job well done and thanks for taking the time to write this.
[t, x] = ode15s(
@(t, x) [x(2); ((1 - x(1)^2) * x(2) - x(1)) * 10^6],
[0, 160], [2; 0], odeset('RelTol', 1e-5));
x(size(x, 1), :)
Intel's dodesol_rkm9mkn() gave me [1.877, -0.743], which is a more accurate result, in less than the blink of an eye: in
That's more than 200x faster, by mere virtue of using a better algorithm. Apparently the algorithm is "Merson's method", which I'm completely unfamiliar with and which I'd never heard of until then.
van = @ode_def_noinvhes VanDerPol2 begin
dy = μ*((1-x^2)*y - x)
dx = 1*y
prob = ODEProblem(VanDerPol2(μ=1e6),[0;2.],(0.0,160.0))
@time sol1 = solve(prob,CVODE_BDF(),reltol=1/10^5,save_everystep=false)
@time sol2 = solve(prob,Rodas5(),reltol=1/10^5,save_everystep=false)
sol3 = solve(prob,CVODE_BDF(),abstol=1/10^10,reltol=1/10^10)
Thanks for pointing me to it though. It's definitely an interesting nugget of ODE history that I didn't know existed.
Out of curiosity, why is this an issue? Wouldn't you have just downloaded the library and run it regardless of whether someone is actively developing it or not? I don't really understand this argument (and I see many people making it, not just you).
What would make it exciting to play with if it added something. Free parallelism or uses the Xeon Phi automatically, complex number support, event handling (many examples where this is necessary. One: PDE solvers need this for adding projection steps). But since development has halted, we're pretty much guaranteed that none of this will ever come.
> Intel grants to you a non-exclusive, non-assignable copyright license to make only the minimum number of copies of the Materials reasonably necessary for your internal testing and development of your products.
> You may not distribute any portion of Materials, whether in source or binary form, to any third party, except as specified in this Agreement.
So since stiff solvers have a lot more moving parts and thus tend to have even larger differences, 200x is easily explained by just language differences. One way to test this is to instead test the Intel solver against Sundials. I wouldn't be surprised if Sundials actually was faster in this case.
But there's another factor here. I explained in the blog post that multistep methods like BDF tend to do well on asymtopically large problems since they minimize the amount of user-function evaluations, but they don't tend to do well on smaller systems of ODEs. This is exactly that case: you're comparing an implicit RK (Intel) to MATLAB's BDF method, and see that the RK is more accurate and is fast. But this effect of the method choice on small (user-scale) problems is exactly why I emphasized it as a downside when solver suites only include multistep BDF methods as their stiff solvers.
That isn't to say there isn't interesting here. Intel's nonstiff method is a 4th order method that can add extra 1st order stages to boost the stability region, similar to a Runge-Kutta Chevyshev method. I honestly have no idea what this stiff solver is so I bookmarked it to read when I get to a campus where the paper is free:
But in general, RKC methods weren't widely adopted because usually using a full non-stiff or stiff solver ends of being better, and picking wildly different methods like this which are not widely known usually isn't a good sign in a very "crowded" field like ODE solvers where generally things are pretty figured out and it's all about the details.
I'm not sure which statement you're reading, but everything you said so far is in agreement with my statement:
>> Intel has an ODE solver that seems unbelievably good at solving certain kinds of differential equations. I'm surprised/disappointed it wasn't even mentioned. I would've loved to see an evaluation of it against MATLAB (or other solvers) more than any other pair in the list.
Note that I have specifically stated that this method seems better on (at least) certain kinds of equations, and that, therefore, I would have loved to see a comparison of it with everything else.
In particular, note that I never claimed the method Intel uses is better, at least (especially?) not in general. If I knew that to be the case, I wouldn't have even had a reason to hope for a comparison! The entire point of my comment was that I would be interested in an evaluation of its performance, and I'm glad to see that you feel similarly. :)
However, the (5,2) method, in order to have an embedded method for adaptivity, has its coefficients (for the Rosenbrock tableau) almost completely determined. What this means is that the paper is not able to "optimize" the method for low error. Normally what's done is that a method is written in terms of tableaus, and the numbers for the tableau are chosen such that the principle terms in the error, the coefficients of the leading error terms, are minimized. Sufficiently good RK and Rosenbrock methods can get like 10^-8, meaning that for large enough dt they almost act like they are an order higher. Of course, the large amount of assumptions to get the given method means that this couldn't be done.
But does the algorithm have a place? This is the issue... it doesn't necessarily have a good fit anywhere. Standard Rosenbrock methods require a new Jacobian each timestep, but that's fine if factorization isn't sufficiently expensive when is true for smaller ODE systems. Also if the problem is "very nonlinear", i.e. not semi-stiff but solidly stiff, Jacobian re-usage doesn't help anyways. So standard coefficient-optimized Rosenbrock methods do really well when the equations are very stiff and the number of ODEs is small enough. (coefficient-optimized) Modern (E)SDIRK methods naturally allow for Jacobian re-usage but are otherwise similar to the Rosenbrock methods, making them do well in the small number of ODEs but semi-stiff case. The issue with this method is that it tries to sit in the middle here but it's not really optimized for either case.
But the issue really goes back to the 2 in the (5,2). Early on in ODE literature, people believed that reducing the number of steps was always a good idea. You can do 5th order RK methods in 7 steps (thus 7 f evaluations, where u'=f with f being the user given ODE function), so why would you ever do 8? However, in the early 90's when we finally got our hands on PCs, people found out that the assumption to keep stages minimized doesn't empirically do well. Instead, what you can do is have more stages, but then use that to design the method such that the local truncation error is orders of magnitude lower. The result is that you can take much larger steps in order to solve the equation but get the same error, making everything quicker. To compensate for the fact that you are now taking large steps you have to have "dense output", i.e. a continuous extension of the method which can use the internal stage data to turn the solution into a continuous function sol(t) which doesn't require any evaluations of f. While this is more cumbersome in to implement, these "higher stage with dense output" methods, the engineering is very much worth it and if you take a look at pretty much any production-quality method these days you'll see they've ended up with these Rosenbrock, (E)SDIRK, or FIRK methods... except for multistep methods.
Multistep methods though use history to achieve high order with just a single function evaluation in its implicit function per step. This is thus what's best when f is expensive, and is another reason why stage reduction isn't necessarily helpful because then you're ending up in an area where other methods are already pretty optimal. The classic is the BDF method. It can do the Jacobian re-usage as well. While these aren't L-stable above order 2, these can be adaptive order and do really well in mildly stiff situations.
So that said, there really isn't a good niche for these types of methods. I have other low-stage Rosenbrock tableaus implemented and it comes out pretty much as I say here: more error for the same amount of time as others. Rosenbrock-W (ROW) methods are not important anymore because with autodifferentiation we don't need to do numerical differentiation and thus we do get exact Jacobians easily, which negates the purpose of the ROW method (the reason why they were invented because of numerical errors with numerical differentiation causing order loss in Rosenbrock methods. The tradeoff is a lot more assumptions and a lot less coefficient optimization).
So the method Intel chose, extrapolating from other low-stage Rosenbrock methods and its other properties, would do best in a cases where the equation is not too nonlinear (you want to make use of the Jacobian freezing, otherwise there's no point since the standard Rosenbrock methods would simply get less error for the same steps), but also it's very stiff (otherwise you're in the regime where BDF dominates), but also not too stiff where you'd want to use the IMEX properties of the modern (E)SDIRK methods, and you'd also want f sufficiently costly that the higher-stage (E)SDIRK methods don't do better.
This puts it into a very very small niche, which is probably why this method was lost to the literature and not really adopted. But then again, since it does sit right in the middle of a lot of things, if you're to pick one method to implement, this is probably a pretty good choice. So while Intel probably did a very good implementation of it, I wouldn't see a reason to go looking for it somewhere else. If this method did well, then either a Rosenbrock, (E)SDIRK, or BDF method will probably do better depending on what regime it more falls into.
The "9" refers to the maximum number of "stages" which is not the same thing as the order of the solver.
The (m,k) refer to the corresponding parameters of the method (I think).
The "n" at the end is for "numerical" Jacobian (as opposed to "a" for user-provided/"analytical").
There should be a PDF in the download that explains these better than I possibly can (though it's still somewhat terse).
Several groups are working on systems inspired by Modelica, including:
> Circuit simulation programs take a text netlist describing the circuit elements (transistors, resistors, capacitors, etc.) and their connections, and translate this description into equations to be solved. The general equations produced are nonlinear differential algebraic equations
All the way down to transistors and other basic elements
> These models are very computer intensive, involving detailed spatial and temporal solutions of coupled partial differential equations on three-dimensional grids inside the device
But then there are "full-stop PDE solvers" which choose a specific PDE and solve that. For example, MATLAB and Mathematica can numerically solve diffusion-advection systems directly. And there are many user packages for different PDEs.
So to cover this fully, I would need to go through the available tools for numerical linear algebra and spatial discretization, but on the other hand cover some full-stop PDE solvers as well. This will be a different blog post. Without knowing your background or what you want to do, FEniCS is a good Python finite element toolbox that can then be used to build the linear equation or system of ODEs, from which you use other tools like those mentioned here to complete the solving. As for full-stop PDE solvers in Python, I am not sure and need to look into it myself...
In this area of numerical computing, the important part is still the user, and the program they use is mostly a matter of what's convenient.
h = min(0.8*h*pow(tol/erinf,0.25),4*h);
Some time ago I was trying to make a 'hpq-FEM' using rational functions as a basis, like a Padé approximation. Do you know if anyone has tried something similar?