
Neural Jump SDEs (Jump Diffusions) and Neural PDEs - ChrisRackauckas
http://www.stochasticlifestyle.com/neural-jump-sdes-jump-diffusions-and-neural-pdes/
======
eigenspace
As someone with only a vague understanding of what neural _DEs are, could
someone comment or point me to an article about what they might be useful for?
I think I get why they're interesting but I was curious if there are any
obvious applications them that do things more conventional methods don't.

~~~
ChrisRackauckas
Neural ODEs approximate ResNet. Neural SDEs approximate Gaussian nets. Nueral
Jump SDEs then approximate it by allowing layers to have discontinuous
randomness, like a Poisson variable. This might be useful for guessing
classifications more directly than doing regression within the net.

On the other end, jump diffusions are a common financial model. Regime shifts
are sudden, and cause the economy to rapidly go from one state to another, and
adding jumps to an SDE model gives this behavior.

PDEs are a way to approximate functionals which evolve over time. In many
applications they have a nonlinear term for how local interactions occur. This
generalizes that nonlinear to be a learnable neural network.

The applications of these are left open, since we don't really know yet. This
is just showing that the software can now handle these types of models, which
are more general forms of differential equations than what were "neural'd
before", and so the applications may come from people who work with these
models in the non-neural sense but want to add more functional estimation /
identification.

~~~
orbifold
One thing I find kind of strange about this development is that the code for
doing these kind of optimizations has been around for a long time and been in
serious use in industry ("Neural" PDEs (i.e. applying adjoint sensitivity
analysis) are routinely used in Gas / Oil exploration) and the same is true
for Jump SDEs and volatility calibration / pricing of financial assets. I
guess everything old is new again, when a new set of people discovers (really)
old tools.

Still I would like to thank for making these examples available aswell as for
your work on DiffEq.jl

~~~
ChrisRackauckas
There's a difference from what was done before though. While we are using
adjoint sensitivity analysis of the old, we have things setup so that the
internal vjp (vector-Jacobian product) is using reverse-mode AD that is
compatible with the neural network framework to do that calculation in O(1)
instead of O(n) f-calls. We have a PR setup that will make it do source-to-
source AD on that part as well. So this scaling is distinctly different in
terms of scalability from the traditional numerical codes like that found in
Sundials. We are not the first to do this, CASADI has been doing this for a
bit of time, but we can now demonstrate that we can throw arbitrary Flux.jl
neural network frameworks in there and reverse-mode will specialize this, and
specialize the whole thing.

And yes, parameter estimation in jump SDEs has been around, but using full
solver reverse-mode AD on high order GPU-accelerated adaptive solvers has not
been around, and we're talking about many orders of magnitude speedups here
when you put these all together.

But yes, in terms of the math, the jump is very small here. But last week at a
conference I got some questions about whether an existing tool can actually
allow you to just define a PDE or jump diffusion with neural networks in there
and take an adjoint that is both efficient in the differential equation and
the neural network. I was saying yes, but given the previous DiffEqFlux.jl it
would require someone to understand the capabilities of the AD/adjoint system,
know all of the differential equations that can be solved through the package,
and realize they can write a code to do it. I wanted to make it explicit by
sharing code: yes, we do neural jump diffusions and nerual PDEs. It's not
something hypothesized in the future, it's something that our package is
already doing today with all of the goodies you expect from the ODE support.

------
rq1
Thank you for this.

I remember my first reaction to the Neural ODEs paper was to tell my professor
that they forgot to mention the parabolic PDEs case (through SDEs).

