Neural ODE's: Understanding how they model data

quantumwoke · on Jan 23, 2019

The diagrams here really helped to explain neural ODEs in an intuitive fashion. Does anyone know the best library that implements them so I can play around in an iPyNB? :-)

ChrisRackauckas · on Jan 23, 2019

https://news.ycombinator.com/item?id=18979541 shows (with code) how to build and train neural ODEs.

lhomdee · on Jan 23, 2019

Thanks. Looks fantastic. Also appreciate the feedback from parent.

GistNoesis · on Jan 23, 2019

Seems like it will be hard to get it to work in Tensorflow, because it needs to compute the gradient in an unusual way, which afaik don't play nice with the existing architecture.

My guess is it will need some deep wizardry of the same kind as OpenAI gradient check-pointing.

Neural ODE, is a nice trick to reduce the memory usage to O(1) instead of O(nb timesteps). But the implementation cost and complexity cost probably mean we are better using gradient check-pointing on a forward dynamic and pay the memory cost.

It will also probably won't play well with noise.

Are there any implementation of it in tensorflow yet?

lhomdee · on Jan 23, 2019

There is a PyTorch implementation by the authors https://github.com/rtqichen/torchdiffeq and there is a Julia implementation due: https://news.ycombinator.com/item?id=18676986#18691054. As for TensorFlow, we will have to wait and see if it can be done. The adjoint method is matrix multiplication, but the tricky part will also be integrating with numerical ODE solvers.

317070 · on Jan 23, 2019

I have one in tf, so I can confirm it is possible.

GistNoesis · on Jan 24, 2019

Cool. This mean that hopefully there will be some tf open source implementations in the future. I guess there is something I don't see. I'm intrigued, does your code run inside a single sess.run() so that it can be composed nicely? If so did you use a "special trick"?

liuliu · on Jan 24, 2019

This is the first time I seriously read the paper and the code. Does this mean for ODE net, we are sharing weights for f? (even though we evaluate f at different y and t points)?

If that is the case, seems you can implement a full static version of ODE net with tf.While.

briga · on Jan 23, 2019

Thanks for this--science needs more writers who are able to distill complex subjects into clear and readable form.

lucidrains · on Jan 24, 2019

Second this! Thank you for the wonderful explanation

lhomdee · on Jan 23, 2019

Thanks, I appreciate the kind words.

contravariant · on Jan 23, 2019

> y(x) = f(x,y), y(x0) = y0

If this is supposed to be the ODE definition, shouldn't it be y'(x) = f(x,y)? Otherwise I don't quite understand the definition of 'f'.

lhomdee · on Jan 23, 2019

Thanks for pointing this out. It is indeed a typo.

dfan · on Jan 23, 2019

I noticed the same thing. I presume it is a typo. Later on f is defined as you would expect.

buboard · on Jan 23, 2019

a quick intro to the concept is given by an author of the paper here: https://www.reddit.com/r/MachineLearning/comments/a65v5r/neu...

madhadron · on Jan 24, 2019

Can anyone comment on the relationship between this and the differential equations used by the connectionist neuroscientists back in the 1980's?