Hacker News new | past | comments | ask | show | jobs | submit login

I mean we didn't know autodifferentiation was a thing, so we (my advisor, not me) analytically solved our loss function for its partial derivatives. After I wrote up my thesis, I spent a lot of time learning mathematica and advanced calculus.

I haven't invested the time to take the loss function from our paper and implement in a modern framework, but IIUC, I wouldn't need to provide the derivatives manually. That would be a satisfying outcome (indicating I had wasted a lot of effort learning math that simply wasn't necessary, because somebody had automated it better than I could do manually, in a way I can understand more easily).




I can't express the extent to which autodifferentiation was like a revelation to me. I don't work in ML, but in grad school around 2010 I was implementing density functional theory computations in a code that was written in Fortran 77. My particular optimization needs required computing to second derivatives. I had Mathematica to actually calculate the derivatives, but even just the step of mechanically translating the computed derivatives into Fortran 77 code would be a week of tedious work. Worse was rewriting these derivative expressions for numerical stability. The worst was realizing you made a mistake in an expression high in the tree and having to rewrite everything below. The whole process took months for a single model, and that's with chain rule depth that probably could be counted on one hand. I can't imagine deep learning making the kind of progress it has without autodifferentiation - the only saving grace is that neural networks tend to be composed from large number of copies of identical functions, and you only need to go to first derivatives.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: