How much an extra layer of wrapper of a functional language help with that? Pytorch is not much helpful in terms of the error messages. You have to print out the matrix sizes generated from intermediate operations to find out what is really happening.
However when using the Python api I often have errors because of:
- Unused variables when refactoring my code, which are just me forgetting to use some parameters.
- Comparing things of different types (and Python does not report any error and just return that they are different).
- Making changes to some helper functions without adapting all the projects where I'm using them.
Using a good Python linter probably helps with these, but that's a place where languages like ocaml naturally shine.
At the most basic level, the sizes of tensors are often not known until runtime, so some sort of dependent typing is necessary. Idris is currently the most practical dependently typed language, and it's missing a number of features that would be needed for machine learning work. For example, it only supports 64bit floating point operations, whereas 32bit ops are standard and the industry is moving to 16 bit and fewer ops.
There's ways in Haskell to get most of the benefits of dependent typing, but they're pretty ugly. My subhask library  I think did a reasonable job for vector/matrix math, but once you get into higher order tensors everything becomes either far too verbose or impossible to specify. For example, permuting the axes of a tensor takes roughly a full page of code, and it's not even properly type safe. At the bottom of the link , there's a list of type system features that I think would be necessary to add to Haskell before it has a chance of a good linear algebra system... but in all honestly, I'm not even convinced yet that a usable linear algebra system can even be implemented in a language with proper dependent types.
I remember reading some lengthy hype about fortress and then never hearing about it again ...
Although this is somewhat of an extreme example, I'd say most of my tensorflow models take at least several minutes to build a graph. Anything that reduces the possibility of runtime errors would therefore speed up my development cycle dramatically. I'll also note that the OCaml compiler is lightning-fast compared to say, gcc.
What are you doing that requires a 10, 30, 45 min graph build?
If you look at the issues, you'll see the developer contemplating GADTs to deal with tensor shape matching (but even phantom types will do). This is an issue that causes headaches for just about anyone whose dealt with neural nets.
In fact I'd wager your average Haskell type error has generated more than a few frustrated stackoverflow questions.
I'll say however, the benefit of an ML like Ocaml is less so the types than that functional languages are the most advanced in providing tools which allow for domain modeling in terms of composition and defining tailored algebras. This is something that fits both applied and experimental machine learning especially well.
That being said there are some rough edges compared to the PyTorch api, e.g. no parallel data loader, not much tooling and only a couple tutorials...