The tradeoff is not between concision and extensibility, but high- and low-level computations.
Even if the language natively implements a "neural_network_train" function, as long as the language also offers low-level primitives to implement all the necessary parts of the neural_network function, the language is no less extensible than the OP's suggested alternative. For example, almost 100% of R users use "lm" to run linear regressions, but R has all the necessary pieces to implement the linear regression calculation (either by inverting matrices or running iterative gradient descent algorithms)
The OP conflates the library-level abstraction and language-level abstraction. I am with him in that there is a trade off between concision and extensibility w/r/t language-level abstraction. The library-level abstraction is pragmatically important (i.e., you would not use OCaml to run websites) but theoretically uninteresting (Ocaml can certainly express all the needed computation for a web server).
I think people should be given high level primitives like "layers", allow them to make their own where necessary, but allow the defaults to be:
Layer with x (dropout,momentum,..) trained by optimization algo: LBFGS,Hessian Free,..
This allows people to experiment with different configurations without having to dive deep to achieve some basic problems.
Relevant to julia: it's a great language and what I wish production code could look like (while being fast!)
Like rust, it's in a pretty alpha state right now. I'm watching the language heavily though.
Whereas a good neural network library like Torch lets you work at a much lower level of abstraction. You can put together individual layers, and it gives you the internal code for doing forward and backward passes, and chaining them together.
On the contents of this blog post: I really like how the Julia type system is used here. Not only do the types help structure the code and send a signal to the user, but of course there is type-checking to catch errors.
I haven't read this yet though, maybe it explains.
Deep learning is a specific area within machine learning.
So "deep" networks have been around for many decades, and they haven't, because you couldn't train them. Now we have computers that are 10,000* faster (at least) and training algorithms that are much faster too these architectures are interesting.