Fwiw, the Skymind team built Deeplearning4j, is the second-largest contributor to Keras after Google, and the sole maintainer of HyperOpt.
Our code serves as a bridge between the Python data science ecosystem and tools like Spark, Kafka, Hadoop, etc.
While I almost always use TensorFlow for work, I appreciate Skymind's open source Deeplearning4j library for use with Common Lisp (via Armed Bear CL), Java, and Scala. Sometimes living on the JVM is the best choice.
You can also import Keras models to train them on a Spark cluster with DL4J:
There's a bit more to jvm than just training models. Java based application servers are still widely deployed for example.
There's a whole data engineering niche we target here as well (for the spark pipelines) where python isn't a good fit for that team.
Then there's the fact we're still easier to run keras on spark than any other framework thanks to our model import.
That doesn't account for what we're doing with inference. Many vendors in this space are just running kubernetes bundling other tools they don't control. We actually engage various large companies in custom chip development running other DL frameworks due to this low level control.
Depending on what you're looking to do, we're still the standard for pre compiled binaries compiled as jar files:
You won't find any other framework running pre cooked avx binaries and IBM power at the same time.
Happy to talk about more depending on what your focus is.
Also, being forced to use a cluster is such catastrophic overkill for so, so many tasks. I have teammates wanting to use spark/databricks just to process a handful of files in S3 totalling a few gb tops. Realistically we could do the same work, in a single container with Python/Julia/Scala/language of choice in same or less amount of time and with an order of magnitude more maintainability.
Spark is in 2019 what Hadoop was in ~2014. In 5-6 years Spark will be the cure-all basket that a bunch of people put all their eggs into not realizing the deep-seated limitations. This is especially true for machine learning.
See my other comment below with a link to a previous discussion.
People dramatically underestimate how far you can get with even a single machine and some slightly better engineering. Best ever example of this I've seen is the super impressive work done by Frank McSherry: