I might be wrong on this, but I don't believe this is a replacement for Spark. Rather this is similar to the Spark SQL execution engine.
I don't believe there is any focus on providing a distributed execution environment, rather platforms like Spark and Flink could integrate DataFusion as an implementation and expose the API for Apache Arrow operations.
Datafusion, and Ballista by definition, also provides a Dataframe API that let's you construct queries programmatically. It also has preliminary support for UDFs.
We also have community members implementing Spark native executors using Datafusion, which showed significant speed improvements in the initial PoC.
I don't believe there is any focus on providing a distributed execution environment, rather platforms like Spark and Flink could integrate DataFusion as an implementation and expose the API for Apache Arrow operations.