to me this seems to target a pretty small audience: very big data and specific problem domains, you need killer devops chops, expensive & specialized infrastructure and a desire to build out on bleeding edge architecture. I'd suspect most with these characteristics will stick with what they've got, "medium Big Data" companies should probably go with hsoted services and the rest of use stick with a single node DuckDB.
Bingo. Very few organizations have petabytes of data on which they are trying to efficiently process for machine learning. Such organizations already have personnel and technology in place offering some kind of solution. Maybe this is an improvement, but it is quite unlikely to be offering new capabilities to such teams.