It's encouraging to see a project like Tribuo come along, but it also feels like too little, too late. I'm already well underway on jumping ship and migrating to Python, and have yet to encounter any particular reason why I should look back.
I don't understand why this has become the norm. Would your company not use smile to, presumably, make money?
What would you say differentiates you from Smile which includes a simplistic datagrame, visualisation and support for CBLAS etc.
Is speed on par?
To your direct question, I've not benchmarked Smile against Tribuo. We are very interested in the upcoming Java Vector API - https://openjdk.java.net/jeps/338 - targeted at Java 16, which will let us accelerate computations which C2 or Graal don't autovectorise.
- What does the future road map look like?
- Are you planning on adding more algorithms ?
- Any plans to bring in dataset and dataframe handling capabilities such as in numpy/pandas etc?
- What other interop features with other languages/platforms are planned?
- Any plans for AutoML features?
- There are various efforts on the JVM to build multidimensional arrays, we're talking to many of them to try and figure out a strategy for the whole platform. Ditto for dataframes, though Apache Arrow looks like a good baseline.
- We're not looking at other languages outside of the JVM at the moment, but we're continuing to contribute to Tensorflow Java and ONNX Runtime to improve their Java support. We could look at pytorch inference support based on their Java API, but that overlaps pretty well with the things that ONNX Runtime supports. Do you have any suggestions?
- Not beyond hyperparameter tuning.
Many models are deployed as restful endpoints, so a quick and easy ways to deploy models as services with containers or serverless providers will be very useful - although admittedly, you might not want that in the core project, could be a good sidecar project to this. Given your focus on model provenance, extending that beyond to model deployment and life cycle management tools such as MLFlow could also be very useful