Show HN: SapientML – Generative AutoML for Tabular Data

newfocogi · on Sept 29, 2023

I’m not sure how the authors can claim to be comparing their approach against “state of the art” AutoML and not include AutoGluon, FLAML or H2O. This independent benchmarking paper[0] is what the AutoML field points to as establishing SOTA, and the libraries compared against in the SapientML paper are middle of the pack at best.

[0] https://arxiv.org/abs/2207.12560

bloopernova · on Sept 29, 2023

Is calling it "sapient" too presumptuous? I feel like that word should be reserved for something more AGI like.

Am I completely off base with that opinion? I've been trying to temper my desire to jump in with any comment however irrelevant. Sorry, weird comment and question.

Philpax · on Sept 29, 2023

I had the same reaction; it seems like a brand you wouldn’t want to use in this space.

timhigins · on Sept 29, 2023

Any reason for using the term "generative", which may confuse readers and imply generative AI/LLMs? It's a traditional tabular autoML system, though it does learn pipelines from a corpus of Kaggle solutions, and generates pipelines with a "three-stage program synthesis approach" [1].

https://arxiv.org/pdf/2202.10451.pdf

dacryn · on Sept 29, 2023

Is there a reason why those frameworks were suggested?

There are many commercial offerings that greatly outperform open-source automl approaches.

At my job we use Datarobot and its super impressive. There is Azure AutoML, Vertex Bigquery AutoML, ... Other data focused software components have automl solutions as well I believe, like Alteryx, Dataiku, SAS, ...

If you want state of the art in AutoML, I am afraid this is one of the areas where the commercial space is well ahead of the open source space.

newfocogi · on Sept 29, 2023

Are there benchmarks or third party evaluations that you can share that support your claim? I haven’t used all these offerings so my experience is anecdotal, but I haven’t seen commercial offerings outperform AutoGluon, at least for Tabular data.

lettergram · on Sept 29, 2023

It's been interesting to watch the generative space catch up to what we were doing back in 2017 -- https://medium.com/capital-one-tech/why-you-dont-necessarily...

I cannot imagine how much stuff is happening internally at companies like Google

hereforcomments · on Sept 29, 2023

We'll have a global ML Hackathon in November, can't wait to try it!