Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: SapientML – Generative AutoML for Tabular Data (github.com/sapientml)
82 points by ya9do on Sept 29, 2023 | hide | past | favorite | 8 comments



I’m not sure how the authors can claim to be comparing their approach against “state of the art” AutoML and not include AutoGluon, FLAML or H2O. This independent benchmarking paper[0] is what the AutoML field points to as establishing SOTA, and the libraries compared against in the SapientML paper are middle of the pack at best.

[0] https://arxiv.org/abs/2207.12560


Is calling it "sapient" too presumptuous? I feel like that word should be reserved for something more AGI like.

Am I completely off base with that opinion? I've been trying to temper my desire to jump in with any comment however irrelevant. Sorry, weird comment and question.


I had the same reaction; it seems like a brand you wouldn’t want to use in this space.


Any reason for using the term "generative", which may confuse readers and imply generative AI/LLMs? It's a traditional tabular autoML system, though it does learn pipelines from a corpus of Kaggle solutions, and generates pipelines with a "three-stage program synthesis approach" [1].

https://arxiv.org/pdf/2202.10451.pdf


Is there a reason why those frameworks were suggested?

There are many commercial offerings that greatly outperform open-source automl approaches.

At my job we use Datarobot and its super impressive. There is Azure AutoML, Vertex Bigquery AutoML, ... Other data focused software components have automl solutions as well I believe, like Alteryx, Dataiku, SAS, ...

If you want state of the art in AutoML, I am afraid this is one of the areas where the commercial space is well ahead of the open source space.


Are there benchmarks or third party evaluations that you can share that support your claim? I haven’t used all these offerings so my experience is anecdotal, but I haven’t seen commercial offerings outperform AutoGluon, at least for Tabular data.


It's been interesting to watch the generative space catch up to what we were doing back in 2017 -- https://medium.com/capital-one-tech/why-you-dont-necessarily...

I cannot imagine how much stuff is happening internally at companies like Google


We'll have a global ML Hackathon in November, can't wait to try it!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: