Hacker News new | past | comments | ask | show | jobs | submit login

And data management / preparation is where major mistakes are made - get a join wrong and you may easily be missing data or double counting something just obscure enough to go unnoticed like "ancillary sales".

There is something potentially harmful, or perhaps that needs addressing, about end-user tools growing in expressive power. A good friend who does statistical genetics work once told me "but I don't want every user running their own regressions and drawing nonsensical conclusions from badly prepared data!"

Yep. Wake me when an AI can tell me "hey, that monthly trending conversion report you asked me to pull...yeah, you're missing two days worth of data when tracking broke, so it will just make your numbers look lower when rolled-up monthly and be hard to notice."

BTW, that is also the reason I not only set alerts, but review data at a daily level when pulling any rolled-up reports of significance.

Why can't AI tell you you're missing data?

For most use cases, you don't even need AI. Once you reach a certain scale, there are certainly ML-based solutions available: https://medium.com/netflix-techblog/rad-outlier-detection-on...

That ship sailed somewhere around Excel 95...

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact