Hacker News new | past | comments | ask | show | jobs | submit login

Well that's because ML isn't really software engineering. Unfortunately, it is also software engineering as I often have to remind my colleagues coming from algebra / econometrics / statistics sides who are happy to shove all kinds of horrible code in.

What I've found in reality is that machine learning is 99% data cleaning scripts and 1% the part you're talking about. I've also seen the heavy duty statistics people writing data cleaning python scripts which probably leads to a lot of frustrations :)

I think what may be understated here is that while it’s true that ML is mostly date cleaning, data cleaning is not easy. There are a million little decisions made and it’s rarely clear which ones are most effective. Experimenting with various techniques is great but the iteration times and cost are usually too high to try more than a small handful of approaches.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
