Hacker News new | past | comments | ask | show | jobs | submit login

Do you mind if I ask you where you work? I find most people practising "machine learning" have barely gotten past kmeans, naive bayes, and SVMs with very little actual understanding on how or why they work and where they fall apart.



I run my own data consulting firm based in Asia.

Originally, we were going to build high dimensional statistical learning systems crunching large datasets on GPUs in Haskell.

90% of clients cannot feed me the data, so I end up building them a data warehouse first; in some cases we even redo their data model. 100% of clients have datasets too small to justify getting out of R and single CPUs, so our Haskell ML libraries are, so far, on the wishlist only (I think Tweag has had more success there, but I still have feeble hope that one day...) Hence the RM focus as well as ML. The ML side is sort of picking up this month.

That being said, I'm not actually that well versed in ML. Just the guy running the company. I've built (and read) just enough to know how it works, how it applies to clients and who to hire to get the job done properly...


I think what maybe tripping people here is your use of the word machine learning. Personally, as you're using it is how it should actually be used: applied statistics. However, a lot of people think of machine learning as a collection of algorithms they can use to make predictions about some given dataset. Use sklearn to model.fit(xtrain, ytrain) and model.train(xtest, ytest). I personally blame Coursera and the online courses for this trend.


Do you ever have the impression that the model you are going to apply influences how you clean the data? What happens then when you redo the model?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: