
Machine Learning with Talend: Decision Trees - markgainor1
http://fr.talend.com/blog/2016/09/29/machine-learning-made-easy-with-talend-%E2%80%93-decision-trees
======
lqdc13
It's pretty easy to visualize decision trees and there are already open source
usable tools for that.

It would be way cooler if they could visualize Random Forest that can have
thousands of these trees. This is what is actually used more frequently in
practice.

~~~
huac
Yeah, decision trees really shouldn't be used in practice. The variance of any
single decision tree is too high to give you reasonable results.

See also [http://scott.fortmann-
roe.com/docs/BiasVariance.html](http://scott.fortmann-
roe.com/docs/BiasVariance.html)

~~~
binalpatel
They have their place - if you're more interested in inference than in
prediction they can be useful tools.

~~~
huac
Do you have a source for this? Pretty curious to see how that would play out.

~~~
binalpatel
No formal sources unfortunately, it's similar to using a linear regression
model trained on historical data for inference (with some cross-
validation/penalization to make sure we're not just finding signals in the
noise).

We know for sure that there's a ton of bias (in the bias-variance sense), and
that the predictions won't be nearly as good as a blackbox model, but we may
glean some useful insights out of it.

Same idea with decision trees, in the past I've used them to find what factors
influence customer support scores on online tickets, and we found some
surprising and useful insights that were used for process improvements.

------
JPKab
Talend lost me when they started making you pay for their software.

Not because I'm opposed to paying for software, but because Talend isn't that
good. If I want to use paid software, I want it to be super refined and
usable. Talend feels very open source.

