
Introduction to Boosted Trees [pdf] - kercker
http://homes.cs.washington.edu/~tqchen/data/pdf/BoostedTree.pdf
======
walrus
Same content in prose rather than slides:
[https://xgboost.readthedocs.org/en/latest/model.html](https://xgboost.readthedocs.org/en/latest/model.html)

------
nl
One interesting thing about Boosted Trees is the author's software
(XGBoost[1]) reliably outperforms other implementations (in terms of accuracy
of results[2]). I'm not entirely sure why this is - I know there is an open
ticket in the Spark GBT implementation to investigate this.

[1] [https://github.com/tqchen/xgboost](https://github.com/tqchen/xgboost)

[2] It's also very fast in terms of absolute speed.

------
shoo
It's worth checking out Friedman's "Gradient Boosting Machine" paper (as
mentioned here in the references) from 1999 -- this has a good description of
"boosting" from the general perspective of function optimisation.

Here's a copy: [pdf]
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.31....](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.31.869&rep=rep1&type=pdf)

------
sysreader2016
I haven't read much about XGBoost boosted trees. Does each tree have additive
independence? Is the tree ensemble of two trees better than one tree?

It seems like additive training that removes all constants in addition to
regularization of model complexity would shape the tree ensemble into a
baseline model that defines minimum assumptions. So, what's its success rate
in predicting favorable outcomes vs. tree learning focused on heuristic
specialization (impurity)?

------
bnjmn
In the first example, being male is one of two features that predict playing
video games, and (surprise!) only the boy and the old man are classified as
gamers. Talk about casual sexism! Can you imagine taking this class as a woman
(who maybe, just maybe, happens to enjoy video games) and having to
forgive/ignore the instructor's cluelessness in order to get through the
material? So incredibly tone-deaf and lazy, ugh.

~~~
walrus
If it actually bothers you, then do something about it.

The author of these slides provided two versions of this material, one in
prose (linked from my other comment in this thread) and these slides. Send a
pull request to the XGBoost GitHub repo to update the prose version[1]. Once
you're done with that, send an email to the author with the new images, _non-
abrasively_ explain why you're providing them, and _politely_ ask the author
to replace the slides on his website. Be sure to point out which slides need
to be updated (8, 9, 24, 25, 28, 31) and which images go with which slides.

It's possible your email will get ignored (PhD students are busy and don't
necessarily have time to maintain two year old teaching material), but I'm
almost certain the pull request will be merged.

[1]
[https://github.com/dmlc/xgboost/blob/master/doc/model.md](https://github.com/dmlc/xgboost/blob/master/doc/model.md)

PS: If you don't do this, I'll call you tone-deaf and lazy. ;)

~~~
bnjmn
It actually does bother me, and I wish I had the time to fix it, but fixing it
involves much more than the comically large amount of work you're asking of
me: it means convincing this PhD student that he made a genuine mistake that
betrays his sexism, and getting him to acknowledge that mistake by fixing his
own slides.

I wish the instructor was more aware of his own sexism when he created this
material, so that neither he nor I nor anyone would have to fix this problem
now. He chose this careless example, and so it's his responsibility to fix it,
no matter how busy he may be as a PhD student. We're all busy, but these
slides have his name on them, and demonstrating that he knows better now is
something only he can do.

~~~
Joof
Well, I could do it, change it to a more neutral example, then take a
screenshot of your post and merge it to the repo. This would take maybe 2
hours (but I am familiar with git).

Fixing it would go very far to prove that you are sincere about the problem
and isn't a hell of a lot of work. Going that far to fix it is more likely to
convince him that it actually matters than anything else you could do.

