
Lessons Learned from Benchmarking Fast Machine Learning Algorithms - hoaphumanoid
https://blogs.technet.microsoft.com/machinelearning/2017/07/25/lessons-learned-benchmarking-fast-machine-learning-algorithms/
======
matt4077
The GPU versions are performing surprisingly bad. To even match CPU
performance, you need a training set in the tens of millions, and even far
beyond that, a doubling of speed seems to be the best you can hope for.

Compare to, for example, tensorflow, where it isn't uncommon to see a 10x
speedup even for moderately-sized training sets.

(I say "surprising" in the sense that I'm surprised; I don't know the
algorithms used for decision trees, and it may well be that they are less
amendable to GPU-parallelization than the NN- and matrix algorithms I've
worked with)

~~~
alexbeloi
In decision trees more than half of the optimization time is spent doing
sorting (sorting the set at each node to find the optimal splitting for the
key at that node), in neural nets it's almost all matrix multiplies. That's
where the speedup difference comes in the CPU v GPU comparison.

~~~
yorwba
There are fast parallel sorting algorithms that should be able to take
advantage of GPUs. Maybe they didn't implement them?

~~~
hoaphumanoid
The LightGBM implementation on GPU is based on this paper:
[https://arxiv.org/abs/1706.08359](https://arxiv.org/abs/1706.08359) they use
several smart techniques to make the computation faster. One is how they
create histograms of features that are computed in parallel in the GPU

------
cropsieboss
It would be interesting if they compared training speed with CatBoost [0].

I remember seeing a paper where they managed to avoid getting stuck in local
optimum in terms of number of learners, and the more trees you add better the
result.

Logloss results seem to confirm there's a superior tree algorithm going on
there in CatBoost.

[0]: [https://catboost.yandex/](https://catboost.yandex/)

~~~
shoo
from the catboost page there's also a link to this:

Fighting biases with dynamic boosting - Dorogush, Gulin, Gusev, Kazeev,
Prokhorenkova, Vorobev

[https://arxiv.org/pdf/1706.09516.pdf](https://arxiv.org/pdf/1706.09516.pdf)

> While gradient boosting algorithms are the workhorse of modern industrial
> machine learning and data science, all current implementations are
> susceptible to a non-trivial but damaging form of label leakage. It results
> in a systematic bias in pointwise gradient estimates that lead to reduced
> accuracy

~~~
cropsieboss
I was thinking of this one:
[https://arxiv.org/pdf/1706.01109.pdf](https://arxiv.org/pdf/1706.01109.pdf)

I see a github link in there
[https://github.com/arogozhnikov/infiniteboost](https://github.com/arogozhnikov/infiniteboost),
but it does not seem to be in CatBoost (as someone here pointed out better
logloss has to do with CatBoost handling of categorical features).

------
nl
It's interesting that LightGBM was initially promoted as being more accurate
than XGB, but that claim always seemed marginal at best and was hard to
reproduce.

Other investigations show the same thing about training speed though, eg
[https://medium.com/implodinggradients/benchmarking-
lightgbm-...](https://medium.com/implodinggradients/benchmarking-lightgbm-how-
fast-is-lightgbm-vs-xgboost-15d224568031)

~~~
cropsieboss
I'd recommend rewording the last sentence as it seems to imply that training
speed has same "hard to reproduce" numbers, when in fact I believe you meant
to say that LightGBM is indeed faster.

~~~
nl
Yes, fair point. Edited, thank.

Actually, that's very, very strange. The edit I made doesn't seem to be what
is above. I said something like " same (edit: that LightGBM is faster to train
than XGB) thing". I mean it's harmless, but very odd.

Did someone else edit my post? @dang ?

~~~
zitterbewegung
This is off topic but I have been told if you have questions to the site you
email it to them. I think the contact email is at the bottom.

------
rdudekul
Seems to be a project under Microsoft's Distributed Machine Learning Toolkit
([http://www.dmtk.io/](http://www.dmtk.io/)).

~~~
hoaphumanoid
Yes, it was initially forked from an internal boosted trees library from
Microsoft. The guys who created it are Microsoft Research Asia

