
Logistic Regression by Discretizing Continuous Variables via Gradient Boosting - vincent_123
https://cdn.rawgit.com/rnorm/dblr_introduction/2aced373/index.html
======
bitL
A question about an opposite problem - is there a way to do this (and the
whole Deep Learning) on discrete domains? So far all I've seen assumes
continuous functions to be able to perform back-propagation; haven't seen
anyone using discrete calculus with similar rules to continuous one (see
Graham/Knuth/Patashnik). That could open many more interesting applications...

~~~
mcintyre1994
Can you use the typical approaches to classification? You can define a
continuous error function and perform back-propagation using that. If you look
at something like Kaggle then deep learning approaches tend to dominate
classification challenges just as much as they do regression ones.

~~~
bitL
That's the usual approach which has its limits. I am specifically curious
about discrete domains. Look at it as at mixed integer programming - yes, you
can estimate solution using linear programming, but that estimate is usually
useless. Having a specific method for mixed integer programming usually yields
far better solutions.

------
phunge
In cases where you need to interpret the resulting model, I've been advised
not to bin (for example:
[http://biostat.mc.vanderbilt.edu/wiki/Main/CatContinuous](http://biostat.mc.vanderbilt.edu/wiki/Main/CatContinuous)).
Other alternatives are splines or generalized additive models.

~~~
vincent_123
Thank you for the comment and the link. I agree with most of the points listed
there. And GAM is a great tool when there is non-linear and non-monotonic
relation between the response and independent variables. GAM has good
interpretability but it is still somehow difficult to understand in some
business environment. For example, in credit scoring, logistic regression with
binning is still widely applied.

~~~
closed
In my experience, most the time people use binning, it's straightforward to
demonstrate that their binning+model is equivalent to restricted forms of more
general models (e.g. common general additive / structural equation model).
Sometimes binning is useful, because it makes them much easier to estimate.

However, people's rationales for why they should bin is often that it makes
the model better / more interpretable, without actually testing the more
restricted binned model against the more general one. There's certainly
something to be said for knowing your audience when choosing a model, though
:).

------
hyperbovine
Logistic regression is in fact obtained by discretizing continuous variables
with logistically distributed errors. This is the "threshold model". If you
assume normal instead of logistic you get the probit.

