
Forecasting in Python with Prophet - pplonski86
https://mode.com/example-gallery/forecasting_prophet_python_cookbook/
======
Tarq0n
A large forecasting competition called M4 [1] recently published their
results. If you're interested in forecasting I suggest checking out their
summary paper. [2]

Highlights include:

* Pure ML methods are still not competitive with statistical models;

* Ensembles perform better than any single model, an important difference to the last competition;

* Very simple benchmarks can perform very well on this type of competition.

The top 3 included multiple statistical models feeding into an RNN (by an Uber
engineer), another ensemble using XGboost for the final layer and a
combination of just statistical methods with a clever weighting scheme.

If you're interested in making production-level predictions, it's probably a
good idea to ensemble prophet with other methods.

[1]
[https://en.wikipedia.org/wiki/Makridakis_Competitions](https://en.wikipedia.org/wiki/Makridakis_Competitions)

[2] [https://www.scribd.com/document/382185710/IJF-
Published-M4-P...](https://www.scribd.com/document/382185710/IJF-
Published-M4-Paper)

~~~
wenc
Those highlights match my field experience.

I've found that ensembles aren't necessarily more accurate for individual
forecasts per se, but in terms of aggregate errors over the long run, they end
up being less wrong. (bias-variance tradeoff)

------
freeradical13
For those looking for a straightforward introduction to forecasting, see the
free e-book, _Forecasting: Principles and Practice_ by Hyndman and
Athanasopoulos: [https://otexts.com/fpp2/](https://otexts.com/fpp2/)

~~~
muraiki
That's an excellent resource that I've used heavily (along with their awesome
forecast package for R). I also found QuantStart's series on Time Series
Analysis (scroll down a bit:
[https://www.quantstart.com/articles](https://www.quantstart.com/articles))
quite helpful, and it gets a little deeper into the math so I had a better
understanding of why certain things are done.

------
andrewljohnson
Is Prophet useful for forecasting subscription revenue?

Say you have multiple product lines like iOS-1month, iOS-1year, iOS-1year-
premium, Android-1month, web-5years, etc. And you have historical numbers for
new users and retention for each product line.

I looked at using Prophet for this, but instead I made a giant spreadsheet and
modeled my assumptions around seasonality, retention, and new user growth. I
wasn't totally sure, but it seemed like Prophet wasn't intended for my use
case, where my assumptions can be stated very explicitly, and I want to
twiddle the knobs to look at various scenarios.

~~~
pplonski86
It should be useful for forecasting subscription revenue. I can help you with
this for free! I'm building autoML solution and this is interesting use case
for me. Please email me if you want help on this

------
maliker
We've enjoyed using Prophet for forecasting in our work with electric
utilities. It's pretty fire-and-forget. We wish it had an option for fast
retraining of the model when we get new hourly data, though. That's pushing us
towards other options in scipy.

~~~
wenc
I looked at Prophet a few months ago because we needed a fire-and-forget
library similar to 'auto.arima' (R "forecast" package) for Python, but no good
candidates existed.

However I found Prophet to be computationally a little heavier than auto.arima
because it uses "stan" (Bayesian) underneath, which in turn uses an MCMC type
approach and has quite a few dependencies. We needed fast model retraining as
well, and at the time, it didn't seem like that was something it excelled at.
(might have changed, I'm not sure)

I ended up putting together a simple ensemble forecast model class with
"statsmodels" which automatically selected/averaged the best models over a
collection of model types via heuristics and cross-validation. It works ok,
but I'm still waiting for someone to port the R auto.arima over to Python. (I
tried rpy, which in theory should have worked, but I struggled with the
impedance mismatch)

~~~
venuur
I found this port of auto-arima for Python. I haven’t used it in production,
but it was easy to test on some demo data. [https://pypi.org/project/pyramid-
arima/](https://pypi.org/project/pyramid-arima/)

~~~
wenc
Thanks, I'll take a look.

------
minimaxir
It's been awhile since my undergrad stats classes, but doesn't using a boxcox-
transform on a objective metric, fitting the dataset to the transformed
metric, and then inverting it violate model assumptions? (or maybe that's just
in the case of a linear regression and not Prophet's approach)

~~~
tfehring
It just establishes new model assumptions. In the linear regression case,
fitting boxcox(y) = βX + ε and predicting with boxcox^(-1)(yhat) produces
different predictions than fitting y = βX + ε (minimizing Σ(boxcox(y) -
boxcox(yhat))^2 does not minimize Σ(y - yhat)^2 in general). Either model
would be an approximation, of course, and it's possible for either to produce
more accurate predictions than the other for a given set of observations.

I'm not familiar enough with Prophet to know whether the same logic applies
here, though I'd hazard a guess that it does.

~~~
yorwba
Right. In particular, using least squares is justified by the assumption that
errors are normally distributed, in which case least squares yields the
maximum-likelihood estimate. Because boxcox is intended to transform a random
variable into one that is normally distributed, the model assumptions are
actually more likely to be satisfied if you do regression on the transformed
values. (Which was probably the reason boxcox was invented in the first
place.)

