
Mean Variance Optimization - karamazov
https://www.datanitro.com/blog/2012/08/07/Mean-Variance-Optimization/
======
joe_the_user
Interesting stuff...

A further point to consider if we're talking about real world applications is
that it is not actually established that markets have finite variance -
seriously.

In the 1960's, Benoit Mandelbrot began his research into chaos and fractal by
looking at markets and finding that non-Gaussian, Levi-Stable distributions
modeled changes in the market best[1]. And these L-stable distributions don't
generally have a finite variance and sometimes don't have a finite mean [2].

And it is fairly easy to see how a market tends to not be Gaussian; change
based on a Gaussian distribution tends to be random walk a la Brownian motion,
where the final position of a variable is the sum of many small changes in the
variable. Non-Gaussian, infinite-variance-distribution-based movement on the
other hand has the property that the final result of a variable tends to be
more the result of a finite number of large changes rather than a lot of small
changes. And this is what the stock market often looks like. A few wild moves
often impact things as much as the incremental changes. The apparent mean,
variance and distribution of stocks on a day-to-day basis may not pan out in
extreme situations and these can eat away the rest of your profits. If the
stocks that seemed independently in normal conditions all go down in crash,
your estimated-correlation-based-diversification hasn't protected you very
well.

The Black Swan is a sadly too-simplified popular summary of these points.[3]
It does point to the general idea. The higher-level take-away is that infinite
variance distributions exist and indeed, you can not apriori assume a given
distribution you are working with isn't one.

[1]
[http://books.google.com/books?id=6KGSYANlwHAC&lpg=PP1...](http://books.google.com/books?id=6KGSYANlwHAC&lpg=PP1&ots=yULs5p13Uo&dq=Benoit%20Mandelbrot%20fractal%20and%20scaling%20in%20finance&pg=PP1#v=onepage&q=Benoit%20Mandelbrot%20fractal%20and%20scaling%20in%20finance&f=false)
[2] <http://en.wikipedia.org/wiki/L%C3%A9vy_distribution> [3]
<http://en.wikipedia.org/wiki/The_Black_Swan_(Taleb_book)>

~~~
photon137
Actually, financial markets have employed non-Gaussian Levy processes in
modelling derivatives for a long time (it is a bit different from the Levy
distribution, I agree - but nothing stops the processes from having non-finite
moments).

For example, a very widely used process for modelling information-driven
timeseries (like stock returns) is the jump-diffusion model where the
diffusion component is a Brownian motion while the jump component is a Poisson
process.

The underlying volatility is often modelled using a different process - e.g.
the SABR model, the Heston model etc.

There are similar cases for interest-rate processes (Hull-White/BDT etc) which
have to satisfy conditions of mean-reversion and no-arbitrage across the yield
curve.

See, it's not as if we don't realize that the underlying processes are
mathematically inadequate to fully explain all market movements. But for a
model to be useful, it has to satisfy two conditions:

1\. Be able to produce a non-arbitrageable "mid"-price for making a market (ie
if someone asks a trader to quote the bid-ask for an option, e.g.)

2\. Be able to reproduce the current market prices of an asset/its
derivatives. This requires model calibration.

Models are chosen based on how easily and how fast they can satisfy 1 &2.

Do they produce risk numbers that are believable? Probably yes, if the model
has been calibrated and has been tested against out-of-sample inputs.

Do they guard adequately against event-risk (the thing you would try to
signify using your "infinite" variance distributions)? Probably not - but then
again, nothing does. How would you go ahead and calibrate the Levy
distribution so that the sampling process can explain currently tradeable
market prices? Would be it a "do once, leave forever" calibration? Or would it
change from day-to-day (ie "local" calibration)?

~~~
joe_the_user
Well more interesting stuff. My Googling yielding woefully incomplete
references for your keywords. So "a long time" means what here? Some pointers
would be useful here. I'm interested, help me out here.

I know there was a book giving long reworking of Black-Scholes for Levi-Stable
distributions in the early 2000's.

Of course, while you described the sophisticated approach, from Black-Scholes
to the Gaussian "coupela" to the _OP_ , the unsophisticated approach has a lot
of traction still.

If we're ranging around all our interests in the market, I'd offer what I
might hope would be the Minsky comment; _"the fault, dear Brutus, lies not
with our models but in our selves"_. Gaussian models reappear because they
have "predictiveness". The problematic sides of the more sophisticated models
are tolerated for the same reason. Greed tempts us always to irrationality
jump from what Keynes called _uncertainty_ to mere probability.

Btw, what do you think of Doug Noland of prudentbear.com?

------
crntaylor
I'm an analyst for a quantitative hedge fund. Please, _please_ everyone
promise me to never base your investment decisions on this discredited form of
mean-variance optimization.

This method of stock selecting was invented by Harry Markowitz in 1952. In the
intervening sixty (!) years we have accumulated overwhelming evidence that
plain vanilla mean variance optimization doesn't work. Among its many flaws:

1\. It makes unrealistic assumptions about the distribution of returns (i.e.
that they are multivariate normal, when it is well known that returns exhibit
heavy tails, time-varying volatility, fluctuating correlations etc etc).

2\. It relies on you having good estimates of the expected annual return of
individual stocks. How do you propose to get these? Don't say you'll use
historical measurements, unless you really believe that last year's return is
a good predictor of this year's return (it's not, except perhaps in some
sectors, and even then it's difficult to measure and you'd be subject to crash
risk).

3\. The optimization procedure is error-maximizing. That is, even if returns
_were_ multivariate normal _and_ you had a reliable way to measure the
expected return on stocks, you'd still have errors in your covariance matrix,
and these errors are amplified by the optimization procedure. You can see this
in the article, then the "optimum" portfolio recommends putting 75% of your
portfolio in MSFT and shorting AMZN and AAPL. Does anyone really believe
that's sensible? Does anyone believe that such a portfolio is diversified?

The problem is that your model of stock returns is subject to massive
overfitting. Let's say you have data for the last 10 years (i.e. about 2500
days). If there are N stocks in your portfolio, you need N(N+1)/2 pieces of
information to specify the covariance matrix, which puts an upper limit of 70
stocks in your universe (since 70 * 71 / 2 ~ 2500). A good rule of thumb is
that you should have 10 observations per free parameter, which cuts that
number down to 22 stocks (22 * 23 / 2 ~ 500). I think that most portfolios
consisting of 22 single-name stocks aren't sufficiently diversified (and
you'll still be subject to the first two problems above).

In 2012, _no one_ should be using mean-variance optimization to select stocks.
At the very least, shrink the covariance matrix toward some sensible prior
(e.g. constant correlations, sector correlations, or a factor model) and
backtest your strategy over the past 10-20 years and look at the annual
volatility, size and length of drawdowns, skewness and information ratio.

~~~
nxtrader
Very interesting post -

Other than using a shrinkage estimator for the covariance matrix - what
techniques would you suggest make sense for doing portfolio optimization?

------
tbenst
For your sake, do not base any investment decisions off of this model.
Historical correlation != future correlation. You are MUCH better off using
the Fama-French three factor model as a starting point.

An example of historical correlation severely understating risk was the 2008
financial crisis. Default rates of mortgages in, say, Florida that
historically had little correlation with default rates of mortgages in Nevada
suddenly became very correlated. Measuring risk in this fashion is not robust
enough for investment decisions

[http://en.wikipedia.org/wiki/Fama%E2%80%93French_three-
facto...](http://en.wikipedia.org/wiki/Fama%E2%80%93French_three-factor_model)

------
photon137
Firstly, nice effort.

Secondly, some features you could add:

1\. Constrained optimization - including budget constraints, sector selection
constraints etc. A tough one would be cardinal constraints e.g. I am limited
to 4 stocks etc.

2\. Return attribution - whether the returns your portfolio earned were due to
stock selection or asset allocation or both (Brinson Model:
[http://www.mscibarra.com/research/articles/2002/PerfBrinson....](http://www.mscibarra.com/research/articles/2002/PerfBrinson.pdf)).

3\. Performance and compression - how would this deal with huge covariance
matrices? 10000 x 10000? Matrix operarions on these wouldn't be trivial. In-
memory serialization/deserialization issues also come to mind. (edit: then
again, Excel can't do 10k x 10k :) )

4\. I'm not conversant with SciPy - does this use BFGS/similar for
optimization?

5\. Compute as a service? Host a grid? Let calculation requests come to you
via Excel? (Nobody would want a 10000 asset timeseries to be processed on
their CPU for two hours).

~~~
wtvanhest
I second everything photon has said and would add that anyone using this model
should acknowledge the risk they are taking by using past stock correlations
to predict future optimal portfolios. While this approach is central to
finance, recently (past 18 months) we have seen bizarrely correlated asset
price changes due to macro issues (especially in Europe).

These macro issues may change the way stocks correlate in future markets. I
would back test any portfolio with data from at least 18 months ago to make
sure it still looks pretty good.

~~~
photon137
_> we have seen bizarrely correlated asset price changes due to macro issues_

Actually, this is quite expected given event-risk in the market. For example,
even in a non-crisis situation, correlation increases in intra-day trading
when market-moving news is due - e.g. payroll numbers are due or ISM numbers
are released.

Coming back to the present, headline-risk hasn't been as high as it has been
since May, 2010 as we keep having one EU summit after another, one central
bank announcement after another and each having a huge impact on how asset
prices behave.

So the correlation is not entirely bizarre - people have been making
correlation trades as well - for example JP Morgan's huge loss while betting
on credit index tranches was a bet on implied vs. realized correlation (it
didn't play out as expected, of course).

I'd be interested to know how people are playing this out in the equity space
- via variance swaps?

------
wiredfool
It looks like the optimal weights for AMZN and AAPL are negative. That doesn't
seem possible unless you're shorting them, but that's a quite different risk
profile than going long.

~~~
wtvanhest
That is just another point of optimization he needs to add.

~~~
wiredfool
It doesn't look like there are any limits on the weighting, even on the high
end.

Furthermore, he's using closing price, not adjusted closing price. So splits
and dividends aren't included. (which is most of the return in MSFT for the
last 8 years or so).

~~~
karamazov
It's easy to switch to adjusted closing price - there's a 'which_price'
dictionary that has adjusted price as an option. The weight limits do go a bit
crazy if you can get close to 0 variance (but that's hard to do in real life).

------
billswift
Decent article, but I wanted to add a little extra warning to this:

>One way to do this is to look at past returns and come up with the historical
correlation.

Be very wary of historical correlations, at any level. I am old enough (I was
in my early teens) that I can remember the screaming of economists during the
1970s stagflation - it was _known_ that you could not have high inflation and
high unemployment at the same time. Until we did.

------
robbiep
I may be totally off the mark here (it has been 5 years since I last studied
portfolio management) but if you have 2 stocks that are totally negatively
correlated, with the same expected return, won't you have a return of 0%?

------
calgaryeng
There are already a number of libraries in R that do this (including reverse
portfolio optimization and Black-Litterman optimization)!

------
hogu
just a small nitpick, you should be able to calculate portfolio variance using
simple matrix operations instead of writing a double for loop.

something like.

temp = a* std_dev

var = np.dot(np.dot(temp.T, cor), temp)

