Hacker News new | past | comments | ask | show | jobs | submit login
Kelly Criterion – how to calculate optimal bet sizes (fhur.github.io)
272 points by fernandohur on June 9, 2021 | hide | past | favorite | 129 comments



It's well worth implementing and testing out the Kelly Criterion. It's super simple to code up in a Jupyter Notebook so that you get to enter an amount to bet each time. When I tried it, I found my own psychology changing as the bets continued, even when I knew the coin's bias. It's a really great demonstration of the difference between a) intellectually knowing the optimal strategy, and b) what actually happens.

A bet on a biased coin paradigm was actually tried in the real world with finance professionals, with a cap on the maximum payout. The results are described here: https://arxiv.org/pdf/1701.01427.pdf It's pretty interesting. (Note though that the "average returns" reported hide a lot of variation.)


It's also worth (I think!) exploring how it generalises past sequences of discrete 0--1 outcomes. There are horse race situations (what Kelly originally modeled), portfolio selection, half-Kelly type strategies in various spaces, and plenty more.


Could you share a link to the notebook? I think it would benefit everyone to learn about this by applying the criterion.


I wonder how professional gamblers approach bet sizing. It seems to me that for most applications the Kelly Criterion is not the right choice. The utility of money is asymmetric; gaining $25000 is worse than not losing $25000. Relatedly, most actual gamblers want to ensure good returns while not going broke, so minimizing risk of ruin is often more important than maximizing return rate. Further complicating the matter is that in real life you don't know your actual probability of success, but you may have an estimation. And finally, though this is less commonly significant, your rate of return in a given game might depend on your the amount you bet.

From what I've seen in the poker community, no one has really approached this type of bet sizing from a rigorous perspective beyond the relatively simple Kelly Criterion.


The Kelly criterion takes into account the fact that "gaining $25000 is worse than not losing $25000". A game where you have a 50/50 chance of gaining $25k or losing $25k has a negative expectation in the log domain, so per the Kelly Criterion one would not bet on this game.

You are right that "your rate of return in a given game might depend on your the amount you bet", this is actually very common. Consider a stock market: buying 1000 shares and selling them a year later will generate less than 1000x the return of buying 1 share and selling it a year later (assuming the stock goes up), because you pay more per share to buy 1000 shares and make less per share when you sell 1000 (because the share price moves as you buy/sell).

Related, I really enjoyed this treatment of the Kelly Criterion by Thorp and highly recommend it http://www.eecs.harvard.edu/cs286r/courses/fall12/papers/Tho...


Kelly Criterion maximizes the wealth in the long-run. It doesn’t take asymmetric utility into account. It just happens to coincide with log-utility. If there is a fixed amount of bets the Kelly criterion will be suboptimal, but as the number of bets grows the optimal strategy will asymptotically reach the Kelly criterion.


It does not. It maximizes log-wealth. Maximizing expected wealth gives a strategy which results in $0 a lot of the time but much much more than Kelly occasionally, achieving a higher average wealth. Kelly gives that up, getting far less expectation of actual wealth, but far more expectation of log-wealth (which is -inf at $0 so avoids the $0 results). If you don't believe this, pick any one scenario and actually do the math, take the limits, etc.


> It does not. It maximizes log-wealth. Maximizing expected wealth gives a strategy which results in $0 a lot of the time but much much more than Kelly occasionally, achieving a higher average wealth.

Log utility function is u(w) = log(w), that’s what it is called in economics. Surely maximizing log of wealth means you are maximizing log utility function.

> Kelly gives that up, getting far less expectation of actual wealth, but far more expectation of log-wealth (which is -inf at $0 so avoids the $0 results).

That’s a crucial misunderstanding of the Kelly’s result. He doesn’t give anything up. He showed than in a nonterminating game the strategy of betting as if you have log utility will give you superior results to any other strategy in terms of long-term wealth growth. What if the game is terminating? Well, in that case you need to know the utility function of the gambler to determine a superior strategy. But in a non-terminating game it is a bit irrelevant because almost all strategies will lead to infinite utility.


(If memory serves right), for bets that either win or double, the Kelly criterion is equivalent to maximize the median wins. Which coincidentally also maximizes expected log wins.


That's an interesting connection! I've heard risk managers talk about how people put too much faith in the mean outcome and don't focus enough on the median outcome. I didn't know there's a correspondence to the Kelly criterion.



Professional gamblers don't play pure random games (like craps or roulette) for a living. They play games where they feel that knowledge/skill have some influence (poker, horse racing).

To take poker, bet sizing is an important factor in play, but it has as much to do with the impression the action makes on other players as the actual underlying odds.


With regards to poker, "bet sizing" in the original article is equivalent to "bankroll management" in poker parlance. Bet sizing within individual hands is indeed a matter of game strategy, but the stakes you should play (i.e. blinds or tournament buyin amounts) is closely related to the original article.


Professional sports betters absolutely think in probabilistic terms -- "I think there's a 25% chance we'll win, the market thinks there's a 20% chance we'll win" etc. Kelly bet sizing absolutely makes sense, though market depth and Bayesian uncertainty also act to reduce bet sizes.


Most of the groups I’m aware of use some variation on Kelly (e.g. half Kelly), but it’s worth pointing out that in some sports it’s actually quite hard to deploy your full bankroll at any one time anyway.


Exactly I'm a pro gambler and data scientist and the optimal staking strategy is mostly just stake sizes that are inversely proportional to odds, plus a little increase/decrease relative to perceived value (what the kelly criterion does).

Check my golf betting algorithm out if you're interested: https://wwww.golfforecast.co.uk


Link doesn't work for me, https://www.golfforecast.co.uk seems better


> no one has really approached this type of bet sizing from a rigorous perspective beyond the relatively simple Kelly Criterion.

Kelly Criterion takes into account your bankroll at the point of betting, so it does take that into account

I haven't understood your points further than that - minimizing risk of ruin (taken literally) is typically equivalent to not ever gambling at all. No gambler would agree with that, by definition of them being a gambler.


This comment sounds so confused that I have to break it down into parts. Maybe I have just misunderstood you.

> I wonder how professional gamblers approach bet sizing.

Some variation of the Kelly criterion, whether they admit it or not. Any time you make a decision with the goal of maximising your growth of wealth relative to a level of risk, you're using the Kelly criterion.

> The utility of money is asymmetric; gaining $25000 is worse than not losing $25000.

Right. Not only is it asymmetric -- it is concave. Just like Bernoulli realised when he invented the proto-Kelly criterion. The Kelly criterion can be interpreted as incorporating an asymmetric, concave utility of wealth. (It's not assuming such a utility -- it's just that log utility happens to maximise long-run growth and be asymmetric and concave.)

But critically, it depends on the size of your bankroll. When you are talking about small stakes compared to your wealth, maximising expected value (i.e. running the risk of losing your entire wager) is the growth optimal choice, because in a local enough region, any continuous utility is linear.

> Relatedly, most actual gamblers want to ensure good returns while not going broke, so minimizing risk of ruin is often more important than maximizing return rate.

This is exactly what the Kelly criterion does. It sacrifices expected value in every gamble for larger long-run growth (which requires also that you don't go bust, or lose too much too quickly.)

> Further complicating the matter is that in real life you don't know your actual probability of success, but you may have an estimation.

But decomplicating the matter is that the Kelly criterion is very forgiving of estimation error, as long as you make your errors intelligently. The optimal bet sizing under the Kelly criterion tends to look like a quadratic function. This means that there is almost a plateau around the optimal bet size where small amounts of misestimation does not actually make your growth much different from the optimal growth. (And if you're worried about large amounts of misestimation, the linear combination of keeping money in your wallet and the full Kelly bet form a sort of "efficient frontier" (to borrow MPT terminology) of risk--growth tradeoffs. Meaning you can always bet less than the full Kelly bet, and get an optimal growth for the risk level that corresponds to. Hence the popularity of half-Kelly and similar variants.)

In the end, estimating the joint distribution of outcomes for the Kelly criterion is a far more forgiving requirement than e.g. estimating the parameters needed for modern portfolio theory type analysis.

> And finally, though this is less commonly significant, your rate of return in a given game might depend on your the amount you bet.

A fact also easily plugged into the Kelly criterion. It might make the search for optimality harder than setting a derivative to zero, but we have computers for numerical optimisation!


Perhaps I'm misunderstanding, but the post says that the optimal bet size f = 1-2p (where p is the probability of winning). But, this seems backwards. As p goes up, you should be betting more.

Shouldn't the optimal bet size, f, be 2p-1?


Yes, looks like a typo. The plots show the correct direction though (f* is an increasing function of p)


Fortune's Formula by William Poundstone is a fun read on the origin of the Kelly criterion, as well as the role of gangsters and bookies in the funding of the early communications infrastructure in the US.


Thanks! I'm looking for "history of probability" mass market books and this looks spot on ;)


Used an abbreviated version to of the kelly criterion along with Markowitz portfolio optimization and applied it to sports betting. All I can say is that past results do not indicate future returns


The part that all these nice theories miss is that you actually do not know the distribution p(win) (in the case of Kelly) or the expected return and covariance (in the case of Markowitz).


Well 'knowing' the 'true probability' is a philosophical can of worms anyway.

The good news is that you don't need to know it exactly, you just need to make a better guess than the bookies (w.r.t. the Kullback Leibler divergence or cross-entropy, whichever takes your fancy).


I know I would never be a writer of seminal papers because I would never publish a formula that includes unknowable parameters.

Same goes for Black-Scholes which includes _future_ volatility.


The fact that Black-Scholes has only one unknowable parameter makes it quite usable, more so than more complicated option pricing models. You can work backwards from the market price to solve for the implied volatility, treating it as a generalized 'price' for the option after factoring out things that are easily adjusted for. You can also abuse the implied volatility (adjusting it up or down) to account for factors outside of the idealized model.


You can’t pinpoint probabilities of many things that make the world run everyday, you can’t even pinpoint probabilities of things that happen in your personal life. It’s useful to at least know some mechanics behind these arcane things rather than completely disregarding them because you don’t fully know their distributions.


That's something of a moot argument, though, since BSM computes prices in terms of the future volatility. Since we have the actual price, the volatility is actually what we solve for with BSM.

Even if we had neither price nor volatility, we can still talk about the surface of possible (price, volatility) pairs which are compatible with the model.


But how do we get the prices? If I tell you the at-the-money front-month call is $1000 will you tell me the vol is 100pa?


Yeah, basically. Also need the strike, spot, an estimate of the risk free rate (probably not today).

The implied vol is a useful way to make sense of the actual market prices of options. We also might have some predictions about the market's implied vol changing going forward and we can reverse those errors back into expected price changes (and maybe trade on them).


It's a real concern, but more realistically you can just compute many potential outcomes and at least get a sense of the structure of the surface


How you will go bust in favorable bet by N N Taleb - https://youtu.be/91IOwS0gf3g


Am I missing something? It doesn’t seem counterintuitive to me that repeatedly making an all-or-nothing bet with a non-zero chance of losing will eventually cause my expected value of capital to go to zero.

I like the presentation style though, and the allusion to Shannon’s theorem although I didn’t quite grasp the connection.


As others pointed out, it's not all or nothing. The specific example he was using was that you have that p = 0.7 game and you invest 80% of your wealth on each round.

I think it would be intuitive for anyone that you could lose the first few rounds despite there being very favourable probabilities. If you really do go all-in, you could lose it all on the first round. But he's using an example of going "just" 80% in.

I think the part that's unintuitive for many people is that the probability of going bust increases the more rounds you play. Most people (myself included, perhaps) would naively expect that a favourable game would always have positive return as the number of plays increases. That's a mental failure to account for the fact that the game ends if we lose everything.

The counterintuitive result is because if you put 80% in each time, you could lose 99.97% after 5-straight losses or 99.99999% after 10-straight losses. The more you play, the more likely you are to eventually hit N consecutive losses and, therefore, go bust.

I have to concede that I don't know who would start from a place where they think betting 80% of the farm on each play is a sane strategy. Then again, we've all heard of founders who re-mortgage the house to fund their startup so...


I'm not sure I really agree with the entire concept of "going bust" in these terms. To nit pick, you don't go bust if you lose 99.97% of your money. The real point is from whereever you start 5 losees will take away 99.97% of your money and it's asymmetrical- after those 5 losses you don't need 5 wins to make back your original sum of money because now your wealth is much smaller so you need far more consecutive wins to regain your wealth. It's counter intuitive because most people don't really think about the implications that you're betting a fraction of your wealth and your wealth at each timestep is a compounded function of the previous timesteps. They think of wealth as a constant where it's not.


You’re totally right and my rationale was mistaken.

My two examples were based on the idea that there’s some limit on how we can divide the original bet. For example, if a person starts with $10 and loses 99.97%, they have nothing left because rounding. Likewise for losing 99.999% of $1000.

In hindsight, though, the risk of consecutive losses wasn’t the point of the math. As you point out, the asymmetry is the problem. Loss occurs over time even with intermittent losses.

Here’s an example of how to half $100 in capital, despite a 70% win rate.

   $100.00
   $ 20.00 (L)
   $ 36.00 (W)
   $ 64.80 (W)
   $ 12.96 (L)
   $ 23.33 (W)
   $ 41.99 (W)
   $ 75.58 (W)
   $ 15.12 (L)
   $ 27.22 (W)
   $ 49.00 (W)
   . . .
The impact of the asymmetry seems obvious in hindsight. I might have to agree with the original post that it’s a bit counterintuitive.


This is the best summary of this problem I’ve read here. Thanks!


I forget what he said in the video, but generally this thing can be a little surprising because even a pretty decent looking bet can blow you up with high probability if you size it wrong.

The other side of this is Shannon's Demon: a sort of crappy game can be made into a profitable one by sizing it properly (and rebalancing).


Right, I don't think that's the part that's counterintuitive. It's the claim he makes "if someone offers you a bet 70% chance to win a dollar, 30% loss, it's better not to take it in some cases" is what is counterintuitive to people.


Seems pretty misleading, since that’s only true of a dollar is a significant portion of your wealth…


Probably depends on your aversion to losing it all. If it's a dollar go for it. If it's 100 dollars, again, take the bet. If it's your life savings of 50000 dollars and going to 0 would be worse for you than 70/30 doubling your money, you don't want to go for it.


Sure, much like the Kelly criterion would tell you. If your life savings are $50,000, and someone offers you an even odds bet but with a 70 % chance of winning, you get optimal growth by wagering $20,000. However, that assumes you'll get a large number of similar offers so that you can make your losses back later if you get unlucky now.

That said, growth is pretty close to optimal even for smaller wagers. This plot shows growth rate as a function of wager size: https://www.wolframalpha.com/input/?i=plot+log%2850+-+x%29*0...


A great investing blog I follow is https://breakingthemarket.com/. He talks about the Kelly Criterion and extending it to making correlated bets (investments).


This is so timely. I made this with a buddy of mine to help us figure out optimal allocation for stocks in a portfolio using Kelly. https://engine.oracled.com/


Another word of caution. The Kelly Criterion depends on each event being independent. Lets say I'm told to allocate 50% to QQQ and 50% to SPY. Those may independently be correct, but since the NASDAQ and S&P are highly correlated, this wouldn't be the correct allocation. You've essentially allocated 100% of your portfolio to one probability, rather than 50% to two independent probabilities.

This is an obvious example. But really all stocks (or at least sectors) are correlated just like this. So other examples wouldn't be so obvious.


> The Kelly Criterion depends on each event being independent.

That's not quite true. The Kelly criterion (generalised to portfolio selection) requires the joint distribution of outcomes, which captures all correlations.

Taking somewhat recent historic outcomes as representative of the joint distribution of outcomes (this effectively becomes the Cover universal portfolio), I'm guessing the Kelly criterion would suggest something like 50 % cash and 50 % equity, if those are the only two options.


True. Correlations are not modeled in here. And the fact is once shit hits the fan all correlation converge to 1. This is not a tool for helping you with that. All it does is tells you - don't put more than x% in this stock. It answers a very specific question --- I like XYZ, how much should I buy?


There are many assets and securities that are inversely correlated with the stock market like gold or inverse ETFs.


How does kelly criterion apply to sizing portfolio positions? From what I understand trying to use it in the stock market is a fools errand.


Options Implied probabilities gives a way to understand what market is pricing volatility at. Given the option price, it is not hard to get the probability distribution. There is not judgement here. Just plain Math. Here is some explanation - https://engine.oracled.com/FatKelly


Be careful using this formula too naively. Predicting tomorrow's expected return is quite difficult, though predicting tomorrow's expected volatility is doable.


I have no idea how this website works, could you explain?


The probabilities are calculated using options market. Options (and risk neutral distribution) gives you the market implied probability of stock going up/down and by how much.


It’s telling me I should short SPY and go long GameStop and AMC… Sounds suicidal to me.


This is beta hedging, it works on the assumption that when SPY performs positively, your risky basket will outperform SPY, but if there is some systemic risk-off event, your SPY short will at least dampen if not fully cover any losses made in your risky basket

Good day, you lose -0.5% on SPY but gain +2% on AMC

Bad day, you maybe gain 0.5-1%% on SPY and lose -2% on AMC


But this is only true if expected returns and beta are constants, right?


Ha yes. We had same reaction. But it is what the market is pricing probabilities at. And if you see the returns market has been right so far. We are also skeptical. But numbers don't lie. I think it is a good way to size how much to buy if you really want to buy GME. Don't put all your money in but limit it to 10% as Kelly suggests


There's rumors/fears about SPY dipping from the next CPI report, could be causing options/sentiment to lean downwards in the short term.


Some estimates end up being negative, is that intentional? E.g AAPL.


Yes that happens when vol skew are a certain shape. I think better use this link. Gives a better understanding of what is happening

https://wiki.oracled.com/


You need to very carefull to apply Kelly Criterion to stock market, as you cannot precisely calculate the p of your investments. If you assume a too high p then you will overbet and its only a matter of time to go bust (see N. N. Taleb MOOCS on Kelly). Thus the Kelly Criterion should be your UPPER bound for real-life investments with uncertain p, stay well below the betting amount what Kelly Criterion would suggest so that you stay longer in game.

Related to this, the best investors in the world are quite old guys. Why? Because they lived long enough to accumulate enough wealth to be of public interest.


> Notice that when p is 0.4 G is 0. What this basically means is that you should never bet if the odds are against you.

Unless if you have a higher or other goal. For example, because you want to impress a woman, because your name is James Bond, or because (more likely) you earn as a result of the related drama of an outlier. This is why sometimes we see people losing and they gain from it on the longer term; they gain from it via advertising, for example. Influencers, advertisers, marketeers, terrorists -- they all abuse this mechanism.


I read about this on HN and then lost it for months when I wanted to apply it to a crappy game on a random discord server.

The game lets you bet on chicken fights and tells you the probability your chicken will win (starts at like 62% chance to win), so this kelly criterion is perfect. It's a bit incredible how reliable it is.


There's a loose vs lose typo in the second paragraph. It's like school teachers whipped they're, their, there into our heads and where one typo door closes another opens.


Its definately a loosing battle.



There's a mistake in this: > I won’t go too deep into the math, but it can be shown that G achieves its maximum when f=1-2p

Should be f=2p-1


But how much is it actually used in trading?


It's useful as an upper limit--if you're leveraging past what Kelly suggests you're almost certainly overleveraged.

In theory, Kelly is optimal--if you knew the exact probability density function of your returns, it would give you the right leverage to take.

In practice, you're always playing with risks, some you're factoring into your models, some you're choosing not to because they're intractable, some you're not even aware of until they occur. The most basic premise--today's returns will be a function of hypotheses that I've derived from looking at past observations--is an approximation at best.

This mismatch between model and reality can lead to expensive lessons learned when using the full Kelly model, so often traders will "half-kelly" or something like that, to incorporate the basic idea of risk scaling proposed by the Kelly model but with more safety margin.


> This mismatch between model and reality can lead to expensive lessons learned when using the full Kelly model, so often traders will "half-kelly" or something like that, to incorporate the basic idea of risk scaling proposed by the Kelly model but with more safety margin.

And to be clear, half-Kelly, quarter-Kelly, eighty-percent-Kelly and any other linear combination between wallet and full Kelly are actually still the only strategies that are growth optimal -- given the particular safety margin each corresponds to.


> In theory, Kelly is optimal...

It's optimal for one utility function (log of future $), but that doesn't make it necessarily optimal for everything, right?


That’s a great callout as well: it’s optimal for maximizing your expected growth in the long term, but does carry significant volatility. That’s fine for an emotionless immortal robot investor, but as we’re human.

If we’re close to an investment goal, like saving a house downpayment or retiring, the calculus is quite different. Even outside that, loss aversion is a real thing and we’re likely happier trading some upside for being able to sleep at night.


It is optimal for the growth of your wealth in the long run. The issue of utility only comes into play if you have to bet a finite amount of times and you don’t plan to or cannot live forever.


Kelly criterion is just the square of Sharpe ratio. Sharpe is used pretty extensively from a strategy selection and risk management perspective.


This is not correct. Kelly criterion is not the square of the sharpe ratio


https://ifta.org/public/files/journal/d_ifta_journal_11.pdf

Page 27: Vince(vi) and independently Thorpv(ii) provide a solution that satisfies the Kelly Criterion for the continuous finance case, often quoted in the financial community to the effect that “f should equal the expected excess return of the strategy divided by the expected variance of the excess return:”

f = (m-r) / s^2

so it's Sharpe with variance instead of standard deviation in the denominator, correct?


Yes, that's correct.

An intuitive way to think about this is that Sharpe depends on the specific horizon you're using. E.g. annualized Sharpe will be sqrt(252) larger than daily Sharpe. It would not make sense to change the Kelly criterion based on a substitution of variables. In contrast variance, like returns, scales linearly with time horizon. Therefore the variance ratio is invariant to the time horizon.


Please share this calculation, dying to know what kind of inter-universal Teichmüller space theoretic math you're using to come up with this.


https://news.ycombinator.com/item?id=27453365

Edward Thorpe and Ralph Vince both conclude that the Kelly Criterion in the continuous case is excess returns divided by variance, which is pretty close to the Sharpe Ratio, correct?

Asking to understand better, not to be combative. Your comment made it seem like that formula is way off.


I used it in live trading last year, I couldn't really make it work.

I precalc my stoplosses + stopgains then use a simulation to get the win/loss probabilities on training data.

What I observed is the kelly formula really prefers the tiny stoplosses, so when you sort your predictions by kelly score it will pick the ones with tiny stoplosses.

What happened to me in the live test is the tiny stoplosses triggered, when a stopgain would have triggered later.

I know someone is going to say "thats a problem with your stoplosses+stopgains OOS performance" and they are right, but OOS stoploss+stopgain calculation isn't trivial for me to calculate :\


Where did you get the distribution of the real data from?


Yes, model predictions on volume+price data from 2019+2020


It's not practical when the outcome is distributed in a very weird and unknown way and each bet isn't really iid.

Moreover you can find the same result with a simple grid search over bet sizes, obviating the need to estimate any parameters of the outcome distribution.

It would be more useful in professional gambling where outcome distributions are more knowable.


Knowingly? Not often. You don't find long-run successful traders who don't knowingly/unknowingly use it or some variation of it.


Almost no one uses it directly – in practice, Kelly is too aggressive – but variations of it are quite common.


Would love to hear about the variations!


bet less than the formula recommends. depends on personal preferences


I am sure it is used but the challenge is calculating the necessary parameters


Does anyone know of any formulas that would accommodate p changing on every bet?


Well, Kelly is infinite horizon, so any derivation is going to depend on the exact payouts. If it is not infinite horizon, you can do dynamic programming to figure it out but you will have to be careful with myopic reasoning (betting it all in the last stage). In practice, just plug your changing payoff into the formula. The rationale being every bet is growth optimal in the long run even if you only bet it once.


If the distribution of p is ergodic, then Kelly criterion (re-sized at every p) still maximizes expected growth rate.


How about optimal bet sizes in parallel occurrences? It is more important because in portfolio theories you must diversify...


read the original paper. the gambler's formulation is garbage, which obscures all insight.


Which one is the original to you? Kelly's new interpretation of the information rate? Or Bernoulli's new theory on the measurement of risk? Or one of the other seminal papers describing different aspects of the Kelly criterion?


The Kelly criterion is to financial math as the Fibonacci sequence is to mathematics. Yes it's neat, no it's not special, please stop bringing it up all the time.


When I was a young guy who happened to be partner in a hedge fund, I used to ask candidates about this.

I kinda regret it now. I'm not sure what I was trying to find out from people. I guess in some way it's a cultural question masquerading as a technical question, because it reveals whether you've heard of it and whether you have heard of the standard stuff that's said about it:

- Don't do full Kelly because if you're on the wrong side (too big bets) that's definitely bad.

- Depends on the probabilities being static.

- You can find a continuous form of it. You can also find the implied leverage from the Sharpe ratio.

I wonder if my memory is even correct on these. But the point remains that it's not terribly useful as a thing to evaluate people with. I guess the question "How do you size your positions?" needs to start somewhere.


> I guess the question "How do you size your positions?" needs to start somewhere.

Maybe the question itself is a good place to start. If someone asked me about Kelly, my mind would immediately drift from "statistical models and dynamic programming" to something else.

The real engineering problem is threefold: (1) how do I model my return generating process, (2) what is my utility function, and (3) do the first two steps yield a tractable Bellman equation?

If the problem is framed right, most people will have something to say about 1 and 2. The third part is tricky. A real world solution will involve finite element methods. But in an interview, people may be hesitant to bring up approaches that don't yield closed form solutions.

That's what's special about Kelly. It's assumptions for steps 1 and 2 that produce a closed form solution in step 3.


This so much. Like anything that is 'optimal' it is optimal with respect to some criterion. For the Kelly Criterion it is to maximize the logarithm of the weighted sum of the expected value across all outcomes.

This is probably not what you actually want in almost any situation.

The one time it actually makes sense to bring it in is if you are forced to make a certain number of wagers in some game AND you have good-enough knowledge of the odds of the game AND your payout is proportional to how much money you have left at the end of the game, AND the wager size does not effect the outcome of the game in any way. This scenario never happens.

Even when you meet many of the necessary prerequisites to use Kelly, it still doesn't make sense at all. For example, Blackjack tournaments. You are given a set amount of starting money, you play for a set number of hands (or time), and you know the odds perfectly. However, the payout structure isn't proportional to your final amount of money, so using Kelly has more or less no chance of winning any money in the tournament. They usually pay for the top N results, which means you have to go with a very high variance strategy AND win consistently to place.

Poker: Nope, not even close, bet size is a direct input into the dynamics of the game. Predictable betting of any kind is a maximally bad strategy.

Stock Market: utility of money isn't logarithmic, so it is not worth maximizing, even if you knew probabilities and were forced to make wagers, which you don't and aren't. If you could even approximate probabilities you could use that power to basically print money on derivatives, so even if Kelly applied, the prerequisites are too strong and would make far far better strategies available.

The only valid use case for bringing in the Kelly Criteria is for gamblers to feel better about burning money at the tables by improperly applying it.


The first sentence of your post, while technically true, misses the point. This misunderstanding undermines many of your other points.

The kelly criterion happens to be optimal with respect to log wealth but that's not the main reason why it's interesting. Many explanations, including the original post, make this mistake. Maybe because 'maximizing expected utility' is a more common idea.

The first sentence of the wikipedia article:

"In probability theory and intertemporal portfolio choice, the Kelly criterion (or Kelly strategy or Kelly bet), also known as the scientific gambling method, is a formula for bet sizing that leads almost surely (under the assumption of known expected returns) to higher wealth compared to any other strategy in the long run".

In other words, pick a strategy. I'll pick the kelly strategy. There will be some point in time, after which I will have more money than you, and you will never overtake me. No logarithms involved. This is something you can easily check by simulation, but requires some heavier math to formulate precisely and prove.

See also posts by spekcular and rssoconnor elsewhere in this thread.


If it's a mistake, it's a charitable one. Picking log utility is at least somewhat principled, trying to lead almost surely to higher wealth is way more arbitrary.


What exactly do you find arbitrary about this? When gambling or investing, more wealth is better than less wealth.

'Almost surely' is a technical term which means 'with probability 1'.


At no point in time does Kelly give you more wealth with probability 1, and there is no reason to care about "more wealth with high probability", that's not even a transitive comparison function.


> At no point in time does Kelly give you more wealth with probability 1,

Either you're claiming that the theorem I attempted to describe is false, or you're misunderstanding the theorem I am trying to describe.

I never wrote anything about 'high probability' either so I don't know why you introduced that notion.


> For the Kelly Criterion it is to maximize the logarithm of the weighted sum of the expected value across all outcomes.

It's actually to maximise the expected logarithm of the monetary amount, and it's a pretty good heuristic (in most circumstances) given that most opportunities are exponential. (It usually takes a given amount of effort to double your money.)


Can you or anyone expand on "utility of money isn't logarithmic"? It seems a whole bunch of economic theory is based on the wish ("because it has nice properties") that losing 10% of your money generally hurts the same amount no matter how much money you have. Even accounting for a non-elastic living wage on the bottom. I know there's a lot of political opinion here, but has there ever been an effort to empirically determine what the utility of money actually is?


You obviously wouldn't use Kelly criterion during a poker game because the assumptions don't fit. But on a larger scale it can be used for 'bankroll management' - what proportion of your wealth should you use on a tournament entry fee. Of course you don't have the exact parameter p but you can use an estimate to make sure you are not making a grossly over/under-sized bet.


No, you can't use it for bankroll management, because you can't estimate the probabilities necessary. A rule like "never bet more than 15% of your bankroll on one thing" would work just as well and it doesn't require you to do a bunch of math to get to the same answer.

Here, lets do some examples to see how dumb it is in practice:

Lets say I want to enter a tournament where I guess I have a 5% chance of winning, and I have $1000, here is how much the KC says I should be willing to pay to enter based on the payout odds:

  10:1  -> Don't enter (duh)
  15:1  -> Don't enter (duh again)
  19:1  (breakeven) -> $0.00  (duh)
  20:1 -> $2.50 
  30:1 -> $18.33
  100:1 -> $40.50have forced
  1000:1 -> $49.05
  10000:1 -> $49.95
  10000000000000000000000000000000000:1  -> $50.00
oh, so this fantastic system tells me to never bet more than 5% of my money if I have a 5% chance of winning. So insightful!

Ok, lets say we have a 95% chance of winning the tournament:

  1:1 -> $900
  2:1 -> $925
  50:1  -> $949
  1000000000000000:1 -> $950
So if I'm a sure thing, I should bet a bunch of money. Again, there's no way someone would do this without maximizing log expected value and doing a bunch of math.

Maybe it gets more interesting if it's around 50/50:

  1:1 -> $0 (ok, makes sense)
  2:1 -> $250
  3:1 -> $333
  4:1 -> $375
  100000000000:1  -> $500
Again, Kelly gives us terrible advice. If you have a trillion to one payout on a coin flip, you want to bet less, not more! Why would you risk half your money, and have a 1/4 chance of losing all your money, when you can bet 1 cent at a time and just wait to win one time so you can buy half of the stock market with your winnings?

So I still contend that whatever it is that Kelly maximizes, it's a dumb thing to maximize outside of contrived situations where you are forced to bet and know exact odds, and where the expected value is positive (if you have negative expected value you should never play, and Kelly tells you that).

Finally, it is very sensitive to your probability estimates. Going back to my first example, where you have a 5% chance of winning a tournament, lets fix the payout at 30:1 and look at what kelly tells us if the probability of winning isn't exactly what we thought it was:

  5% (same as first example): $18.33
  4%: $8.00
  3%: Don't enter
  6%: $28.67
  7%: $39.00
So if I have to guess my probability of winning the tournament within 1% of the actual probability or Kelly is going to tell me drastically wrong amounts. Nobody can set odds on something like a tournament precisely enough for this to be useful. Just like with stocks, if you have the ability to estimate probabilities so well that Kelly stops telling you to do the wrong thing, you can make far more money just directly using your magical probability estimating powers and betting on derivatives. If I could estimate my odds of winning the tournament to within 1%, I can just go to the sports book and bet on who is going to win on the tournament and make far more money than I would in the tournament itself. It's like a system to sell a cake for 15% more profits, and it starts with "first, use your laser vision to preheat the cake pan".


> Again, Kelly gives us terrible advice. If you have a trillion to one payout on a coin flip, you want to bet less, not more! Why would you risk half your money, and have a 1/4 chance of losing all your money, when you can bet 1 cent at a time and just wait to win one time so you can buy half of the stock market with your winnings?

So you're just going to throw away the criterion because you think the results are unintuitive? That's the argument you're making here.

To take your reasoning seriously, the reason why you might not want to bet 1 cent at a time is because the Kelly bet is guaranteed to eventually overtake your 1-cent-bet-strategy. Furthermore, it is completely incorrect to say that the Kelly bet has a 1/4 chance of losing all your money in the given situation. If you lose your first bet, the Kelly criterion tells you not to bet the whole house on the next bet.

Nothing you have written so far suggests that you actually understand the sense in which the Kelly criterion is optimal, which I attempted to explain in my other reply to you. You keep writing as though it only maximizes the expectation of log-utility. In fact it's not clear that you even understand what the Kelly criterion is telling you to do.


Kelly is not optimal if you can't estimate the probabilities with great precision, which you can't outside of contrived examples or casinos. In casinos you have negative expected value on every bet, and Kelly tells you to not play at all. Contrived examples don't matter.

You aren't addressing what I said, you are cherry picking things you find easy to rebut. You are right that I messed up and that you can't lose all your money with two 50% bets. However you ignore the stronger argument, which I opened with and repeatedly pointed out, which is that you can't estimate probabilities well enough to use it, and that if you can estimate probabilities well enough to benefit from Kelly, that ability to estimate probabilities itself almost always unlocks strategies that strictly dominate any benefit Kelly gives you. It's useless in practice.


I don't really care about winning any global arguments. I see bad logic and I try to point it out. If you didn't like that some of your points were easily rebutted then you shouldn't have written them. Leaving them unchallenged makes your position seem artificially strong.

I couldn't resist bringing up the poker bankroll example because I think your in-game-poker example was poorly chosen. To me, it looked like you came up with a situation where the criterion obviously had no hope of being applicable and then used it to argue that the criterion is useless. E.g. I could find a whole list of things for which calculus is not applicable, but that would not be a good argument for 'calculus is useless'. The example I gave is at least closer to the assumptions of the Kelly Criterion.

I think the main thing I wanted to do was to correct the misconception that Kelly is only maximizing expected log utility, because it is a shame if someone (including other readers) thinks that the Kelly Criterion is just a fancy name we gave for the argmax of E f(S) where f happens to be the logarithm.

After all this, you (and other readers) might still conclude that the criterion is useless. But the set of justifications, and maybe the certainty, in that position, should change.


You don't have a 1/4 chance of losing all your money if you repeatedly bet half of it on the trillion-to-one coin flip.


Hi there, you seem to know a good deal about the intricacies of where the Kelly Criterion lies within the range of possible options to calculate outcomes.

Could you point me to any good resource you know of to learn more about the range itself and other options?


Kelly must be the name of the wife in the WSB "wife's boyfriend" memes.

More seriously, great analysis. Couldn't have said it better myself.


I believe it's come into fashion again with the rise of legal sportsbook in the US (DraftKings, etc). Particulaly, with the fascination over long shot "parlays" or chain bets that can achieve very high payouts with very little outlays. Better odds than buying a lottery ticket by far, or so it is perceived.

I think you have to have a few jokers short of a full deck to get into sports gaming and expect any outcome other than ruin. But the parlays are interesting. And the simplicity one could devise a winning strategy is enticing.

Consider a typical NBA season: 120 game nights, about 8 games per night. Let's say you constructed a parlay strategy in which you pick the under to hit on every game played that night. If the payout is large, say $1000 for a $3 bet. That's $360 for the season risk. And a high probability that eventually it'll cash ;)


If the odds of an under happening is 50% then that is indeed a great bet! The odds of winning would be (0.5)^8 or 1 in 256. So the EV of the bet would be

E(x) = (1/256 * 1000) - (255/256 * 3)

E(x) = .91796875

So you are expected to make nearly 92 cents on every 3 dollars bet. Over a 30% return! Any bookie offering these odds would quickly go broke.


In practice they'd simply flag you as a sharp and ban your account before you made them broke


The Fibonacci sequence is not special because it is just one element of the family of linear recurrences. Is there a larger family to which the Kelly criterion belongs? (Not being snide — I’m genuinely asking)


Yes, any utility function will give you a Kelly-like criterion. Kelly is log utility.


Kelly is special though, it's not just log utility. In fact, viewing it as maximizing log utility is ahistorical; the original derivation was in terms of information theory. (The paper's title is "A new interpretation of information rate" [0].) Further, and most importantly, Kelly betting has certain favorable asymptotic properties that betting strategies motivated by other utility functions don't have. See Breiman's "Optimal gambling systems for favorable games" [1].

This point is well known in the literature, for instance see [2]:

> Perhaps one reason is that maximizing E log S suggests that the investor has a logarithmic utility for money. However, the criticism of the choice of utility functions ignores the fact that maximizing E log S is a consequence of the goals represented by properties P1 and P2, and has nothing to do with utility theory.

[0] https://www.princeton.edu/~wbialek/rome/refs/kelly_56.pdf

[1] http://www-stat.wharton.upenn.edu/~steele/Resources/FTSResou...

[2] https://pubsonline.informs.org/doi/abs/10.1287/moor.5.2.161


Exactly. The Kelly strategy is the strategy that, with probability 1, eventually and permanently beats any other strategy. This isn't true for any other strategy and this criterion has nothing to do with utility.


Paul Samuelson wrote an article in 1979 on this. It contains only one-syllable words, except for the last word, which is "syllable". Title: "Why We Should Not Make Mean Log of Wealth Big though Years to Act Are Long."

His 1971 paper, "The 'Fallacy' of Maximizing the Geometric Mean in Long Sequences of Investing or Gambling" is more readable.


That’s wrong. The assumptions behind the Kelly criterion lead to the log utility only in a particular case.

What if each week you can bet $1 that in 50% of cases will triple your bet and in 50% of cases will give you $0? According to the log utility whether you should bet depends on your current wealth. According to the assumptions behind the Kelly criterion you should take the bet every week.


Technically the Kelly criterion suggests you should take this bet if your wealth is greater than $1.23, so it does depend on your current wealth.

However, this might still look dumb, and it's because my calculation assumes you could pick a smaller stake for the same bet, which isn't necessarily the case. If you can't, the calculation gets a little bit more complicated, but sure, the Kelly criterion can deal with that too.


According to the assumptions behind the Kelly criterion (maximizing long-term growth rate) you should always take the bet. Kelly criterion can’t be applied here because we have a different setup (you can’t choose your bet size), that’s my entire point!


the entire field of martingale pricing


I can hear your sigh in answering this all the way from here :D


Bloom filters are the equivalent in computer science. Just beyond the basics, but ubiquitous enough to be annoying when it’s on the front page every other week.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: