A bet on a biased coin paradigm was actually tried in the real world with finance professionals, with a cap on the maximum payout. The results are described here: https://arxiv.org/pdf/1701.01427.pdf
It's pretty interesting. (Note though that the "average returns" reported hide a lot of variation.)
From what I've seen in the poker community, no one has really approached this type of bet sizing from a rigorous perspective beyond the relatively simple Kelly Criterion.
You are right that "your rate of return in a given game might depend on your the amount you bet", this is actually very common. Consider a stock market: buying 1000 shares and selling them a year later will generate less than 1000x the return of buying 1 share and selling it a year later (assuming the stock goes up), because you pay more per share to buy 1000 shares and make less per share when you sell 1000 (because the share price moves as you buy/sell).
Related, I really enjoyed this treatment of the Kelly Criterion by Thorp and highly recommend it http://www.eecs.harvard.edu/cs286r/courses/fall12/papers/Tho...
Log utility function is u(w) = log(w), that’s what it is called in economics. Surely maximizing log of wealth means you are maximizing log utility function.
> Kelly gives that up, getting far less expectation of actual wealth, but far more expectation of log-wealth (which is -inf at $0 so avoids the $0 results).
That’s a crucial misunderstanding of the Kelly’s result. He doesn’t give anything up. He showed than in a nonterminating game the strategy of betting as if you have log utility will give you superior results to any other strategy in terms of long-term wealth growth. What if the game is terminating? Well, in that case you need to know the utility function of the gambler to determine a superior strategy. But in a non-terminating game it is a bit irrelevant because almost all strategies will lead to infinite utility.
Check my golf betting algorithm out if you're interested: https://wwww.golfforecast.co.uk
To take poker, bet sizing is an important factor in play, but it has as much to do with the impression the action makes on other players as the actual underlying odds.
Kelly Criterion takes into account your bankroll at the point of betting, so it does take that into account
I haven't understood your points further than that - minimizing risk of ruin (taken literally) is typically equivalent to not ever gambling at all. No gambler would agree with that, by definition of them being a gambler.
> I wonder how professional gamblers approach bet sizing.
Some variation of the Kelly criterion, whether they admit it or not. Any time you make a decision with the goal of maximising your growth of wealth relative to a level of risk, you're using the Kelly criterion.
> The utility of money is asymmetric; gaining $25000 is worse than not losing $25000.
Right. Not only is it asymmetric -- it is concave. Just like Bernoulli realised when he invented the proto-Kelly criterion. The Kelly criterion can be interpreted as incorporating an asymmetric, concave utility of wealth. (It's not assuming such a utility -- it's just that log utility happens to maximise long-run growth and be asymmetric and concave.)
But critically, it depends on the size of your bankroll. When you are talking about small stakes compared to your wealth, maximising expected value (i.e. running the risk of losing your entire wager) is the growth optimal choice, because in a local enough region, any continuous utility is linear.
> Relatedly, most actual gamblers want to ensure good returns while not going broke, so minimizing risk of ruin is often more important than maximizing return rate.
This is exactly what the Kelly criterion does. It sacrifices expected value in every gamble for larger long-run growth (which requires also that you don't go bust, or lose too much too quickly.)
> Further complicating the matter is that in real life you don't know your actual probability of success, but you may have an estimation.
But decomplicating the matter is that the Kelly criterion is very forgiving of estimation error, as long as you make your errors intelligently. The optimal bet sizing under the Kelly criterion tends to look like a quadratic function. This means that there is almost a plateau around the optimal bet size where small amounts of misestimation does not actually make your growth much different from the optimal growth. (And if you're worried about large amounts of misestimation, the linear combination of keeping money in your wallet and the full Kelly bet form a sort of "efficient frontier" (to borrow MPT terminology) of risk--growth tradeoffs. Meaning you can always bet less than the full Kelly bet, and get an optimal growth for the risk level that corresponds to. Hence the popularity of half-Kelly and similar variants.)
In the end, estimating the joint distribution of outcomes for the Kelly criterion is a far more forgiving requirement than e.g. estimating the parameters needed for modern portfolio theory type analysis.
> And finally, though this is less commonly significant, your rate of return in a given game might depend on your the amount you bet.
A fact also easily plugged into the Kelly criterion. It might make the search for optimality harder than setting a derivative to zero, but we have computers for numerical optimisation!
Shouldn't the optimal bet size, f, be 2p-1?
The good news is that you don't need to know it exactly, you just need to make a better guess than the bookies (w.r.t. the Kullback Leibler divergence or cross-entropy, whichever takes your fancy).
Same goes for Black-Scholes which includes _future_ volatility.
Even if we had neither price nor volatility, we can still talk about the surface of possible (price, volatility) pairs which are compatible with the model.
The implied vol is a useful way to make sense of the actual market prices of options. We also might have some predictions about the market's implied vol changing going forward and we can reverse those errors back into expected price changes (and maybe trade on them).
I like the presentation style though, and the allusion to Shannon’s theorem although I didn’t quite grasp the connection.
I think it would be intuitive for anyone that you could lose the first few rounds despite there being very favourable probabilities. If you really do go all-in, you could lose it all on the first round. But he's using an example of going "just" 80% in.
I think the part that's unintuitive for many people is that the probability of going bust increases the more rounds you play. Most people (myself included, perhaps) would naively expect that a favourable game would always have positive return as the number of plays increases. That's a mental failure to account for the fact that the game ends if we lose everything.
The counterintuitive result is because if you put 80% in each time, you could lose 99.97% after 5-straight losses or 99.99999% after 10-straight losses. The more you play, the more likely you are to eventually hit N consecutive losses and, therefore, go bust.
I have to concede that I don't know who would start from a place where they think betting 80% of the farm on each play is a sane strategy. Then again, we've all heard of founders who re-mortgage the house to fund their startup so...
My two examples were based on the idea that there’s some limit on how we can divide the original bet. For example, if a person starts with $10 and loses 99.97%, they have nothing left because rounding. Likewise for losing 99.999% of $1000.
In hindsight, though, the risk of consecutive losses wasn’t the point of the math. As you point out, the asymmetry is the problem. Loss occurs over time even with intermittent losses.
Here’s an example of how to half $100 in capital, despite a 70% win rate.
$ 20.00 (L)
$ 36.00 (W)
$ 64.80 (W)
$ 12.96 (L)
$ 23.33 (W)
$ 41.99 (W)
$ 75.58 (W)
$ 15.12 (L)
$ 27.22 (W)
$ 49.00 (W)
. . .
The other side of this is Shannon's Demon: a sort of crappy game can be made into a profitable one by sizing it properly (and rebalancing).
That said, growth is pretty close to optimal even for smaller wagers. This plot shows growth rate as a function of wager size: https://www.wolframalpha.com/input/?i=plot+log%2850+-+x%29*0...
This is an obvious example. But really all stocks (or at least sectors) are correlated just like this. So other examples wouldn't be so obvious.
That's not quite true. The Kelly criterion (generalised to portfolio selection) requires the joint distribution of outcomes, which captures all correlations.
Taking somewhat recent historic outcomes as representative of the joint distribution of outcomes (this effectively becomes the Cover universal portfolio), I'm guessing the Kelly criterion would suggest something like 50 % cash and 50 % equity, if those are the only two options.
Good day, you lose -0.5% on SPY but gain +2% on AMC
Bad day, you maybe gain 0.5-1%% on SPY and lose -2% on AMC
Related to this, the best investors in the world are quite old guys. Why? Because they lived long enough to accumulate enough wealth to be of public interest.
Unless if you have a higher or other goal. For example, because you want to impress a woman, because your name is James Bond, or because (more likely) you earn as a result of the related drama of an outlier. This is why sometimes we see people losing and they gain from it on the longer term; they gain from it via advertising, for example. Influencers, advertisers, marketeers, terrorists -- they all abuse this mechanism.
The game lets you bet on chicken fights and tells you the probability your chicken will win (starts at like 62% chance to win), so this kelly criterion is perfect. It's a bit incredible how reliable it is.
Should be f=2p-1
In theory, Kelly is optimal--if you knew the exact probability density function of your returns, it would give you the right leverage to take.
In practice, you're always playing with risks, some you're factoring into your models, some you're choosing not to because they're intractable, some you're not even aware of until they occur. The most basic premise--today's returns will be a function of hypotheses that I've derived from looking at past observations--is an approximation at best.
This mismatch between model and reality can lead to expensive lessons learned when using the full Kelly model, so often traders will "half-kelly" or something like that, to incorporate the basic idea of risk scaling proposed by the Kelly model but with more safety margin.
And to be clear, half-Kelly, quarter-Kelly, eighty-percent-Kelly and any other linear combination between wallet and full Kelly are actually still the only strategies that are growth optimal -- given the particular safety margin each corresponds to.
It's optimal for one utility function (log of future $), but that doesn't make it necessarily optimal for everything, right?
If we’re close to an investment goal, like saving a house downpayment or retiring, the calculus is quite different. Even outside that, loss aversion is a real thing and we’re likely happier trading some upside for being able to sleep at night.
Page 27: Vince(vi) and independently Thorpv(ii) provide a solution that satisfies the Kelly Criterion for the continuous finance case, often quoted in the financial community to the effect that “f should equal the expected excess return of the strategy divided by the expected variance of the excess return:”
f = (m-r) / s^2
so it's Sharpe with variance instead of standard deviation in the denominator, correct?
An intuitive way to think about this is that Sharpe depends on the specific horizon you're using. E.g. annualized Sharpe will be sqrt(252) larger than daily Sharpe. It would not make sense to change the Kelly criterion based on a substitution of variables. In contrast variance, like returns, scales linearly with time horizon. Therefore the variance ratio is invariant to the time horizon.
Edward Thorpe and Ralph Vince both conclude that the Kelly Criterion in the continuous case is excess returns divided by variance, which is pretty close to the Sharpe Ratio, correct?
Asking to understand better, not to be combative. Your comment made it seem like that formula is way off.
I precalc my stoplosses + stopgains then use a simulation to get the win/loss probabilities on training data.
What I observed is the kelly formula really prefers the tiny stoplosses, so when you sort your predictions by kelly score it will pick the ones with tiny stoplosses.
What happened to me in the live test is the tiny stoplosses triggered, when a stopgain would have triggered later.
I know someone is going to say "thats a problem with your stoplosses+stopgains OOS performance" and they are right, but OOS stoploss+stopgain calculation isn't trivial for me to calculate :\
Moreover you can find the same result with a simple grid search over bet sizes, obviating the need to estimate any parameters of the outcome distribution.
It would be more useful in professional gambling where outcome distributions are more knowable.
I kinda regret it now. I'm not sure what I was trying to find out from people. I guess in some way it's a cultural question masquerading as a technical question, because it reveals whether you've heard of it and whether you have heard of the standard stuff that's said about it:
- Don't do full Kelly because if you're on the wrong side (too big bets) that's definitely bad.
- Depends on the probabilities being static.
- You can find a continuous form of it. You can also find the implied leverage from the Sharpe ratio.
I wonder if my memory is even correct on these. But the point remains that it's not terribly useful as a thing to evaluate people with. I guess the question "How do you size your positions?" needs to start somewhere.
Maybe the question itself is a good place to start. If someone asked me about Kelly, my mind would immediately drift from "statistical models and dynamic programming" to something else.
The real engineering problem is threefold: (1) how do I model my return generating process, (2) what is my utility function, and (3) do the first two steps yield a tractable Bellman equation?
If the problem is framed right, most people will have something to say about 1 and 2. The third part is tricky. A real world solution will involve finite element methods. But in an interview, people may be hesitant to bring up approaches that don't yield closed form solutions.
That's what's special about Kelly. It's assumptions for steps 1 and 2 that produce a closed form solution in step 3.
This is probably not what you actually want in almost any situation.
The one time it actually makes sense to bring it in is if you are forced to make a certain number of wagers in some game AND you have good-enough knowledge of the odds of the game AND your payout is proportional to how much money you have left at the end of the game, AND the wager size does not effect the outcome of the game in any way. This scenario never happens.
Even when you meet many of the necessary prerequisites to use Kelly, it still doesn't make sense at all. For example, Blackjack tournaments. You are given a set amount of starting money, you play for a set number of hands (or time), and you know the odds perfectly. However, the payout structure isn't proportional to your final amount of money, so using Kelly has more or less no chance of winning any money in the tournament. They usually pay for the top N results, which means you have to go with a very high variance strategy AND win consistently to place.
Poker: Nope, not even close, bet size is a direct input into the dynamics of the game. Predictable betting of any kind is a maximally bad strategy.
Stock Market: utility of money isn't logarithmic, so it is not worth maximizing, even if you knew probabilities and were forced to make wagers, which you don't and aren't. If you could even approximate probabilities you could use that power to basically print money on derivatives, so even if Kelly applied, the prerequisites are too strong and would make far far better strategies available.
The only valid use case for bringing in the Kelly Criteria is for gamblers to feel better about burning money at the tables by improperly applying it.
The kelly criterion happens to be optimal with respect to log wealth but that's not the main reason why it's interesting. Many explanations, including the original post, make this mistake. Maybe because 'maximizing expected utility' is a more common idea.
The first sentence of the wikipedia article:
"In probability theory and intertemporal portfolio choice, the Kelly criterion (or Kelly strategy or Kelly bet), also known as the scientific gambling method, is a formula for bet sizing that leads almost surely (under the assumption of known expected returns) to higher wealth compared to any other strategy in the long run".
In other words, pick a strategy. I'll pick the kelly strategy. There will be some point in time, after which I will have more money than you, and you will never overtake me. No logarithms involved. This is something you can easily check by simulation, but requires some heavier math to formulate precisely and prove.
See also posts by spekcular and rssoconnor elsewhere in this thread.
'Almost surely' is a technical term which means 'with probability 1'.
Either you're claiming that the theorem I attempted to describe is false, or you're misunderstanding the theorem I am trying to describe.
I never wrote anything about 'high probability' either so I don't know why you introduced that notion.
It's actually to maximise the expected logarithm of the monetary amount, and it's a pretty good heuristic (in most circumstances) given that most opportunities are exponential. (It usually takes a given amount of effort to double your money.)
Here, lets do some examples to see how dumb it is in practice:
Lets say I want to enter a tournament where I guess I have a 5% chance of winning, and I have $1000, here is how much the KC says I should be willing to pay to enter based on the payout odds:
10:1 -> Don't enter (duh)
15:1 -> Don't enter (duh again)
19:1 (breakeven) -> $0.00 (duh)
20:1 -> $2.50
30:1 -> $18.33
100:1 -> $40.50have forced
1000:1 -> $49.05
10000:1 -> $49.95
10000000000000000000000000000000000:1 -> $50.00
Ok, lets say we have a 95% chance of winning the tournament:
1:1 -> $900
2:1 -> $925
50:1 -> $949
1000000000000000:1 -> $950
Maybe it gets more interesting if it's around 50/50:
1:1 -> $0 (ok, makes sense)
2:1 -> $250
3:1 -> $333
4:1 -> $375
100000000000:1 -> $500
So I still contend that whatever it is that Kelly maximizes, it's a dumb thing to maximize outside of contrived situations where you are forced to bet and know exact odds, and where the expected value is positive (if you have negative expected value you should never play, and Kelly tells you that).
Finally, it is very sensitive to your probability estimates. Going back to my first example, where you have a 5% chance of winning a tournament, lets fix the payout at 30:1 and look at what kelly tells us if the probability of winning isn't exactly what we thought it was:
5% (same as first example): $18.33
3%: Don't enter
So you're just going to throw away the criterion because you think the results are unintuitive? That's the argument you're making here.
To take your reasoning seriously, the reason why you might not want to bet 1 cent at a time is because the Kelly bet is guaranteed to eventually overtake your 1-cent-bet-strategy. Furthermore, it is completely incorrect to say that the Kelly bet has a 1/4 chance of losing all your money in the given situation. If you lose your first bet, the Kelly criterion tells you not to bet the whole house on the next bet.
Nothing you have written so far suggests that you actually understand the sense in which the Kelly criterion is optimal, which I attempted to explain in my other reply to you.
You keep writing as though it only maximizes the expectation of log-utility.
In fact it's not clear that you even understand what the Kelly criterion is telling you to do.
You aren't addressing what I said, you are cherry picking things you find easy to rebut. You are right that I messed up and that you can't lose all your money with two 50% bets. However you ignore the stronger argument, which I opened with and repeatedly pointed out, which is that you can't estimate probabilities well enough to use it, and that if you can estimate probabilities well enough to benefit from Kelly, that ability to estimate probabilities itself almost always unlocks strategies that strictly dominate any benefit Kelly gives you. It's useless in practice.
I couldn't resist bringing up the poker bankroll example because I think your in-game-poker example was poorly chosen. To me, it looked like you came up with a situation where the criterion obviously had no hope of being applicable and then used it to argue that the criterion is useless.
E.g. I could find a whole list of things for which calculus is not applicable, but that would not be a good argument for 'calculus is useless'.
The example I gave is at least closer to the assumptions of the Kelly Criterion.
I think the main thing I wanted to do was to correct the misconception that Kelly is only maximizing expected log utility, because it is a shame if someone (including other readers) thinks that the Kelly Criterion is just a fancy name we gave for the argmax of E f(S) where f happens to be the logarithm.
After all this, you (and other readers) might still conclude that the criterion is useless. But the set of justifications, and maybe the certainty, in that position, should change.
Could you point me to any good resource you know of to learn more about the range itself and other options?
More seriously, great analysis. Couldn't have said it better myself.
I think you have to have a few jokers short of a full deck to get into sports gaming and expect any outcome other than ruin. But the parlays are interesting. And the simplicity one could devise a winning strategy is enticing.
Consider a typical NBA season: 120 game nights, about 8 games per night. Let's say you constructed a parlay strategy in which you pick the under to hit on every game played that night. If the payout is large, say $1000 for a $3 bet. That's $360 for the season risk. And a high probability that eventually it'll cash ;)
E(x) = (1/256 * 1000) - (255/256 * 3)
E(x) = .91796875
So you are expected to make nearly 92 cents on every 3 dollars bet. Over a 30% return! Any bookie offering these odds would quickly go broke.
This point is well known in the literature, for instance see :
> Perhaps one reason is that maximizing E log S suggests that the investor has a logarithmic utility for money. However, the criticism of the choice of utility functions ignores the fact that maximizing E log S is a consequence of the goals represented by properties P1 and P2, and has nothing to do with utility theory.
His 1971 paper, "The 'Fallacy' of Maximizing the Geometric Mean in Long Sequences of Investing or Gambling" is more readable.
What if each week you can bet $1 that in 50% of cases will triple your bet and in 50% of cases will give you $0? According to the log utility whether you should bet depends on your current wealth. According to the assumptions behind the Kelly criterion you should take the bet every week.
However, this might still look dumb, and it's because my calculation assumes you could pick a smaller stake for the same bet, which isn't necessarily the case. If you can't, the calculation gets a little bit more complicated, but sure, the Kelly criterion can deal with that too.