
Real world machine learning at Jane Street - astdb
https://blog.janestreet.com/real-world-machine-learning-part-1/
======
huac
This is a good post!

> An example: modeling the difference in price between an ETF and its basket
> as a Gaussian would be a mistake, since there are well-defined arbitrage
> bounds on this difference.

The infinite tails of the normal distribution always strike where you least
expect :)

~~~
totalZero
Even if a Gaussian isn't the right model for this difference, I don't think
that example always holds. Sometimes the ETF trades at a different time than
its basket (the same can be true for ADRs), or there is a substantial
immediate dislocation in an individual basket component due to a news headline
that doesn't get reflected in the ETF price until a few moments later. Also,
you may have situations where disadvantageous carry rates cause the basket-vs-
ETF carry trade to either trade wide, or to be ambiguous even when an
arbitrage is possible (ie, you would make money if everything stays put but
you don't actually feel confident about the stability of the carry rates). For
options, even an exemplary arbitrage like put-call parity has situations where
dividends and/or forward rates justify substantial disparity.

I see how it would make sense for an algorithmic trading strategy to make
simplifying assumptions, but within every such assumption is a neglected
opportunity, however small or rare, to either make or save money.

