This method will work but will require a large grid and consequently be quite slow. And order of magnitude or two faster than this is possible if you are clever.
Given the exercise boundary, the American Option Price can be written exactly as a one-dimensional integral. That is the key insight to this superior method.
( I'm a fixed income quant, so I didn't look for it until now.) For a more advance model than Black-Scholes, e.g. local vol I don't expect it can be extended, and one would then need use some PDE based method.
Your intuition is quite correct. These methods (Leif et al) do not extend well to different boundary or intermediate conditions that are quite necessary in real life scenarios. AFAIK, there are a few teams on the Street that do fairly advanced numerical analysis, but most resort to Monte Carlo or some statistically-informed perturbation theory.
(I wish I could talk more, but yeah, legal obligations)
Monte Carlo might be ok to OTC derivatives, however for automatic market making of exchange traded option, which are mostly American, it would be too slow.
I go through academic literature on a regular basis, hoping that some kind of really major improvement might magically appear. Usually the ideas are great, but they don’t survive real life equities markets ( from dividends to non convex payoffs, local vol etc )
Crank-Nicolson is probably the least objectionable part of the method, but I prefer ADE.
There are two numerically painful parts of the problem: the advection term and the oscillation inducing terminal condition (because it has a discontiuous derivative). I like to deal with advection by transforming the equation to an advection free equation. I'm under NDA on the best solution to the oscillatory terminal condition so I can't give that one away unfortunately.
Indeed, a transformation (of some kind) is fairly standard, including the derivation for the standard analytic solution for European options.
AFAIK, discontinuous first derivative per se may act as a seed to an oscillation due to its high frequency content that are not captured by any finite resolution algorithm (n.b. Gibbs phenomenon). But it is Crank-Nicolson that characteristically creates these oscillatory problems -- in other words, there are algorithms that can gracefully handle the discontinuity without creating oscillation.
Yeah, the discretization interacts with the oscillation for sure. Full implicit is better than CN with regards to oscillation for instance, but I don't think would be a net win. Running a few implicit steps before switching to CN might help, though I've never tried it.
Real historical data might not cover current market state.
For example, option price among other params depends on the interest rate. For the last decade interest rate was around 0% in Europe and slightly higher in US. If you train on that data only, there is no chance to "learn" option prices in the high-interest-rate environment which we saw for the last few years. Hence, you need synthetic data to learn that region of the market space.