The part I missed is that you're doing this at the character level, and not at the word level. If you were doing this at the word level a Markov chain could easily tell an IPA from a porter. But at the character level it suddenly becomes a lot more impressive. Thank you! I'll read the paper now.