Well, if that's the case, then the situation is worse than I imagined. I thought that training the model would be far more expensive computationally than using it to compute an 'answer'.
Training is much more computationally intensive than computing a _single_ answer. But not compared to running the model over time at scale for millions of users.
Thanks.