With specialized hardware you can get close. But you’re still talking about a mid-single digit number of microseconds on inference alone. The competitor using linear models can get down to hundreds of nanoseconds. If you’re in FPGA world, that kind of latency advantage is worth way more than a 30% accuracy improvement from using a complex ML model.
At it's core, macro-level algorithmic trading is answering a question with only 2
possible answers, at any point in time...the question is, will the next tick be either "up" or "down".
Even in software, I’ve been able to hit O(15 uS) using optimized FANN libraries. But the nets are far smaller than deep, and pretty ruthlessly pruned and compressed. Another trick that helps is pre-differentiating across all the variables you don’t expect to change on a latency critical event. E.g. if you’re running a liquidity take strategy, you can pre-differentiate assuming the opposite touch size and deep book stays constant, because you’re only gonna act following on an aggressor trade at the touch.
Putting aside whether it's technically possible, do you know if any groups are actually having good success with this approach (NNs on microstructure) in live trading?