Fantastic project and great writeup! The screen tradeoff with needing triple buffering but getting integer scaling was interesting to hear about - any feeling as to whether it adds human-noticeable latency vs. original hardware?
In the absolute worst case (drawing an object at the very top of the screen, and the LCD output for the next frame started right before the current one finished), buffering adds a 2 frame delay (33 milliseconds). Probably noticeable for some people, but this worst case is uncommon.
Average case I would expect ~0.5 to 1 frame delay, so 8 to 16 milliseconds. Probably not really noticeable.