Each hole in the shadow mask acts as a pinhole camera, giving an inverted image (in electrons) of the three guns. All three beams get bent nearly the same amount, but yes there is some distortion which is traditionally corrected for by a set of convergence coils and corresponding circuit with knobs for static and dynamic convergence [0]. A pain to adjust, BTW.
[0] https://antiqueradio.org/art/RCACTC-11ConvergBoardNewRC.jpg