This reminded me of a cool paper I came across recently, Spatial Transformer Networks [1], a good example of how knowledge of geometry helps frame the problem more effectively, allowing the network to learn how to e.g rotate objects into a canonical orientation before identifying them.
[1] https://arxiv.org/abs/1506.02025