On one hand I fully understand there are differences in language and specific features available depending on framework out there.
On the other hand, if a framework "correctly" implements the underlying statistical theory/principals of deep learning, shouldn't I get the same results regardless of whichever framework I use?
If not, how would I know which framework produces "more correct" interpretations of the underlying data?
If your problem is more complicated and you want to use some unique architecture, you'll have to use one of the more low-level frameworks. I would recommend Tensorflow just on the basis of its popularity (you're more likely to find people who have run into the some problems as you). But Theano, Torch, and MXNet are probably pretty much equivalent in terms of speed and ease of use. I hear Caffe has a steeper learning curve.
If you're really doing something fancy, then you'll have to look into more detail. Torch and MXNet have the advantage that you can adaptively change your computation graph based on the data, but you'd probably have to be pretty far into deep learning research before something like that is useful. Tensorflow Fold does something similar, but I'm not sure how well integrated it is with the rest of Tensorflow (I've never used it).
You might also take a look at this:
https://github.com/zer0n/deepframeworks
It's a little out of date now, but it'll get you started.
Some of these frameworks are more general than others (e.g., Tensorflow is more general than Keras), so you can specify architectures in some that you can't in others. But as long as you can specify the architecture in a particular framework, you'll be able to get a working model. Your choice of framework just comes down to whatever one is easiest to work with for the problem at hand.