I agree with posts above. Diving straight into 'practice' in statistics (and other fields e.g. cryptography being the most notorious) leaves you open to a great many of pitfalls. Best case you will be inefficient with your approaches.
Your examples may work, a couple of testing sets giving you high confidence, and then you attempt to use it in the wild and everything falls apart.
At the same time machine learning is a lot about data cleaning, bootstrapping, picking the right algorithm with mininum iteration, minimizing your iteration cycle as much as possible etc which you don't gain until you actually mess around and get your hands dirty. Plus there are little implementation tidbits specific to each project.
That might work for some simpler machine learning algorithms, but in deep learning I think you'll have an even harder time correctly figuring out intuitively what's going on than you would learning the math.
I would be careful to claim that with theoretical approach you will always get better understanding.