Our protein design work has its origins coming from a different angle. But there are some interesting thoughts we've had about utilizing (well, dreamed up) technologies like yours to quickly make up a lot of ground between structural prediction, and empirical functional design.
Starting to see such 21st century data standards for bio data! Yay for everyone!
The training times obviously vary on the network architecture, software and hardware. I can safely say you can process 7200+ protein sequence with average sequence length of 120 amino acids in 2h on 2 x NVIDIA Titan XP
On that note, is there any reason why the propensity classes alpha/beta/coil are still so widely used? Especially coil/turn/"other". It seems to me that these are ancient artifacts of structural biology that could definitely restrict human understanding of protein dynamics. Perhaps there is nuance in the chemical shift data that may help the design of better structural classes using ML.
That and the biophysical properties conferred by partially disordered proteins makes them a motherfucker to work with outside of some archetypal domains. I liked to explain it like this. Imagine you have a piece of string three feet long. Along the length of that string you have ~1 inch segments consisting of velcro (both kinds), zippers, magnets, balloons, strawberry jello, and marshmallows--all randomly distributed along the length. Now try to fold all of that up so the jello, velcro, and balloons are on the inside. That's a simple model of a protein. Now make it start opening and closing. Now put 5 of them next to each other.
Coming back to your comments about the canonical secondary structures; I couldn't agree more with you. The problem is quite simple, how are we going to convince the >90% of structural biochemistry society to simply accept the fact proteins are bloody dynamic and X-ray / eye candy structures may have quite little to do with the "real" picture at room temperature?
We have in "stock" a network (obviously another paper) that will aim at propensity prediction, still in trivial alpha/coil/beta phase space.
I am wondering how this compares to the TALOS-N  server from Ad Bax (NIH) with 9000+ proteins in its DB? This, too, uses machine learning to 'fit' a predictor for secondary structure (dihedral angles) for backbone and side chain torsions based on chemical shifts.