Playing around with only a cursory understanding of neural nets, it confirmed a lot of my suspicions: the libraries have gotten good enough that a failure to understand the underlying mechanics and math can still get you like 70-80% there.
If you have a base understanding around concepts like "activation functions," you may have gotten hung up on the precise reasoning or benefits around choosing a particular function over another. After playing around with some of the examples they've provided, I tried changing activation functions between sigmoid, tanh, and relu, got virtually identical results (with relu being the best). This same general pattern of seeing only marginal differences continued as I tried adding additional dense layers to the network, mostly similar results, just slower and less generalized. I tried changing filter sizes on convolutional layers for the couple image things I tried, very forgiving as well. It really felt like there is a very standardized solution to most types of problems, and the iteration that goes on to improve results is more arbitrary tweaking than it is based on any underlying theory.