Unless you use a purpose made low CPU usage cipher like salsa 20, hardware crypto is just so much faster than software. It is like comparing software h264 implementations in 2019, and trying to make big news out of it.
Salsa 20 and derivatives looks to me a very promising cypher suite, but for as long as there is no hardware support for it, it will not see its niche even in resource constrained niches as even $5 smartphone SoCs nowadays can run hardware AES faster than any software crypto.
Another moment that needs note is that Salsa is still a relatively new and untried cypher, without much real world track record other than its use in Chrome. Poly1305 is an even less tried construct vs GMAC, with harder to imagine hardware implementation (it will be fair to say that most HMACs are hard nuts to crack for hardware implementation because of them using tons of LUTs.)
Second, Salsa is a work of just 1 man + few collaborators. Knowing the crypto community, this itself may be a reason for push back from it.
P.S. Question to people in crypto: why not to use some derivative of Salsa 20 as an HMAC instead of Poly1305 in Salsa-Poly combo?
Daniel J. Bernstein, to be precise, also author of Djbdns, Qmail, NaCl, Curve25519, ... Some of which he solo authored. So I'd take the quoted skepticism with a huge grain of salt.