For continuous use, you need 95% at least. See http://anewdomain.net/2014/01/28/lamont-wood-windows-speech-...
State of the art systems are far better than this. Microsoft recently published a paper with a 5.9% word error rate for conversational speech. Speech directed at computers/assistants is already in the high 90s, though I don't have a figure off the top of my head.
Occasionally, android gets the words right (as demonstrated by the onscreen text) and then flubs passing the correct intent because of "loss of connection", which is just about the most frustrating ML fail.
No doubt android's voice recognition is spectacular. I can prompt it in three different relatively orthogonal-sounding languages (English, French, Japanese), and it can figure out which language I'm using and usually get the transcription correct. Notably, I can't activate the italian/japanese pair and get useful results - which makes sense if you know both languages.
Google voice is horrible, however, at transcribing voicemail.