I'm always amazed that everyone seems okay with this, and will even say so when pressed on it. But those same people will get a look in their eyes as if they're going to shit their pants, if someone else picks up their unlocked iPhone and starts scrolling.
I'm amazed that people are afraid of a home assistant in their living room, yet carry a cloud connected microphone+camera around with them all the time (with the added feature of sending GPS tracking data back to the manufacturer, carrier, and apps that you granted location permission to, like a weather app).
If you trust that your phone isn't listening when it's not supposed to, why don't you trust the home assistant (even if it's made by the same manufacturer)?
It's because they have sexts on their phone. A recorder is not as big of a deal. Especially if I'm fully aware that the entity listening to the recording is listening to billions of people and couldn't care less about my one voice.
With the microphone switch turned off they don't send audio to the cloud but they're also no longer smart devices. The only words they can detect using local processing are their limited set of wake-words. But using Alexa or Google Home is literally the opposite of an offline smart home system if that's your goal.
Even if you create tasks that can function through a button press or something they still rely on the cloud.
Voice control locally is really hard though. Or at least it used to be. I thick OpenAI Whisper is available for local/offline usage if you build your own wrapper around it? I tried adding (local open-source) voice to my home system about 8 years ago, and again 3 or 4 years ago and it was very rough. It might finally be feasible with the current state of local AI.
For an example, my home robot was named "Marvin" (from H2G2) and the local detection was so poor, this is my list of "matching" wake words.