Assuming this isn't parody, the OS doesn't have to do it automatically. Having an application grab a microphone stream and say to the OS "take this and cancel any audio out streams" might be pretty useful.
I agree with that, but the point I'm trying to make is that audio i/o handling is pretty complicated and application specific. The idea I'm challenging is that "any app that wants microphone input wants this" is dubious. I'd say it's only a small number of audio applications that care about mic input want background noise reduced - and it makes sense for this to be configured per-input stream.
Really what would be nice is if every audio i/o backend supported multiplex i/o streams and you could configure whether or not to cancel audio based on that set of streams but not all output (because multi output-device audio gets tricky).
The technique introduces latency and distortion because it's subtracting an estimate of sound that's traveling/reflecting in the listening environment, which is imperfect and involves the speed of sound.
That latency is within the tolerance that users are comfortable with for voice chat, and much less than video processing/transfer is introducing for video calls anyway, so it's a very obvious win there. Especially since those users are most interested in just picking out clear words using whatever random mic/speaker configuration happens to be most convenient.
But musicians, for instance, are much more interested in minimizing the delay between their voice or instrument being captured and returned through a monitor, and they generally choose a hardware arrangement that avoids the problem in the first place. And that's not really a niche use case.
Live video or audio chat is basically the only time you do want this. Granted, that’s a big chunk of microphone usage in practice, but any time you are doing higher fidelity audio recording and you have set up the inputs accordingly you absolutely do not want the artefacts introduced by this cancellation. DAWs, audio calibration, and even live audio when you’ve ensured the output cannot impact the recording all would want it switched off.
Default on vs default off is really just an implementation detail of the API though, as you say.
> Live video or audio chat is basically the only time you do want this.
If I'm recording a voice memo, or talking to an AI assistant, I would want this. Basically everything I can imagine doing with a PC microphone outside of (!) professional audio recording work.
That last case is important and we agree there needs to be a way to turn it off. I think defaults are really important though.
My colleague works in a very quiet house, and has no need for noise cancelling. Sometimes, he has it turned on by accident, and the quality is much worse - his voice gets cut out, and volume of his voice goes up and down.
As you say, as long as either option is available, the only question is what the default should be.
I gave an example, when I'm wearing headphones I don't want this enabled. If I'm recording anything, I probably don't want it on either. If I'm using a virtual output, I don't want AEC to treat that as a loudspeaker.