But, if it can really fill in missing or distorted chunks then for me that's a killer feature. I'm going to give it a try on my morning check-in call, which is coming up.
I'd love to see a product like this be able to adapt to the voices of individual speakers, and fill in gaps or distortion in their natural voices. I presume that would be more difficult, though, because you'd need a model of each speaker at the output device, so both parties would need to be running Krisp, and you'd somehow have to share the model between them - which if you already have network issues causing dropouts and distoration might not be feasible. Unless it was a side-channel thing, where for regular calls with the same people the voice model for is updated after each call, ready for the next one.
Though I'm not sure I love the idea of a model of my voice being constructed and transmitted around. But I think it would be really cool :)
Since I moved to NYC, I think I haven't had one phone call without some obnoxiously loud siren/horn/baby/dog/jack hammer/piano/bros in the background. People must think I listen to the NYC soundtrack on repeat at full volume and make phone calls.
I envy you. :)
This call typically has decent audio quality, and as it's a stand-up / status call people tend to speak one at a time. No-one today had much in the way of background noise, so I'm not sure there was a lot of opportunity for the app to show off.
That said, I switched the Krisp speaker mode on and off repeatedly while each person who was talking.
1. When just one person was talking Krisp didn't seem to ever make the sound worse, which I guess is a start.
2. When there was a conversation going back and forth, or when one person started talking right after someone else finished, the voices got mangled or muted and I couldn't understand them. I had to turn Krisp off.
3. When there was some background echo (like in a large room) or some minor distortion I thought it maybe sounded a bit clearer with Krisp running, but it didn't really make much difference. I could understand the speaker either way.
With the audio problems I mentioned at (2) and no real gain from using Krisp I doubt I would use it regularly, though if I run into a call with bad background noise I might try it again.
I also tried the Krisp microphone, and at one point I had to repeat myself, which doesn't usually happen. But I have no way of knowing whether that was due to me speaking unclearly, or audio issues at the listeners end, or something else. So I don't really have an opinion about the microphone, but as I am in a quiet place anyway I wouldn't probably use it.
It would be nice if there was a single channel evaluation mode for the speaker. If I could hear in my right ear the normal audio was, and in the left what the Krisp-processed sound, then I would have a better chance of evaluating the performance. I guess if you have a lot of continuous background noise that mode would be redundant, and it should be an obvious improvement switching back and forth.
Re #2, one thing we noticed is that Conferencing apps themselves will distort the voice when multiple voices are overlapping. Especially when there is also noise. There is not much Krisp can do here since the stream it receives is already distorted.
Unfortunately for krisp speaker we don't control the audio stream. Imagine how many times the stream gets signal processed before krisp speaker gets it (noise cancellation in the headset, noise cancellation in RTC, codec, etc).
Re Krisp microphone, the DNN model used here is more effective since what Krisp receives is "less processed/distorted stream".
Please stay tuned, our release cycle is around 2 weeks. More quality and UX features are under way.
And nowhere to "leave my email".
This may have nothing to do with Krisp as an app. All phones/communications device have the ability for either half or full duplex audio. Half duplex means only one caller can be heard. If someone were to speak, the audio would cut out as you experienced. Full duplex allows for conversations to happen all at once.
So this all could just be a symptom of the feature of the device or service you're using. I would try your test on different devices and/or services.
I remember as intern spending hours in front of spectrograms manually deleting noise so that the researchers could get clean targets. Let me tell you, I started being able to identify a lot of phonemes just by visually examining waveforms.
Eventually, the company did pursue some noise cancellation, but only as one part of their offering. I don't think they ever could get the holy grail of separating non-stationary noise.
This was a sensitive use case for the team since I (CEO/krisp) had a baby 3 days ago and really needed Krisp to do my calls :)
Most businesses, especially outside of Silicon Valley, don't use Mac; Windows will likely be a larger market.
Windows support will come in Dec. We are working hard on it.
All audio is processed on device, but is the goal to use the public training/learning to tailor a more robust model which they can sell commercially/integrate into apps/phones/etc?
Wouldn’t be surprised if a software update makes it possible to use all of those mics for some very good ambient noise cancellation during face time calls
"It's only free during beta. Although we haven't decided on the exact monetization strategy yet."
For awhile, it seemed like the autocorrection in iOS was deliberately trying to break up me and my then-girlfriend. It even seemed to have a penchant for doing an unfortunate autocorrect just a sliver of a second before my finger hit send on a txt. I finally turned the feature off.
I am /not/ saying Krisp.ai is a trojan or is nefarious, but if I was the NSA or FSB or... something like this would be very interesting to me. Both for infiltration (deliberately malicious) or for exploitation (compromised / exploited at run time).
Finding quiet space to make phone calls is a hassle.
Similar to that project to colorize old photos with a GAN, but in audio form. 
We are also building web and iOS apps to cancel the noise in the files.
I still plan to play with the app for voice chat, though.
You have to apply for it and it's apparently priced per minute. I applied and I'm waiting for pricing info.
To remove static background noice would just need to get a silent sample and reverse out the polarity of the static noise from the audio signal. It can be done real-time or in post production.
To sum out random background sounds you would need an omnidirectional mic that picks up everything. Then you'd need a close range shotgun type mic from which you subtract all that noise mix picks out.
<meta name="description" content="Take calls from wherever you want without being embarassed
for a background noise. Get krisp for Mac and use with any conferecing app!">
In contrast, Krisp eliminates the noise going from your environment to the call participants and vice versa.
I'll give it a spin and see if it helps.
Edit: I know it's a typo, but I figured it's a door to somewhat interesting line of linguistic thought.
You could have "seemful" which might be somewhat synonymous with "inauthentic" and "seemless" which might be in with "genuine"?