
NoiseTorch: Real-time microphone noise suppression on Linux written in Go - ClawsOnPaws
https://github.com/lawl/NoiseTorch
======
tazjin
I've recently built the inverse of this using NSFV
([https://github.com/werman/noise-suppression-for-
voice](https://github.com/werman/noise-suppression-for-voice)), i.e.
suppressing noise in _incoming_ audio.

A lot of people - despite being forced to work from home - simply don't seem
to care about the way their audio sounds. Many don't even try to tackle these
problems after it's been pointed out to them that they're being a nuisance in
online meetings.

I gave up on trying to help people fix their setups, or convincing them that
it matters, and switched to doing this on the receiver end. It's been a
massive quality-of-life improvement.

If you're interested in the setup, you basically just need a small script that
loads the pulseaudio plugin and wires up the sources/sinks correctly.

My setup script is here: [https://cs.tvl.fyi/depot@canon/-/blob/tools/nsfv-
setup/defau...](https://cs.tvl.fyi/depot@canon/-/blob/tools/nsfv-
setup/default.nix)

And some more context:
[https://cl.tvl.fyi/c/depot/+/578](https://cl.tvl.fyi/c/depot/+/578)

~~~
basilgohar
I think this is an out-of-sight, out-of-mind kind of issue. They simply don't
understand how their noise, which they do not perceive, can be so detrimental
to others. Moreover, a lot of people simply can't grasp the difference good
hardware or even just a different setup (moving away from noise sources like
fans, open windows, appliances running, etc.) can impact the quality of their
sound. Lastly, a lot of people either cannot or think they cannot do anything
about it, so they dismiss others' concerns because "everyone else has problems
too", equating their noise to be the same as others'.

~~~
ponker
Why don’t the services like Zoom and Teams fix this on the server side?

~~~
ralphm
If services are indeed doing, or moving to, end-to-end encrypted media,
there's nothing the server can do here.

~~~
bigiain
Doesn’t mean they can’t do anything about the noise, just that they can’t do
it on the server.

The RNNoise link from the bottom of that post runs the noise suppression in
real time in JavaScript. Zoom et al. could do this client side while still
doing proper E2E. (Although Zoom already uses more cou that I think it needs
to...)

------
drblah
As far as I can see this uses RNNoise. If you haven't checked it out yet you
should, because it is simply amazing. It is a super effective noise gate /
noise removal tool that does not require any configuration whatsoever.

My study mates and I have been using it over the last four months when working
from home. It removes the noise of keyboards, seaguls and vacuum cleaners.

It is essentially the same as Nvidia RTX voice except it is much lighter on
the system and does not require an Nvidia GPU. In our testing RNNoise performs
similarly.

This project looks super cool. It seems to make RNNoise much more accessible.
Normally you would have to manually set up the pulseaudio plumbing for this to
work.

~~~
swyx
do you need Linux to run your version? would love to get this running on my
Mac.

~~~
drblah
I have mainly used the built in RNNoise support in Mumble. But you can use
[https://github.com/werman/noise-suppression-for-
voice/](https://github.com/werman/noise-suppression-for-voice/) and build the
VST plugin (This is also what NoiseTorch uses i think). Then use any
application that can load VST plugins to pipe your mic through. I have had
reasonably good luck with it on Windows with Equalizer APO.

~~~
tyfon
Someone also recently made a plugin [1] for OBS using this.

[1] [https://gitlab.com/gravydanger/obs-
rnnoise/](https://gitlab.com/gravydanger/obs-rnnoise/)

------
Abishek_Muthian
Nicely done!

I went through some core libraries being used in the project, there's a pure
Go pulseaudio implementation[1] which seems to deserve few more stars and the
GUI framework nucular[2] seems support even metal rendering on macOS. I like
how the native GUI frameworks for Go are becoming viable alternative to Qt.

Off-topic, Since this thread might attract audio programmers-

I was looking at ambient noise cancellation, audio amplification
implementation for TWS earphones(BL 5.0) without those features on Android[3],
would the latency defeat the purpose because it isn't implemented on device
and does android bluetooth/audio APIs provide necessary access to implement
such features in an app?

[1][https://github.com/lawl/pulseaudio](https://github.com/lawl/pulseaudio)

[2][https://github.com/aarzilli/nucular](https://github.com/aarzilli/nucular)

[3][https://needgap.com/problems/22-enabling-hearing-aid-
feature...](https://needgap.com/problems/22-enabling-hearing-aid-features-on-
tws-earphones-audio-hearingaid)

------
lawl
Hey everyone author here! Awesome to see this on HN.

I'm happy to answer any questions, but this is a slightly inopportune moment
to hit HN for me as I need to leave soon :) Some responses might be delayed by
a day or so!

~~~
kaielvin
Thanks for the work.

Is incorporating NoiseTorch into pulseEffects something that could be
considered? The interest being to have all filters managed under one app.

~~~
lawl
I've seen pulseeffects mentioned a few times, I must admit that I don't know
exactly what it is and will need to research it first.

------
nickjj
I'm not saying this tool is bad but I would be really careful about using
tools like this in an environment where audio quality really matters (Youtube
videos, podcasts, etc.).

Noise reduction tools work by removing specific frequencies from the source,
some of which overlap with your natural voice.

This is why you start to sound robotic and get weird cutouts if you try to use
tools to remove too much noise or background sounds. It's one of those things
where, if you're not used to hearing your entire vocal range, you might not be
aware at how much is getting cut out from tools that reduce noise.

It's too bad they don't have a before / after with a few voice samples in the
readme.

~~~
manojlds
This demonstration with Nvidia RTX Voice sounds pretty good

[https://youtu.be/Q-mETIjcIV0](https://youtu.be/Q-mETIjcIV0)

~~~
nickjj
Definitely sounds better than I thought it would have and I've watched tons of
this guy's videos in the past.

It really distorts his voice / range in some cases, such as when he taps his
desk with that orange hammer. The difference there is night and day. It chops
out his his natural voice's range. It seems to degrade his voice the more
intense the background noise is, such as the leaf blower (lol), but that's
reasonable to expect. But at the same time, even the mechanical keyboard has a
very noticeable negative effect on his range.

It's one of those things where I wish so much that it worked perfectly, but I
couldn't realistically think about using it for any recording work due to
things like the above. There's just too many common noises (typing, etc.) that
drastically distorts your voice.

9:23 in that video is hilarious though. Have to love Jerry!

~~~
amcoastal
I wonder if its the algorithm degrading his voice or if the input sound is
already degraded. Is it possible a leaf blower or a hammer would cause enough
"noise" to make it so our ears couldnt hear his voice clearly as well? Then
when you subtract out the portion of the sound attributed to the leafblower,
youre hearing the parts of his voice that werent being jumbled by the leaf
blower?

~~~
jacobush
Like the blown out whites of a photograph. You can adjust levels, but if the
input peaked, there’s just no information left in the data.

------
dsteinman
This might be useful to use along side with DeepSpeech
([https://github.com/mozilla/DeepSpeech](https://github.com/mozilla/DeepSpeech)),
which doesn't work very well in noisy environments.

------
ACAVJW4H
It might be a stupid question but, aside from the obvious benefits of saving
bandwidth by omitting useless noise in transport, doesn't it make sense to
employ these technologies server-side? One could maybe make Jitsi or
BigBlueButton use similar technologies? It would make it much more ubiquitous,
better platform support (would work on mobile or low CPU/GPU clients) and also
save on system provisioning as maybe the neural net could be utilized better
by running for different audio sources concurrently

~~~
bufferoverflow
As a system owner, it makes financial sense to do it on the client. Imagine
you're managing Zoom. You will need tens of thousands of GPUs running 24/7
just for noise suppression.

------
wenc
Very nice. Krisp.ai is a commercial option, and NVIDIA RTX is free but
requires a CUDA card, so this is a great alternative.

Noise suppression is becoming more and more common. My Jabra headset has it
built in.

~~~
kbouck
When testing Krisp.ai, I recorded myself speaking inches away from a noisy
water boiler. In the playback, I could not even hear the water boiler, but
voice came through clearly. Signed up for the service immediately after that.

~~~
orware
I signed up for it too last weekend after coming across it after doing some
research (I had been making a bunch of video recordings a few days prior, and
once the videos were added into Camtasia and the audio played back I noticed a
lot of background hum coming from my HVAC return outside of the room I'm in).

Was impressed with the Krisp.ai tech as well and probably works similarly to
this tool and the other Nvidia solution that I can't try out since I don't
have an RTX card (main difference might be the overall training set that Krisp
has already run their algorithm through?).

I haven't had any Zoom meetings since purchasing Krisp, but I had been using
the built-in mic from my LG Tone headset for those meetings.

Since making those video recordings I've been using my blue Yeti mic (and a
pair of headphones connected to the mic for listening) as my primary and I've
continued running a bunch of small tests to try and see if I can be happy with
using Krisp enabled all the time.

Currently, I don't feel comfortable with leaving it on all of the time though
for recordings, particularly with something like the blue Yeti mic which is
able to capture pretty rich audio. In my testing, Krisp did a great job of
eliminating the background HVAC humming noise, but replaced that issue with
two others: some minor (but distracting) hiss/noise between words as I'm
playing back the recorded audio, and also currently is limited to 16000mhz
frequency (not sure if mhz is correct or not in this case...this is what
support shared with me when I asked about audio quality degradation). The
support person did respond though and say that the team is working on the
increasing the frequencies they are able to work with though so I guess there
might be some improvements in the near future on it?

After seeing the latency figures on the NoiseTorch page it makes me wonder if
the Krisp latency is similar or not (so far I haven't noticed any latency
issues with Krisp).

As far as remaining thoughts...I kind of wish there was a bit more
configuration options available for Krisp, but the simplicity of it is also a
benefit (for others that might not be as technical and just want a simple
solution that does appear to work overall). I haven't gotten it to work for
playback needs (it has the toggle for it, but nothing seems to happen when I
try and toggle that on). Also, still not sure what the overall
differences/improvements with Krisp Rooms enabled (I am recording in a room,
but after reading their description/blog announcement page it kind of seems
like it's more for conference rooms where multiple people are speaking and
extra echo cancellation might be useful? ref: [https://krisp.ai/blog/krisp-
rooms-launch/](https://krisp.ai/blog/krisp-rooms-launch/))

Since I'm already out with a year subscription with them I'll continue to try
and figure out how to use it effectively, but not as excited about it at the
moment compared to how I was last weekend initially (impressive overall
though...hopefully it continues to improve :-).

~~~
fred123
16kHz sample rate (= max frequency 8kHz) should be enough for speech only.
Human voice is mostly <0.5kHz. You may hear some difference for hisses or for
room sounds etc. but I’m sure you’re unable to hear any difference to higher
sample rate in a voice chat setting

------
rstuart4133
From the developer of RNNoise, which is the technique being used here:

"As strange as it may sound, you should not be expecting an increase in
intelligibility. Humans are so good at understanding speech in noise that an
enhancement algorithm — especially one that isn't allowed to look ahead of the
speech it's denoising — can only destroy information. So why are we doing this
in the first place? For quality. The enhanced speech is much less annoying to
listen to and likely causes less listener fatigue"

[https://jmvalin.ca/demo/rnnoise/](https://jmvalin.ca/demo/rnnoise/)

------
sandworm101
Does noise suppression work in reverse? Can I use it to isolate the noise from
the human voices? There are lots of situations where someone might want to
isolate and analyse background noises or conversations.

~~~
fred123
Yes. Noise suppression is very similar to speech separation (separating
multiple speaker voices that talk at the same time). For example you can use
ConvTasNet for both speech separation and denoising; in the denoising case you
set target track 1 = speech, track 2 = noise, hence you get a noise-only
track.

I guess you can also simply subtract the clean speech from the original
mixture to get the noise-only track.

------
hu3
I'm curious about the impact of Go's Garbage Collection in a real-time project
like this.

From reading past comments in other Go related threads I was led to believe
this was impossible to achieve with Go.

I'm talking about threads like this:
[https://news.ycombinator.com/item?id=21036037](https://news.ycombinator.com/item?id=21036037)

------
kaielvin
Alternatively there is the pulseaudio module: module-echo-cancel
([https://askubuntu.com/questions/18958/realtime-noise-
removal...](https://askubuntu.com/questions/18958/realtime-noise-removal-with-
pulseaudio)), which I have been using so far.

I haven't tried NoiseTorch yet. How do the two compare?

~~~
lawl
NoiseTorch uses RNNoise, which uses a mix of deep learning and DSP to remove
noise. I haven't used module-echo-cancel yet, but it's probably "just"
classical DSP, rnnoise may deliver better results.

~~~
kaielvin
Indeed, after some testing, the filtering is much better.

------
formerly_proven
Most noise suppression I've seen so far can shave off a few dB (worth gold
already), but when you try to suppress more noise it always starts to impact
the signal very negatively. Interesting to see whether these ML approaches can
do better. I suspect they might depend even more on the type of your voice
than conventional noise suppression.

~~~
fred123
Note that most state of the art machine learning based denoising models
perform MUCH better than rnnoise quality wise, but they are mostly not tuned
for real time use.

If you’re interested, have a look at some of the Interspeech 2020 Deep Noise
Suppression submissions.

~~~
fred123
Some examples here: [https://paperswithcode.com/task/speech-
enhancement](https://paperswithcode.com/task/speech-enhancement)

Some of them have audio samples.

------
gingerlime
Anything similar for MacOS ? I tried krisp.ai which is nice but seems too
heavy on my 2015 MacBook Air together with zoom

------
manojlds
Any of these remove dog barking noise?

~~~
speedgoose
I would guess. RTX Voice removes my cat's sounds.

~~~
manojlds
Yeah but with my rudimentary skills I struggled with dog barks as they are
closer to our speech.

------
bhouston
This should be included in Linux by default it is this good. :)

Or at least available via apt-get.

------
jcastro
I've been using this for the past few days and it's been fantastic, every
distro should just do this out of the box.

------
42droids
Thank you for making this, I really can't wait to try it. In fact, I am now
shocked this didn't exist before... :)

------
captn3m0
noisetorch-bin and noisetorch-git packages already on AUR:
[https://aur.archlinux.org/packages/?O=0&SeB=nd&K=noisetorch&...](https://aur.archlinux.org/packages/?O=0&SeB=nd&K=noisetorch&outdated=&SB=n&SO=a&PP=50&do_Search=Go)

------
ped4enko
How well did you choose the Golang for this task?

------
freedomben
Is this using GTK? What bindings?

~~~
hu3
Not GTK but
[https://github.com/aarzilli/nucular](https://github.com/aarzilli/nucular)
which is a Go port of
[https://github.com/vurtun/nuklear](https://github.com/vurtun/nuklear)

------
kochthesecond
This is pretty cool!

------
thomasfedb
I read NoseTorch, was intrigued.

------
sahoo
Only if the sound card was detected in Linux. Sigh.

~~~
shock
What do you mean? NoiseTorch deals with PulseAudio, it doesn't deal with
hardware directly, so, yes, Linux needs to have a driver for your soundcard.

~~~
sahoo
I mean, is there a non linux port?

