
Deep Learning ‘ahem’ detector - sndean
https://github.com/worldofpiggy/deeplearning-ahem-detector
======
dnautics
Badly needed: something that removes coughing from classical music

~~~
gaur
Relevant:
[http://www.davegrossman.net/gould/](http://www.davegrossman.net/gould/)

~~~
savanaly
Terrific, hilarious page. I'm in the camp that likes glenn's humming though.

~~~
dnautics
I like his humming, too, but at the risk of making a heretical statement, it
would certainly be nice to have both options.

------
petercooper
Hook this up to a shock collar and train yourself. Even better if it could
detect "erm"s :-)

~~~
M_Grey
How about just... a mild vibration on the wrist or something? Then again, at
least, "AhAAAAHHHH" is more interesting than "Ahem".

~~~
annnnd
Mild vibration wouldn't be enough, it has to be at least annoying, if not
painful... I would buy it in a second though, as this approach is probably the
easiest way to self-train the "uhm"s & similar out.

Any entrepreneurs around here? ;)

~~~
M_Grey
No, but I can buy some dog anti-barking collars, and I'm willing to electrify
some people... especially today.

~~~
annnnd
Well then, if I start barking and can't stop, I'll contact you for sure!

~~~
M_Grey
Good boy!

------
greggman
Random trivia: "ahem" is a cultural / language based thing. Other languages
use different "words" for "ahem". The same for grunting sounds. For example
the sound people make when they need to push something very hard. It's
cultural not natural. The sounds are different by culture.

~~~
jmiserez
And "huh" seems to be universal:

[http://www.smithsonianmag.com/science-nature/everybody-
almos...](http://www.smithsonianmag.com/science-nature/everybody-almost-every-
language-says-huh-huh-180949822/)

------
Taylor_OD
Can I get a deep learning that cleans podcast episodes from advertising?

~~~
greggman
Getting off topic but ... by advertising do you mean obvious commercials or
submarine ones? I ask because my podcast player has an advance 30s, 2m, rewind
15s buttons. When a commercial like ad starts it takes me like 2-3 seconds to
skip it. In other words I don't need deep learning to skip ads for me.

Which actually makes me question if podcasts can make money as it's so dang
easy to skip all the ads.

~~~
qq66
A lot of podcast consumption is while driving, where operating the phone is
dangerous and illegal. I would especially like an option to skip advertising
when I'm listening to two episodes from the same podcast and the closing ads
from one episode are the same as the opening ads in the next episode.

------
revelation
So this is literally learned from images of the fake-color frequency spectrum?

That's not state of the art in DNN speech recognition, is it?

~~~
skoocda
Pretty much SOA; most end-to-end systems use these spectrograms on short time
slices. The alternative is mel-frequency cepstral coefficients, which are used
more in GMM-HMM speech recognition than for DNN.

~~~
annnnd
For other illiterates like me:

* SOA = state of the art

* GMM-HMM = Gaussian mixture model - Hidden markov model

* DNN = deep neural networks (more than 1 hidden layer)

* mel-frequency cepstral = MFC ;-)

~~~
kwhitefoot
Could sum1 write a GM script to replace all abbr.s with their ffs? The over
use of abbr.s is one of the most XABs on Hacker News.

Could someone write a Grease Monkey script to replace all abbreviations with
their full forms? The over use of abbreviations is one of the most extremely
annoying behaviours on Hacker News.

(Thank you [https://www.allacronyms.com/aa-
search?q=annoying&cx=01082138...](https://www.allacronyms.com/aa-
search?q=annoying&cx=010821384832661523411:tvncv4ludxy&cof=FORID:11))

------
neals
I'm serious here: Can we make a "euuhmm" remover for Elon Musk? I love his
vision, but can't stand his talks because he can't a word out without
"euhmm"-ing it.

~~~
frag
lol... how about forking from github and train a model on Elon. I hate his
euhmm thing so badly!! :D

------
eth0up
<peeve>I hope this evolves to eventually become capable of removing "like"s,
and maybe "uhm"s too, but certainly "like"s. I would willfully reside in the
throat of a titanic bloviating German with influenza before listening to that
haggard word uttered in every instance where punctuation or thoughtfulness
ought have precedence. "Ahem" I can tolerate, even "uhm" and "you know"; but
not all sentences require an analogy.</peeve>

~~~
eth0up
For those apparently disturbed [1] by some attribute of the above, please see:

[https://en.wikipedia.org/wiki/Guttural](https://en.wikipedia.org/wiki/Guttural)
\- note that the German language is, by some[2], considered guttural. Happens
to be among my favorites too. "Ahem", being a guttural sound, I jovially
compared to a hypothetically enlarged and garrulous German with influenza (for
purposes of exaggeration), which one might fairly imagine sounding slightly
more pharyngeal than usual - which I must add that I would truly not find
offensive. As for any perceived assault on the ubiquitous abuse of the word
"like", if there is anything I can add to exacerbate it, I'd be delighted to
oblige.

1\. Euphemism for forbidden reference regarding votes

2\. [http://www.huffingtonpost.com/2013/08/03/german-harsh-
langua...](http://www.huffingtonpost.com/2013/08/03/german-harsh-language-
other-languages-video_n_3683379.html) \- A little witzelsucht for your
aphonogelia.

------
hackpert
A friend and I tried to make a similar detector for removing squeaking sounds
of whiteboard markers from videos. Although solutions like this do sound like
a waste of time to some, I think they can go a long way in removing those tiny
annoyances.

~~~
Kenji
That sounds fantastic! I've recently been looking at mechanical keyboards
again (Cherry-MX blue switches, I have browns at the moment) and those are
quite noisy. An algorithm to detect and filter keyboard noise could benefit
millions of people with VoIP, live streams, etc.

~~~
frag
interesting

------
Hydraulix989
Next thing you need is ummm and uhhhh detection :)

~~~
sundvor
Maybe I'm showing my age here, but I'd actually prefer a "liiiiiike" detector.

------
amelius
I would be interested in an "ahem" remover, with the constraint that it should
also work if the "ahem" is superimposed over other sound, e.g. music.

------
pmyjavec
This really seems like a waste of time or an I missing something ?

~~~
yalooze
First thought: helpful to know what words to ignore when trying to parse
audio.

------
skoocda
This is neat. Looks like you need more data though!

~~~
frag
Exactly! With just 5 epochs and some hours on a low budget GPU I got 81%
accuracy. Not bad at all, considering that no knowledge of MFC & Co. is
required.

------
jaflo
Is there a demo available?

~~~
frag
the entire project is on github. In the folder data/ there are some samples to
"see" it in action. Otherwise you have to train it on your voice and apply to
whatever sound you like

------
frag
ahem... hi everybody this is ahem... Piggy ;)

------
farright
Is it just a coincidence that this comes right at the end of Obamas tenure?

