What are some good learning resources on audio processing, detection and anomaly detection using machine learning or deep learning? I am interested in machine predictive maintenance using audio anomaly detection
There's a huge amount to discuss in the audio domain... But for a starting place, using ResNet on spectrograms to build a binary classifier is a good place to start.
I am taking a course called "Speech and Audio Understanding" from Prof. Michael I Mandel, you can check course website[1] , he has a good collection of resources. Also his github stars are good collection of related projects[2]. In class we are using a book called "Human and Machine Hearing: Extracting Meaning from Sound" by Richard F. Lyon, authors shares it for free [3]
For example one of the resources you will see on the course website is presentations from interspeech2018, you can check all tutorials from there[4].
I don't know if this is off topic but would it be possible to remove the sound of mechanical keyboards with ML in realtime from a VOIP stream? Sell the technology to Discord and profit.
In some of my early YouTube videos (for classes I taught), I would live code. One complaint one of my students had was while they loved the videos, the key strokes were distracting.
aubio and librosa are two excellent MIR (music information retrieval) tools I can recommend from personal use. They can both be implemented for real-time audio using pyaudio or similar.
I am also curious about this topic!
I have picked up a jetson nano and fully intend to put this device to use by projecting comic-book panel-style speech bubbles (plus, who knows... random panels?) on the wall leveraging pytorch + deepspeech.
I'm no expert. Haven't done it. Don't really want to send every convo into the cloud or my tinfoil hat will start burning.
You do not need a jetson to get started investigating. Maybe just nvidia for that particular library.
If you find something, maybe you can let me know somehow.
Recently I started looking in to this as a backup method of anomaly detection while performing automated testing of our robotics. I concluded that it's actually pretty easy. Depending upon how simplistic your requirements, you can even achieve this cheaply and effectively on a very tiny microprocessor with an attached surface mount MEMS microphone. Additional features like anomalous audio recording, timestamping and alert transmission are not that hard either. No need for a fully-fledged general purpose operating system, or complex algorithms.
See this book and the sources it links to: https://musicinformationretrieval.com/ Also google for pitch and onset detection. If you want more specific help, you have to ask a more specific question.
It sounds as if he's looking for tools that can be used to monitor the sounds coming from machinery to detect or predict impending failures. I found your link interesting since I'm interested in musical applications of machine learning, but I don't think it's what he's looking for.
MIR is where the research is at. Not nearly as much work has been done in the general audio IR domain. But most methods are easily transferable. E.g tempo estimation would perhaps serve his anomaly detection needs.
Contact the founder / maker of Auphonic.com - he's a super nice and clever guy who does this kind of stuff for a living. He'll definitely point you into the right direction.
This depends if you're interested in creative applications or analytical (MIR) ones. The two fields share a lot of techniques, but the way they are used is wildly different.
https://courses.engr.illinois.edu/cs598ps/fa2018/material.ht...
Course is led by Paris Smaragdis, one of top researchers in the field of audio processing.