Sampling at 48kHz seems like overkill. If you record at a lower sample rate (e.g. 22k) you will still be able to detect the bubbles, but the files will be about 4 times smaller, and as an added bonus it will function as a sort of high cut filter, which may help with stray high frequency noises or thermal noise (to some extent). Additionally, you can apply a band pass filter around the frequency of the popping, if you aren't already cleaning it up before processing. Given a clean recording, you can trim the silence and still have full sample data for re-running algorithms on later if you preserve the timing metadata.
You could match the spectral signature to a known sample to distinguish between bubbles popping and other types of noises, if there were any, but the recording conditions seem good enough not to warrant it.
You could also possibly use an array of two, three or more mics and do some fancy triangulation to get a 3D map of the pop locations and better distinguish between individual pops, but I doubt how useful that would be besides a few cool visualizations.
Yeah I agree with respect to the sample rate. I wasn't really sure what sample rate would be best when I started recording the audio, so just picked the highest the soundcard could handle.
I'll definitely look into applying a bandpass filter, thats a good idea!
I had wondered about the spectral signature idea, I was wondering if I could simply slide a window across the data, applying some kind of correlation function, using the magnitude data from the FFT of a known 'bubble' output compared to the window. But I wasn't sure what kind of correlation function to use, whether maybe Pearson correlation coefficient would be sensible?
I like your idea of using multiple microphones to get a 3D visualisation.
You are better off squaring the signal. Applying an FFT to get the magnitude will smear your signal in time. I suggest you look into acoustic emission detection.
Thanks! I'd not heard of acoustic emission detection, I'll look into that now.
Do you think my sample rate is high enough for acoustic emission detection, as I notice in your other comment you mention they often use sample rates many times higher.
Also is there a particular paper you'd recommend regarding acoustic emission detection?
Since bubbles popping are a random and impulsive source, the spectrum of the detected noise will be the resonance of the glass plus a white noise floor.
48khz is not overkill. Acoustic emission sensors used to detect signals like this typically use many times that sampling rate.
Awesome! I attempted the same approach, but after discovering a cheap gas concentration sensor (MQ-135, tunable to CO2) I abandoned the audio approach for a minimum of data processing. I get very accurate bubble detection and highly recommend it :-)
Ooh that sounds very interesting, I'd be interested in how that works, do you put the co2 sensor near the airlock and notice 'peaks' of co2 from it, when bubbles come through.
Or is the sensor in the fermenter itself and you're kind of measuring the pressure?
I put it near the airlock and detect peaks. For now, the most valuable information I extract is when the fermentation starts and stops -- with notifications on my phone :-)
I took the audio recordings first, then processed the .wav files later via FFT. That's a good point though, but having the raw audio does let me play around with different techniques at a later date.
I will have to look into a sliding window, if that's what you mean though, because at the moment I'm simply looking at each chunk of N seconds of audio.
When I've tweaked the approach it would definitely be better to switch to the type of system you mention.
You could match the spectral signature to a known sample to distinguish between bubbles popping and other types of noises, if there were any, but the recording conditions seem good enough not to warrant it.
You could also possibly use an array of two, three or more mics and do some fancy triangulation to get a 3D map of the pop locations and better distinguish between individual pops, but I doubt how useful that would be besides a few cool visualizations.