

Ask HN: Is this idea for audio compression technically feasible? - pcf

I have a great idea for an audio compression algorithm, but does exist already? And if not, is it technically feasible?<p>My idea is that you take repeating identical sounds&#x2F;frequencies (kick&#x2F;snare&#x2F;hi-hat&#x2F;bass etc.) and encode them as the &quot;same&quot; as the first occurence in the song. That way you would save a lot of data&#x2F;space, especially with electronic music where many of the sounds actually are 100% identical along the song.<p>But would this work in practice? Or would it be too complex to find out what is truly identical and what is just similar, but not similar enough to be marked as &quot;same&quot;?<p>I would love to hear what HN has to say about this. Thanks!
======
lsiunsuex
As someone who listens to EDM (electronic dance music) almost exclusively
(though, I'm no DJ or producer) the vast majority of those sounds come from
some sort of midi type software or machine be it a drum machine, sound effects
machines or software, or recordings of sounds that occur around us. EDM music
also has a build up - the beat generally starts simple and gets more complex
as the track progresses. Sometimes, thats called the drop, right? When a track
builds up, they "drop the beat" and the best gets more complex, as found in
Dubstep.

So, I think compression like that might work in software they use to write /
develop a track; if the software can determine that 2 notes or beats are
identical, it could code the file with less data in it, but I think of music
to be similar to water. It's fluid, with each note / beat / voice effecting
the notes / beats around it. Ever listen to a track with the voice removed? Or
with the drums removed? It sounds very different then when the track
untouched. Point being, I don't think this would work for a final mp3 or
lossless compression.

(or I may be completely wrong)

------
nathan_f77
You're kind of thinking of the "source code" for electronic music.

To simplify things: You start with a sequence of notes stored as MIDI, which
are played through synthesizers and sampled instruments, which generate the
sounds. Then you layer effects on top of that, such as reverb, filters, and
compression.

Distributing this "source code" probably wouldn't make music files any
smaller, since you would also need to distribute all of the sample packs and
effects plugins. A single reverb or compression plugin will take up tens of
megabytes. Acoustic instruments are usually sampled at very high quality. A
grand piano library can be gigabytes of samples, which includes all kinds of
variations for velocity, acoustic response, etc.

In addition to that, anything you record with a microphone can't be compressed
in this way, such as vocals, or any acoustic instruments. So if you have
vocals throughout the whole song, then the entire length of this audio track
will have to compressed at the original bitrate anyway. So if you have a 3
minute song that uses lots of samples, but also has 3 minutes of vocals, then
the vocals are your bottleneck. Whatever you do to split up your track into
samples, your minimum filesize will be 3 minutes of compressed audio.

So as soon as you have anything that can't be reduced into repeating samples +
filters, you won't be able to do better than MP3 or FLAC (for lossless
compression).

If your song contains very little recorded audio, and is purely MIDI +
synthesizers, then this technique could save a lot of space.

------
rullgrus
The analysing part sounds like something "The Infinite Jukebox" [0] must be
doing to find suitable paths to jump between.

[0][http://labs.echonest.com/Uploader/index.html](http://labs.echonest.com/Uploader/index.html)

------
dalke
I know little of the topic, but the end result sounds equivalent to methods
that convert a music recording to a MIDI file. There is a list of such
programs at
[http://wiki.audacityteam.org/wiki/Midi#Converting_from_audio...](http://wiki.audacityteam.org/wiki/Midi#Converting_from_audio_formats_to_MIDI)
. That section does comment that this is a challenging research problem.

------
pcf
I should add that I'm a music producer myself and know very well how music is
created.

What I'm wondering about if it's feasible having an algorithm that searches
for repeated instances of any sound (don't limit yourself to thinking MOD or
samples) across a file and then marks these as "same" in order to save space.

Any programmers who could tell me that?

------
cJ0th
Some people actually create such files by writing music with a tracker.
[http://en.wikipedia.org/wiki/Music_tracker](http://en.wikipedia.org/wiki/Music_tracker)

What you describe may be called a conversion to a tracker format.

------
rfergie
Is this not the type of thing that every compression algorithm does?

~~~
mschuster91
Nope. MP3, for example, cuts off all frequencies outside the audible range and
thus achieves large savings.

~~~
GFK_of_xmaspast
A 44.1Khz .wav file has no frequencies outside the audible range.

~~~
joushou
How well do you hear 1Hz? A 44.1KHz doesn't have _a lot_ of frequencies
outside of the audible range, but that's not really that relevant, as that's
only a minor trim compared to the crazy things lossy compression algorithms do
here. Their main focus is having the result sound _almost_ like the original.
The audible sound-image is modified extensively by them, with focus on
simplifications that are hard to notice for a human.

