
Lossless audio compression with libpng - phoboslab
https://github.com/alankila/Junk/tree/master/wav2png
======
FlyingAvatar
FLAC is a streaming format. Each block of audio is compressed independently of
the rest of the file.

Comparing it to any archival compression algorithm (PNG uses DEFLATE with some
pre-filtering) is not an apples to apples comparison.

The utility of a streaming format is that, (a) your listeners can start the
stream at any offset without having the preceding blocks, (b) the CPU overhead
is generally stable and (c) decoding requires a minimal amount of memory and
IO.

------
robryk
I've done:

    
    
      sox piano.flac piano.wav
      gzip -c < piano.wav > piano.wav.gz
      brotli-encode < piano.wav > piano.wav.brotli
      ls -l piano2.wav piano2.wav.gz piano2.wav.brotli piano2.flac
    

and gotten:

    
    
      -rw-r--r-- 1 robryk users 40056 Jun 14 19:15 piano2.wav.gz
      -rw-r--r-- 1 robryk users 87656 Jun 14 19:13 piano2.wav
      -rw-r--r-- 1 robryk users 34221 Jun 14 19:14 piano2.wav.brotli
      -rw-r--r-- 1 robryk users 52372 Jun 14 19:13 piano2.flac
    

Either I'm doing something wrong or even plain gzip fares better than flac
(edit: in this particular case, obviously). brotli-encode compresses using
brotli[1] using the font mode.

[1] [https://code.google.com/p/font-compression-
reference/source/...](https://code.google.com/p/font-compression-
reference/source/browse/#git%2Fbrotli)

~~~
bnegreve
Here's the answer (from AceJohnny2's comment):
[https://news.ycombinator.com/item?id=7893243](https://news.ycombinator.com/item?id=7893243)

------
foxhill
a single test case hardly means "beating" FLAC. indeed, if you just wanted to
encode a sine wave at one frequency for 30 seconds, your encoder would encode
this as a couple of bytes - for any general compression codec there will be a
problem class for which it will not be optimal.

that said, this approach may obviously be applied more generally, but it
remains to be seen if it will perform so well. self-similarity seems so
obvious as a compression technique that i would be.. surprised if this sort of
compression wasn't attempted before.

~~~
pyre
I remember an article a while back about how there are certain inputs that
will be lossy in FLAC. They were mostly artificial inputs though (e.g. a
square wave).

~~~
dTal
Can you find the link? FLAC is not supposed to be lossy under any
circumstances. In the worst case, if none of several algorithms are successful
at compression, the block is stored verbatim. A single-sample deviation on any
input would be considered a bug.

------
AceJohnny2
People forget one advantage of FLAC over other compression methods: it's
directly playable. You don't need to decompress the whole thing before you can
start playing it.

That's an essential feature for a media compression format.

~~~
raverbashing
It might be possible to create an libpng player, however, there may be a
limitation in the way it is uncompressed to allow that

Of course there are FLAC players, but without one of those you would need to
uncompress it to wav before playing.

~~~
benatkin
That was implied by what AceJohnny2 said.

------
rb2k_
Interesting approach, but:

> The sample audio I chose is one of the best case signals for this kind of
> compression, a single gradually decaying piano note.

Anyone know if this approach has any use in real-world scenarios?

~~~
gcr
Tracker music works by having a small table of "samples" that are pitch-
shifted and played on top of each other to make music.

Each "sample" is a small unit of sound: a piano note, a symbol crash, a snare
hit, a single guitar pluck, etc. In most tracker music files, most of the
space is due to the compression of the raw sample waves; only a little space
is used by the actual arrangement of the music notes themselves.

~~~
Argorak
For a good example of how the result sounds like, I recommend the UT
soundtrack:

[https://www.youtube.com/watch?v=7MSFW8pZ-_4](https://www.youtube.com/watch?v=7MSFW8pZ-_4)

It has a pretty distinct feel to it, IMHO.

~~~
Shish2k
My favourite sample-based soundtrack; turns out it's actually the same
composer working with the same software as yours :D

[https://www.youtube.com/watch?v=2yDVM77lGlM](https://www.youtube.com/watch?v=2yDVM77lGlM)

Also noteworthy is that you can open these games' soundtracks in a tracker
like [http://openmpt.org/](http://openmpt.org/) to see the arrangements and
play samples individually. I tried making music that way a while back and was
sad that all my tunes sounded as shitty as the samples I was working with,
then opened up the Deus Ex OST and found that his samples were worse, and he
was working with fewer channels and filters; it's just pure skill in
arrangement that makes the music so great :)

~~~
voltagex_
I'm a huge tracker music fan, I didn't know about the Deus Ex OST. My
introduction to tracker music came from Epic games like One Must Fall 2097,
Epic Pinball and Jazz Jackrabbit. It's really interesting to open a lot of
these tracker music files up and play with the samples, have a look at the
comments left and marvel in the way the music works.

Some other good sites to look at:

[http://modarchive.org/](http://modarchive.org/) \- huge tracker music archive
site

[https://www.scene.org/](https://www.scene.org/) \- demoscene information and
archive

[https://pouet.net/](https://pouet.net/) \- demoscene information and archive

~~~
chengsun
Don't forget
[https://www.scenemusic.net/demovibes/](https://www.scenemusic.net/demovibes/)
\- demoscene streaming music radio

------
AshleysBrain
Are there any lossless compression algorithms that use autocorrelation to
reduce the bit depth on repeated cycles? For example a human voice sustaining
a note may have similar cycles, but with enough low-level noise-like
variations that traditional deflate-type compression does not help much.
However if an algorithm could spot that the second cycle only has small
differences to the first cycle, perhaps it could encode the second cycle as a
diff to the first cycle. Since the second cycle differences are likely to be
small, the bit depth could be reduced (e.g. to 2-3 bits). Then the third cycle
could also be encoded as a diff of the second, so even a gradually changing
volume can be encoded with a very low bit rate throughout while ending up
significantly different from the first cycle.

Has that been tried before? Would be willing to spend a few evenings hacking
on this myself.

~~~
thrownaway2424
FLAC works by finding a well-fitting polynomial function and Rice-encoding the
residuals, which isn't very different from what you described.

------
eliteraspberrie
To improve the calculation of the autocorrelation, first 'demean' the input
sequence by subtracting its mean from it, like so:
[https://gist.github.com/anonymous/a293bdbfa9133a607dd1](https://gist.github.com/anonymous/a293bdbfa9133a607dd1)

The reason being that autocorrelation (and convolution in general) is
meaningful when the input signal is linear time invariant. Read more about
that here:
[https://en.wikipedia.org/wiki/LTI_system_theory#Impulse_resp...](https://en.wikipedia.org/wiki/LTI_system_theory#Impulse_response_and_convolution_2)

------
dtech
Now while my intuition would certainly say that autocorrelation is a good
candidate property to be used in (lossless) audio compression, I remain
sceptical.

It is certainly not a recently discovered or difficult concept to grasp, so if
it was that useful I would think that it was already included in things like
FLAC...

~~~
dspig
Having tried similar things myself, while autocorrelation looks useful for
repeating waveforms, real waveforms at any point usually have more in common
with the preceding few samples than the same point in the previous cycle (e.g.
Using LPC to predict the next sample based on the previous ones). One reason
for this is the cycle length will usually not be a whole number of samples so
you would have to interpolate between samples to find the same point in a
previous cycle, and even then it's not an improvement over LPC.

~~~
diydsp
Yes indeed. In fact one of the key reasons music appeals to the mind is that
it's just entropic enough to supply the brain with continuous new artifacts to
challenge its compression/organization scheme.

Btw, I like your handle. What kinds of things have worked on? You should post
some links in your profile!

------
qq66
Fun, but a good media compression format has other properties like: 1) low
cost of decoding in terms of RAM/CPU etc. 2) ability to decompress an
arbitrary amount of the audio from any starting point in the file, 3) ability
to start decoding before the entire file is present.

------
aobuke
What are difference between audio-based and fit-based compression method?
These two can transform easily...

------
zbowling
What kind of weissman score does this get?

~~~
chrissyb
ctrl-f, "weissman", this. Gold!

------
billylindeman
But what is its weissman score?

~~~
zyang
2.89

------
marksands07
Wow, a Weissman Score of 5.2!

------
aobuke
Jj

