
Show HN: Trigrad, a novel image compression with interesting results - ruarai
http://ruarai.github.io/Trigrad/
======
kobigurk
This is the difference between someone who actually does something and
academic work that claims to achieve something. This is half-done, but it
WORKS and you can use it and understand it right now.

I had the opportunity to try and implement a "novel" algorithm for image
downscaling. I contacted the authors - one replied that he can't reveal the
source code, and the other didn't reply. So I went ahead and invested about 2
weeks implementing and optimizing it to the point where it worked - but the
results were far from what we wanted. If they just supplied a demo program
where I could see if it worked for our case, it would be much better.

~~~
bastijn
I mean the author no harm, nor want to talk bad about his work. His work is
very cool and I like the amount of information that he gives. Yet, this is far
from comparable to academic work. I am inclined to say that you mixed up the
sides in your statement, IMHO.

Academic work would have explained the benefit of the algorithm. It would have
presented it with a side by side comparison with common algorithms and explain
it's pros and cons against these algorithms. It would have covered all aspects
like quality, size, performance to name a few. It would have explained me if I
could use this new algorithm in my field, and why (not). None of this is
present in the current work shared here. You say this is half done, I would
say this is not even 20% done..

To end on friendlier terms, I completely agree that more academic work should
have been made available. Yet, I know the pressure in that world and can
understand keeping it for yourself for a while. More often than not you will
have to drag out a couple of other papers in the same field.

~~~
tjradcliffe
This is the traditional model of academic publishing that was driven by the
limited communication and collaboration ability of the time. A study like the
one you describe would be done by multiple co-authors (I don't think I've
written an image processing paper that doesn't have at least three people as
authors, all of whom actually contributed to the work one way or another.)

Furthermore, the traditional paper would have to make a lot of guesses about
the kinds of images people were likely interested in and the range of
characteristics that mattered. What kind of noise spectrum due your images
have, how does increasing contrast affect things, what about the spacial
frequency distribution in the images themselves, and so on... Different fields
have radically different "typical" images and the attempts at covering a
reasonable range of the in traditional papers were not necessarily very
limited.

Instead, I see this model of publication as exploiting the possibilities of
the 'Net to allow more effective communication and collaboration. And it _is_
publication: it is making public, which is what makes the difference between
science and alchemy... if there had been a "Proceedings of the Alchemical
Society" we'd have had chemistry a thousand years ago.

What this model of publication does not (yet) have is a reputation mechanism,
but it isn't clear it needs one, because you can see the results (and the
code) for yourself. As such, I think the author has not only done something
interesting in the image compression space, they are pointing the way on the
future of scientific publication.

Measuring this model as if it could be described as a certain amount of
progress along a line toward the old endpoint is mistaken. This is a paradigm
shift, and the models are incommensurable.

~~~
pezzep
> What this model of publication does not (yet) have is a reputation
> mechanism, but it isn't clear it needs one, because you can see the results
> (and the code) for yourself. As such, I think the author has not only done
> something interesting in the image compression space, they are pointing the
> way on the future of scientific publication.

The original post is certainly interesting, but that doesn't mean it extends
our knowledge of image processing. For example, see this 20 year old paper
that proposes the idea:

[https://www.cs.cmu.edu/~./garland/scape/scape.pdf](https://www.cs.cmu.edu/~./garland/scape/scape.pdf)

This is something peer review would pick up on... That said, I don't mean to
discourage the author. It's a great idea and nicely presented!

~~~
mc808
Maybe OP or someone else will take the initiative to add that and other
reference material to the project's wiki.

------
a_e_k
Garland and Heckbert had a nice algorithm for this sort of thing in their 1995
paper, "Fast Polygonal Approximation of Terrains and Height Fields." The paper
is mainly devoted to height fields, obviously, but at the end they demonstrate
that their algorithm is also effective at triangulating color images for
Gouraud-shading as well.

I'd be curious to know how this stacks up in terms of speed and quality.

EDIT: Oh yes, and there's also "Image Compression Using Data-Dependent
Triangulations" and "Survey of Techniques for Data-dependent Triangulations
Approximating Color Images", both by Lehner et al., 2007. I don't mean to
discourage you here, just pointing out the bar to be beaten. It's a cool idea.

~~~
jacobolus
To the OP: There are also several other tools for scattered data
approximation/interpolation developed in the last few decades, both mesh-based
and mesh-free. Linear interpolation using barycentric coordinates on a
triangulation is fast (and might be the most practical method for this
particular use case), but nowhere near as good a result as you can get via
other methods.

See e.g.
[http://scribblethink.org/Courses/ScatteredInterpolation/scat...](http://scribblethink.org/Courses/ScatteredInterpolation/scatteredinterpcoursenotes.pdf)

~~~
ruarai
Not sure if that applies for my purposes, since I'm not actually using linear
interpolation barycentric coordinates (I don't think that's possible). The
barycentric coordinates supply the gradient within themselves.

I may have to read further, though. That's a lot of math.

~~~
jacobolus
What you’re calling a gradient is also known as linear interpolation.

------
TheLoneWolfling
I have an improvement to this:

Run the edge detection _twice_. That way you get better gradients along sharp
edges.

Single run: [http://i.imgur.com/kusJDRo.png](http://i.imgur.com/kusJDRo.png)

Double run: [http://i.imgur.com/rHYSzpq.png](http://i.imgur.com/rHYSzpq.png)

To me, at least, the double looks better. Especially in the stems.

Just add the following to FrequencyTable.cs, after line 18 (var edges = ...)

    
    
        detector = new SobelEdgeDetector();
        edges = detector.Apply(edges);

~~~
scottfr
The double run does look a lot better.

------
pjtr
Since the order of the samples doesn't matter, could you sort them somehow so
the Gzipped stream of samples can be compressed better? (E.g. sort the color
index by component average, by red, by maximum component, ... and sort the
point index by color.)

Have you tried struct-of-array (AAA...BBB...) instead of array-of-struct
(ABABAB...) layouts?

~~~
ruarai
I tried your struct-of-array idea, and that's produced an okay improvement
~1%.

Sorting them seems tough as the index of each value must match for each
channel, so any sorting would have to occur beforehand. Except that is already
sorted by x-y values, and my attempts otherwise have failed to produce
results.

~~~
pjtr
I'm missing where they are sorted by x-y values.

I did a quick and dirty experiment:
[http://pastebin.com/NjZNRjw1](http://pastebin.com/NjZNRjw1)

Seems about 30% smaller than before on
[http://i.imgur.com/5zwCEF5.png](http://i.imgur.com/5zwCEF5.png)

~~~
ruarai
Wow, that works pretty well. I was mistaken in thinking that either the
Dictionary class or the process of sampling would sort them.

Mind if I merge that? Or you could submit a pull request. Either would be
great!

Also, do you know of any resources for learning about how to optimise for gzip
compression? Google is just telling me about compression for websites.

~~~
pjtr
Sure, feel free to merge.

I don't know any resources specifically about gzip compression. Demosceners
have very practical and fun compression know how, so maybe look into:
[http://www.farbrausch.com/~fg/seminars/workcompression.html](http://www.farbrausch.com/~fg/seminars/workcompression.html)

------
arketyp
This makes me feel obligated to share my ongoing master thesis project. Among
other things, my approach is the reverse to this, namely decimating a full
detail mesh using edge/ridge detection.

[https://femtondev.wordpress.com/2014/12/18/not-
delaunay/](https://femtondev.wordpress.com/2014/12/18/not-delaunay/)

[https://femtondev.wordpress.com/2014/12/12/principal-
compone...](https://femtondev.wordpress.com/2014/12/12/principal-component-
analysis/)

------
userbinator
Nice pictures, but they should really have indicated the filesize for the
3000-sample version and given more details about this part:

 _the samples can be saved and zipped up_

Depending on the algorithm the results could vary wildly - there could be some
characteristic of the samples that make them encodable in a smaller/easily-
compressible way.

It also reminds me of this:

[http://codegolf.stackexchange.com/questions/50299/draw-an-
im...](http://codegolf.stackexchange.com/questions/50299/draw-an-image-as-a-
voronoi-map)

------
JoshTriplett
Very impressive!

How might the final rendering look if it used some of the standard triangle
shading techniques? Treat the sample points as coordinates in a mesh, assign
colors to those coordinates based on what you sampled, then interpolate colors
for the points between those coordinates using something like Gouraud or Phong
shading (without the lighting). That might produce a satisfying result with
fewer samples.

I wonder if this could be used as an image resizing mechanism? Take a large
number of samples, then render the resulting image using those samples and a
smaller or larger size. Or, generalizing further: turn the image into samples
and associated colors, apply a transform to the sample coordinates, then
render.

This also reminds me quite a bit of the algorithm used in
[http://research.microsoft.com/en-
us/um/people/kopf/pixelart/...](http://research.microsoft.com/en-
us/um/people/kopf/pixelart/paper/pixel.pdf) (for which, sadly, code is not
available). I wonder if some of the techniques from there could improve the
quality of the results with fewer samples?

~~~
1wd
That's exactly what it does, no? (Standard triangle shading technique,
interpolating colors between the mesh, Gouraud shading without the lighting.)
Phong shading (interpolate normal vectors) wouldn't make sense, as the mesh
has no normals.

~~~
JoshTriplett
It isn't obvious from the article that the color interpolation used here
matches Gouraud.

------
CyberDildonics
This is neat and the illustrations are great. A few things that will probably
give large gains while being low hanging fruit:

1\. There are triangle interpolation schemes out there now that are smoother
than barycentric coordinates which should give much better results.

2\. Look up DDE - Data dependent triangulation. It switches edges to connect
points to neighbors that have similar values. It will get rid of some of the
spikiness and leave more smooth gradients.

3\. The running the edge detection twice scheme mentioned in the comments
works because you want the change of the gradient, and you need both sides
represented. So the double edge detection will give you manifolds, which is
good.

4\. Instead of having arbitrary vertex positions, you can just specify the
offset to the next point. Then instead of an x and y value you can use one
(possibly uint8_t) value to encode where the next point will go.

5\. You can also chop some accuracy off of colors. In RGB, you can lose
accuracy in blue and some in red. In other schemes like you can keep accuracy
in luminance and lose it heavily in hue and chroma/saturation, etc.

~~~
TheLoneWolfling
W.r.t. running the edge detection twice. Is there a name for this operation?
Finding points that are near an edge but not on the edge?

~~~
a_e_k
The Laplacian.

~~~
TheLoneWolfling
Weird. I hadn't made the connection with Physics there. I suppose it is a
general mathematical operator.

Thanks!

~~~
a_e_k
Yes, it's not exactly equivalent here but it's pretty close. Basically, Sobel
kernels are analogous to the 1st derivative and the Laplacian is analogous to
the 2nd. You'll also often see the Laplacian combined with a Gaussian (the
"LoG" operator) for pre-smoothing since it tends to be particularly sensitive
to any noise.

~~~
TheLoneWolfling
Being a second derivative, I suppose it would be.

So does that mean that there are other kernels that approximate derivatives
"better"? Like with finite differences?

------
atinoda-kestrel
Hmmm... I wonder if there's potential to use this concept for video. Not this
implementation, naturally, but the concept.

I'm a little too sleep deprived to think through this entirely, but just off
the top of my head, it seems like over the course of a few frames that an edge
in motion would wind up as a series of triangles where one of the points
remains static while the other two shift away from it -- in other words, the
Trigrad approach would yield motion blur as an artifact. And then when the
next key frame comes along, the remaining point gets reselected, probably
further along the motion path... So much like a normal differential approach
you wouldn't need to store all of the point locations each frame, just the
ones that change and which triangle they belong to.

It might be hard to make it stream-friendly though, since obviously the
compression efficiency depends heavily on the storage structure (see pjtr's
comments)...

~~~
a_e_k
There's been some work on encoding video with triangles. See "Video
Compression Using Data-Dependent Triangulations" ([http://www-home.htwg-
konstanz.de/~umlauf/Papers/cgv08.pdf](http://www-home.htwg-
konstanz.de/~umlauf/Papers/cgv08.pdf)) for example.

------
slavik81
I really like the look at low sample rates, but the stems on the flowers look
rather jaggy even with 100,000 samples.

It would be nice to have the original image for a more detailed comparison. As
beautiful as the picture is, the depth-of-field effect makes comparison
between the left and right sides of the image a little tricky.

------
JosephRedfern
How are the sample points selected? I get that they're weighted according to
edge intensity, but what kind of distribution are you using in cases where
there is no edge?

EDIT: I've read the code - it seems to be using random sampling. Still not
entirely sure how a point can be placed at a place with absolutely no Sobel
response - maybe it can't, which would make sense. My question arose after
looking at: [https://i.imgur.com/9YHOtQ0.png](https://i.imgur.com/9YHOtQ0.png)
and then [https://i.imgur.com/XRF7mz4.png](https://i.imgur.com/XRF7mz4.png).
It looks like samples have been placed in regions with no response, but
perhaps my eyes just can't see the edges.

~~~
ruarai
This is an area that needs work, but basically there's the table of 'edge
intensity' that gets multiplied by a constant baseChance variable for every
pixel.

baseChance = 8 * samples / (width * height)

samples is the desired number of samples.

Again, this needs work. I'm pretty sure there's a way I can more accurately
match the number of output samples to the desired number of samples.

Edit: And no, it can't produce a response if there's no edge (a value of
zero). However, there's always some level of noise. This is true of the
example you've shown - there's very light noise visible.

~~~
JosephRedfern
I see - thanks for the explanation!

Have you tried using Canny Edge detection rather than Sobel? It might help a
bit with the noise. I've just attempted it, but can't seem to get Canny
working under Mono (I've not got a windows machine at the moment).

------
mainguy
Not sure exactly, but wouldn't these images theoretically scale up better than
jpeg? i.e. Making a 600x800 image out of a moderately compressed 300x400 seems
like these would potentially scale better than jpeg (for some types of
images).

~~~
evan_
I think you'd just start to notice the fuzzy artifacts more in a larger image
and need to have more and denser samples to make up for it.

------
joshu
Some more explanation about barycentric coordinate would be appreciated.

~~~
1wd
Barycentric coordinates are "weights" (u, v, w) that determine a point P on a
triangle (A, B, C) by weighting the triangle corner points. You can calculate
the cartesian coordinates of P = u * A + v * B + w * C.

Since the weights u+v+w = 1 you actually don't need all three: u = 1-v-w, so P
= (1-v-w) * A + v * B + w * C = A + v * (B - A) + w * (C - A).

~~~
joshu
Gotcha. That makes sense.

------
tgb
Reading how this works, I'm a bit disappointed in how the low sample images
performs, particularly for the stems. It feels like one should be able to
tweak this to get much better results in this situation. I'm disappointed in
the amount of yellow over the stem.

Here's my first thought on what could improve that: samples are taken at
edges, which is exactly where colors vary quickly. So perhaps samples should
be taken in pairs, one on either side of the edge.

Fun project!

~~~
TheLoneWolfling
The "easy" way to do this is to run the edge detection _twice_. In other
words, run edge detection, then run edge detection on the result of the edge
detection.

------
return0
Congrats! but beware of Gavin Belson.

------
coolvision
What about thresholding gradients (with adaptive threshold), storing resulting
edges alongside (compressed with RLE f.ex.), and then blending it over the
triangle gradients? It could remove some artifacts.

------
rincewind
Can you try it with the lenna benchmark image and post the results please?

~~~
ruarai
Here's Lenna reconstructed after 30,000 samples.
[http://i.imgur.com/hlPonsO.png](http://i.imgur.com/hlPonsO.png)

The compressed data comes out to 303KB, which isn't that great. It's a pretty
noisy image.

------
bdamos
How does the speed compare to other compression algorithms?

~~~
ruarai
Currently compressing the example image (the one of the flowers) with 100,000
samples takes 1.5 seconds + 1.5 seconds for AForge's edge detection.

I'm sure this could be increased by a huge amount if I had a not-terrible CPU
or if I did some major refactors to use the GPU.

~~~
ekianjo
How does it compare in quality/size vs PNG ? How about using examples where
JPG are traditionally bad at, such as pictures with dark gradients leading to
blocky artifacts? Would that process be more efficient there?

~~~
darkmighty
PNG is lossless, so I don't get the quality comparison.

I don't think this approach can compete with JPEG and newer transform based
variants for natural photos (event at edge cases), but seems like it would be
nice for lossy compression of logos/general internet pics.

~~~
ekianjo
> PNG is lossless, so I don't get the quality comparison.

The point is, is that technique in between JPEG and PNG in terms of
quality/size or is it worse than JPEG altogether ?

------
imaginenore
Try using something better than Zip. LZMA2/PPMd are very good.

This is what I use for backups:

    
    
        7z a -m0=PPMd:mem=256m -mx9 archive.7z file_to_compress

------
boterock
haven't you tried applying the edge detection filter twice? i think this way
the samples are taken not in the edge but before and after the edges, maybe it
will end with a blurrier image but with less artifacts.

~~~
ruarai
This doesn't seem to have any benefit unfortunately. It seems like the current
approach already produces the before-after edge effect.

------
akhilcacharya
This is very cool!

------
octatoan
"Not really well enough to try to push onto people. Current results show that
Trigrad can only beat something like JPEG with optimal conditions."

?

