

How to run a state-of-the-art image classifier on an iPhone - liuliu
http://libccv.org/post/how-to-run-a-state-of-the-art-image-classifier-on-a-iphone/

======
bglazer
"The beautiful part of using PNG file to store parameters is that all
optimization techniques available to PNG is now available to us. The generated
PNG file can be further reduced with pngcrush -brute. In fact, 10% data file
size reduced with pngcrush."

I thought this was a cool hack for reducing the memory size of the model. Is
this an established technique for storing parameters? Are there known better
ways to compress the model?

~~~
Houshalter
Compression works by making assumptions about how the data should be. You
can't compress random data for example. In the case of images, it usually
takes advantage of the fact that pixels nearby each other are very similar.
Obviously that isn't true for NN parameters.

~~~
liuliu
Yes, it is pretty noisy. My assumption is that PNG is pretty good at encoding
2D patterns too, and there are some patterns:
[https://github.com/liuliu/klaus/blob/master/ccv/ccvResources...](https://github.com/liuliu/klaus/blob/master/ccv/ccvResources/layer-11.png)
This is a big hack though. I am pretty sure there are better dictionary-based
compression method at the last step, but pngcrush gives me some nice small
gain already :)

~~~
Houshalter
It's literally indistinguishable from random static:
[http://silverspaceship.com/static/shot_2.png](http://silverspaceship.com/static/shot_2.png)
My guess is it may be making some small gains by learning "pixels are 20% more
likely to be black than white" or something like that.

Dictionary wouldn't work either, unless you think the NN is more likely than
random to produce sequences of bits that are exactly the same.

There may be regularities in deep NNs. For example, many learned features are
basically exactly the same but rotated or translated in a different way. This
is because NN's aren't inherently invariant to rotations or translations, so
they have to learn multiple features for every possible variation. Another
possible regularity is that the weights tend to cluster on specific sections
of the input image and ignore the rest.

But this entirely depends how the weights are projected into an image. The
image provided looks like random static. The standard method is to just create
a separate image for every feature, and for every pixel becomes the weights of
the NN from that input pixel. Like this: [http://fastml.com/images/deep-
learning-made-easy/layer2visua...](http://fastml.com/images/deep-learning-
made-easy/layer2visualization.png)

Standard image compression software should be able to work with that. But this
won't work in higher layers, because there are only 3 to 4 color channels in
an image.

In fact, since the first layer in that example is just gabor filters, you
could manually code a program to create gabor filter units for that layer, and
get massive compression. Don't know if that's true for all NN's though.

~~~
liuliu
CNN layer's parameters are not interesting for compression, these are very
small comparing to the massive full connect layers.

In retrospective, the full connect layer's regularity probably just reflects
that some neurons are more dead than others.

You always probably want to have a dictionary-based compression method (LZ-
like) for the last step to see how much more you can squeeze. Before that,
quantization, residual quantization, some clever transformations probably can
carry much longer way in terms of compressing the full connect layer
parameters. I haven't explored literally any of these fancy methods yet.

Just finished running xzip / gzip on best setting on my quantized full connect
layer output to see if pngcrush's gain is my illusion. For the last two full
connect layers, both pngcrush and xzip and gzip produce the about same size
files, which suggests very limited compression ratio. On the first full
connect layer, pngcrush (8.2M) is marginally better than xzip (8.6M). It is a
reasonable choice as last step if you don't want to have library reference to
xzip ;)

~~~
Houshalter
Also you can still compare neurons to see if any of them are very similar.
Another thing that might work would be to prune away many of the connections.
Set all of the connections that are very small to zero, or encourage it to set
weights to zero with something like weight decay. As a bonus this might even
improve generalization.

------
oostevo
Nice!

When I was playing with convolutional neural networks a while ago, I found the
following tutorial to be a good introduction to the topic:
[http://deeplearning.net/tutorial/lenet.html](http://deeplearning.net/tutorial/lenet.html)

------
Someone1234
This seems like it would also have applications when utilised on a large
cluster of inexpensive machines (for "cloud" based services and similar).

------
WhitneyLand
It's an intriguing result good job. Are there many non-edge cases where you'd
prefer this over a cloud service?

~~~
sp332
If I don't feel like uploading my data to someone else's server? If I don't
want to pay for a cloud service when the phone I already bought can do the
job?

------
steeve
On a side note, libccv is absolutely amazing, especially the SWT text
extractor.

------
mrfusion
How do you load this onto a phone?

~~~
sp332
Load this into Xcode
[https://github.com/liuliu/klaus](https://github.com/liuliu/klaus) build the
app, and put it on your phone.

