Question: do you/will you plan to support converting GPU nets to CPU, perhaps by keeping weights and architecture definition separate from PyCUDA dependent structures during serialization?

I have found that using a trained net for preprocessing can be accomplished using very limited resources (read: Core 2 Duo laptop). This is one of the very nice features of DeCAF, which could allow for some interesting applications on embedded devices.

Great work by the way - I look forward to testing it out soon!

That would be possible, but since Hebel is mainly meant to be used in research I don't think it's a big priority now. The most important reason to do this would be to allow development on laptops and workstations without NVIDIA cards and to run the finished model on CUDA hardware later.

As far as embedded devices go (I assume you're talking about ARM cpus etc), they are probably too underpowered to run Neural nets anyway, or models would have to be written in highly specialized C.

