Not if you want to stay afloat. 100% of your time will go to serving your customers, saving for a downturn, doing admin, customer acquisition, staying in touch with old customers who are not currently in the market but who may ping you for the occasional question and so on. The other 100% of your time you can spend on whatever you want.
Question for the HN audience: what exactly would Deep Dream do it if were trained on 'everything'?
When it came out, people asked why its output was full of puppyslugs, and the answer was "Because it was trained primarily on a corpus of dog pictures."
Well, suppose that it was trained on a corpus of pictures of 'everything'. What would its output look like then? Would they look more or less like the input image?
Translating this to: If it was trained on a uniform corpus of pictures sampled across the space of "likely things people take pictures of" and then used on photos from that same distribution, what would you get?
The answer is: The same kind of things, but with a more balanced distribution across puppies and "other things."
Possibly the easiest way to think about how a deepdreamed image might look is to look at the microstructure of an image in one isolated patch, and then look at what that patch seems most like. Is a close-up, 50% crop of a button on a shirt like... a button? an eye? a donut? And then magnify the high-level concept evoked by that microstructure.
The key in getting the psychedelia of the deepdreamed images is to go up the DNN to the right level. If you go all the way to the top of an Inception-style classifier, you're left with a single (or small set of) outputs: "surfer." "dog" "cat." Letting the network propagate back down based upon that gets you just an output of what the network's canonical input is for that -- see, for example, what Deepvis outputs: http://yosinski.com/deepvis
If you stop at the first level, you basically get a lot of fancy edge detectors and sharp/unsharp masks. (The first layer is a lot of convolutions, so think about the effects you get by applying a convolutional effect in Photoshop.) Can look nice, but not exciting. See the examples in the Wikipedia article on kernels: https://en.wikipedia.org/wiki/Kernel_(image_processing)
But if you stop in the middle, you get things that are mid-way between "pure concept" and "concrete feature", and that's where the cool happens. It's still localized within the image, but it's able to propagate up to entirely different types of images within that local region. So eyes in hands that are still kinda shaped like hands.
DD trained on a uniform set of images would have much more diversity in what it "imagines" in images. Christmas trees in forests; UFOs in pancakes; and lots and lots of human faces showing up everywhere (because real photos are heavy in human faces). Faces in faces with eyes in more faces in more eyes in more faces. ahh!
Great explanation, but wouldn't it be hard (essentially impossible) to train on "likely things people take pictures of"? The alternative would be "pure static".
Though if you tried, my guess would be that it would associate various color-change gradients and frequency of hard edges etc with real photos.
If so, if you fed it a real photo and said "interpret", it would just give you the same photo back. If you fed it static, then you'd probably get some kind of abstract art back.
The question becomes interesting if you start with "things a baby would see" (probably mostly happy faces and eyes (wonderfully geometric) up close), which would train for certain aspects, and then moved on from there.
I suppose you could have people just walking around with go-pros on, capturing whatever falls in its field of 'vision'. That way, the corpus isn't the subjects that people find interesting, but something more like everyday things and background.
I'm not a CNN guy, but your question can't really make sense given my understanding of neural networks. In short, the output step of a NN has to include a classifier. That is, there's a matrix of probabilities that the NN generates. The higher a given probability, the more likely the input is to match the category associated with that value. For example, a NN trained to distinguish "light" from "dark" may output a matrix with value [0.3, 0.7].
To train a CNN on "everything" means you have to have an arbitrarily large output matrix. Can you list every possible category of everything? Probably not. Even if you could take a swing at it, it's hard to get enough data (and time!) to train the net on each category. Small datasets result in overfitting of the training data, and poor overall performance. How big a data set would you need to properly train a sufficiently-sized CNN with an arbitrarily-sized classifier?
I think I understand what you're getting at - that network architectures have a fixed size output - but you're incorrect in saying that the final layer must be a classifier. In general, you can optimize any differentiable function with gradient descent, and the output does not have to be a probability distribution.
The original poster's question does make sense; he's asking what would happen if you trained the network on something like the ILSVRC dataset.
You'd get the input image. A NN requires negatives. If you train it to accept "everything" as a positive, then you just get the original back. You need to more precisely define your negatives, to get anything interesting back.
My guess is that the network wouldn't recognize as much stuff effectively and might not look as identifiable. Like an ADHD brain with too much stuff going on.
You'd still get puppyslug-style things, but they would be combinations of everything the network knows about. You might get car-eyes in one place and tree-rockets in another.
It tries to converge onto something it recognizes.
Also a great look into how awesome Google management is for letting their software engineers explore spontaneous interests.