Absolutely possible, might even be a good idea, but my expectation is that the results won't be robust: the fakes will be uncovered by a slightly differently trained classifier. Maybe even the same classifier with a different random initialization.
Sounds like overfitting the defense against classification. Would existing solutions to overfitting possibly fix this (though make such a network even more expensive to train)?