A bilateral filter weights surrounding pixels by how close in value they are. This ends up creating a smoothing effect from soft clustering. Take a look at the 'rolling guidance filter' for an effect when used iteratively.
This makes me think of deep learning "style transfer" effects. They're nice but so are artistic filters that long predate these things and are tremendously simpler.
CNN are also tremendously simple. They just iterate many filters, but each one of them is much simpler than the bilateral filter.