
Swish: A Self-Gated Activation Function - goberoi
https://arxiv.org/abs/1710.05941v1
======
goberoi
Why is this interesting? In short: a great new activation function that may
challenge the dominance of ReLU.

Longer story:

Today, ReLU is the most popular activation function for deep networks (along
with its variants like leaky ReLU or parametric ReLU).

This paper from the Google Brain team is ~2 weeks old, and shows that SWISH, a
new activation function, "improves top-1 classification accuracy on ImageNet
by 0.9% for Mobile NASNetA and 0.6% for Inception-ResNet-v2" by simply
replacing ReLU with SWISH.

SWISH is equal to x * sigmoid(x), so not that much harder to compute either.

