In current neural networks the activation function usually is not sigmoid but something like ReLU ( y = 0 if x<0 else x ), and in any case the computation of activations is not meaningful part of total compute - for non-tiny networks, almost all the effort is spent on the the large matrix multiplications of the large layers before the activation function, and as the network size grows, they become even less relevant, as the amount of activation calculations grows linearly with layer size but the core computation grows superlinearly (n^2.8 perhaps?).
Yes, some non-linearity is important - not for Turing completeness, but because without it the consecutive layers effectively implement a single linear transformation of the same size and you're just doing useless computation.
However, the "decision point" of the ReLU (and it's everywhere-differentiable friends like leaky ReLU or ELU) provides a sufficient non-linearity - in essence, just as a sigmoid effectively results in a yes/no chooser with some stuff in the middle for training purposes, so does the ReLU "elbow point".
Sigmoid has a problem of 'vanishing gradients' in deep networks, as the sigmoid gradients of 0 - 0.25 in standard backpropagation means that a 'far away' layer will have tiny, useless gradients if there's a hundred sigmoid layers in between.
neural networks aren't turing complete (they're circuits, not state machines) and relu is not just nonlinear but in fact not even differentiable at zero
see section B of https://eprints.soton.ac.uk/267873/1/tcas1_cordic_review.pdf. Generalized cordic can compute hyperbolic trig functions which gives you exponentials. That said, it's still not useful for ML because it's pretty hard to beat polynomials and tables.
> The signal each neuron outputs is calculated from this number, according to its activation function
The activation function is usually a sigmoid, which is also usually defined in terms of trigonometric functions
Which neural networks don’t use trigonometric functions or equivalent?