The future is going to be great, the implications of these findings seem obvious: auto-tune for my voice is coming so that people think I am thrilled that they called.
To monetize it, they can sell it to contact centres so that sales reps sound super engaged and interesting, and customer support reps seem genuinely concerned.
This sounds so dystopian!
It might even be baked into some phones by default, kind of like image softening to remove skin blemishes is the default post-processing for most pictures in most phones today.
I did this test with my toddler and he chose “correctly.”
This process could also account for the classic “cross-modal” finding, first reported in 1929 and replicated around the world many times since, that people tend to pair a blobby shape with a word-sound like “bouba” and a spiky shape with a word like “kiki”.
If you draw out these patterns, or lack thereof, as a graph chronologically, be it changes in pitch, rhythm, timbre of a sound/music, or the characteristics of drawn figures, such as change of the direction of a line over distance, dramatic and frequent changes would make it harder to interpolate what would happen if we kept the graph going. I postulate that the brain associates regular, "round" patterns with happy or content emotions because it too tries to interpolate the future state of these stimuli, and uncertainty about that future gets associated with "negative" emotions that are rooted in the amygdala and are tied to more survival-centric emotions such as anger and fear i.e. fight or flight.
This might be total bunk, but that's the 2 cents from a non-psychologist :)
I definitely suspect that predictive processing in the brain is part of the story, and we discuss that a bit in the paper. Interestingly, local entropy is highly correlated with our "spikiness" measure, the spectral centroid. However, where your comment focuses on macro-level regularity, like Bach versus Rush, our study focused on micro-level texture, like the hum of a box fan versus squealing brake pads. Although macro-level regularity definitely has an impact on emotion perception as well! This is something my other research has touched on, e.g.: https://www.pnas.org/content/110/1/70
We also found that "spikiness" predicted emotional arousal for both positive and negative emotions. E.g., "angry" and "excited" are both spiky across the senses, while "sad" and "peaceful" are both smooth. So the results can't be driven solely by uncertainty becoming associated with negative emotion.
Thanks for your thoughts! And here's a link to a preprint of the paper: https://psyarxiv.com/wucs4
From an audio standpoint, triggers often seem to be highly correlated with audio transients ( https://en.wikipedia.org/wiki/Transient_(acoustics) ) - quick little sounds with a lot of detail. From an information-science standpoint, transient sounds convey more information in a short time period than does, say, a uniform sine wave, or music, or even talking.
There's a lot happening there, and it's not very predictable; but at the same time, the entire setting of the content is generally predicated upon relaxation. It's possible that the triggering sounds are stimulating that kind of negative / fight-or-flight / sudden-action-required sort of interrupt you're describing. The difference is that because of the context, the listener can immediately return back to feeling safe and comfortable again after the transient is past.
I have personally noticed that the most triggering kind of content for myself is content where I can _almost_ predict what's going to happen, but when it surprises me slightly. Repeated trigger words work really well for this - by the 3rd repetition my brain has an excellent profile of how the ASMR practitioner is going to pronounce a word, but there are always slight differences, unexpected timing, etc. that go into it. I'm not yet sure whether the trigger happens more on a prediction hit or miss; leaning towards a hit at this point.
This could even potentially be just from societal conditioning.
We don't rule out learning processes (e.g., social conditioning), but we do examine evidence for some level of innateness. For example, cross-modal Bouba–Kiki-like effects are present in pre-linguistic infants, and arousal signals can be easily understood across species. It may end up that what is innate is a predisposition to track and learn cross-modal correspondences that are widespread in the environment.
Link to paper pre-print: https://psyarxiv.com/wucs4
He occasionally posts about his research on twitter - https://twitter.com/beausievers
Here's a free, non-paywalled link to a pre-print version of the paper: https://psyarxiv.com/wucs4
And here's a paywalled link to the published version: https://royalsocietypublishing.org/doi/abs/10.1098/rspb.2019...