There is definitely overlap in the FOV of the cameras that are adjacent to each other (GoPros in 16:9 generally have about a 120 degree field of view, though specifics depend upon which GoPro models is being used), so I assume they timesync the individual videos and use that overlap information to determine relative depths of things.
In order to do Stereoscopy properly, you need the axes of the camera barrel to intersect having a set of cameras all radially pointed outward doesn't achieve this.
The end user never sees the raw video in this case so that's not a requirement -- they're generating lots of synthetic stereoscopic views for the user by post-processing the video and accounting for depth.
I'll be interested to see what data they're sending to the video player to provide both stereoscopy and 360-panoramic video at the same time. The methods I can think of offhand require real-time pixel-shuffling to assemble the proper view during playback but maybe they figured out something else.