
High-dimensional spheres are "spikey" - ColinWright
http://www.penzba.co.uk/cgi-bin/PvsNP.py?SpikeySpheres#HN2
======
xyzzyz
Another way of looking at this phenomenon is to inscribe a high dimensional
sphere in a hypercube. As the dimension grows, the sphere/cube volume ratio
gets arbitrarily small, although sphere touches the cube at every side. It
seems that most of the mass of the cube was focused near vertices. This is
because the side of a cube is a whole lot closer than the vertex from the
center of the cube -- for instance, if we take a million-dimensional cube with
a side of length 2, then the distance from the center of the cube to the side
is 1, but the distance from the center to the vertex is 1000. It's not the
spheres that are "spikey", the cubes are.

~~~
tel
One should also take a close look at the formulas for volumes of _unit-radius_
n-balls. For even numbers of dimensions it's

    
    
        pi^(n/2)/fact(n/2)
    

which behaves very interestingly[1]. It's equal to pi in two dimensions, peaks
at the 4-dimensional ball (about 1.6 pi), then is driven to effectively zero
by the time you get to 20 dimensions or so (value of ~8/1000 pi).

Again, however, the radius isn't changing. I don't think of unit balls as
being spiky--they're still rotationally invariant in all d rotational degrees
of freedom, so calling them spiky doesn't sit well with me.

Instead, imagine you lived on a line, only walking up and back along it
forever. Then, one day, someone introduced you to the plane and then a volume.
These are vastly larger than the space you were afforded by the line:
exponential blowup.

It's also driven home by how the Lesbegue measure in d-dimensions assigns zero
measure to all (d-1)-dimensional (or lesser) objects. Adding a dimension just
makes space immensely larger.

And so pegging the size of our unit ball to its 1-dimensional parameter, the
radius, causes its volume to vanish.

[1] Wolfram Alpha chart showing the volume of the n-sphere in units of pi
[http://www.wolframalpha.com/input/?i=pi%5E%28n%2F2-1%29%2FGa...](http://www.wolframalpha.com/input/?i=pi%5E%28n%2F2-1%29%2FGamma%5Bn%2F2%2B1%5D+from+0+to+30)

~~~
altrego99
> And so pegging the size of our unit ball to its 1-dimensional parameter, the
> radius, causes its volume to vanish.

There's some inherent flaw here - as 'pegging' the size of unit cube to its
1-dimensional parameter, the length, does not make the volume to vanish.

~~~
aardvark179
But the cube isn't pegged like that at all. The cube is the set of points
where any element of the coordinate vector has a magnitude less than a
specified value, whereas the sphere is the set of points where the magnitude
of the vector is less than a specified value. Hence the cube gets 'larger' as
you make space itself larger by adding dimensions, but the sphere doesn't.

~~~
Gravityloss
Definitely the best comment. By looking at it this way, the mystery almost
vanishes!

------
yummyfajitas
One practical application of this stuff is in understanding high dimensional
discriminants.

Consider two populations, A and B, with N real-valued traits. Suppose each
trait in group A is normally distributed with mean 0 and stdev=10, while group
B is distributed with mean 1 and stdev=10. (This is true for trait 1, trait 2,
etc.)

Each individual trait in these groups overlaps quite drastically. Imagining
that all the mass of the normal distribution is contained in a ball of radius
20, then for any single trait A lives on the ball of radius 20 about 0 (namely
[-20,20]) while B lives on [-19, 21]. Barely different, right?

On the other hand, in the N-dimensional space, the point (0,0,...,0) has a
distance sqrt(N) from the point (1,1,...,1). So in 401-dimensional space, the
ball of radius 20 around (0,0,...,0) and the ball of radius 20 around
(1,1,...,1) don't overlap at all and the discriminant f(traits)=sign(trait[1]
+ trait[2] + ... + trait[N]) works fantastically.

This is one reason why "big data" can work well - chaining together many weak
predictors gets you a strong predictor.

See Lewontin's Fallacy for a biological example of this.

<https://en.wikipedia.org/wiki/Lewontins_Fallacy>

~~~
adolgert
I like that explanation, coming from the data side. I wonder if the article's
author is working with data or a model for the data. There is a general rule
that, especially for high-dimensional models, only a few parameters are
important because eigenvalues of the sensitivity matrix fall off quickly (with
logarithmic density). That means the data is well-separate, as you described,
and only a few parameters control whether the model fits, while the rest are
relatively unimportant. It's a different kind of pointy.

------
ot
Other fun fact about spheres: as the dimension grows to infinity, the function
of the area of the section along a diameter converges to a Gaussian (!)

If you are interested in these things, this survey is amazing:
[http://www.math.ucdavis.edu/~deloera/MISC/BIBLIOTECA/trunk/B...](http://www.math.ucdavis.edu/~deloera/MISC/BIBLIOTECA/trunk/Ball/ball.pdf)

~~~
mturmon
Thanks for this nice link. It's a robust phenomenon that kicks in pretty well
even in modestly large dimensions. One name for this is concentration of
measure <http://en.wikipedia.org/wiki/Concentration_of_measure>

See figure 6 in the paper linked in the above comment for a nice illustration
of how this works.

------
evolvingstuff
Another fun fact about high-dimensional spaces: Randomly pick k points in an
n-dimensional space. Now, find their average location. As the number of
dimensions increases, it becomes progressively more likely that every point
will be closer to the average than it is to any other point.

------
gfodor
A fun walk through the curse of dimensionality and how your intuition can
break down in higher dimensional spaces, and other life lessons, can be found
in Richard Hamming's book The Art of Doing Science and Engineering:

[http://www.amazon.com/The-Doing-Science-Engineering-
ebook/dp...](http://www.amazon.com/The-Doing-Science-Engineering-
ebook/dp/B000P2XFPA/ref=sr_1_2?ie=UTF8&qid=1337463636&sr=8-2)

------
mreid
I wrote a very similar blog post a few years ago when I first encountered the
same fact:

<http://mark.reid.name/iem/warning-high-dimensions.html>

Thanks to the comments, I learned this is a common warning, used in at least
three textbooks.

------
evincarofautumn
The intuition seems flawed. It’s not that _n_ -spheres are spiky, but rather
that _n-cubes_ are. If you put _n_ -spheres in all the corners of an _n_
-cube, for _n_ > 9 the corner spheres are far enough away from the centre of
the cube that the central sphere ends up with a diameter greater than the
cube’s edge length.

Even if I’m just misunderstanding it, I don’t see how it’s surprising. The
author writes “with a true high-dimensional sphere, every point on the surface
is ‘an extremity’”—isn’t the same true by definition of a sphere of any
dimension? For an _n_ -sphere, you have an infinite number of “spikes” whose
tips constitute an ( _n_ ‒ 1)-dimensional surface.

~~~
ColinWright
Thanks for that response - I can see that you've missed the point, so I need
to go back and consider how to re-work it to make the point clearer.

I'm trying to explain that every point on the surface of a high-dimensional
sphere has characteristics that we, with our 3-dimensional intuition, would
more usually associate with a spike.

When I get a bit of time I'll go back and re-read it with your comment in
mind.

~~~
sirclueless
I think the square getting spikier is the salient issue at hand, not the
sphere getting spikier. The inner sphere has an ever increasing radius based
on the distance from the origin of a unit n-sphere placed at (1, 1, 1, ...).
The inner sphere isn't spiky, that point just keeps getting farther away
because n-squares get spiky.

~~~
ColinWright
You're focussing here on something other than the point I'm trying to get to .
Yes, the corners of the cube get further from the surface of the sphere, but
that's not the intended point. As I said in the comment to which you're
replying, the idea is that points on the surface of the high-dimensional
sphere have characteristics we normally associate with a spike. As so often
happens with analogies, people (and you're not the only one, so the problem is
with the writing, not the audience) are concentrating or fixating on aspects
other than the one the author had in mind.

Thanks for your comment.

~~~
oscilloscope
Maybe you could elaborate on what you mean by "spikey". I associate spikes
with some kind of discontinuity, having little to do with volume.

~~~
ColinWright
Two things in particular.

Firstly, if you stand on the surface and chop off a cap, it has almost no
volume. Secondly, if you step in a random direction, you'll be outside, not
inside, the sphere. These are almost equivalent if consider a ball around your
current location - almost none of it intersects the sphere.

In our 3D world this are characteristics of a spike, not of a sphere, so
thinking of high-dimensional spheres as spikey helps stop you from making
natural mistakes driven by otherwise perfectly good visualization abilities.

~~~
sesqu
That second thing is perhaps your best example so far.

The cube example suffers from construction. It's simple to see that the
bounding spheres are fixed to the cube's corners, which escape with dimension.
While very interesting, this doesn't do much to constrain the ball's shape.

Volume is also bad measure to use here, since it's already agreed that the
volume of the entire sphere rapidly decreases. This is mostly due to the
immense expansion of the reference cube. The ratio of the cap to the sphere is
better, but you left that as an exercise to the reader, ignoring things like
what height relative volume starts materializing in (a reasonable person might
expect cos π/4).

But picking a random direction from the surface is illuminating. It's hard to
get rid of that effect without switching to spherical coordinates.

------
mjw
I'm not sure this works for me; I know what he means, but it's hard to view a
rotationally-invariant object with positive curvature as being spikey. The
analogy captures the 'what happens when you cut off a slice' property at the
expense of being a good analogy for other properties.

Best not to rely too heavily on analogies IMO. In this case the point of the
exercise is to demonstrate that one popular analogy (that of hyperspheres to
the spheres and circles we can visualise) is flawed in some important
respects. Replacing it with another analogy might not be the best idea, unless
you're careful to remember the caveats attached to it.

------
tomerv
Another explanation (a bit clearer in my opinion): <http://bit-
player.org/2011/the-n-ball-game>

------
celoyd
As a gateway to further reading (covering many of the interesting facts people
are pointing out here), I enjoyed
<http://en.wikipedia.org/wiki/Curse_of_dimensionality>

------
ColinWright
See also the discussion of the "Curse of dimensionality":
<http://news.ycombinator.com/item?id=4045143>

------
kentpalmer
I use this fact about hyperspheres getting smaller in my dissertation on
Emergent Design at <http://about.me/emergentdesign>.

I have created something called Schemas Theory which is the next level of
abstraction up from Systems Theory but contains all the schemas like Facet,
Monad, Pattern, Form, System, OpenScape (meta-system), Domain, World, Kosmos,
Pluriverse. Then to kick things off I created a hypothesis that Schemas were
related to dimensions by a rule that there were two scheams per dimension and
two dimensions per schema. And so there are ten schemas ranging from -1
dimension to 9 dimensions. It just so happens that String theory starts at the
tenth dimension, but is unschematized, in other words we have no natural
organizing template of understanding to relate to it. Schemas are projected
organizations by which we understand the things in a given dimension. They are
the way that we project Spacetime and find things in it intelligible given
Kant's idea of a priori synthesis.

Then the question comes why are there only ten schemas and why do they stop at
the ninth dimension, and I use the fact that bounded spheres as in the example
given overflow the surrounding spheres at that point which is something that
goes beyond our intuitions of how space itself works. I think it works as an
explanation as to why we don't have natural models of intuition beyond the
pluriverse (i.e. the multiverse). The point in my dissertation is that we use
schemas as the basis of all our design activites.

So I think this fact of the overflowing of the hyperspheres of their
surrounding spheres is quite important for our understanding of how we project
spacetime templates of understanding as a framework for understanding
dimensional phenomena.

The other point that I make in my dissertation that is related is that
hyperspheres get bigger in terms of surface and volume and then they get much
smaller and the dimensions where they are the biggest are at 5 throug 7
dimensions. I make the point that when we say that we can hold 7+/-2 things in
short term memory those are independent things, and that means that
conceptually we can do design up to the ninth dimension but that the optima is
in the fifth through seventh dimensions where the space of possibilities is
largest. So we actually hold in our minds higher dimensional objects and we
design in spaces of higher dimensions but not too high, but the optimal height
is 7+/2 dimensions which is where we have the most room to maneuver the
possible schemas. However in terms of manipulation it is the fourth dimension
that is best because in that dimension movement has perfect laminar flow
without singularities. And it is interesting that this is the dimension where
the middle sphere is the same size as the surrounding spheres.

Anyway, I just thought I would mention this because it is a theoretical use of
this fact about higher dimensions that we do not see referenced very often
which I think has lots of implications for how we think and how we design
especially in Software Engineering.

