Hacker News new | comments | ask | show | jobs | submit login
High-dimensional spheres are "spikey" (penzba.co.uk)
184 points by ColinWright on May 19, 2012 | hide | past | web | favorite | 59 comments

Another way of looking at this phenomenon is to inscribe a high dimensional sphere in a hypercube. As the dimension grows, the sphere/cube volume ratio gets arbitrarily small, although sphere touches the cube at every side. It seems that most of the mass of the cube was focused near vertices. This is because the side of a cube is a whole lot closer than the vertex from the center of the cube -- for instance, if we take a million-dimensional cube with a side of length 2, then the distance from the center of the cube to the side is 1, but the distance from the center to the vertex is 1000. It's not the spheres that are "spikey", the cubes are.

One should also take a close look at the formulas for volumes of unit-radius n-balls. For even numbers of dimensions it's

which behaves very interestingly[1]. It's equal to pi in two dimensions, peaks at the 4-dimensional ball (about 1.6 pi), then is driven to effectively zero by the time you get to 20 dimensions or so (value of ~8/1000 pi).

Again, however, the radius isn't changing. I don't think of unit balls as being spiky--they're still rotationally invariant in all d rotational degrees of freedom, so calling them spiky doesn't sit well with me.

Instead, imagine you lived on a line, only walking up and back along it forever. Then, one day, someone introduced you to the plane and then a volume. These are vastly larger than the space you were afforded by the line: exponential blowup.

It's also driven home by how the Lesbegue measure in d-dimensions assigns zero measure to all (d-1)-dimensional (or lesser) objects. Adding a dimension just makes space immensely larger.

And so pegging the size of our unit ball to its 1-dimensional parameter, the radius, causes its volume to vanish.

[1] Wolfram Alpha chart showing the volume of the n-sphere in units of pi http://www.wolframalpha.com/input/?i=pi%5E%28n%2F2-1%29%2FGa...

As a practical application of this, consider 1-d optimization using some kind of stochastic, "jumping" process. So long as the global minimum is in a kind of larger well then the information concerning it's location is "global" enough for the jumping process to work its way down there.

If the well the global minimum lives in is sufficiently small—if it is a sufficiently local event—then your stochastic search is bound to fail. At the limit, you have a discontinuity, a global minimum which literally has no global presence and is essentially impossible to find (you can show expected convergence times going to infinity as the minimum looks more and more discontinuous).

Fortunately, few processes are genuinely discontinuous and they usually have some kind of non-local "well of attraction". But in higher and higher dimensions, these wells can be made to appear nearly infinitely narrow.

All together this enforces the principle of the curse of dimensionality. High dimensionality hurts you in statistics and search unless every dimension is just chock full of signal.

Aha, right.. That helps me understand the link to optimisation. So the point of spheres getting "spiky" is that if you are searching for the inside of the sphere then you need to go into the spike which gets harder and harder to find as the dimension increases.

I think of it as the difference between playing air hockey (with no opponent) and darts. One of them is just a lot easier than the other.

You guys should swap nicks.

> which behaves very interestingly

Check out Bill Thurston's comment on why this is deeply misleading geometrically:


As several commenters there point out, if you must choose a ratio, a more geometrically natural choice would be the ratio of the unit sphere to its circumscribed cube (which decreases monotonically) or to its inscribed cube (which increases monotonically).

As a demonstration of the non-intuitiveness of measures in high dimensions, I think this is still a decent example, no? I agree that n=5 [sic on my part above] isn't special, but seeing the Gamma function should be cause enough for worrying your intuition. It might be misleading, but I'd hope that it wouldn't beg the question of what's special about n=5, but instead what's wrong with your views about high dimensional space.

I agree with the comments on MathOverflow though.

> And so pegging the size of our unit ball to its 1-dimensional parameter, the radius, causes its volume to vanish.

There's some inherent flaw here - as 'pegging' the size of unit cube to its 1-dimensional parameter, the length, does not make the volume to vanish.

But the cube isn't pegged like that at all. The cube is the set of points where any element of the coordinate vector has a magnitude less than a specified value, whereas the sphere is the set of points where the magnitude of the vector is less than a specified value. Hence the cube gets 'larger' as you make space itself larger by adding dimensions, but the sphere doesn't.

Definitely the best comment. By looking at it this way, the mystery almost vanishes!

They're totally different objects. Unit balls encode distance by being the set of points within some distance from the center. They are thus, loosely, parameterized by a 1-dimensional measure.

Unit cubes are constrained such that all of their measurements are unit length---something which ensures that their volume is constantly 1. They do this by getting spiky, as xyzzyx mentioned. The distance to cross them shrinks as the vertex distance grows.

So they start to look totally different in high dimensions, evidenced by the ball's vanishing measure.

> It's also driven home by how the Lesbegue measure in d-dimensions assigns zero measure to all (d-1)-dimensional (or lesser) objects. Adding a dimension just makes space immensely larger.

> And so pegging the size of our unit ball to its 1-dimensional parameter, the radius, causes its volume to vanish.

Funny. I was just going to enter a comment how that was the most useful intuition in the whole thread.

As one last point about how the problem is caused by just how big high dimensional spaces are, remember that the n-dimensional unit sphere contains every d-dimensional unit sphere for d <= n.

The spheres are a strictly increasing sequence of sets, but the measures we call "volume" are growing even faster.

Amazing explanation.

It paints the visual picture where corners of a cube are more 'pointy' in higher dimensions. Another way to see this is basically that the corners basically divide the whole angle in many more parts in higher dimensions. For 2D, four corners cover the full 360 degree angle. For 3D, 8 corners. For a million dimensions, you can connect 2^(1mill) corners of different cubes to each other without overlapping each other.

So 'pointy' cubes reasons out the 'paradoxes' without resorting to 'spikey' spheres.

  > So 'pointy' cubes reasons out the 'paradoxes'
  > without resorting to 'spikey' spheres.
See elsewhere why this comment misses the really important bits that relate to machine learning and high-dimensional visualizations.

In particular, this comment: http://news.ycombinator.com/item?id=3998259

Thanks for the point - but your argument that 'from a surface almost every step leads you out' doesn't seem to be true in particular.

Imagine a sphere in n-dimensions, S = {Sum(x_i^2)<1}. Top of the sphere is p=(1,0,0,...,0). Now for any random direction r, if you take a very small step from p towards r, you have exactly 50% chance to be inside the sphere.

To be mathematically precise, for any r (unit vector) chosen at random, probability that there is e>0 such that p + e*r is within S is 1/2.

So half the time the steps take you out, half the time it takes you in - considering the step is small enough compared to the radius.

This made it click for me. Thanks for explaining.

Just to expand a bit on the vertex thing, in 2 dimensions length is sqrt(x^2 + y^2). For a square of side length 2 the distance from the center to each vertex is sqrt(1^2 + 1^2) or sqrt(2 * 1). For 3 dimensions it's sqrt(x^2 + y^2 + z^2) or sqrt(3 * 1). For one million dimensions the distance to each vertex is sqrt(1,000,000) or 1,000 while the distance to each side is still 1.

The spheres are spikey, too, in the sense he means. Fix a small radius and consider balls of that radius centered on the surface of the sphere as you increase dimension. The volume of such neighborhoods that intersects the original sphere drops drastically. Just consider the cone that has its apex at your point on the sphere and that extends to the one lower dimensional sphere at the intersection of the sphere and your local ball. Even if your ball is so small that the cone has an apex angle of 89 degrees, that means that there is some fraction (less than one) of the cone base radius that you're losing. And that reduces your volume in each dimension. So there is exponential loss of cone volume as the number of dimensions increases.

While much of what you say is true, it seems to me to miss the main point of the article. It feels like you've read the first part about the sphere inside the other spheres, but not read the rest that talks about the other reasons a sphere is "spikey".

You say:

  > It's not the spheres that are "spikey", the cubes are.
It's true that cubes are spikey, but it's in an obvious and easy to visualize sense. Take an ordinary 3D cube, stretch out the corners and you've got the idea.

The reason high-dimensional spheres are "spikey" is more subtle, harder to visualize, and easy to get wrong when working in machine learning. I've now explained it in several replies in this thread, clearly I need to go back and revise the original to try to forestall this mis-reading.






This is alluded to in a sidebar that undermines the title of the article. The author seems to be too invested in the original idea to fully correct it. Obviously cubes are spike when measured from any origin point.

Can you be more specific? I'm certainly going to re-write the article, and I'd be interested to know which sidebar you are referring to. It's been incredibly valuable seeing the tangents people have taken, and how some have then caused them to miss entirely the intended point(s).


I think your point about the cubes gets the idea across much better, after all any lower dimensional slice taken through a sphere will look spherical rather than spiky, but you can take slices through cubes that perfectly illustrate their spiky nature.

Note that all these characteristics depend on the Euclidean distance metric. Perhaps the conclusions should be that when working with high-dimension data, you should consider using a different distance metric.

> It's not the spheres that are "spikey", the cubes are.

This was exactly my reaction on thinking over what the article is saying.

You're not the only person saying something similar, and it's really valuable feedback because it's showing me that people are being distracted from the point I'm trying to make.

The point is that when you're on the surface of a high-dimensional sphere, what you're standing on has characteristics that we, with our 3D intuition, would normally associate with spikes. It seems that you are in a reasonable number of people who are missing that point, so obviously I'm not making it very well.

So thanks for the feedback.

> The point is that when you're on the surface of a high-dimensional sphere, what you're standing on has characteristics that we, with our 3D intuition, would normally associate with spikes.

Can you explain why the "spikiness" should be associated with the hyperspheres instead of the hypercubes?

Sure, although it's not an "instead". Let me explain.

No, it would take too long. Let me summarize.

If you look at high-dimensional cubes it's pretty obvious that the corners get further and further away from the center as the dimension goes up. People (quite rightly) feel that this can be visualized as taking an ordinary 3D cube and tugging on the corners, stretching them out and thus making them "spikey".

With spheres, though, there are no corners, so people think they remain "smooth". Which they do, in a sense, but that leads to the wrong intuition. Lopping off a high-dimensional spherical cap gives you virtually no volume at all, and in our 3D world the shape that is as symmetrical as you can make it, but which when you chop it off, has almost no volume, is a spike.

Add to that these comments: http://news.ycombinator.com/item?id=3997551 and http://news.ycombinator.com/item?id=3997681

These allude to the fact that when you've on the surface of a high-dimensional sphere, almost every step takes you out. Getting into the interior is like trying to get into the interior of a spike from its tip.

Does that help?

> almost every step takes you out

This doesn't seem to be true.

Imagine a sphere in n-dimensions, S = {Sum(x_i^2)<1}. Top of the sphere is p=(1,0,0,...,0). Now for any random direction r, if you take a very small step from p towards r, you have exactly 50% chance to be inside the sphere.

To be mathematically precise, for any r (unit vector) chosen at random, probability that there is e>0 such that p + e*r is within S is 1/2.

So half the time the steps take you out, half the time it takes you in - considering the step is small enough compared to the radius.

But for a given e, that probability goes to zero as n goes to infinity, or as n goes to infinity, the e required to be within a given delta of 50% gets smaller. In other words, choose your step size in an optimisation. As n gets larger, the chances of it working get smaller. In even moderately large dimensions the step sixe required is so small, you don't make any progress.

> when you've on the surface of a high-dimensional sphere, almost every step takes you out. Getting into the interior is like trying to get into the interior of a spike from its tip.

This does, yes. Starting with how this aspect changes from a circle (1-sphere) to a 2-sphere might help to get across the "spikiness" you are talking about. It seems to be basically saying the same thing as the "lopping off the cap" description, but the latter was very hard for me to visualize. Saying that a smaller and smaller fraction of the total "hypersolid angle" intersects the hypersphere's interior as you increase the dimension makes it much easier (for me) to see what's going on.

Cool - thanks.

A nearly flat cap from the 3D sphere with big radius will also have small volume. I think the word "spikey" is bad for getting your idea across.

But in the 3D sphere the volume of the cap grows rapidly with the height. That's not true in higher dimensions, which is why in higher dimensions it behaves more like a spike than a sphere. It's the apparent contradiction of that image that helps prevent the usual mistake in visualization.

That's why I use the term, for that shock value.

One practical application of this stuff is in understanding high dimensional discriminants.

Consider two populations, A and B, with N real-valued traits. Suppose each trait in group A is normally distributed with mean 0 and stdev=10, while group B is distributed with mean 1 and stdev=10. (This is true for trait 1, trait 2, etc.)

Each individual trait in these groups overlaps quite drastically. Imagining that all the mass of the normal distribution is contained in a ball of radius 20, then for any single trait A lives on the ball of radius 20 about 0 (namely [-20,20]) while B lives on [-19, 21]. Barely different, right?

On the other hand, in the N-dimensional space, the point (0,0,...,0) has a distance sqrt(N) from the point (1,1,...,1). So in 401-dimensional space, the ball of radius 20 around (0,0,...,0) and the ball of radius 20 around (1,1,...,1) don't overlap at all and the discriminant f(traits)=sign(trait[1] + trait[2] + ... + trait[N]) works fantastically.

This is one reason why "big data" can work well - chaining together many weak predictors gets you a strong predictor.

See Lewontin's Fallacy for a biological example of this.


I like that explanation, coming from the data side. I wonder if the article's author is working with data or a model for the data. There is a general rule that, especially for high-dimensional models, only a few parameters are important because eigenvalues of the sensitivity matrix fall off quickly (with logarithmic density). That means the data is well-separate, as you described, and only a few parameters control whether the model fits, while the rest are relatively unimportant. It's a different kind of pointy.

Other fun fact about spheres: as the dimension grows to infinity, the function of the area of the section along a diameter converges to a Gaussian (!)

If you are interested in these things, this survey is amazing: http://www.math.ucdavis.edu/~deloera/MISC/BIBLIOTECA/trunk/B...

Thanks for this nice link. It's a robust phenomenon that kicks in pretty well even in modestly large dimensions. One name for this is concentration of measure http://en.wikipedia.org/wiki/Concentration_of_measure

See figure 6 in the paper linked in the above comment for a nice illustration of how this works.

Fascinating stuff, thanks for the link

Another fun fact about high-dimensional spaces: Randomly pick k points in an n-dimensional space. Now, find their average location. As the number of dimensions increases, it becomes progressively more likely that every point will be closer to the average than it is to any other point.

A fun walk through the curse of dimensionality and how your intuition can break down in higher dimensional spaces, and other life lessons, can be found in Richard Hamming's book The Art of Doing Science and Engineering:


The intuition seems flawed. It’s not that n-spheres are spiky, but rather that n-cubes are. If you put n-spheres in all the corners of an n-cube, for n > 9 the corner spheres are far enough away from the centre of the cube that the central sphere ends up with a diameter greater than the cube’s edge length.

Even if I’m just misunderstanding it, I don’t see how it’s surprising. The author writes “with a true high-dimensional sphere, every point on the surface is ‘an extremity’”—isn’t the same true by definition of a sphere of any dimension? For an n-sphere, you have an infinite number of “spikes” whose tips constitute an (n ‒ 1)-dimensional surface.

Thanks for that response - I can see that you've missed the point, so I need to go back and consider how to re-work it to make the point clearer.

I'm trying to explain that every point on the surface of a high-dimensional sphere has characteristics that we, with our 3-dimensional intuition, would more usually associate with a spike.

When I get a bit of time I'll go back and re-read it with your comment in mind.

I think the square getting spikier is the salient issue at hand, not the sphere getting spikier. The inner sphere has an ever increasing radius based on the distance from the origin of a unit n-sphere placed at (1, 1, 1, ...). The inner sphere isn't spiky, that point just keeps getting farther away because n-squares get spiky.

You're focussing here on something other than the point I'm trying to get to . Yes, the corners of the cube get further from the surface of the sphere, but that's not the intended point. As I said in the comment to which you're replying, the idea is that points on the surface of the high-dimensional sphere have characteristics we normally associate with a spike. As so often happens with analogies, people (and you're not the only one, so the problem is with the writing, not the audience) are concentrating or fixating on aspects other than the one the author had in mind.

Thanks for your comment.

Maybe you could elaborate on what you mean by "spikey". I associate spikes with some kind of discontinuity, having little to do with volume.

Two things in particular.

Firstly, if you stand on the surface and chop off a cap, it has almost no volume. Secondly, if you step in a random direction, you'll be outside, not inside, the sphere. These are almost equivalent if consider a ball around your current location - almost none of it intersects the sphere.

In our 3D world this are characteristics of a spike, not of a sphere, so thinking of high-dimensional spheres as spikey helps stop you from making natural mistakes driven by otherwise perfectly good visualization abilities.

That second thing is perhaps your best example so far.

The cube example suffers from construction. It's simple to see that the bounding spheres are fixed to the cube's corners, which escape with dimension. While very interesting, this doesn't do much to constrain the ball's shape.

Volume is also bad measure to use here, since it's already agreed that the volume of the entire sphere rapidly decreases. This is mostly due to the immense expansion of the reference cube. The ratio of the cap to the sphere is better, but you left that as an exercise to the reader, ignoring things like what height relative volume starts materializing in (a reasonable person might expect cos π/4).

But picking a random direction from the surface is illuminating. It's hard to get rid of that effect without switching to spherical coordinates.

Is there some quantitative measure of curvature that works for the (n-1) dimensional surface of an n-dimensional solid for any n? If such exists, it would allow you to quantify this.

After some thought, I think I see what you were trying to get at. Let’s say you’ve got a 3-sphere, and you lop off a thin slice. That slice has a fair amount of volume, whereas with an n-sphere for increasingly high n, the volume is less and less—and thus intuitively more and more spike-like. Right?

Yes, that's one part of it. The other part is the fact that from the surface, whichever direction you go is not into the sphere.

I wrote a very similar blog post a few years ago when I first encountered the same fact:


Thanks to the comments, I learned this is a common warning, used in at least three textbooks.

I'm not sure this works for me; I know what he means, but it's hard to view a rotationally-invariant object with positive curvature as being spikey. The analogy captures the 'what happens when you cut off a slice' property at the expense of being a good analogy for other properties.

Best not to rely too heavily on analogies IMO. In this case the point of the exercise is to demonstrate that one popular analogy (that of hyperspheres to the spheres and circles we can visualise) is flawed in some important respects. Replacing it with another analogy might not be the best idea, unless you're careful to remember the caveats attached to it.

Another explanation (a bit clearer in my opinion): http://bit-player.org/2011/the-n-ball-game

As a gateway to further reading (covering many of the interesting facts people are pointing out here), I enjoyed http://en.wikipedia.org/wiki/Curse_of_dimensionality

See also the discussion of the "Curse of dimensionality": http://news.ycombinator.com/item?id=4045143

I use this fact about hyperspheres getting smaller in my dissertation on Emergent Design at http://about.me/emergentdesign.

I have created something called Schemas Theory which is the next level of abstraction up from Systems Theory but contains all the schemas like Facet, Monad, Pattern, Form, System, OpenScape (meta-system), Domain, World, Kosmos, Pluriverse. Then to kick things off I created a hypothesis that Schemas were related to dimensions by a rule that there were two scheams per dimension and two dimensions per schema. And so there are ten schemas ranging from -1 dimension to 9 dimensions. It just so happens that String theory starts at the tenth dimension, but is unschematized, in other words we have no natural organizing template of understanding to relate to it. Schemas are projected organizations by which we understand the things in a given dimension. They are the way that we project Spacetime and find things in it intelligible given Kant's idea of a priori synthesis.

Then the question comes why are there only ten schemas and why do they stop at the ninth dimension, and I use the fact that bounded spheres as in the example given overflow the surrounding spheres at that point which is something that goes beyond our intuitions of how space itself works. I think it works as an explanation as to why we don't have natural models of intuition beyond the pluriverse (i.e. the multiverse). The point in my dissertation is that we use schemas as the basis of all our design activites.

So I think this fact of the overflowing of the hyperspheres of their surrounding spheres is quite important for our understanding of how we project spacetime templates of understanding as a framework for understanding dimensional phenomena.

The other point that I make in my dissertation that is related is that hyperspheres get bigger in terms of surface and volume and then they get much smaller and the dimensions where they are the biggest are at 5 throug 7 dimensions. I make the point that when we say that we can hold 7+/-2 things in short term memory those are independent things, and that means that conceptually we can do design up to the ninth dimension but that the optima is in the fifth through seventh dimensions where the space of possibilities is largest. So we actually hold in our minds higher dimensional objects and we design in spaces of higher dimensions but not too high, but the optimal height is 7+/2 dimensions which is where we have the most room to maneuver the possible schemas. However in terms of manipulation it is the fourth dimension that is best because in that dimension movement has perfect laminar flow without singularities. And it is interesting that this is the dimension where the middle sphere is the same size as the surrounding spheres.

Anyway, I just thought I would mention this because it is a theoretical use of this fact about higher dimensions that we do not see referenced very often which I think has lots of implications for how we think and how we design especially in Software Engineering.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact