Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another way of looking at this phenomenon is to inscribe a high dimensional sphere in a hypercube. As the dimension grows, the sphere/cube volume ratio gets arbitrarily small, although sphere touches the cube at every side. It seems that most of the mass of the cube was focused near vertices. This is because the side of a cube is a whole lot closer than the vertex from the center of the cube -- for instance, if we take a million-dimensional cube with a side of length 2, then the distance from the center of the cube to the side is 1, but the distance from the center to the vertex is 1000. It's not the spheres that are "spikey", the cubes are.


One should also take a close look at the formulas for volumes of unit-radius n-balls. For even numbers of dimensions it's

    pi^(n/2)/fact(n/2)
which behaves very interestingly[1]. It's equal to pi in two dimensions, peaks at the 4-dimensional ball (about 1.6 pi), then is driven to effectively zero by the time you get to 20 dimensions or so (value of ~8/1000 pi).

Again, however, the radius isn't changing. I don't think of unit balls as being spiky--they're still rotationally invariant in all d rotational degrees of freedom, so calling them spiky doesn't sit well with me.

Instead, imagine you lived on a line, only walking up and back along it forever. Then, one day, someone introduced you to the plane and then a volume. These are vastly larger than the space you were afforded by the line: exponential blowup.

It's also driven home by how the Lesbegue measure in d-dimensions assigns zero measure to all (d-1)-dimensional (or lesser) objects. Adding a dimension just makes space immensely larger.

And so pegging the size of our unit ball to its 1-dimensional parameter, the radius, causes its volume to vanish.

[1] Wolfram Alpha chart showing the volume of the n-sphere in units of pi http://www.wolframalpha.com/input/?i=pi%5E%28n%2F2-1%29%2FGa...


As a practical application of this, consider 1-d optimization using some kind of stochastic, "jumping" process. So long as the global minimum is in a kind of larger well then the information concerning it's location is "global" enough for the jumping process to work its way down there.

If the well the global minimum lives in is sufficiently small—if it is a sufficiently local event—then your stochastic search is bound to fail. At the limit, you have a discontinuity, a global minimum which literally has no global presence and is essentially impossible to find (you can show expected convergence times going to infinity as the minimum looks more and more discontinuous).

Fortunately, few processes are genuinely discontinuous and they usually have some kind of non-local "well of attraction". But in higher and higher dimensions, these wells can be made to appear nearly infinitely narrow.

All together this enforces the principle of the curse of dimensionality. High dimensionality hurts you in statistics and search unless every dimension is just chock full of signal.


Aha, right.. That helps me understand the link to optimisation. So the point of spheres getting "spiky" is that if you are searching for the inside of the sphere then you need to go into the spike which gets harder and harder to find as the dimension increases.


I think of it as the difference between playing air hockey (with no opponent) and darts. One of them is just a lot easier than the other.


You guys should swap nicks.


> which behaves very interestingly

Check out Bill Thurston's comment on why this is deeply misleading geometrically:

http://mathoverflow.net/questions/53119/volumes-of-n-balls-w...

As several commenters there point out, if you must choose a ratio, a more geometrically natural choice would be the ratio of the unit sphere to its circumscribed cube (which decreases monotonically) or to its inscribed cube (which increases monotonically).


As a demonstration of the non-intuitiveness of measures in high dimensions, I think this is still a decent example, no? I agree that n=5 [sic on my part above] isn't special, but seeing the Gamma function should be cause enough for worrying your intuition. It might be misleading, but I'd hope that it wouldn't beg the question of what's special about n=5, but instead what's wrong with your views about high dimensional space.

I agree with the comments on MathOverflow though.



> And so pegging the size of our unit ball to its 1-dimensional parameter, the radius, causes its volume to vanish.

There's some inherent flaw here - as 'pegging' the size of unit cube to its 1-dimensional parameter, the length, does not make the volume to vanish.


But the cube isn't pegged like that at all. The cube is the set of points where any element of the coordinate vector has a magnitude less than a specified value, whereas the sphere is the set of points where the magnitude of the vector is less than a specified value. Hence the cube gets 'larger' as you make space itself larger by adding dimensions, but the sphere doesn't.


Definitely the best comment. By looking at it this way, the mystery almost vanishes!


They're totally different objects. Unit balls encode distance by being the set of points within some distance from the center. They are thus, loosely, parameterized by a 1-dimensional measure.

Unit cubes are constrained such that all of their measurements are unit length---something which ensures that their volume is constantly 1. They do this by getting spiky, as xyzzyx mentioned. The distance to cross them shrinks as the vertex distance grows.

So they start to look totally different in high dimensions, evidenced by the ball's vanishing measure.


> It's also driven home by how the Lesbegue measure in d-dimensions assigns zero measure to all (d-1)-dimensional (or lesser) objects. Adding a dimension just makes space immensely larger.

> And so pegging the size of our unit ball to its 1-dimensional parameter, the radius, causes its volume to vanish.

Funny. I was just going to enter a comment how that was the most useful intuition in the whole thread.


As one last point about how the problem is caused by just how big high dimensional spaces are, remember that the n-dimensional unit sphere contains every d-dimensional unit sphere for d <= n.

The spheres are a strictly increasing sequence of sets, but the measures we call "volume" are growing even faster.


Amazing explanation.

It paints the visual picture where corners of a cube are more 'pointy' in higher dimensions. Another way to see this is basically that the corners basically divide the whole angle in many more parts in higher dimensions. For 2D, four corners cover the full 360 degree angle. For 3D, 8 corners. For a million dimensions, you can connect 2^(1mill) corners of different cubes to each other without overlapping each other.

So 'pointy' cubes reasons out the 'paradoxes' without resorting to 'spikey' spheres.


  > So 'pointy' cubes reasons out the 'paradoxes'
  > without resorting to 'spikey' spheres.
See elsewhere why this comment misses the really important bits that relate to machine learning and high-dimensional visualizations.

In particular, this comment: http://news.ycombinator.com/item?id=3998259


Thanks for the point - but your argument that 'from a surface almost every step leads you out' doesn't seem to be true in particular.

Imagine a sphere in n-dimensions, S = {Sum(x_i^2)<1}. Top of the sphere is p=(1,0,0,...,0). Now for any random direction r, if you take a very small step from p towards r, you have exactly 50% chance to be inside the sphere.

To be mathematically precise, for any r (unit vector) chosen at random, probability that there is e>0 such that p + e*r is within S is 1/2.

So half the time the steps take you out, half the time it takes you in - considering the step is small enough compared to the radius.



This made it click for me. Thanks for explaining.

Just to expand a bit on the vertex thing, in 2 dimensions length is sqrt(x^2 + y^2). For a square of side length 2 the distance from the center to each vertex is sqrt(1^2 + 1^2) or sqrt(2 * 1). For 3 dimensions it's sqrt(x^2 + y^2 + z^2) or sqrt(3 * 1). For one million dimensions the distance to each vertex is sqrt(1,000,000) or 1,000 while the distance to each side is still 1.


The spheres are spikey, too, in the sense he means. Fix a small radius and consider balls of that radius centered on the surface of the sphere as you increase dimension. The volume of such neighborhoods that intersects the original sphere drops drastically. Just consider the cone that has its apex at your point on the sphere and that extends to the one lower dimensional sphere at the intersection of the sphere and your local ball. Even if your ball is so small that the cone has an apex angle of 89 degrees, that means that there is some fraction (less than one) of the cone base radius that you're losing. And that reduces your volume in each dimension. So there is exponential loss of cone volume as the number of dimensions increases.


While much of what you say is true, it seems to me to miss the main point of the article. It feels like you've read the first part about the sphere inside the other spheres, but not read the rest that talks about the other reasons a sphere is "spikey".

You say:

  > It's not the spheres that are "spikey", the cubes are.
It's true that cubes are spikey, but it's in an obvious and easy to visualize sense. Take an ordinary 3D cube, stretch out the corners and you've got the idea.

The reason high-dimensional spheres are "spikey" is more subtle, harder to visualize, and easy to get wrong when working in machine learning. I've now explained it in several replies in this thread, clearly I need to go back and revise the original to try to forestall this mis-reading.

http://news.ycombinator.com/item?id=3997551

http://news.ycombinator.com/item?id=3997681

http://news.ycombinator.com/item?id=3998259

http://news.ycombinator.com/item?id=3997215

http://news.ycombinator.com/item?id=3998259


This is alluded to in a sidebar that undermines the title of the article. The author seems to be too invested in the original idea to fully correct it. Obviously cubes are spike when measured from any origin point.


Can you be more specific? I'm certainly going to re-write the article, and I'd be interested to know which sidebar you are referring to. It's been incredibly valuable seeing the tangents people have taken, and how some have then caused them to miss entirely the intended point(s).

Thanks.


I think your point about the cubes gets the idea across much better, after all any lower dimensional slice taken through a sphere will look spherical rather than spiky, but you can take slices through cubes that perfectly illustrate their spiky nature.


Note that all these characteristics depend on the Euclidean distance metric. Perhaps the conclusions should be that when working with high-dimension data, you should consider using a different distance metric.


> It's not the spheres that are "spikey", the cubes are.

This was exactly my reaction on thinking over what the article is saying.


You're not the only person saying something similar, and it's really valuable feedback because it's showing me that people are being distracted from the point I'm trying to make.

The point is that when you're on the surface of a high-dimensional sphere, what you're standing on has characteristics that we, with our 3D intuition, would normally associate with spikes. It seems that you are in a reasonable number of people who are missing that point, so obviously I'm not making it very well.

So thanks for the feedback.


> The point is that when you're on the surface of a high-dimensional sphere, what you're standing on has characteristics that we, with our 3D intuition, would normally associate with spikes.

Can you explain why the "spikiness" should be associated with the hyperspheres instead of the hypercubes?


Sure, although it's not an "instead". Let me explain.

No, it would take too long. Let me summarize.

If you look at high-dimensional cubes it's pretty obvious that the corners get further and further away from the center as the dimension goes up. People (quite rightly) feel that this can be visualized as taking an ordinary 3D cube and tugging on the corners, stretching them out and thus making them "spikey".

With spheres, though, there are no corners, so people think they remain "smooth". Which they do, in a sense, but that leads to the wrong intuition. Lopping off a high-dimensional spherical cap gives you virtually no volume at all, and in our 3D world the shape that is as symmetrical as you can make it, but which when you chop it off, has almost no volume, is a spike.

Add to that these comments: http://news.ycombinator.com/item?id=3997551 and http://news.ycombinator.com/item?id=3997681

These allude to the fact that when you've on the surface of a high-dimensional sphere, almost every step takes you out. Getting into the interior is like trying to get into the interior of a spike from its tip.

Does that help?


> almost every step takes you out

This doesn't seem to be true.

Imagine a sphere in n-dimensions, S = {Sum(x_i^2)<1}. Top of the sphere is p=(1,0,0,...,0). Now for any random direction r, if you take a very small step from p towards r, you have exactly 50% chance to be inside the sphere.

To be mathematically precise, for any r (unit vector) chosen at random, probability that there is e>0 such that p + e*r is within S is 1/2.

So half the time the steps take you out, half the time it takes you in - considering the step is small enough compared to the radius.


But for a given e, that probability goes to zero as n goes to infinity, or as n goes to infinity, the e required to be within a given delta of 50% gets smaller. In other words, choose your step size in an optimisation. As n gets larger, the chances of it working get smaller. In even moderately large dimensions the step sixe required is so small, you don't make any progress.


> when you've on the surface of a high-dimensional sphere, almost every step takes you out. Getting into the interior is like trying to get into the interior of a spike from its tip.

This does, yes. Starting with how this aspect changes from a circle (1-sphere) to a 2-sphere might help to get across the "spikiness" you are talking about. It seems to be basically saying the same thing as the "lopping off the cap" description, but the latter was very hard for me to visualize. Saying that a smaller and smaller fraction of the total "hypersolid angle" intersects the hypersphere's interior as you increase the dimension makes it much easier (for me) to see what's going on.


Cool - thanks.


A nearly flat cap from the 3D sphere with big radius will also have small volume. I think the word "spikey" is bad for getting your idea across.


But in the 3D sphere the volume of the cap grows rapidly with the height. That's not true in higher dimensions, which is why in higher dimensions it behaves more like a spike than a sphere. It's the apparent contradiction of that image that helps prevent the usual mistake in visualization.

That's why I use the term, for that shock value.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: