# comp.graphics.algorithms

## Subject: Re: nearest neighbour of a point

In comp.graphics.algorithms running.polygon@gmail.com wrote:

> I have 2 sets of points X & Y. The distribution of points in both the
> sets is such that points at a certain distance are concentrated and on
> either side of this distance, they start becoming sparce, something
> like a normal distribution.

This description is a bit hard to parse. You seem to be saying that
within set X, distances to the nearest neighbor cluster around a
certain value, rather separated from zero. And the same for set Y.
Or is this clustering true for *all* distances between points in one
set? I.e. are all points of set X expected to fall roughly on a
spherical shell of radius R around any point in set X one picks as a
reference?

The former case would indicate that your pointset is some kind of
regular grid embedded in that high-D space, and it may be a grid of
considerably fewer actual dimensions than the embedding space. You
could find out by counting the number of almost-equally-far neighbors
per point. E.g. a regular "grid" of points along a curve in 3D space
would yield 2 close neighbors per point, a gridded surface would have
4, and a volumetric grid would have 6.

E.g. if you find an average of 2*d close neighbors per vertex, your
pointset might be a grid on a d-dimensional manifold, rather than a
truly N-dimensional data set. If d is a good deal smaller than N,
that could ease the search algorithms a lot.

> The points are in higher dimentions > 3. They may be up to 50
> dimentions.

In that case, and assuming the argument above doesn't apply, odds are
that there's no algorithm that could improve upon the naive "compute
all distances and remember the shortes" approach. 2^50 is guaranteed
to be much larger than the number of points you have, so no kind of
dimensionality is 50. The fact to remember is that in
high-dimensional spaces, everything is pretty much next to everything
else, so the only way to find the closest thing is to check
everything.

You can do the maths yourself: compare the volume of a 50-dimensional
hyper-spherical shell whose radius and thickness are the center and
width of your distance distribution, to that of a 50-dimensional
hypersphere of said radius. You'll find that the shell's volume is
not really much different from that of the full sphere. I.e. the
restriction to a thin spherical shell isn't going to help you much in
searching or organizing points.

--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.