# comp.graphics.algorithms

## Subject: Re: nearest neighbour of a point

Hans-Bernhard Broeker wrote:
> In comp.graphics.algorithms running.... wrote:
>
> > All the points lie within a hypercube of side 1 and all points are
> > having non-negative co-ordinates in each of the dimensions. So, if we
> > consider 3D, then they would all lie in a cube of side 1 in the 1st
> > octant(if that's what it's called).
>
> So, to reduce the words to a concise formula: your input sets are
> contained in [0:1]^N, for N somewhere around 20, possibly up to 50.
>
> > and the number of points having dist^ between 10 & 11 would probably be
> > the most.
>
> You'ld better verify that probability. The actual expected distance
> is 0.5 in each coordinate, which ends up as an expected squared
> distance of 20*(0.25). I.e. the average squared distance would be 5,
> not 10.

Yes, you are right. For a distribution of 32 dimensions, I get the
elements in the complete distribution. 32*0.25 = 8.0 though.

>

It is a Normal Distribution, not a Uniform Distribution.

> in an
> N-dimensional hypercube is the worst case for nearest-neighbor
> searches. If your points don't cluster in any known way inside that
> hypercube, the naive O(|X|*|Y|) algorithm will be unbeatable.
>
> To repeat:
>
> > > The fact to remember is that in
> > > high-dimensional spaces, everything is pretty much next to everything
> > > else, so the only way to find the closest thing is to check
> > > everything.
>
> > However, instead of considering the thickness of the hyperspherical
> > shell to be the width of the distribution, we let the thickness be say
> > the standard deviation of the distribution,
>
> That is what I meant by "width".
>
> > Let y be a point in set Y, and let dy be the dist^2 of y from {0,0,0}.
> > Say that I have a list of all points in set X such that they are sorted
> > in ascending order on their dist^2 from {0,0,0}.
>
> > Now, I can say that the point in set X which is closest to point y is
> > one which lies in the band such that the band contains all points at a
> > distance of [dy-DELTA] to [dy+DELTA] such that DELTA is chosen in such
> > a way that it encompases at least the point which is closest to y.
>
> And the problem with that is that DELTA will be too large.

Yes, exactly. If I reduce DELTA, then many points get left out as you
mentioned, and if I increase DELTA & it becomes >= 2*SIGMA, then many
points are searched, and the program becomes very slow.

>
> In a nutshell: sorting doesn't preserve distance in spaces of
> dimension greater than one.
>
> --
> Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
> Even if all the snow were burnt, ashes would remain.