**comp.graphics.algorithms**

## Subject: **Re: nearest neighbour of a point**

Michael Hennebry wrote:

> running.polygon@gmail.com wrote:

>> I have 2 sets of points X & Y. The distribution of points in both the

>> sets is such that points at a certain distance are concentrated and on

>> either side of this distance, they start becoming sparce, something

>> like a normal distribution. The points are in higher dimentions > 3.

>> They may be up to 50 dimentions. Now, there are many points in each of

>> the sets, and I would like to know if there is any efficient method to

>> find for each point in set Y, a point in set X such that the distance

>> between the point in set Y & the selected point in set X is <= the

>> distance between the point in set Y and any other point in set X. I can

>> use euclidean distance as the metric.

>

> This one is still unclear on how X and Y were obtained.

>

> I gather that X and Y are subsets of unit hypercube [0,1]^n,

> where n is an integer that might be as large as 50.

> Are the elements of X and Y drawn uniformly at random from the

> hypercube?

> If so, I expect you are stuck with an O(|X| |Y| n) algorithm.

>

Probably the simplest way of achieving subquadratic time-complexity is

to use a spatial hash. Each point in your N-space is truncated to the

nearest gird position of an infinite N-grid of a given cell size S. Grid

positions are denoted by an N-vector of integers. Hashes are computed

based on this N-vector of integers. In a spatial hash, points in

neighboring cells of a given cell can be retrieved in constant time. By

choosing a sensible cell size the cells can be kept relatively sparse

(for a sort-of homogeneous distribution of points).

Store the smallest of the two sets (let's assume it is X) in the spatial

hash, and use the larger set (Y) to query the hash structure. For each y

in Y, retrieve the cell and compute the nearest distance d to a point x

in the cell. If the cell is empty, then take d to be infinite. Traverse

all neighboring cells in a nearest-first manner (Dijkstra's algorithm)

until all unsearched cells are further away then the current nearest

distance d.

It can be seen that if your cell size is too small then a lot of cells

need to be traversed, wheareas if the cell size is too large then the

number of distance computations per cell will be huge. So, it pays to

balance the cell size. If you have a way of estimating the average

distance between to to points, then that would be a proper measure for

the cell size.

Hope this info helpful.

Gino

Reply

View All Messages in

**comp.graphics.algorithms**

path:

nearest neighbour of a point =>Re: nearest neighbour of a point =>

Replies:

Re: nearest neighbour of a point

Copyright © 2006 WatermarkFactory.com. All Rights Reserved.