# Fast k-NN in R^4 ?

5 messages
Open this post in threaded view
|

## Fast k-NN in R^4 ?

 We are looking for a fast algorithm (implemented in C or even available for R) to compute the k nearest neighbors of a given point from n points in R^4 (i.e. 4-dim. Euclidean space). Even more concretely, the n points in 4- space are in a (n x 4) matrix Y, and he has m other points in 4-space as (m x 4) matrix X The result should be an  (m x k) matrix Ik  where Ik[i,] contains k indices in 1:n with the meaning that Y[Ik[i,]] contains the k nearest neighbours of X[i,] in Y[,] A simply implementation {also the one used inside of the class package knn() function} will be of complexity O(m * n), but the fast algorithms known really should provide O(m * log(n)) complexity. I see that packages 'gstat' and 'spatstat' mention solutions to such problems, but it seems this is just for 2-d point patterns, not 4-d or "general dim" ? Pointers, hints, ... are very welcome.         Martin Maechler, Seminar fuer Statistik, ETH (Federal Inst. Technology) Zurich, SWITZERLAND   <><
Open this post in threaded view
|

## Fast k-NN in R^4 ?

 Administrator On Tue, 6 Jul 2004, Martin Maechler wrote: > We are looking for a fast algorithm (implemented in C or even > available for R) to compute the k nearest neighbors of a given > point from n points in R^4 (i.e. 4-dim. Euclidean space). > > Even more concretely, the n points in 4- space are in a (n x 4) matrix Y, > and he has m other points in 4-space as (m x 4) matrix X > > The result should be an  (m x k) matrix Ik  where > Ik[i,] contains k indices in 1:n with the meaning that > Y[Ik[i,]] contains the k nearest neighbours of X[i,] in Y[,] > > A simply implementation {also the one used inside of the > class package knn() function} will be of complexity O(m * n), > but the fast algorithms known really should provide O(m * log(n)) > complexity. Some time ago I ported ANN (http://www.cs.umd.edu/~mount/ANN/) to R as a package, but never heard back from the authors when I wrote to ask for permission to release. They claim to be about O(kd log n) time for k-nearest neighbours in d dimensions (p. 31, http://www.cs.umd.edu/~mount/Papers/dist.pdf) so this might be a starting point. Please let me know if you'd like a copy of the draft package. Roger > > I see that packages 'gstat' and 'spatstat' mention solutions to such > problems, but it seems this is just for 2-d point patterns, not > 4-d or "general dim" ? > > Pointers, hints, ... are very welcome. > > Martin Maechler, Seminar fuer Statistik, > ETH (Federal Inst. Technology) Zurich, SWITZERLAND   <>< > > _______________________________________________ > R-sig-Geo mailing list > R-sig-Geo at stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/r-sig-geo> -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: Roger.Bivand at nhh.no Roger Bivand Department of Economics Norwegian School of Economics Helleveien 30 N-5045 Bergen, Norway
Open this post in threaded view
|

## Fast k-NN in R^4 ?

 In reply to this post by Martin Maechler This is a really good resource: http://www.cs.sunysb.edu/~algorith/major_section/1.6.shtmlT. On Tue, 2004-07-06 at 11:20, Martin Maechler wrote: > We are looking for a fast algorithm (implemented in C or even > available for R) to compute the k nearest neighbors of a given > point from n points in R^4 (i.e. 4-dim. Euclidean space). > > Even more concretely, the n points in 4- space are in a (n x 4) matrix Y, > and he has m other points in 4-space as (m x 4) matrix X > > The result should be an  (m x k) matrix Ik  where > Ik[i,] contains k indices in 1:n with the meaning that > Y[Ik[i,]] contains the k nearest neighbours of X[i,] in Y[,] > > A simply implementation {also the one used inside of the > class package knn() function} will be of complexity O(m * n), > but the fast algorithms known really should provide O(m * log(n)) > complexity. > > I see that packages 'gstat' and 'spatstat' mention solutions to such > problems, but it seems this is just for 2-d point patterns, not > 4-d or "general dim" ? > > Pointers, hints, ... are very welcome. > > Martin Maechler, Seminar fuer Statistik, > ETH (Federal Inst. Technology) Zurich, SWITZERLAND   <>< > > _______________________________________________ > R-sig-Geo mailing list > R-sig-Geo at stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/r-sig-geo-- Timothy H. Keitt Section of Integrative Biology University of Texas at Austin http://www.keittlab.org/