How to efficiently generate data of neighboring points within specified radii (distances) for each point in a given points data set.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

How to efficiently generate data of neighboring points within specified radii (distances) for each point in a given points data set.

Lom Navanyo
Hello,
I have data set of about 3400 location points with which I am trying to
generate data of each point and their neighbors within defined radii (eg,
0.25, 1, and 3 miles).

Below is a reprex using the built-in  nz_height  data:

library(sf)
library(dplyr)
library(spData)
library(ggplot2)
library(stringr)
library(rgdal)
library(lwgeom)
library(sp)


#Transform and project to required UTM

projdata<-st_transform(nz_height, 32759)  #32759 is for UTM Zone 59S


# plot(projdata$geometry)

# sequence of radii

bufferR <- c(402.336, 1609.34, 3218.69, 4828.03, 6437.38)

#Create data of neighboring wells per buffer

dataout <- do.call("rbind", lapply(1:length(bufferR), function(y) {
    bfr <- projdata %>% st_buffer(bufferR[y]) ## create Buffer
    ## minus the next smaller buffer
    if(y>1) {
      inters <- suppressWarnings(st_difference(bfr, projdata %>%
st_buffer(bufferR[y-1])))
      bfr <- inters[which(inters$t50_fid == inters$t50_fid.1),]
    }

    # get ids that intersect with buffer
    inters <- bfr %>% st_intersects(projdata)


    do.call("rbind", lapply(which(sapply(inters, length)>0),
         function(z) data.frame(t50_fid = projdata[z,]$t50_fid, radius =
bufferR[y],
                t50_fid_2 = projdata[unlist(inters[z]),]$t50_fid,
                elevation_mtchd = projdata[unlist(inters[z]),]$elevation)))
}))

This gives data frame as:

> head(dataout)
  t50_fid  radius t50_fid_2 elevation_mtchd
1 2353944 402.336   2353944            2723
2 2354404 402.336   2354404            2820
3 2354405 402.336   2354405            2830
4 2369113 402.336   2369113            3033
5 2362630 402.336   2362630            2749
6 2362814 402.336   2362814            2822

The end goal is that for each (original) point with  t50_fid  as IDs, I
want its neighboring points within the specified radius listed under
 t50_fid_2 in a long format. The caveat is that for the very first (ie. the
smallest) radius 402.336,  t50_fid_2 should return neighboring points
within that distance. But for subsequent radii,  t50_fid_2 should return
neighboring points within them but not within the smaller radius. Thus for
example, for radius 1609.34m, I should get as neighboring points, points
within 1609.34m but not within the smaller buffer/radius 402.336m.

The problem is that if I use my full data set of over 3000 rows (points), I
get the following error:

Error in CPL_geos_op2(op, st_geometry(x), st_geometry(y)) : Evaluation
error: std::bad_alloc.

I understand this is a memory issue as the code I am using creates buffers
around each point and this approach is memory intensive.

A suggestion was made that I could achieve my objective  using
st_is_within_distance  instead of  st_buffer  , st_difference  and
st_intersect without creating buffers.

How can I achieve my objective (that is, the table in dataout) efficiently
either with the suggested use of  st_is_within_distance,  or with my code
without running out of memory (RAM) or any other approach?

Thank you for considering my question.
-----------------------
Lom Navanyo Newton

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo