loop memory problem

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

loop memory problem

marta_TM
Hi all the members,

I have an script with a loop. It's working properly with a few points.
However, when I tried to run all my database 90.000 points, the loop
stopped each 3.000 points. And it takes 8hours for run 3.000 points.

My computer has 32GB of RAM, intel corei7 6700HQ CPU @ 2.60GHz .

Do you think it can be a memory problem? Is there anyway to make it faster??

Thanks you in advance,

Marta

####files
https://drive.google.com/open?id=0BwqSBe1Yq-FBUWVBOUdvaThEU1k
##############the script
library(sp)
library(gdistance)
library(rgeos)
### define data folder
path_data<-"C:/Users/Q11/"

#debug test# tormenta1
tor<- read.table(paste0(path_data,"tormenta1.csv"), header=TRUE, sep=",",
na.strings="NA", dec=".", strip.white=TRUE)
#transition layer
costa6Azo<- raster(paste0(path_data,"costa6Azo_projected.tif"))#wgs84
transitioncosta6Azo <- transition(costa6Azo, min, directions=16)#porque
min????
trCostS16 <- gdistance::geoCorrection(transitioncosta6Azo, type="c")
#

effortSP_tormentapos_1_1<-as.data.frame(cbind(tor$Lat1,tor$Long1,tor$transect))

str(effortSP_tormentapos_1_1)
sp::coordinates(effortSP_tormentapos_1_1) <- ~V2+V1
sp::proj4string(effortSP_tormentapos_1_1) <-CRS("+proj=longlat +datum=WGS84
+no_defs")

effortSP_tormenta <-sp::spTransform(effortSP_tormentapos_1_1,
CRS("+proj=utm +zone=26 +ellps=intl +towgs84=-104,167,-38,0,0,0,0 +units=m
+no_defs"))

#effortSP_tormenta#to keep the same names
plot(effortSP_tormentapos_1_1,axes=TRUE,add=TRUE)
# calculating the first segment of the whole sailing path# 10 total points
tormenta<- gdistance::shortestPath(trCostS16, effortSP_tormenta@coords
[1,],effortSP_tormenta@coords[2,], output="SpatialLines")
gLength(tormenta)
lines(tormenta,col=5)

### here we start with the for-loop :
for (i in (seq(2,length(effortSP_tormenta)-1))) { {
  print(tormenta)
  nextSegment<- gdistance::shortestPath(trCostS16, effortSP_tormenta@coords
[i,],effortSP_tormenta@coords[i+1,], output="SpatialLines")
  # simple addition combines the single spatialline segements
  tormenta <- nextSegment + tormenta
  # we plot each new segement
  lines(nextSegment)
  gLength(nextSegment)

}}
####################################

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Model - SAR Poisson GLM

Ticiana Grecco Zanon
Hello everybody,


I am estimating a doubly constrained model with Poisson (Quasi-Poisson) but I would like to include a spatial lag variable to account to spatial dependence. My data is not zero-inflated but it is overdispersed.


Thanks in advance,

Ticiana



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: loop memory problem

Marcel Gangwisch
In reply to this post by marta_TM
Hi Marta,

for your loop:
1) maybe you can first make your calculations and afterwards plot the
result.
It might be faster this way.

2) maybe you can use the parallel for each loop in order to use the
whole performance of your cpu

3) loops in R are really slow in general.. you could also think about
some fancy stuff like some R compiler or Rcpp package

4) in order to analyze your memory consumption, you should have a view
on your system resources (depending on your OS: task manager (win) or
htop (linux))

Best regards,
Marcel


On 06.03.2017 12:51, marta azores wrote:

> Hi all the members,
>
> I have an script with a loop. It's working properly with a few points.
> However, when I tried to run all my database 90.000 points, the loop
> stopped each 3.000 points. And it takes 8hours for run 3.000 points.
>
> My computer has 32GB of RAM, intel corei7 6700HQ CPU @ 2.60GHz .
>
> Do you think it can be a memory problem? Is there anyway to make it faster??
>
> Thanks you in advance,
>
> Marta
>
> ####files
> https://drive.google.com/open?id=0BwqSBe1Yq-FBUWVBOUdvaThEU1k
> ##############the script
> library(sp)
> library(gdistance)
> library(rgeos)
> ### define data folder
> path_data<-"C:/Users/Q11/"
>
> #debug test# tormenta1
> tor<- read.table(paste0(path_data,"tormenta1.csv"), header=TRUE, sep=",",
> na.strings="NA", dec=".", strip.white=TRUE)
> #transition layer
> costa6Azo<- raster(paste0(path_data,"costa6Azo_projected.tif"))#wgs84
> transitioncosta6Azo <- transition(costa6Azo, min, directions=16)#porque
> min????
> trCostS16 <- gdistance::geoCorrection(transitioncosta6Azo, type="c")
> #
>
> effortSP_tormentapos_1_1<-as.data.frame(cbind(tor$Lat1,tor$Long1,tor$transect))
>
> str(effortSP_tormentapos_1_1)
> sp::coordinates(effortSP_tormentapos_1_1) <- ~V2+V1
> sp::proj4string(effortSP_tormentapos_1_1) <-CRS("+proj=longlat +datum=WGS84
> +no_defs")
>
> effortSP_tormenta <-sp::spTransform(effortSP_tormentapos_1_1,
> CRS("+proj=utm +zone=26 +ellps=intl +towgs84=-104,167,-38,0,0,0,0 +units=m
> +no_defs"))
>
> #effortSP_tormenta#to keep the same names
> plot(effortSP_tormentapos_1_1,axes=TRUE,add=TRUE)
> # calculating the first segment of the whole sailing path# 10 total points
> tormenta<- gdistance::shortestPath(trCostS16, effortSP_tormenta@coords
> [1,],effortSP_tormenta@coords[2,], output="SpatialLines")
> gLength(tormenta)
> lines(tormenta,col=5)
>
> ### here we start with the for-loop :
> for (i in (seq(2,length(effortSP_tormenta)-1))) { {
>    print(tormenta)
>    nextSegment<- gdistance::shortestPath(trCostS16, effortSP_tormenta@coords
> [i,],effortSP_tormenta@coords[i+1,], output="SpatialLines")
>    # simple addition combines the single spatialline segements
>    tormenta <- nextSegment + tormenta
>    # we plot each new segement
>    lines(nextSegment)
>    gLength(nextSegment)
>
> }}
> ####################################
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: loop memory problem

Marcel Gangwisch
Hi Marta,

for your loop:
1) maybe you can first make your calculations and afterwards plot the
result.
It might be faster this way.

2) maybe you can use the parallel for each loop in order to use the
whole performance of your cpu

3) loops in R are really slow in general.. you could also think about
some fancy stuff like some R compiler or Rcpp package

4) in order to analyze your memory consumption, you should have a view
on your system resources (depending on your OS: task manager (win) or
htop (linux))

Best regards,
Marcel



On 06.03.2017 12:51, marta azores wrote:

> Hi all the members,
>
> I have an script with a loop. It's working properly with a few points.
> However, when I tried to run all my database 90.000 points, the loop
> stopped each 3.000 points. And it takes 8hours for run 3.000 points.
>
> My computer has 32GB of RAM, intel corei7 6700HQ CPU @ 2.60GHz .
>
> Do you think it can be a memory problem? Is there anyway to make it
> faster??
>
> Thanks you in advance,
>
> Marta
>
> ####files
> https://drive.google.com/open?id=0BwqSBe1Yq-FBUWVBOUdvaThEU1k
> ##############the script
> library(sp)
> library(gdistance)
> library(rgeos)
> ### define data folder
> path_data<-"C:/Users/Q11/"
>
> #debug test# tormenta1
> tor<- read.table(paste0(path_data,"tormenta1.csv"), header=TRUE, sep=",",
> na.strings="NA", dec=".", strip.white=TRUE)
> #transition layer
> costa6Azo<- raster(paste0(path_data,"costa6Azo_projected.tif"))#wgs84
> transitioncosta6Azo <- transition(costa6Azo, min, directions=16)#porque
> min????
> trCostS16 <- gdistance::geoCorrection(transitioncosta6Azo, type="c")
> #
>
> effortSP_tormentapos_1_1<-as.data.frame(cbind(tor$Lat1,tor$Long1,tor$transect))
>
>
> str(effortSP_tormentapos_1_1)
> sp::coordinates(effortSP_tormentapos_1_1) <- ~V2+V1
> sp::proj4string(effortSP_tormentapos_1_1) <-CRS("+proj=longlat
> +datum=WGS84
> +no_defs")
>
> effortSP_tormenta <-sp::spTransform(effortSP_tormentapos_1_1,
> CRS("+proj=utm +zone=26 +ellps=intl +towgs84=-104,167,-38,0,0,0,0
> +units=m
> +no_defs"))
>
> #effortSP_tormenta#to keep the same names
> plot(effortSP_tormentapos_1_1,axes=TRUE,add=TRUE)
> # calculating the first segment of the whole sailing path# 10 total
> points
> tormenta<- gdistance::shortestPath(trCostS16, effortSP_tormenta@coords
> [1,],effortSP_tormenta@coords[2,], output="SpatialLines")
> gLength(tormenta)
> lines(tormenta,col=5)
>
> ### here we start with the for-loop :
> for (i in (seq(2,length(effortSP_tormenta)-1))) { {
>    print(tormenta)
>    nextSegment<- gdistance::shortestPath(trCostS16,
> effortSP_tormenta@coords
> [i,],effortSP_tormenta@coords[i+1,], output="SpatialLines")
>    # simple addition combines the single spatialline segements
>    tormenta <- nextSegment + tormenta
>    # we plot each new segement
>    lines(nextSegment)
>    gLength(nextSegment)
>
> }}
> ####################################
>
>     [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: loop memory problem

marta_TM
> #answer Marcel in r-sig-geo forum:
>> #1) maybe you can first make your calculations and afterwards plot the
>> result.
>> #It might be faster this way.
>>
>> #2) maybe you can use the parallel for each loop in order to use the
>> whole performance of your cpu
>>
>> #3) loops in R are really slow in general.. you could also think about
>> some fancy stuff like some R compiler or Rcpp package
>>
>> #4) in order to analyze your memory consumption, you should have a look
>> on your system resources (depending on your OS: task manager (win) or htop
>> (linux))
>>
>> #############
>> #1)MARCEL: maybe you can first make your calculations and afterwards plot
>> the result.
>> #It might be faster this way.
>> ##MARTA: I did that, but the difference between the calculations inside
>> the loop of afterwards is less than a second.
>> #
>>
>       #############

> #2) MARCEL:maybe you can use the parallel or each loop in order to use the
>> whole performance of your cpu
>> #MARTA:I  developed 5 different loops, the old one (2.A) it works
>> properly but is too slow. I followed your suggestion with the parallel
>> loops to increase the speed. I wrote a loop (2.B) with %do%, which is works
>> but the time is the #same as the old loop. The other loops with %doPar%,
>> didn't work at all. Only the simple (2.D) runs, but is not what I want. The
>> other two ( 2.C and 2.E) didn't run, they have errors. The 2.C "task 1
>> failed - "non-numeric argument #to binary operator"  and the 2.E "task 1
>> failed - "no method for coercing this S4 class to a vector".
>> #
>> #Any idea of how repair this loop?
>>
>> #########
>> #library
>> ########
>> library(parallel);library(foreach);library(doParallel);
>> library(gdistance);library(raster)
>> library(rgdal);library(rgeos)
>> library(sp)
>> #######
>> #data
>> ###########
>> path_data<-"E:/Q11/"
>> # cost raster sea and island
>> costa6Azo <- raster("E:/Q11/costa6Azo_projected.tif")
>> transitioncosta6Azo <- transition(costa6Azo, min, directions=16)#porque
>> min????
>> trCostS16 <- gdistance::geoCorrection(transitioncosta6Azo, type="c")
>> #points
>> boat <- read.table(paste0(path_data,"boat2905.csv"), header=TRUE,
>> sep=";", na.strings="NA", dec=".", strip.white=TRUE)
>> pos<-as.data.frame(cbind(boat$Lat1,boat$Long1,boat$Ref))
>> str(pos)
>> sp::coordinates(pos) <- ~V2+V1
>> sp::proj4string(pos) <-CRS("+proj=longlat +datum=WGS84 +no_defs")
>>
>> pos<-sp::spTransform(pos, CRS("+proj=utm +zone=26 +ellps=intl
>> +towgs84=-104,167,-38,0,0,0,0 +units=m +no_defs"))
>> # The aim of the loop: calculating the track of the whole sailing
>> path#############################################################
>> #
>> line<- gdistance::shortestPath(trCostS16, pos@coords[1,],pos@coords[2,],
>> output="SpatialLines")#
>> glength(1-2=4887.737m);(2-3=12590.01);(12-11=11360.39m);(12-13=9453.001m)
>> lines(line,col=5)
>> #2. A) old
>> loop##########################################################################################################################
>> #it works, but it's too slow
>>
>

> ## here we start with the for-loop
>> for (i in (seq(2,length(pos) - 1))) {
>>   # calculation of the rest of the segements
>>   nextSegment<- gdistance::shortestPath(trCostS16, pos@coords
>> [i,],pos@coords[i+1,], output="SpatialLines")
>>   # simple addition combines the single spatialline segements
>>   line <- nextSegment + line
>>   # we plot each new segment
>>   lines(nextSegment)
>> }
>> # note that we have now ten combined line features in this SpatialLines
>> object
>> line
>> gLength(line)#110747.2
>>
>> #2.B### new parallel
>> loop##################################################################################################3
>> #it works, but it's too slow##%do%
>> #
>> line<- gdistance::shortestPath(trCostS16, pos@coords[1,],pos@coords[2,],
>> output="SpatialLines")#
>> glength(1-2=4887.737m);(2-3=12590.01);(12-11=11360.39m);(12-13=9453.001m)
>>
>> x <- foreach(i=2:13) %do%
>>   {nextSegment<-gdistance::shortestPath(trCostS16, pos@coords
>> [i,],pos@coords[i-1,], output="SpatialLines")
>>     line <- nextSegment + line
>>  }
>> x      ##  it works!!!! 1+ 12 spatialLines!!
>> ##
>
>
#     #2.C#parallel loop ##%dopar%

> #without success
>>
> registerDoParallel()
>
> getDoParWorkers()
>> line<- gdistance::shortestPath(trCostS16, pos@coords[1,],pos@coords[2,],
>> output="SpatialLines")#
>> glength(1-2=4887.737m);(2-3=12590.01);(12-11=11360.39m);(12-13=9453.001m)
>> #function
>> #
>> funMTM<-function(){
>> nextSegment<-gdistance::shortestPath(trCostS16, pos@coords[i,],pos@coords[i-1,],
>> output="SpatialLines")
>> line <- nextSegment + line
>> }
>> getDoParWorkers()
>> ptime <- system.time({
>>   result <- foreach(i=2:13) %dopar% funMTM()
>>   })
>> ptime#Error in funMTM() :
>> #task 1 failed - "non-numeric argument to binary operator"
>> #
>>
> #

> #2.D#loop %dopar% simple
>>
>        # it works, but I need the spatiallines output, not a list.

> #If I run the %dopar% only with the functions shortestPath without
>> increase the lines into the SpatialLines. I get a list with 13 features(
>> indidivual spatialLines). However, I need an SpatialLines object, with 13
>> spatiallines inside.
>> registerDoParallel()
>> registerDoSEQ()
>> registerDoParallel(cores=10)
>> getDoParWorkers()
>> system.time(foreach(i=2:13) %dopar% gdistance::shortestPath(trCostS16,
>> pos@coords[i,],pos@coords[i-1,], output="SpatialLines"))
>> #user  system elapsed
>> #0.74    0.69   54.81
>>
>
#    #2.E#loop %dopar% with the function defined in the loop
      #without success:

> system.time(foreach(i=2:13) %dopar% {
>>   nextSegment=gdistance::shortestPath(trCostS16, pos@coords
>> [i,],pos@coords[i-1,], output="SpatialLines")
>>             line =nextSegment + line})
>> #Error in { :
>> #task 1 failed - "no method for coercing this S4 class to a vector"
>> stopCluster(cl)
>>
>> ##############
>> #3)MARCEL: loops in R are really slow in general.. you could also think
>> about some fancy stuff like some R compiler or Rcpp package
>> #MARTA##### Functions #####
>> # byte code compilation
>> library(compiler)
>> myfunc<-gdistance::shortestPath(trCostS16, pos@coords[1,],pos@coords[2,],
>> output="SpatialLines")
>> myFuncCmp <- cmpfun(myfunc)
>> system.time({
>>   output <- SpatialLines(LinesList = , 1, FUN=myFuncCmp)
>> })
>> #############
>>
>>       #############

> #4) MARCEL: in order to analyze your memory consumption, you should have a
>> look on your system resources (depending on your OS: task manager (win) or
>> htop (linux))
>> #MARTA:I run the loop with the task manager's window open, and never over
>> pass the 30% of the CPU's memory .
>> ####files
>> https://drive.google.com/open?id=0BwqSBe1Yq-FBUWVBOUdvaThEU1k
>>
>>
> I haven't my solution yet but I'm closer now, your suggestions were really
helpful.
Marta

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: loop memory problem

marta_TM
Hi again,

I read in the guide of foreach that include the packages was a good idea.
Now the parallel loop is runing but at the end I got only one spatial line
(the first) I don't understand why is not adding the other spatiallines.

Any idea?

###3)dopar%
library(doSNOW)
registerDoSEQ()
registerDoParallel(cores=10)
getDoParWorkers()
line2<- gdistance::shortestPath(trCostS16, pos@coords[13,],pos@coords[12,],
output="SpatialLines")#

x<-foreach(i=2:13,.packages=c("gdistance","doParallel","foreach","base","sp"))
%dopar% {
  nextSegment2=gdistance::shortestPath(trCostS16,
pos@coords[i,],pos@coords[i+1,],
output="SpatialLines")
  line2 =nextSegment2+line2
  print(i)}

stopCluster(cl)
registerDoSEQ()

Thanks in advance,
Marta
####files
https://drive.google.com/open?id=0BwqSBe1Yq-FBUWVBOUdvaThEU1k

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Loading...