Ordinary co-kriging with gstat - computing time/syntax error?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Ordinary co-kriging with gstat - computing time/syntax error?

Mercedes Román
Dear R-sig-Geo members,

I want to perform co-kriging on the residuals on two regression models with
gstat. I am following some tutorials (Rossiter, 2012; Malone et al., 2017),
but I may have a mistake in the script. The computing time seems too long
(I know it takes a lot of calculation time, but in four days it never
finished), even when I set nmax=20. Originally, I wanted to perform a test
on ~ 11000 points, but I am trying now with a subset of 200 points. My
training dataset has ~ 27000 observations. I would like to produce a
reproducible example, but perhaps just the sintaxis of the script may be
enough to have a clue of my problem. My computer has an Intel®Xeon® CPU
E5-2609 v2 processor, and 32Go RAM memory.

Let’s say my two response variables are X1 and X2, and the dataframes with
the coordinates and covariates are “training” and “test”.

### Calculate the predictions on the training dataset (predictors is a
dataframe with the covariates)
training$pred.X1 <- predict(modelX1, newdata = predictors)
training$pred.X2 <- predict(modelX2, newdata = predictors)
### Calculate residuals
training$res.X1<- training$X1 - training$ pred.X1
training$res.X2 <- training$X2 - training$pred.X2
### Transform to spatial
training.sp <- training
coordinates(training.sp) <- ~ x + y
proj4string(training.sp) <- CRS("+init=epsg:2154")

### Define gstat object and compute experimental semivariograms for X1 and
X2 (for visual examination)
X1.g <- gstat(formula = res.X1~1, data = training.sp)
X1.svar <- variogram(X1.g, width = 5000, cutoff=250000)
X1.ivgm <- vgm(nugget=0.56, range=14229, psill=0.65, kappa=10, model="Mat")
## initial variogram model
X1.vgm <- fit.variogram(X1.svar, model=X1.ivgm, fit.method=7)
plot(X1.svar, X1.vgm)

X2.g <- gstat(formula = res.X2~1, data = training.sp)
X2.svar <- variogram(X2.g, width = 5000, cutoff=250000)
X2.ivgm <- vgm(nugget=0.19, range=13954, psill=0.21,kappa=10,   model="Mat")
X2.vgm <- fit.variogram(X2.svar, model=X2.ivgm, fit.method=7)
plot(X2.svar, X2.vgm)

### Define gstat object to compute the direct variograms and covariogram
g <- gstat( NULL, id="res.X1",  formula = res.X1~1, data = training.sp)
g <- gstat( g, id="res.X2",  formula = res.X2~1, data = training.sp)
v.cross <- variogram(g, width=5000, cutoff = 250000)
#### Fill parameters
g <- gstat(g, id = "res.X2", model = X2.vgm,  fill.all=T)
### fit LMCR
g <- fit.lmc(v.cross, g)
### Predict at the test locations (test.sp only indicates the coordinates)
test.sp <- test
coordinates(ensemble.sp) <- ~ x + y
proj4string(ensemble.sp) <- CRS("+init=epsg:2154")
k.c_residuals <- predict(g, test.sp, nmax=20)

I am wondering if there is something missing in the predict command… I do
not need to include the training data in it?
I have been reading previous posts from R-sig-Geo, but I still did not find
the solution. I would appreciate any advice you could give me.

Thanks,

Mercedes Roman

INRA

Unité de Service InfoSol

2163, avenue de la Pomme de Pin

CS 40001 Ardon

45075 ORLEANS cedex 2

France

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Ordinary co-kriging with gstat - computing time/syntax error?

edzer
You need to set nmax for each variable, as a parameter of the gstat()
function call. I don't see nmax mentioned anywhere in the documentation
of gstat::predict.

On 10/04/17 18:16, Mercedes Román wrote:

> Dear R-sig-Geo members,
>
> I want to perform co-kriging on the residuals on two regression models with
> gstat. I am following some tutorials (Rossiter, 2012; Malone et al., 2017),
> but I may have a mistake in the script. The computing time seems too long
> (I know it takes a lot of calculation time, but in four days it never
> finished), even when I set nmax=20. Originally, I wanted to perform a test
> on ~ 11000 points, but I am trying now with a subset of 200 points. My
> training dataset has ~ 27000 observations. I would like to produce a
> reproducible example, but perhaps just the sintaxis of the script may be
> enough to have a clue of my problem. My computer has an Intel®Xeon® CPU
> E5-2609 v2 processor, and 32Go RAM memory.
>
> Let’s say my two response variables are X1 and X2, and the dataframes with
> the coordinates and covariates are “training” and “test”.
>
> ### Calculate the predictions on the training dataset (predictors is a
> dataframe with the covariates)
> training$pred.X1 <- predict(modelX1, newdata = predictors)
> training$pred.X2 <- predict(modelX2, newdata = predictors)
> ### Calculate residuals
> training$res.X1<- training$X1 - training$ pred.X1
> training$res.X2 <- training$X2 - training$pred.X2
> ### Transform to spatial
> training.sp <- training
> coordinates(training.sp) <- ~ x + y
> proj4string(training.sp) <- CRS("+init=epsg:2154")
>
> ### Define gstat object and compute experimental semivariograms for X1 and
> X2 (for visual examination)
> X1.g <- gstat(formula = res.X1~1, data = training.sp)
> X1.svar <- variogram(X1.g, width = 5000, cutoff=250000)
> X1.ivgm <- vgm(nugget=0.56, range=14229, psill=0.65, kappa=10, model="Mat")
> ## initial variogram model
> X1.vgm <- fit.variogram(X1.svar, model=X1.ivgm, fit.method=7)
> plot(X1.svar, X1.vgm)
>
> X2.g <- gstat(formula = res.X2~1, data = training.sp)
> X2.svar <- variogram(X2.g, width = 5000, cutoff=250000)
> X2.ivgm <- vgm(nugget=0.19, range=13954, psill=0.21,kappa=10,   model="Mat")
> X2.vgm <- fit.variogram(X2.svar, model=X2.ivgm, fit.method=7)
> plot(X2.svar, X2.vgm)
>
> ### Define gstat object to compute the direct variograms and covariogram
> g <- gstat( NULL, id="res.X1",  formula = res.X1~1, data = training.sp)
> g <- gstat( g, id="res.X2",  formula = res.X2~1, data = training.sp)
> v.cross <- variogram(g, width=5000, cutoff = 250000)
> #### Fill parameters
> g <- gstat(g, id = "res.X2", model = X2.vgm,  fill.all=T)
> ### fit LMCR
> g <- fit.lmc(v.cross, g)
> ### Predict at the test locations (test.sp only indicates the coordinates)
> test.sp <- test
> coordinates(ensemble.sp) <- ~ x + y
> proj4string(ensemble.sp) <- CRS("+init=epsg:2154")
> k.c_residuals <- predict(g, test.sp, nmax=20)
>
> I am wondering if there is something missing in the predict command… I do
> not need to include the training data in it?
> I have been reading previous posts from R-sig-Geo, but I still did not find
> the solution. I would appreciate any advice you could give me.
>
> Thanks,
>
> Mercedes Roman
>
> INRA
>
> Unité de Service InfoSol
>
> 2163, avenue de la Pomme de Pin
>
> CS 40001 Ardon
>
> 45075 ORLEANS cedex 2
>
> France
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Edzer Pebesma
Institute for Geoinformatics  (ifgi),  University of Münster
Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081
Journal of Statistical Software:   http://www.jstatsoft.org/
Computers & Geosciences:   http://elsevier.com/locate/cageo/


_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

signature.asc (484 bytes) Download Attachment
Loading...