Spatial autocorrelation help

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Spatial autocorrelation help

Dechen Lham
Hello all,

I would like some help in my problem below:

I am running a logistic regression and my best model residuals has spatial autocorrelation  (SAC) when checked as below and also on the raw data of the response type. My response is binary 0 and 1 (type of prey and to be predicted by several predictors). These type of prey are obtained from  a total of  200 locations (where the faecal samples are collected from).   In order to account for this SAC , I used the auto_covdist function from spdep package. But when i use this as a new predictor in my model, and then check for spatial autocorrelation in the residues of the model, there is still spatial autocorrelation,…..could u see if i am doing something wrong please?

#account for SAC in the model using weights
# auto_covariate is a distance weighted covariate
data$response <- as.numeric(data$response)
auto_weight <- autocov_dist(data$prey.type, xy=coords, nbs=1, type="inverse", zero.policy = TRUE,style="W", longlat = TRUE)

m5_auto <- glm(response ~  predictor1 + predictor2 + predictor3 + predictor4 + predictor1:predictor4, weight=auto_weight, family=quasibinomial("logit"), data=data)

# check spatial autocorrelation - first convert data to spatial points dataframe
dat <- SpatialPointsDataFrame(cbind(data$long, data$lat), data)
lstw  <- nb2listw(knn2nb(knearneigh(dat, k = 2)))

# check SAC in model residuals
moran.test(residuals.glm(m5_auto), lstw) # and gives the below:

Moran I test under randomisation

data:  residuals.glm(m5)  
weights: lstw  

Moran I statistic standard deviate = 1.9194, p-value = 0.02747
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance
     0.160824328      -0.004608295       0.007428642

-Someone said its stupid to account for spatial autocorrelation in a logistic regression when you have a significant SAC using moran’s I. So i am now wondering how this can be solved? or does a SAC in a logistic regression be just ignored?

I am new to spatial statistics and now idea how to move with such. I only know that my data has spatial
 autocorrelation (which i hope to have checked correctly using morans I as above) and now need to account for this in my analysis. Some advice would be greatly appreciated by people who have used to account for SAC in their logistic models.  Is a logistic mixed models an option to consider?especially if your covariates are spatial in nature,…i read somewhere that if you cant account for SAC in glm then you can move to mixed models esp if your covariates are spatial which is expected to digest the SAC.

Help and advice would be greatly appreciated.

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Spatial autocorrelation help

Patrick Schratz
Hi Dechen,

it is very important to account for SAC in any model. This can be done in various ways. In log.reg it is common to include spatial autocorrelation structures that describe the underlying SAC. To do so, you can use mixed models, e.g. MASS::glmmPQL().
Also have a look at Wood (2017) Generalized Additive Models in R.
I did account for it in my master thesis.Even though the code is not attached, it may help you: https://zenodo.org/record/814262
Cheers, Patrick
On Jul 10 2018, at 7:46 pm, Dechen Lham <[hidden email]> wrote:

>
> Hello all,
> I would like some help in my problem below:
> I am running a logistic regression and my best model residuals has spatial autocorrelation (SAC) when checked as below and also on the raw data of the response type. My response is binary 0 and 1 (type of prey and to be predicted by several predictors). These type of prey are obtained from a total of 200 locations (where the faecal samples are collected from). In order to account for this SAC , I used the auto_covdist function from spdep package. But when i use this as a new predictor in my model, and then check for spatial autocorrelation in the residues of the model, there is still spatial autocorrelation,…..could u see if i am doing something wrong please?
> #account for SAC in the model using weights
> # auto_covariate is a distance weighted covariate
> data$response <- as.numeric(data$response)
> auto_weight <- autocov_dist(data$prey.type, xy=coords, nbs=1, type="inverse", zero.policy = TRUE,style="W", longlat = TRUE)
>
> m5_auto <- glm(response ~ predictor1 + predictor2 + predictor3 + predictor4 + predictor1:predictor4, weight=auto_weight, family=quasibinomial("logit"), data=data)
> # check spatial autocorrelation - first convert data to spatial points dataframe
> dat <- SpatialPointsDataFrame(cbind(data$long, data$lat), data)
> lstw <- nb2listw(knn2nb(knearneigh(dat, k = 2)))
>
> # check SAC in model residuals
> moran.test(residuals.glm(m5_auto), lstw) # and gives the below:
>
> Moran I test under randomisation
> data: residuals.glm(m5)
> weights: lstw
>
> Moran I statistic standard deviate = 1.9194, p-value = 0.02747
> alternative hypothesis: greater
> sample estimates:
> Moran I statistic Expectation Variance
> 0.160824328 -0.004608295 0.007428642
>
> -Someone said its stupid to account for spatial autocorrelation in a logistic regression when you have a significant SAC using moran’s I. So i am now wondering how this can be solved? or does a SAC in a logistic regression be just ignored?
> I am new to spatial statistics and now idea how to move with such. I only know that my data has spatial
> autocorrelation (which i hope to have checked correctly using morans I as above) and now need to account for this in my analysis. Some advice would be greatly appreciated by people who have used to account for SAC in their logistic models. Is a logistic mixed models an option to consider?especially if your covariates are spatial in nature,…i read somewhere that if you cant account for SAC in glm then you can move to mixed models esp if your covariates are spatial which is expected to digest the SAC.
>
> Help and advice would be greatly appreciated.
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Spatial autocorrelation help

Dechen Lham
Hi Patrick

Thank you for your quick response and i went through your thesis and its very useful information. One thing that i was wondering was, you could potentially also use quadratic terms of the predictors which may have non-linear relation with the response variable right? rather than to use GAMM.

Besides i need to still figure out how to check the SAC correctly in my data as there is the global morans I and a local morans I right? Further need to figure out how to plot them correctly to see the patterns. I did make a correlogram of the raw data and from the residuals of the best model but both looked very similar and also after accounting for SAC, the morans I was significant and SAC was not accounted for. So it would be great if you can see I am doing something wrong while accepting for the SAC below…please


regards


> On 10 Jul 2018, at 8:38 PM, Patrick Schratz <[hidden email]> wrote:
>
> Hi Dechen,
>
> it is very important to account for SAC in any model. This can be done in various ways. In log.reg it is common to include spatial autocorrelation structures that describe the underlying SAC. To do so, you can use mixed models, e.g. MASS::glmmPQL().
>
> Also have a look at Wood (2017) Generalized Additive Models in R.
>
> I did account for it in my master thesis.Even though the code is not attached, it may help you: https://zenodo.org/record/814262 <https://zenodo.org/record/814262>
> Cheers, Patrick
> On Jul 10 2018, at 7:46 pm, Dechen Lham <[hidden email]> wrote:
>
> Hello all,
>
> I would like some help in my problem below:
>
> I am running a logistic regression and my best model residuals has spatial autocorrelation (SAC) when checked as below and also on the raw data of the response type. My response is binary 0 and 1 (type of prey and to be predicted by several predictors). These type of prey are obtained from a total of 200 locations (where the faecal samples are collected from). In order to account for this SAC , I used the auto_covdist function from spdep package. But when i use this as a new predictor in my model, and then check for spatial autocorrelation in the residues of the model, there is still spatial autocorrelation,…..could u see if i am doing something wrong please?
>
> #account for SAC in the model using weights
> # auto_covariate is a distance weighted covariate
> data$response <- as.numeric(data$response)
> auto_weight <- autocov_dist(data$prey.type, xy=coords, nbs=1, type="inverse", zero.policy = TRUE,style="W", longlat = TRUE)
>
> m5_auto <- glm(response ~ predictor1 + predictor2 + predictor3 + predictor4 + predictor1:predictor4, weight=auto_weight, family=quasibinomial("logit"), data=data)
>
> # check spatial autocorrelation - first convert data to spatial points dataframe
> dat <- SpatialPointsDataFrame(cbind(data$long, data$lat), data)
> lstw <- nb2listw(knn2nb(knearneigh(dat, k = 2)))
>
> # check SAC in model residuals
> moran.test(residuals.glm(m5_auto), lstw) # and gives the below:
>
> Moran I test under randomisation
>
> data: residuals.glm(m5)
> weights: lstw
>
> Moran I statistic standard deviate = 1.9194, p-value = 0.02747
> alternative hypothesis: greater
> sample estimates:
> Moran I statistic Expectation Variance
> 0.160824328 -0.004608295 0.007428642
>
> -Someone said its stupid to account for spatial autocorrelation in a logistic regression when you have a significant SAC using moran’s I. So i am now wondering how this can be solved? or does a SAC in a logistic regression be just ignored?
>
> I am new to spatial statistics and now idea how to move with such. I only know that my data has spatial
> autocorrelation (which i hope to have checked correctly using morans I as above) and now need to account for this in my analysis. Some advice would be greatly appreciated by people who have used to account for SAC in their logistic models. Is a logistic mixed models an option to consider?especially if your covariates are spatial in nature,…i read somewhere that if you cant account for SAC in glm then you can move to mixed models esp if your covariates are spatial which is expected to digest the SAC.
>
> Help and advice would be greatly appreciated.
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Spatial autocorrelation help

Orcun Morali
In reply to this post by Dechen Lham
Hi Dechen,

As for measuring spatial autocorrelation, one thing I noticed about your
output is that you are using the randomization assumption in
spdep::moran.test. Randomization assumption is not appropriate for
Moran's I of regression residuals and spdep::lm.morantest is the
function to correctly calculate moments of the measure for regression
residuals anyway. Before using lm.morantest though, if I were you, I
would check whether its inference applies to logistic regression
residuals as well, since the theory was initially based on the classical
regression.

As for fitting a spatial logistic model if you need it, McSpatial
package in R might help you.

Best Regards,

Orcun

On 10/07/18 20:46, Dechen Lham wrote:

> Hello all,
>
> I would like some help in my problem below:
>
> I am running a logistic regression and my best model residuals has spatial autocorrelation  (SAC) when checked as below and also on the raw data of the response type. My response is binary 0 and 1 (type of prey and to be predicted by several predictors). These type of prey are obtained from  a total of  200 locations (where the faecal samples are collected from).   In order to account for this SAC , I used the auto_covdist function from spdep package. But when i use this as a new predictor in my model, and then check for spatial autocorrelation in the residues of the model, there is still spatial autocorrelation,…..could u see if i am doing something wrong please?
>
> #account for SAC in the model using weights
> # auto_covariate is a distance weighted covariate
> data$response <- as.numeric(data$response)
> auto_weight <- autocov_dist(data$prey.type, xy=coords, nbs=1, type="inverse", zero.policy = TRUE,style="W", longlat = TRUE)
>
> m5_auto <- glm(response ~  predictor1 + predictor2 + predictor3 + predictor4 + predictor1:predictor4, weight=auto_weight, family=quasibinomial("logit"), data=data)
>
> # check spatial autocorrelation - first convert data to spatial points dataframe
> dat <- SpatialPointsDataFrame(cbind(data$long, data$lat), data)
> lstw  <- nb2listw(knn2nb(knearneigh(dat, k = 2)))
>
> # check SAC in model residuals
> moran.test(residuals.glm(m5_auto), lstw) # and gives the below:
>
> Moran I test under randomisation
>
> data:  residuals.glm(m5)
> weights: lstw
>
> Moran I statistic standard deviate = 1.9194, p-value = 0.02747
> alternative hypothesis: greater
> sample estimates:
> Moran I statistic       Expectation          Variance
>       0.160824328      -0.004608295       0.007428642
>
> -Someone said its stupid to account for spatial autocorrelation in a logistic regression when you have a significant SAC using moran’s I. So i am now wondering how this can be solved? or does a SAC in a logistic regression be just ignored?
>
> I am new to spatial statistics and now idea how to move with such. I only know that my data has spatial
>   autocorrelation (which i hope to have checked correctly using morans I as above) and now need to account for this in my analysis. Some advice would be greatly appreciated by people who have used to account for SAC in their logistic models.  Is a logistic mixed models an option to consider?especially if your covariates are spatial in nature,…i read somewhere that if you cant account for SAC in glm then you can move to mixed models esp if your covariates are spatial which is expected to digest the SAC.
>
> Help and advice would be greatly appreciated.
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Spatial autocorrelation help

Dechen Lham
Hi Orcun

I am not quite sure if im doing this correctly but I do understand that i first need to check spatial autocorrelation occurs in my data. so i did this below steps and after that checked it again in best model residuals

# Another approach to find SAC by creating neighbors first, then get distances between each point and neighbors, then
# inverse the distance and then check the SAC using mora's I
coord <- cbind(data$long, data$lat)
coords <- coordinates(coord)

# creates a matrix of nn indexes - knearneigh to get nearest neighbors
nn5 <- knearneigh(coords, k=5)  
mi5.nlist <- knn2nb(nn5, row.names = NULL, sym=FALSE)

# creates a sp weights matrix
mi5.sw <- nb2listw(mi5.nlist)

# cal moran's I using distance as weights
# calculates the distance
mi5.dist <- nbdists(mi5.nlist, coords)

# now invert the distnace to determine weights (closer =higher)
mi5.dist1 <- lapply(mi5.dist, function(x){ifelse(is.finite(1/x), (1/x), (1/0.001))})
mi5.dist2 <- lapply(mi5.dist, function(x){ifelse(is.finite(1/x^2), (1/x^2), (1/0.001^2))})

# check the distance between the distribution
summary(unlist(mi5.dist1))

# now create sp weights matrix weighted on distance
mi5.d1sw <- nb2listw(mi5.nlist, glist=mi5.dist1)
mi5.d2sw <- nb2listw(mi5.nlist, glist=mi5.dist2)

# morans test
moran.test(as.numeric(data$response), mi5.d1sw)
moran.test(as.numeric(data$response), mi5.d2sw)

This first moran’s test gives :
Moran I statistic standard deviate = 2.0328, p-value = 0.02104
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance
      0.105850408      -0.004608295       0.002952729

Second morans test gives:

Moran I statistic standard deviate = 2.3848, p-value = 0.008545
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance
      0.154097396      -0.004608295       0.004428848

And both indicates presence of spatial autocorrelation in the raw data.

Should i account for this in all models or if i perform logistic mixed model it is fine……help is much appreciated. Difficult to understand what the problem is and how to solve it



> On 11 Jul 2018, at 7:01 PM, Orcun Morali <[hidden email]> wrote:
>
> Hi Dechen,
>
> As for measuring spatial autocorrelation, one thing I noticed about your output is that you are using the randomization assumption in spdep::moran.test. Randomization assumption is not appropriate for Moran's I of regression residuals and spdep::lm.morantest is the function to correctly calculate moments of the measure for regression residuals anyway. Before using lm.morantest though, if I were you, I would check whether its inference applies to logistic regression residuals as well, since the theory was initially based on the classical regression.
>
> As for fitting a spatial logistic model if you need it, McSpatial package in R might help you.
>
> Best Regards,
>
> Orcun
>
> On 10/07/18 20:46, Dechen Lham wrote:
>> Hello all,
>>
>> I would like some help in my problem below:
>>
>> I am running a logistic regression and my best model residuals has spatial autocorrelation  (SAC) when checked as below and also on the raw data of the response type. My response is binary 0 and 1 (type of prey and to be predicted by several predictors). These type of prey are obtained from  a total of  200 locations (where the faecal samples are collected from).   In order to account for this SAC , I used the auto_covdist function from spdep package. But when i use this as a new predictor in my model, and then check for spatial autocorrelation in the residues of the model, there is still spatial autocorrelation,…..could u see if i am doing something wrong please?
>>
>> #account for SAC in the model using weights
>> # auto_covariate is a distance weighted covariate
>> data$response <- as.numeric(data$response)
>> auto_weight <- autocov_dist(data$prey.type, xy=coords, nbs=1, type="inverse", zero.policy = TRUE,style="W", longlat = TRUE)
>>
>> m5_auto <- glm(response ~  predictor1 + predictor2 + predictor3 + predictor4 + predictor1:predictor4, weight=auto_weight, family=quasibinomial("logit"), data=data)
>>
>> # check spatial autocorrelation - first convert data to spatial points dataframe
>> dat <- SpatialPointsDataFrame(cbind(data$long, data$lat), data)
>> lstw  <- nb2listw(knn2nb(knearneigh(dat, k = 2)))
>>
>> # check SAC in model residuals
>> moran.test(residuals.glm(m5_auto), lstw) # and gives the below:
>>
>> Moran I test under randomisation
>>
>> data:  residuals.glm(m5)
>> weights: lstw
>>
>> Moran I statistic standard deviate = 1.9194, p-value = 0.02747
>> alternative hypothesis: greater
>> sample estimates:
>> Moran I statistic       Expectation          Variance
>>      0.160824328      -0.004608295       0.007428642
>>
>> -Someone said its stupid to account for spatial autocorrelation in a logistic regression when you have a significant SAC using moran’s I. So i am now wondering how this can be solved? or does a SAC in a logistic regression be just ignored?
>>
>> I am new to spatial statistics and now idea how to move with such. I only know that my data has spatial
>>  autocorrelation (which i hope to have checked correctly using morans I as above) and now need to account for this in my analysis. Some advice would be greatly appreciated by people who have used to account for SAC in their logistic models.  Is a logistic mixed models an option to consider?especially if your covariates are spatial in nature,…i read somewhere that if you cant account for SAC in glm then you can move to mixed models esp if your covariates are spatial which is expected to digest the SAC.
>>
>> Help and advice would be greatly appreciated.
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> [hidden email]
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo