Differences between moran.test and lm.morantest

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Differences between moran.test and lm.morantest

Javier García

Hello everybody:

 

Currently I am working on a paper in which we need to analyze the presence of possible spatial correlation in the data. With this aim I am running some tests in R. I am a little bit confused about the differences between moran.test and lm.morantest R functions. The problem I face to is that when I run moran.test on my  regression residuals the result is totally different from the one I obtain when I use lm.morantest with the lm regression object (please, see below the different outputs I get and after it a reproducible example). In particular, whereas the observed Moran I is the same, the expectation and variance differ dramatically, getting opposite conclusions. I would appreciate very much if someone could clarify for me which is the cause behind this. By the way, I also run LM tests (LMerr, RLMerr, LMlag and RLMlag) not rejecting the null hypothesis in any of them (all p-values are higher than 0.7), which is in clear contradiction with the lm.morantest… how is this possible?

 

 

MY PARTICULAR CASE 

 

reg.OLS <- lm(y~z1+z2+z3+z4+z5+z6+z7+z8+z9+z10, data=datos)

 

moran.test(resid(reg.OLS),alternative="two.sided", W_n)

 

        Moran I test under randomisation

 

data:  resid(reg.OLS) 

weights: W_n 

 

Moran I statistic standard deviate = 0.4434, p-value = 0.6575

alternative hypothesis: two.sided

sample estimates:

Moran I statistic       Expectation          Variance

     1.596378e-05     -3.595829e-04      7.173448e-07

 

 

 

moran.lm <-lm.morantest(reg.OLS, W_n, alternative="two.sided")

print(moran.lm)

 

        Global Moran I for regression residuals

 

data: 

model: lm(formula = y ~ z1 + z2 + z3 + z4 + z5 + z6 + z7 + z8 + z9 + z10

, data = datos)

weights: W_n

 

Moran I statistic standard deviate = 11.649, p-value < 2.2e-16

alternative hypothesis: two.sided

sample estimates:

Observed Moran I      Expectation         Variance

    1.596378e-05    -1.913005e-03     2.741816e-08

 

 

A REPRODUCIBLE EXAMPLE

 

 

library(spdep)

data(oldcol)

oldcrime.lm <- lm(CRIME ~ HOVAL + INC + OPEN + PLUMB + DISCBD + PERIM, data = COL.OLD)

 

moran.test(resid(oldcrime.lm), nb2listw(COL.nb, style="W"))

 

        Moran I test under randomisation

 

data:  resid(oldcrime.lm) 

weights: nb2listw(COL.nb, style = "W") 

 

Moran I statistic standard deviate = 1.2733, p-value = 0.1015

alternative hypothesis: greater

sample estimates:

Moran I statistic       Expectation          Variance

      0.096711162      -0.020833333       0.008521765

 

 

lm.morantest(oldcrime.lm, nb2listw(COL.nb, style="W"))

 

        Global Moran I for regression residuals

 

data: 

model: lm(formula = CRIME ~ HOVAL + INC + OPEN + PLUMB + DISCBD +

PERIM, data = COL.OLD)

weights: nb2listw(COL.nb, style = "W")

 

Moran I statistic standard deviate = 1.6668, p-value = 0.04777

alternative hypothesis: greater

sample estimates:

Observed Moran I      Expectation         Variance

     0.096711162     -0.052848581      0.008050938

 

Thanks a lot in advance and sorry for the inconvenience.

 

Javi

 

 

JAVIER GARCÍA

 

Departamento de Economía Aplicada III (Econometría y Estadística)

Facultad de Economía y Empresa (Sección Sarriko)
Avda. Lehendakari Aguirre 83

48015 BILBAO
T.: +34 601 7126 F.: +34 601 3754
www.ehu.es

http://www.unibertsitate-hedakuntza.ehu.es/p268-content/es/contenidos/informacion/manual_id_corp/es_manual/images/firma_email_upv_euskampus_bilingue.gif

 

 


_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Differences between moran.test and lm.morantest

Roger Bivand
Administrator
On Sat, 29 Jul 2017, Javier García wrote:

> Hello everybody:
>
>
>
> Currently I am working on a paper in which we need to analyze the presence
> of possible spatial correlation in the data. With this aim I am running some
> tests in R. I am a little bit confused about the differences between
> moran.test and lm.morantest R functions. The problem I face to is that when
> I run moran.test on my  regression residuals the result is totally different
> from the one I obtain when I use lm.morantest with the lm regression object
> (please, see below the different outputs I get and after it a reproducible
> example). In particular, whereas the observed Moran I is the same, the
> expectation and variance differ dramatically, getting opposite conclusions.
> I would appreciate very much if someone could clarify for me which is the
> cause behind this. By the way, I also run LM tests (LMerr, RLMerr, LMlag and
> RLMlag) not rejecting the null hypothesis in any of them (all p-values are
> higher than 0.7), which is in clear contradiction with the lm.morantest? how
> is this possible?
>
moran.test() is for "primary" variables only - read the reference in
?moran.test. The mean model applied to the this variable is the intercept,
that is the mean only:

> moran.test(COL.OLD$CRIME, nb2listw(COL.nb, style="W"),
+ randomisation=FALSE, alternative="two.sided")

  Moran I test under normality

data:  COL.OLD$CRIME
weights: nb2listw(COL.nb, style = "W")

Moran I statistic standard deviate = 5.6754, p-value = 1.384e-08
alternative hypothesis: two.sided
sample estimates:
Moran I statistic       Expectation          Variance
       0.510951264      -0.020833333       0.008779831

under the Normal assumption.

lm.morantest() can reproduce the same results for the same model:

> lm.morantest(lm(CRIME ~ 1, data=COL.OLD), nb2listw(COL.nb, style="W"),
+ alternative="two.sided")

  Global Moran I for regression residuals

data:
model: lm(formula = CRIME ~ 1, data = COL.OLD)
weights: nb2listw(COL.nb, style = "W")

Moran I statistic standard deviate = 5.6754, p-value = 1.384e-08
alternative hypothesis: two.sided
sample estimates:
Observed Moran I      Expectation         Variance
      0.510951264     -0.020833333      0.008779831

Note that the Normal VI for moran.test() is:

         VI <- (wc$nn * wc$S1 - wc$n * wc$S2 + 3 * S02)/(S02 *
             (wc$nn - 1))

and for lm.morantest():

     XtXinv <- chol2inv(model$qr$qr[p1, p1, drop = FALSE])
     X <- model.matrix(terms(model), model.frame(model))
...
     Z <- lag.listw(listw.U, X, zero.policy = zero.policy)
     C1 <- t(X) %*% Z
     trA <- (sum(diag(XtXinv %*% C1)))
     EI <- -((N * trA)/((N - p) * S0))
     C2 <- t(Z) %*% Z
     C3 <- XtXinv %*% C1
     trA2 <- sum(diag(C3 %*% C3))
     trB <- sum(diag(4 * (XtXinv %*% C2)))
     VI <- (((N * N)/((S0 * S0) * (N - p) * (N - p + 2))) * (S1 +
         2 * trA2 - trB - ((2 * (trA^2))/(N - p))))

where you see easily that the issue of dependent residuals (only n-p are
independent even in the aspatial case, so you see n-p instead of n) is
handled, as are products of X, WX, and (X'X)^{-1}.

Reading the references is essential and should not be neglected. Reading
the code may help, but doesn't explain the reasoning behind the code and
the output. If you need to test model residuals, use the appropriate test.
Using moran.test() on extracted residuals if covariates are included in
the model is never justified.

Different outcomes from Moran's I and LM suggest severe mis-specification
in your model, see for example:

https://groups.google.com/forum/#!msg/openspace-list/k4F4jI9cU1I/s5bj8r4nwn4J

for a simplified flowchart for choosing models.

Hope this clarifies,

Roger

>
>
>
>
> MY PARTICULAR CASE
>
>
>
> reg.OLS <- lm(y~z1+z2+z3+z4+z5+z6+z7+z8+z9+z10, data=datos)
>
>
>
> moran.test(resid(reg.OLS),alternative="two.sided", W_n)
>
>
>
>        Moran I test under randomisation
>
>
>
> data:  resid(reg.OLS)
>
> weights: W_n
>
>
>
> Moran I statistic standard deviate = 0.4434, p-value = 0.6575
>
> alternative hypothesis: two.sided
>
> sample estimates:
>
> Moran I statistic       Expectation          Variance
>
>     1.596378e-05     -3.595829e-04      7.173448e-07
>
>
>
>
>
>
>
> moran.lm <-lm.morantest(reg.OLS, W_n, alternative="two.sided")
>
> print(moran.lm)
>
>
>
>        Global Moran I for regression residuals
>
>
>
> data:
>
> model: lm(formula = y ~ z1 + z2 + z3 + z4 + z5 + z6 + z7 + z8 + z9 + z10
>
> , data = datos)
>
> weights: W_n
>
>
>
> Moran I statistic standard deviate = 11.649, p-value < 2.2e-16
>
> alternative hypothesis: two.sided
>
> sample estimates:
>
> Observed Moran I      Expectation         Variance
>
>    1.596378e-05    -1.913005e-03     2.741816e-08
>
>
>
>
>
> A REPRODUCIBLE EXAMPLE
>
>
>
>
>
> library(spdep)
>
> data(oldcol)
>
> oldcrime.lm <- lm(CRIME ~ HOVAL + INC + OPEN + PLUMB + DISCBD + PERIM, data
> = COL.OLD)
>
>
>
> moran.test(resid(oldcrime.lm), nb2listw(COL.nb, style="W"))
>
>
>
>        Moran I test under randomisation
>
>
>
> data:  resid(oldcrime.lm)
>
> weights: nb2listw(COL.nb, style = "W")
>
>
>
> Moran I statistic standard deviate = 1.2733, p-value = 0.1015
>
> alternative hypothesis: greater
>
> sample estimates:
>
> Moran I statistic       Expectation          Variance
>
>      0.096711162      -0.020833333       0.008521765
>
>
>
>
>
> lm.morantest(oldcrime.lm, nb2listw(COL.nb, style="W"))
>
>
>
>        Global Moran I for regression residuals
>
>
>
> data:
>
> model: lm(formula = CRIME ~ HOVAL + INC + OPEN + PLUMB + DISCBD +
>
> PERIM, data = COL.OLD)
>
> weights: nb2listw(COL.nb, style = "W")
>
>
>
> Moran I statistic standard deviate = 1.6668, p-value = 0.04777
>
> alternative hypothesis: greater
>
> sample estimates:
>
> Observed Moran I      Expectation         Variance
>
>     0.096711162     -0.052848581      0.008050938
>
>
>
> Thanks a lot in advance and sorry for the inconvenience.
>
>
>
> Javi
>
>
>
>
>
>
>
>
> JAVIER GARCÍA
>
>
>
> Departamento de Economía Aplicada III (Econometría y Estadística)
>
> Facultad de Economía y Empresa (Sección Sarriko)
> Avda. Lehendakari Aguirre 83
>
> 48015 BILBAO
> T.: +34 601 7126 F.: +34 601 3754
> <http://www.ehu.es/> www.ehu.es
>
> http://www.unibertsitate-hedakuntza.ehu.es/p268-content/es/contenidos/inform
> acion/manual_id_corp/es_manual/images/firma_email_upv_euskampus_bilingue.gif
>
>
>
>
>
>
--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
http://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Differences between moran.test and lm.morantest

Javier García
Thanks a lot for such a detailed response, Roger.

Best
Javi

-----Mensaje original-----
De: Roger Bivand [mailto:[hidden email]]
Enviado el: domingo, 30 de julio de 2017 18:42
Para: Javier García
CC: [hidden email]
Asunto: Re: [R-sig-Geo] Differences between moran.test and lm.morantest

On Sat, 29 Jul 2017, Javier García wrote:

> Hello everybody:
>
>
>
> Currently I am working on a paper in which we need to analyze the
> presence of possible spatial correlation in the data. With this aim I
> am running some tests in R. I am a little bit confused about the
> differences between moran.test and lm.morantest R functions. The
> problem I face to is that when I run moran.test on my  regression
> residuals the result is totally different from the one I obtain when I
> use lm.morantest with the lm regression object (please, see below the
> different outputs I get and after it a reproducible example). In
> particular, whereas the observed Moran I is the same, the expectation and
variance differ dramatically, getting opposite conclusions.
> I would appreciate very much if someone could clarify for me which is
> the cause behind this. By the way, I also run LM tests (LMerr, RLMerr,
> LMlag and
> RLMlag) not rejecting the null hypothesis in any of them (all p-values
> are higher than 0.7), which is in clear contradiction with the
> lm.morantest? how is this possible?
>

moran.test() is for "primary" variables only - read the reference in
?moran.test. The mean model applied to the this variable is the intercept,
that is the mean only:

> moran.test(COL.OLD$CRIME, nb2listw(COL.nb, style="W"),
+ randomisation=FALSE, alternative="two.sided")

  Moran I test under normality

data:  COL.OLD$CRIME
weights: nb2listw(COL.nb, style = "W")

Moran I statistic standard deviate = 5.6754, p-value = 1.384e-08
alternative hypothesis: two.sided
sample estimates:
Moran I statistic       Expectation          Variance
       0.510951264      -0.020833333       0.008779831

under the Normal assumption.

lm.morantest() can reproduce the same results for the same model:

> lm.morantest(lm(CRIME ~ 1, data=COL.OLD), nb2listw(COL.nb, style="W"),
+ alternative="two.sided")

  Global Moran I for regression residuals

data:
model: lm(formula = CRIME ~ 1, data = COL.OLD)
weights: nb2listw(COL.nb, style = "W")

Moran I statistic standard deviate = 5.6754, p-value = 1.384e-08
alternative hypothesis: two.sided
sample estimates:
Observed Moran I      Expectation         Variance
      0.510951264     -0.020833333      0.008779831

Note that the Normal VI for moran.test() is:

         VI <- (wc$nn * wc$S1 - wc$n * wc$S2 + 3 * S02)/(S02 *
             (wc$nn - 1))

and for lm.morantest():

     XtXinv <- chol2inv(model$qr$qr[p1, p1, drop = FALSE])
     X <- model.matrix(terms(model), model.frame(model))
...
     Z <- lag.listw(listw.U, X, zero.policy = zero.policy)
     C1 <- t(X) %*% Z
     trA <- (sum(diag(XtXinv %*% C1)))
     EI <- -((N * trA)/((N - p) * S0))
     C2 <- t(Z) %*% Z
     C3 <- XtXinv %*% C1
     trA2 <- sum(diag(C3 %*% C3))
     trB <- sum(diag(4 * (XtXinv %*% C2)))
     VI <- (((N * N)/((S0 * S0) * (N - p) * (N - p + 2))) * (S1 +
         2 * trA2 - trB - ((2 * (trA^2))/(N - p))))

where you see easily that the issue of dependent residuals (only n-p are
independent even in the aspatial case, so you see n-p instead of n) is
handled, as are products of X, WX, and (X'X)^{-1}.

Reading the references is essential and should not be neglected. Reading
the code may help, but doesn't explain the reasoning behind the code and
the output. If you need to test model residuals, use the appropriate test.
Using moran.test() on extracted residuals if covariates are included in
the model is never justified.

Different outcomes from Moran's I and LM suggest severe mis-specification
in your model, see for example:

https://groups.google.com/forum/#!msg/openspace-list/k4F4jI9cU1I/s5bj8r4nwn4
J

for a simplified flowchart for choosing models.

Hope this clarifies,

Roger

>
>
>
>
> MY PARTICULAR CASE
>
>
>
> reg.OLS <- lm(y~z1+z2+z3+z4+z5+z6+z7+z8+z9+z10, data=datos)
>
>
>
> moran.test(resid(reg.OLS),alternative="two.sided", W_n)
>
>
>
>        Moran I test under randomisation
>
>
>
> data:  resid(reg.OLS)
>
> weights: W_n
>
>
>
> Moran I statistic standard deviate = 0.4434, p-value = 0.6575
>
> alternative hypothesis: two.sided
>
> sample estimates:
>
> Moran I statistic       Expectation          Variance
>
>     1.596378e-05     -3.595829e-04      7.173448e-07
>
>
>
>
>
>
>
> moran.lm <-lm.morantest(reg.OLS, W_n, alternative="two.sided")
>
> print(moran.lm)
>
>
>
>        Global Moran I for regression residuals
>
>
>
> data:
>
> model: lm(formula = y ~ z1 + z2 + z3 + z4 + z5 + z6 + z7 + z8 + z9 + z10
>
> , data = datos)
>
> weights: W_n
>
>
>
> Moran I statistic standard deviate = 11.649, p-value < 2.2e-16
>
> alternative hypothesis: two.sided
>
> sample estimates:
>
> Observed Moran I      Expectation         Variance
>
>    1.596378e-05    -1.913005e-03     2.741816e-08
>
>
>
>
>
> A REPRODUCIBLE EXAMPLE
>
>
>
>
>
> library(spdep)
>
> data(oldcol)
>
> oldcrime.lm <- lm(CRIME ~ HOVAL + INC + OPEN + PLUMB + DISCBD + PERIM,
data

> = COL.OLD)
>
>
>
> moran.test(resid(oldcrime.lm), nb2listw(COL.nb, style="W"))
>
>
>
>        Moran I test under randomisation
>
>
>
> data:  resid(oldcrime.lm)
>
> weights: nb2listw(COL.nb, style = "W")
>
>
>
> Moran I statistic standard deviate = 1.2733, p-value = 0.1015
>
> alternative hypothesis: greater
>
> sample estimates:
>
> Moran I statistic       Expectation          Variance
>
>      0.096711162      -0.020833333       0.008521765
>
>
>
>
>
> lm.morantest(oldcrime.lm, nb2listw(COL.nb, style="W"))
>
>
>
>        Global Moran I for regression residuals
>
>
>
> data:
>
> model: lm(formula = CRIME ~ HOVAL + INC + OPEN + PLUMB + DISCBD +
>
> PERIM, data = COL.OLD)
>
> weights: nb2listw(COL.nb, style = "W")
>
>
>
> Moran I statistic standard deviate = 1.6668, p-value = 0.04777
>
> alternative hypothesis: greater
>
> sample estimates:
>
> Observed Moran I      Expectation         Variance
>
>     0.096711162     -0.052848581      0.008050938
>
>
>
> Thanks a lot in advance and sorry for the inconvenience.
>
>
>
> Javi
>
>
>
>
>
>
>
>
> JAVIER GARCÍA
>
>
>
> Departamento de Economía Aplicada III (Econometría y Estadística)
>
> Facultad de Economía y Empresa (Sección Sarriko)
> Avda. Lehendakari Aguirre 83
>
> 48015 BILBAO
> T.: +34 601 7126 F.: +34 601 3754
> <http://www.ehu.es/> www.ehu.es
>
>
http://www.unibertsitate-hedakuntza.ehu.es/p268-content/es/contenidos/inform
>
acion/manual_id_corp/es_manual/images/firma_email_upv_euskampus_bilingue.gif
>
>
>
>
>
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
http://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Loading...