Error in predict.sarlm: non-unique row.names given

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Error in predict.sarlm: non-unique row.names given

amoroso
Another question on predict.sarlm!

Here is the line of code that is producing the error:
pred <- spatialreg::predict.sarlm(model, df, test.listw,zero.policy = T)

Here is the error:

Error in mat2listw(W, row.names = region.id.mixed, style = style) :
  non-unique row.names given
In addition: Warning messages:
1: In spatialreg::predict.sarlm(model, df, test.listw,  :
  some region.id are both in data and newdata
2: In subset(attr(listw.mixed, "region.id"), attr(listw.mixed, "region.id")
%in%  :
  longer object length is not a multiple of shorter object length

Any idea how I can solve the non-unique row.names error?

Thank you!

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Error in predict.sarlm: non-unique row.names given

Roger Bivand
Administrator
Do provide a complete reproducible example. I really appeal to all posting
questions to give potential helpers something to work on. Asking for
reproducible examples is the absolutely dominant response to postings that
lack them, if they get any response at all.

Start with this and work backwards until you can reproduce your
misunderstanding:

col <- st_read(system.file("shapes/columbus.shp", package="spData"))
train <- col[col$EW == 1,]
test <- col[col$EW == 0,]
col.nb <- spdep::poly2nb(col)
train.nb <- spdep::poly2nb(train)
test.nb <- spdep::poly2nb(test)
attr(col.nb, "region.id")
attr(train.nb, "region.id")
attr(test.nb, "region.id")
train.mod <- lagsarlm(CRIME ~ INC + HOVAL, data=train,
   listw=spdep::nb2listw(train.nb))
try(preds <- predict(train.mod, newdata=test,
   listw=spdep::nb2listw(test.nb)))
preds[2]
try(preds1 <- predict(train.mod, newdata=col,
   listw=spdep::nb2listw(col.nb)))
# warning


preds1[4]
try(preds2 <- predict(train.mod, newdata=test,
   listw=spdep::nb2listw(col.nb)))
preds2[2]

Using the complete set of weights permits the spatial process to flow
between neighbouring members of train/test sets.

Your problem is probably that your two data objects do not use row.names
as expected:

attr(test.nb, "region.id") <- as.character(1:length(test.nb))
attr(train.nb, "region.id") <- as.character(1:length(train.nb))
train.mod1 <- lagsarlm(CRIME ~ INC + HOVAL, data=train,
   listw=spdep::nb2listw(train.nb))
try(preds3 <- predict(train.mod, newdata=test,
   listw=spdep::nb2listw(test.nb)))
# Error in predict.sarlm(train.mod, newdata = test, listw =
# spdep::nb2listw(test.nb)) :
#   mismatch between newdata and spatial weights. newdata should have
# region.id as row.names

as is obvious. So when the predict method is trying to assign the newdata
neighbours (it needs to identify the correct rows in newdata based on the
"region.id" attribute of the provided weights), it fails as described.

Use the whole data weights when predicting for the test set newdata=, or
if the two graphs do not neighbour each other, that is train.nb is
separate from test.nb (think two islands), make sure that the region.ids
and row.names do not overlap between test and train sets.

Please use the example to explore the problem in your workflow, (re-)read
Goulard et al. (2017), and the help page, and report back. Remember that
you can only predict for a test set of reasonable size (because as you see
from the underlying article, you probably need an inverted nxn matrix in
the spatial lag model case).

Hope this clarifies

Roger




On Mon, 8 Jul 2019, Jiawen Ng wrote:

> Another question on predict.sarlm!
>
> Here is the line of code that is producing the error:
> pred <- spatialreg::predict.sarlm(model, df, test.listw,zero.policy = T)
>
> Here is the error:
>
> Error in mat2listw(W, row.names = region.id.mixed, style = style) :
>  non-unique row.names given
> In addition: Warning messages:
> 1: In spatialreg::predict.sarlm(model, df, test.listw,  :
>  some region.id are both in data and newdata
> 2: In subset(attr(listw.mixed, "region.id"), attr(listw.mixed, "region.id")
> %in%  :
>  longer object length is not a multiple of shorter object length
>
> Any idea how I can solve the non-unique row.names error?
>
> Thank you!
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway