Error while using predict.sarlm

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Error while using predict.sarlm

AMITHA PURANIK
I am facing an error while using predict.sarlm to make predictions for spatial
lag model generated using lagsarlm. I used the following code:

predicted = predict(fit.lag, listw=weightmatrix, newdata=missed_data,
pred.type="TS", zero.policy = T)

For the argument newdata, I have passed the same data missed_data which I
used to fit the spatial lag model.

When I run the above code, I get the following error message: “Error in
predict.sarlm(fit.lag, listw = weightmatrix, newdata = missed_data,  :
mismatch between newdata and spatial weights. newdata should have region.id
as row.names”

I have obtained the weight matrix from the function below

weightMat <- function(shp){

  dnb <- knearneigh(coordinates(shp), k=4)

  dnb <- knn2nb(dnb) #create nb

  lw <- nb2listw(dnb, style="W",zero.policy=TRUE) #create lw

  return(lw)

}

To cross check and make sure there are no discrepancies, I have run the
following lines

length(weightmatrix$weights)

nrow(missed_data)

nrow(coordinates(shape))

For all the codes above, the result is 182, which is the sample size of
data.

Can anyone offer me some guidance in solving this problem? Thanks for your
help.


       Thanks & regards,

*Amitha Puranik*

Assistant Professor,

Department of Statistics, PSPH

Phone:0820-2922407
Address:Department of Statistics,

Health Sciences Library, Level 6,

Manipal Academy of Higher Education,Manipal,Karnataka,India

An Institute of Eminence (Status Accorded by MHRD)

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Error while using predict.sarlm

Roger Bivand
Administrator
On Fri, 24 May 2019, Amitha Puranik wrote:

> I am facing an error while using predict.sarlm to make predictions for spatial
> lag model generated using lagsarlm. I used the following code:
>
> predicted = predict(fit.lag, listw=weightmatrix, newdata=missed_data,
> pred.type="TS", zero.policy = T)
>
> For the argument newdata, I have passed the same data missed_data which I
> used to fit the spatial lag model.
>
> When I run the above code, I get the following error message: “Error in
> predict.sarlm(fit.lag, listw = weightmatrix, newdata = missed_data,  :
> mismatch between newdata and spatial weights. newdata should have region.id
> as row.names”
The predict method has to identify the weights applying to the newdata. So
it uses the region.id attribute of the neighbour object, and the row.names
of the newdata object. If they do not match, it error-exits. If shp below
was read in the typical way, the default region.id may be the FID of the
input file (0, ..., (n-1)), but the default row.names of newdata may be 1,
..., n.

For example:

> library(sf)
Linking to GEOS 3.7.2, GDAL 3.0.0, PROJ 6.1.0
> boston_506 <- st_read(system.file(
+                                   "shapes/boston_tracts.shp",
+                                   package="spData")[1])
Reading layer `boston_tracts' from data source
`/home/rsb/lib/r_libs/spData/shapes/boston_tracts.shp' using driver `ESRI
Shapefile'
Simple feature collection with 506 features and 36 fields
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: -71.52311 ymin: 42.00305 xmax: -70.63823 ymax:
42.67307
epsg (SRID):    4267
proj4string:    +proj=longlat +datum=NAD27 +no_defs
> nb_q <- spdep::poly2nb(boston_506)
> lw_q <- spdep::nb2listw(nb_q, style="W")
> boston_489 <- boston_506[!is.na(boston_506$median),]
> nb_q_489 <- spdep::poly2nb(boston_489)
> lw_q_489 <- spdep::nb2listw(nb_q_489, style="W", zero.policy=TRUE)
> form <- formula(log(median) ~ CRIM + ZN + INDUS + CHAS +
+                 I((NOX*10)^2) + I(RM^2) + AGE + log(DIS) +
+                 log(RAD) + TAX + PTRATIO + I(BB/100) +
+                 log(I(LSTAT/100)))
> suppressPackageStartupMessages(library(spatialreg))
>
> eigs_489 <- eigenw(lw_q_489)
>
> SLM_489 <- lagsarlm(form, data=boston_489,
+           listw=lw_q_489, zero.policy=TRUE,
+           control=list(pre_eig=eigs_489))
>
> nd <- boston_506[is.na(boston_506$median),]
> t0 <- exp(predict(SLM_489, newdata=nd, listw=lw_q,
+                   pred.type="TS", zero.policy=TRUE))
> str(attr(lw_q, "region.id"))
  chr [1:506] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14"
"15" "16" ...
> str(row.names(nd))
  chr [1:17] "13" "14" "15" "17" "43" "50" "312" "313" "314" "317" "337"
"346" "355" ...
> all(row.names(nd) %in% attr(lw_q, "region.id"))
[1] TRUE
# introduce a wrong row.name
> row.names(nd)[1] <- "0"
> all(row.names(nd) %in% attr(lw_q, "region.id"))
[1] FALSE
> t0 <- exp(predict(SLM_489, newdata=nd, listw=lw_q,
+                   pred.type="TS", zero.policy=TRUE))
Error in predict.sarlm(SLM_489, newdata = nd, listw = lw_q,
   pred.type = "TS",  :
   mismatch between newdata and spatial weights. newdata should have
   region.id as row.names

In this case, the row.names of the input object to spdep::poly2nb() and
the region.id matched, as the newdata were subsetted from the same object.
We don't know the values for your data, but you should be able to check
them. It is important that they align the data with the weights correctly
for obvious reasons.

Hope this helps,

Roger

>
> I have obtained the weight matrix from the function below
>
> weightMat <- function(shp){
>
>  dnb <- knearneigh(coordinates(shp), k=4)
>
>  dnb <- knn2nb(dnb) #create nb
>
>  lw <- nb2listw(dnb, style="W",zero.policy=TRUE) #create lw
>
>  return(lw)
>
> }
>
> To cross check and make sure there are no discrepancies, I have run the
> following lines
>
> length(weightmatrix$weights)
>
> nrow(missed_data)
>
> nrow(coordinates(shape))
>
> For all the codes above, the result is 182, which is the sample size of
> data.
>
> Can anyone offer me some guidance in solving this problem? Thanks for your
> help.
>
>
>       Thanks & regards,
>
> *Amitha Puranik*
>
> Assistant Professor,
>
> Department of Statistics, PSPH
>
> Phone:0820-2922407
> Address:Department of Statistics,
>
> Health Sciences Library, Level 6,
>
> Manipal Academy of Higher Education,Manipal,Karnataka,India
>
> An Institute of Eminence (Status Accorded by MHRD)
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: Error while using predict.sarlm

AMITHA PURANIK
Dear Prof. Roger Bivand,

Thanks a lot for providing a clarification for my query.
I used the following code and found out that the region.id in listw object
and row.names of the data do not match.
> str(attr(weightmatrix, "region.id"))
 chr [1:182] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14"
"15" "16" "17" ...
> str(row.names(missed_data))
 chr [1:182] "142" "108" "149" "76" "8" "71" "45" "75" "173" "119" "22"
"32" "156" "221" ...
> all(row.names(missed_data) %in% attr(weightmatrix, "region.id"))
[1] FALSE

 How can I change the row.names of my data 'missed_data' and align to the
listw object?


       Thanks & regards,

*Amitha Puranik*















On Sat, May 25, 2019 at 9:49 PM Roger Bivand <[hidden email]> wrote:

> On Fri, 24 May 2019, Amitha Puranik wrote:
>
> > I am facing an error while using predict.sarlm to make predictions for
> spatial
> > lag model generated using lagsarlm. I used the following code:
> >
> > predicted = predict(fit.lag, listw=weightmatrix, newdata=missed_data,
> > pred.type="TS", zero.policy = T)
> >
> > For the argument newdata, I have passed the same data missed_data which I
> > used to fit the spatial lag model.
> >
> > When I run the above code, I get the following error message: “Error in
> > predict.sarlm(fit.lag, listw = weightmatrix, newdata = missed_data,  :
> > mismatch between newdata and spatial weights. newdata should have
> region.id
> > as row.names”
>
> The predict method has to identify the weights applying to the newdata. So
> it uses the region.id attribute of the neighbour object, and the
> row.names
> of the newdata object. If they do not match, it error-exits. If shp below
> was read in the typical way, the default region.id may be the FID of the
> input file (0, ..., (n-1)), but the default row.names of newdata may be 1,
> ..., n.
>
> For example:
>
> > library(sf)
> Linking to GEOS 3.7.2, GDAL 3.0.0, PROJ 6.1.0
> > boston_506 <- st_read(system.file(
> +                                   "shapes/boston_tracts.shp",
> +                                   package="spData")[1])
> Reading layer `boston_tracts' from data source
> `/home/rsb/lib/r_libs/spData/shapes/boston_tracts.shp' using driver `ESRI
> Shapefile'
> Simple feature collection with 506 features and 36 fields
> geometry type:  POLYGON
> dimension:      XY
> bbox:           xmin: -71.52311 ymin: 42.00305 xmax: -70.63823 ymax:
> 42.67307
> epsg (SRID):    4267
> proj4string:    +proj=longlat +datum=NAD27 +no_defs
> > nb_q <- spdep::poly2nb(boston_506)
> > lw_q <- spdep::nb2listw(nb_q, style="W")
> > boston_489 <- boston_506[!is.na(boston_506$median),]
> > nb_q_489 <- spdep::poly2nb(boston_489)
> > lw_q_489 <- spdep::nb2listw(nb_q_489, style="W", zero.policy=TRUE)
> > form <- formula(log(median) ~ CRIM + ZN + INDUS + CHAS +
> +                 I((NOX*10)^2) + I(RM^2) + AGE + log(DIS) +
> +                 log(RAD) + TAX + PTRATIO + I(BB/100) +
> +                 log(I(LSTAT/100)))
> > suppressPackageStartupMessages(library(spatialreg))
> >
> > eigs_489 <- eigenw(lw_q_489)
> >
> > SLM_489 <- lagsarlm(form, data=boston_489,
> +           listw=lw_q_489, zero.policy=TRUE,
> +           control=list(pre_eig=eigs_489))
> >
> > nd <- boston_506[is.na(boston_506$median),]
> > t0 <- exp(predict(SLM_489, newdata=nd, listw=lw_q,
> +                   pred.type="TS", zero.policy=TRUE))
> > str(attr(lw_q, "region.id"))
>   chr [1:506] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14"
> "15" "16" ...
> > str(row.names(nd))
>   chr [1:17] "13" "14" "15" "17" "43" "50" "312" "313" "314" "317" "337"
> "346" "355" ...
> > all(row.names(nd) %in% attr(lw_q, "region.id"))
> [1] TRUE
> # introduce a wrong row.name
> > row.names(nd)[1] <- "0"
> > all(row.names(nd) %in% attr(lw_q, "region.id"))
> [1] FALSE
> > t0 <- exp(predict(SLM_489, newdata=nd, listw=lw_q,
> +                   pred.type="TS", zero.policy=TRUE))
> Error in predict.sarlm(SLM_489, newdata = nd, listw = lw_q,
>    pred.type = "TS",  :
>    mismatch between newdata and spatial weights. newdata should have
>    region.id as row.names
>
> In this case, the row.names of the input object to spdep::poly2nb() and
> the region.id matched, as the newdata were subsetted from the same
> object.
> We don't know the values for your data, but you should be able to check
> them. It is important that they align the data with the weights correctly
> for obvious reasons.
>
> Hope this helps,
>
> Roger
>
> >
> > I have obtained the weight matrix from the function below
> >
> > weightMat <- function(shp){
> >
> >  dnb <- knearneigh(coordinates(shp), k=4)
> >
> >  dnb <- knn2nb(dnb) #create nb
> >
> >  lw <- nb2listw(dnb, style="W",zero.policy=TRUE) #create lw
> >
> >  return(lw)
> >
> > }
> >
> > To cross check and make sure there are no discrepancies, I have run the
> > following lines
> >
> > length(weightmatrix$weights)
> >
> > nrow(missed_data)
> >
> > nrow(coordinates(shape))
> >
> > For all the codes above, the result is 182, which is the sample size of
> > data.
> >
> > Can anyone offer me some guidance in solving this problem? Thanks for
> your
> > help.
> >
> >
> >       Thanks & regards,
> >
> > *Amitha Puranik*
> >
> > Assistant Professor,
> >
> > Department of Statistics, PSPH
> >
> > Phone:0820-2922407
> > Address:Department of Statistics,
> >
> > Health Sciences Library, Level 6,
> >
> > Manipal Academy of Higher Education,Manipal,Karnataka,India
> >
> > An Institute of Eminence (Status Accorded by MHRD)
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > [hidden email]
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: [hidden email]
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Error while using predict.sarlm

AMITHA PURANIK
Dear Prof. Roger Bivand,

I was able to sort the error by using the 'row.names' argument in the
knn2nb function. I extracted the ID variable from my data and passed that
as input for row.names. Now when I checked I got the following.
> str(attr(weightmatrix, "region.id"))
 int [1:182] 16144 16007 16151 15910 15785 15829 16157 15909 16223 16097 ...
> str(row.names(missed_data))
 chr [1:182] "16144" "16007" "16151" "15910" "15785" "15829" "16157"
"15909" "16223" ...
> all(row.names(missed_data) %in% attr(weightmatrix, "region.id"))
[1] TRUE

I am now successfully able to use the predict.sarlm function. Thanks a ton
for your help!!


       Thanks & regards,

*Amitha Puranik*
















On Sun, May 26, 2019 at 9:52 PM Amitha Puranik <[hidden email]>
wrote:

> Dear Prof. Roger Bivand,
>
> Thanks a lot for providing a clarification for my query.
> I used the following code and found out that the region.id in listw
> object and row.names of the data do not match.
> > str(attr(weightmatrix, "region.id"))
>  chr [1:182] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14"
> "15" "16" "17" ...
> > str(row.names(missed_data))
>  chr [1:182] "142" "108" "149" "76" "8" "71" "45" "75" "173" "119" "22"
> "32" "156" "221" ...
> > all(row.names(missed_data) %in% attr(weightmatrix, "region.id"))
> [1] FALSE
>
>  How can I change the row.names of my data 'missed_data' and align to the
> listw object?
>
>
>        Thanks & regards,
>
> *Amitha Puranik*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Sat, May 25, 2019 at 9:49 PM Roger Bivand <[hidden email]> wrote:
>
>> On Fri, 24 May 2019, Amitha Puranik wrote:
>>
>> > I am facing an error while using predict.sarlm to make predictions for
>> spatial
>> > lag model generated using lagsarlm. I used the following code:
>> >
>> > predicted = predict(fit.lag, listw=weightmatrix, newdata=missed_data,
>> > pred.type="TS", zero.policy = T)
>> >
>> > For the argument newdata, I have passed the same data missed_data which
>> I
>> > used to fit the spatial lag model.
>> >
>> > When I run the above code, I get the following error message: “Error in
>> > predict.sarlm(fit.lag, listw = weightmatrix, newdata = missed_data,  :
>> > mismatch between newdata and spatial weights. newdata should have
>> region.id
>> > as row.names”
>>
>> The predict method has to identify the weights applying to the newdata.
>> So
>> it uses the region.id attribute of the neighbour object, and the
>> row.names
>> of the newdata object. If they do not match, it error-exits. If shp below
>> was read in the typical way, the default region.id may be the FID of the
>> input file (0, ..., (n-1)), but the default row.names of newdata may be
>> 1,
>> ..., n.
>>
>> For example:
>>
>> > library(sf)
>> Linking to GEOS 3.7.2, GDAL 3.0.0, PROJ 6.1.0
>> > boston_506 <- st_read(system.file(
>> +                                   "shapes/boston_tracts.shp",
>> +                                   package="spData")[1])
>> Reading layer `boston_tracts' from data source
>> `/home/rsb/lib/r_libs/spData/shapes/boston_tracts.shp' using driver `ESRI
>> Shapefile'
>> Simple feature collection with 506 features and 36 fields
>> geometry type:  POLYGON
>> dimension:      XY
>> bbox:           xmin: -71.52311 ymin: 42.00305 xmax: -70.63823 ymax:
>> 42.67307
>> epsg (SRID):    4267
>> proj4string:    +proj=longlat +datum=NAD27 +no_defs
>> > nb_q <- spdep::poly2nb(boston_506)
>> > lw_q <- spdep::nb2listw(nb_q, style="W")
>> > boston_489 <- boston_506[!is.na(boston_506$median),]
>> > nb_q_489 <- spdep::poly2nb(boston_489)
>> > lw_q_489 <- spdep::nb2listw(nb_q_489, style="W", zero.policy=TRUE)
>> > form <- formula(log(median) ~ CRIM + ZN + INDUS + CHAS +
>> +                 I((NOX*10)^2) + I(RM^2) + AGE + log(DIS) +
>> +                 log(RAD) + TAX + PTRATIO + I(BB/100) +
>> +                 log(I(LSTAT/100)))
>> > suppressPackageStartupMessages(library(spatialreg))
>> >
>> > eigs_489 <- eigenw(lw_q_489)
>> >
>> > SLM_489 <- lagsarlm(form, data=boston_489,
>> +           listw=lw_q_489, zero.policy=TRUE,
>> +           control=list(pre_eig=eigs_489))
>> >
>> > nd <- boston_506[is.na(boston_506$median),]
>> > t0 <- exp(predict(SLM_489, newdata=nd, listw=lw_q,
>> +                   pred.type="TS", zero.policy=TRUE))
>> > str(attr(lw_q, "region.id"))
>>   chr [1:506] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13"
>> "14"
>> "15" "16" ...
>> > str(row.names(nd))
>>   chr [1:17] "13" "14" "15" "17" "43" "50" "312" "313" "314" "317" "337"
>> "346" "355" ...
>> > all(row.names(nd) %in% attr(lw_q, "region.id"))
>> [1] TRUE
>> # introduce a wrong row.name
>> > row.names(nd)[1] <- "0"
>> > all(row.names(nd) %in% attr(lw_q, "region.id"))
>> [1] FALSE
>> > t0 <- exp(predict(SLM_489, newdata=nd, listw=lw_q,
>> +                   pred.type="TS", zero.policy=TRUE))
>> Error in predict.sarlm(SLM_489, newdata = nd, listw = lw_q,
>>    pred.type = "TS",  :
>>    mismatch between newdata and spatial weights. newdata should have
>>    region.id as row.names
>>
>> In this case, the row.names of the input object to spdep::poly2nb() and
>> the region.id matched, as the newdata were subsetted from the same
>> object.
>> We don't know the values for your data, but you should be able to check
>> them. It is important that they align the data with the weights correctly
>> for obvious reasons.
>>
>> Hope this helps,
>>
>> Roger
>>
>> >
>> > I have obtained the weight matrix from the function below
>> >
>> > weightMat <- function(shp){
>> >
>> >  dnb <- knearneigh(coordinates(shp), k=4)
>> >
>> >  dnb <- knn2nb(dnb) #create nb
>> >
>> >  lw <- nb2listw(dnb, style="W",zero.policy=TRUE) #create lw
>> >
>> >  return(lw)
>> >
>> > }
>> >
>> > To cross check and make sure there are no discrepancies, I have run the
>> > following lines
>> >
>> > length(weightmatrix$weights)
>> >
>> > nrow(missed_data)
>> >
>> > nrow(coordinates(shape))
>> >
>> > For all the codes above, the result is 182, which is the sample size of
>> > data.
>> >
>> > Can anyone offer me some guidance in solving this problem? Thanks for
>> your
>> > help.
>> >
>> >
>> >       Thanks & regards,
>> >
>> > *Amitha Puranik*
>> >
>> > Assistant Professor,
>> >
>> > Department of Statistics, PSPH
>> >
>> > Phone:0820-2922407
>> > Address:Department of Statistics,
>> >
>> > Health Sciences Library, Level 6,
>> >
>> > Manipal Academy of Higher Education,Manipal,Karnataka,India
>> >
>> > An Institute of Eminence (Status Accorded by MHRD)
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > R-sig-Geo mailing list
>> > [hidden email]
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> >
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; e-mail: [hidden email]
>> https://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>
>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo