Re: Getting spgm Models to work at Substate Levels - SAMHSA NSDUH Data - with Added nb Links

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Getting spgm Models to work at Substate Levels - SAMHSA NSDUH Data - with Added nb Links

Stuart Reece
Thankyou so much again Roger.

 

Both the connectivity maps and regressions work fine with the state data so I hope you will not mind if I use the substate data from the link mentioned earlier at :

https://www.samhsa.gov/data/sites/default/files/cbhsq-reports/NSDUHsubstateShapeFile2016/ShapeFile2016.zip 

 

and then just subset for one state.  New York would be good to choose as it has Richmond Island off the southern tip of Long Island which is not connected to any other area.  So it naturally forms a major point of interest.

 

I have done my level best to follow your instructions very carefully.

 

My code follows - obviously including the path to the file on my machine.

The connectivity map - closely following your own model script of course -  is perfect.  The inserted link works just exactly correctly as you will see.

The spml model also works OK albeit with some mysterious error which I do not well understand, but I do not think it matters much either.

 

However the spgm model does not run at all.

Of course this is also the most important model.

This throws the usual error "Error in x[, ii] : subscript out of bounds".

I have included the traceback notes, but I do not understand them at all.

 

Thankyou again for all of your kind and gracious advice.

 

Yours sincerely,

Stuart.

 

 

################################################################################################################################################################################################

################## Using NYC as a Reproducible Example

# Substate Shapefile - Single Year



USC <- st_read(dsn=" [path to file]  /NSDUH/SubstateRegionData141516.shp")

dim(USC)

names(USC)



USCReduced <- USC[,c(1:6,10,14,18,22,26,30,34,38,42,46,50,66)]

dim(USCReduced)

names(USCReduced)



NYsf <- subset(USCReduced, ST_NAME=="New York")

NYsf$Ix <- 0:14

NYsf$YearDt <- as.POSIXct(as.Date("01/01/2015", format="%d/%m/%Y"), tz="GMT")

class(NYsf$YearDt)

dim(NYsf)

names(NYsf)

NYsf <- NYsf[,c(19,20,1:5,9,6,7,10,14,15,12,8,13,16,17,18)]

dim(NYsf)

names(NYsf)

glimpse(NYsf)

print(NYsf, n=15)



NYsfnb <- poly2nb(NYsf)

str(NYsfnb, max.level=0)

str(NYsfnb, max.level=1)

summary(NYsfnb)



attr(NYsfnb, "region.id")[1:8]



NYsfnbb <- NYsfnb



attr(NYsfnbb, "region.id")

NYsfnbb[[8]]                           # == Region 322 on the Original map

NYsfnbb[[8]] <- as.integer(5)

NYsfnbb[[8]]                           # == Region 322 on the Original map

NYsfnbb[[5]]                           # == Region 319 on the Original map

NYsfnbb[[5]] <- as.integer(c(6,7,8))

NYsfnbb[[5]]

card(NYsfnbb)

summary(NYsfnbb)



NYsflw  <- nb2listw(NYsfnb, zero.policy=TRUE)

NYsflww <- nb2listw(NYsfnbb)

str(NYsfnbb, max.level = 0)

str(NYsflww, max.level = 0)



can.be.simmed(NYsflww)



############### US Substate Test Mapping::

coordsSubstateNY <- st_coordinates(st_centroid(st_geometry(NYsf), of_largest_polygon = TRUE))



## SubState_q1

dxxx <- diffnb(NYsfnb, NYsfnbb)

plot(st_geometry(NYsf), border = "grey50")

plot(NYsfnb, coordsSubstateNY, add = TRUE, lwd=0.5)

plot(dxxx, coordsSubstateNY, add = TRUE, col = "red", lwd=2)

title(main=paste("Differences (red) in New York Sub-State GAL Queen Weights (Black) First Order - USA", cex.main=1))



### Checking w Regressions

summary(spml(asinh(smiyr) ~ cigmon * asinh(mrjmon),

             data=NYsf, listw=NYsflww,

             lag=TRUE, effect="individual",

             model="random", spatial.error="b"))



summary(spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon),

             data=NYsf, listw=NYsflww,

             lag=TRUE, moments="fullweights", method="g2sls",

             model="random", spatial.error=TRUE))



#    Error in x[, ii] : subscript out of bounds

#

# > traceback()

# 13: na.omit.data.frame(list(`yend[, i]` = c(0, 0, 0, 0, 0, 0, 0,

#                                             0, 0, 0, 0, 0, 0, 0, 0), H = numeric(0)))

# 12: na.omit(list(`yend[, i]` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

#                                  0, 0, 0, 0), H = numeric(0)))

# 11: model.frame.default(formula = yend[, i] ~ H - 1, drop.unused.levels = TRUE)

# 10: stats::model.frame(formula = yend[, i] ~ H - 1, drop.unused.levels = TRUE)

# 9: eval(mf, parent.frame())

# 8: eval(mf, parent.frame())

# 7: lm(yend[, i] ~ H - 1)

# 6: fitted.values(lm(yend[, i] ~ H - 1))

# 5: spgm.tsls(ywithin, wywithin, Xwithin, Hwithin)

# 4: ivplm.w2sls(Y = y, X = x, lag = TRUE, listw = Ws, listw2 = Ws2,

#                twow = twow, lag.instruments = lag.instruments, T = T, N = N,

#                NT = NT)

# 3: spsarargm(formula = formula, data = data, index = index, listw = listw,

#              listw2 = listw2, moments = moments, lag = lag, endog = endog,

#              instruments = instruments, verbose = verbose, effects = effects,

#              control = control, lag.instruments = lag.instruments, optim.method = optim.method,

#              pars = pars, twow = twow)

# 2: spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data = NYsf, listw = NYsflww,

#         lag = TRUE, moments = "fullweights", model = "random", spatial.error = TRUE)

# 1: summary(spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data = NYsf,

#                 listw = NYsflww, lag = TRUE, moments = "fullweights", model = "random",

#                 spatial.error = TRUE))





-----Original Message-----
From: Roger Bivand <[hidden email]>
Sent: Thursday, 8 August, 2019 9:35 PM
To: Stuart Reece <[hidden email]>
Cc: 'Dr Stuart Reece' <[hidden email]>; 'Barry Rowlingson' <[hidden email]>; 'R-sig-geo Mailing List' <[hidden email]>
Subject: RE: [R-sig-Geo] Adding a Few Neighbour Relationships to a nb List



On Thu, 8 Aug 2019, Stuart Reece wrote:



> Thankyou for this advice Roger.

> Happy to provide my work for your review but I am not sure how.

> This list server has a limit of 50KB and these files are about 100MB....

> Love to assist but I would need advice on how best to proceed....???



Choose a built-in data set, such as Produc, change two states to ALASKA and HAWAII, drop the ones you chose from the map and any other states not on your map, and use that data set, putting the script needed at the head of your example. Then create the weights, initially with no links to ALASKA and HAWAII, and run the model. Next, edit the neighbour object in script form, and try again. Then anyone with your map object (you gave a link), your script, and the plm and splm packages can reproduce your problem.



Do also run traceback() after errors, and post what you see.



Roger



> Many thanks again,

> Stuart.

>


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Getting spgm Models to work at Substate Levels - SAMHSA NSDUH Data - with Added nb Links

Roger Bivand
Administrator
On Thu, 8 Aug 2019, Stuart Reece wrote:

> Thankyou so much again Roger.
>
> Both the connectivity maps and regressions work fine with the state data
> so I hope you will not mind if I use the substate data from the link
> mentioned earlier at :
>
> https://www.samhsa.gov/data/sites/default/files/cbhsq-reports/NSDUHsubstateShapeFile2016/ShapeFile2016.zip
>
> and then just subset for one state.  New York would be good to choose as
> it has Richmond Island off the southern tip of Long Island which is not
> connected to any other area.  So it naturally forms a major point of
> interest.
>
> I have done my level best to follow your instructions very carefully.

Thanks for your patience - now there is something to work on. It is easier
to extract the code if you post plain text only, so my abbreviation is:

library(sf)
USC <- st_read(dsn="SubstateRegionData141516.shp")
USCReduced <- USC[,c(1:6,10,14,18,22,26,30,34,38,42,46,50,66)]
NYsf <- subset(USCReduced, ST_NAME=="New York")
NYsf$Ix <- 0:14
NYsf$YearDt <- as.POSIXct(as.Date("01/01/2015", format="%d/%m/%Y"),
tz="GMT")
NYsf <- NYsf[,c(19,20,1:5,9,6,7,10,14,15,12,8,13,16,17,18)]
library(spdep)
NYsfnb <- poly2nb(NYsf)
NYsfnb
NYsfnb[[8]] <- as.integer(5)
NYsfnb[[5]] <- as.integer(unique(sort(c(NYsfnb[[5]], 8))))
NYsflw  <- nb2listw(NYsfnb)
library(splm)
ml_obj <- spml(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, lag=TRUE, effect="individual", model="random",
  spatial.error="b"))
gm_obj <- spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, lag=TRUE, moments="fullweights", method="g2sls",
  model="random", spatial.error=TRUE)
# Error in x[, ii] : subscript out of bounds
gm_obj <- spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, lag=TRUE, moments="weights", method="g2sls",
  model="random", spatial.error=TRUE)
# Error in x[, ii] : subscript out of bounds
gm_obj <- spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, lag=TRUE, moments="initial", method="g2sls",
  model="random", spatial.error=TRUE)
# Error in x[, ii] : subscript out of bounds
gm_obj <- spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw, moments="fullweights", spatial.error=TRUE)
# Error in x[, ii] : subscript out of bounds

Could you please try to run the model through plm - this subset seems only
to have one time period? Perhaps plm is more forgiving than splm? Maybe
that is confusing the internals?

summary(lm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf))
library(spatialreg)
summary(lagsarlm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data=NYsf,
  listw=NYsflw))

both complete without trouble?

Roger

>
> My code follows - obviously including the path to the file on my machine.
>
> The connectivity map - closely following your own model script of course -  is perfect.  The inserted link works just exactly correctly as you will see.
>
> The spml model also works OK albeit with some mysterious error which I do not well understand, but I do not think it matters much either.
>
>
>
> However the spgm model does not run at all.
>
> Of course this is also the most important model.
>
> This throws the usual error "Error in x[, ii] : subscript out of bounds".
>
> I have included the traceback notes, but I do not understand them at all.
>
>
>
> Thankyou again for all of your kind and gracious advice.
>
>
>
> Yours sincerely,
>
> Stuart.
>
>
>
>
>
> ################################################################################################################################################################################################
>
> ################## Using NYC as a Reproducible Example
>
> # Substate Shapefile - Single Year
>
>
>
> USC <- st_read(dsn=" [path to file]  /NSDUH/SubstateRegionData141516.shp")
>
> dim(USC)
>
> names(USC)
>
>
>
> USCReduced <- USC[,c(1:6,10,14,18,22,26,30,34,38,42,46,50,66)]
>
> dim(USCReduced)
>
> names(USCReduced)
>
>
>
> NYsf <- subset(USCReduced, ST_NAME=="New York")
>
> NYsf$Ix <- 0:14
>
> NYsf$YearDt <- as.POSIXct(as.Date("01/01/2015", format="%d/%m/%Y"), tz="GMT")
>
> class(NYsf$YearDt)
>
> dim(NYsf)
>
> names(NYsf)
>
> NYsf <- NYsf[,c(19,20,1:5,9,6,7,10,14,15,12,8,13,16,17,18)]
>
> dim(NYsf)
>
> names(NYsf)
>
> glimpse(NYsf)
>
> print(NYsf, n=15)
>
>
>
> NYsfnb <- poly2nb(NYsf)
>
> str(NYsfnb, max.level=0)
>
> str(NYsfnb, max.level=1)
>
> summary(NYsfnb)
>
>
>
> attr(NYsfnb, "region.id")[1:8]
>
>
>
> NYsfnbb <- NYsfnb
>
>
>
> attr(NYsfnbb, "region.id")
>
> NYsfnbb[[8]]                           # == Region 322 on the Original map
>
> NYsfnbb[[8]] <- as.integer(5)
>
> NYsfnbb[[8]]                           # == Region 322 on the Original map
>
> NYsfnbb[[5]]                           # == Region 319 on the Original map
>
> NYsfnbb[[5]] <- as.integer(c(6,7,8))
>
> NYsfnbb[[5]]
>
> card(NYsfnbb)
>
> summary(NYsfnbb)
>
>
>
> NYsflw  <- nb2listw(NYsfnb, zero.policy=TRUE)
>
> NYsflww <- nb2listw(NYsfnbb)
>
> str(NYsfnbb, max.level = 0)
>
> str(NYsflww, max.level = 0)
>
>
>
> can.be.simmed(NYsflww)
>
>
>
> ############### US Substate Test Mapping::
>
> coordsSubstateNY <- st_coordinates(st_centroid(st_geometry(NYsf), of_largest_polygon = TRUE))
>
>
>
> ## SubState_q1
>
> dxxx <- diffnb(NYsfnb, NYsfnbb)
>
> plot(st_geometry(NYsf), border = "grey50")
>
> plot(NYsfnb, coordsSubstateNY, add = TRUE, lwd=0.5)
>
> plot(dxxx, coordsSubstateNY, add = TRUE, col = "red", lwd=2)
>
> title(main=paste("Differences (red) in New York Sub-State GAL Queen Weights (Black) First Order - USA", cex.main=1))
>
>
>
> ### Checking w Regressions
>
> summary(spml(asinh(smiyr) ~ cigmon * asinh(mrjmon),
>
>             data=NYsf, listw=NYsflww,
>
>             lag=TRUE, effect="individual",
>
>             model="random", spatial.error="b"))
>
>
>
> summary(spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon),
>
>             data=NYsf, listw=NYsflww,
>
>             lag=TRUE, moments="fullweights", method="g2sls",
>
>             model="random", spatial.error=TRUE))
>
>
>
> #    Error in x[, ii] : subscript out of bounds
>
> #
>
> # > traceback()
>
> # 13: na.omit.data.frame(list(`yend[, i]` = c(0, 0, 0, 0, 0, 0, 0,
>
> #                                             0, 0, 0, 0, 0, 0, 0, 0), H = numeric(0)))
>
> # 12: na.omit(list(`yend[, i]` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>
> #                                  0, 0, 0, 0), H = numeric(0)))
>
> # 11: model.frame.default(formula = yend[, i] ~ H - 1, drop.unused.levels = TRUE)
>
> # 10: stats::model.frame(formula = yend[, i] ~ H - 1, drop.unused.levels = TRUE)
>
> # 9: eval(mf, parent.frame())
>
> # 8: eval(mf, parent.frame())
>
> # 7: lm(yend[, i] ~ H - 1)
>
> # 6: fitted.values(lm(yend[, i] ~ H - 1))
>
> # 5: spgm.tsls(ywithin, wywithin, Xwithin, Hwithin)
>
> # 4: ivplm.w2sls(Y = y, X = x, lag = TRUE, listw = Ws, listw2 = Ws2,
>
> #                twow = twow, lag.instruments = lag.instruments, T = T, N = N,
>
> #                NT = NT)
>
> # 3: spsarargm(formula = formula, data = data, index = index, listw = listw,
>
> #              listw2 = listw2, moments = moments, lag = lag, endog = endog,
>
> #              instruments = instruments, verbose = verbose, effects = effects,
>
> #              control = control, lag.instruments = lag.instruments, optim.method = optim.method,
>
> #              pars = pars, twow = twow)
>
> # 2: spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data = NYsf, listw = NYsflww,
>
> #         lag = TRUE, moments = "fullweights", model = "random", spatial.error = TRUE)
>
> # 1: summary(spgm(asinh(smiyr) ~ cigmon * asinh(mrjmon), data = NYsf,
>
> #                 listw = NYsflww, lag = TRUE, moments = "fullweights", model = "random",
>
> #                 spatial.error = TRUE))
>
>
>
>
>
> -----Original Message-----
> From: Roger Bivand <[hidden email]>
> Sent: Thursday, 8 August, 2019 9:35 PM
> To: Stuart Reece <[hidden email]>
> Cc: 'Dr Stuart Reece' <[hidden email]>; 'Barry Rowlingson' <[hidden email]>; 'R-sig-geo Mailing List' <[hidden email]>
> Subject: RE: [R-sig-Geo] Adding a Few Neighbour Relationships to a nb List
>
>
>
> On Thu, 8 Aug 2019, Stuart Reece wrote:
>
>
>
>> Thankyou for this advice Roger.
>
>> Happy to provide my work for your review but I am not sure how.
>
>> This list server has a limit of 50KB and these files are about 100MB....
>
>> Love to assist but I would need advice on how best to proceed....???
>
>
>
> Choose a built-in data set, such as Produc, change two states to ALASKA and HAWAII, drop the ones you chose from the map and any other states not on your map, and use that data set, putting the script needed at the head of your example. Then create the weights, initially with no links to ALASKA and HAWAII, and run the model. Next, edit the neighbour object in script form, and try again. Then anyone with your map object (you gave a link), your script, and the plm and splm packages can reproduce your problem.
>
>
>
> Do also run traceback() after errors, and post what you see.
>
>
>
> Roger
>
>
>
>> Many thanks again,
>
>> Stuart.
>
>>
>
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway