I am trying to perform moran's I test on residuals from by probit model
using knearest neighbour weights, however i run into an error which i cant seem to find the solution anywhere online the error is > moran.test(residuals.glm(svyprobitest),knear2weight) Error in moran.test(residuals.glm(svyprobitest), knear2weight) : objects of different length The codes I use to come up to this are below > library(foreign) > dhsanalysis2= read.dta("DHSSpatialreg11.dta") ##removing NAs in PSU > dhsanalysis3=subset(dhsanalysis2, !is.na(psu)) ##generating a survey design > mydesighdhs= svydesign(ids =~psu, data =dhsanalysis3, weight = ~wgt, strata = ~v023,nest=TRUE) ##Defining coordinate colums > coordinates(dhsanalysis3)= c("longnum","latnum") #Defining Projection > proj4string(dhsanalysis3) < CRS("+init=epsg:4326") ##binding longiude and latitude > lon2< dhsanalysis3$longnum > lat2< dhsanalysis3$latnum > coords< cbind(lon2,lat2) ##Creating spatial weights based on the nearest neighbour > knear2= knearneigh(coords,k=2,longlat=T) Warning message: In knearneigh(coords, k = 2, longlat = T) : knearneigh: identical points found ### I get a warning message on identical points found, is this a problem and how would i deal with this > knear2.nb= knn2nb(knear2) > knear2weight= nb2listw(knear2.nb, style="W",zero.policy=T) ##declaring categorical variables as factor variables > dhsanalysis3$hivpositive.f < factor(dhsanalysis3$hivpositive) > dhsanalysis3$protestant.f < factor(dhsanalysis3$protestant2) > dhsanalysis3$Married.f < factor(dhsanalysis3$Married) > dhsanalysis3$female.f < factor(dhsanalysis3$female) > dhsanalysis3$urban1.f < factor(dhsanalysis3$urban1) > dhsanalysis3$river10kmdum.f < factor(dhsanalysis3$river10kmdum) > dhsanalysis3$Explorer50kmdum.f < factor(dhsanalysis3$Explorer50kmdum) > dhsanalysis3$Rail50kmdum.f < factor(dhsanalysis3$Rail50kmdum) > dhsanalysis3$Province1.f < factor(dhsanalysis3$Province1) > dhsanalysis3$WealthIndex.f < factor(dhsanalysis3$WealthIndex) > dhsanalysis3$occupation2.f < factor(dhsanalysis3$occupation2) > dhsanalysis3$highested.f < factor(dhsanalysis3$highested) > protestant.f= dhsanalysis3$protestant.f > hivpositive.f=dhsanalysi3$hivpositive.f > Married.f=dhsanalysis3$Married.f > female.f=dhsanalysis3$female.f > urban1.f=dhsanalysis3$urban1.f > river10kmdum.f=dhsanalysis3$river10kmdum.f > Explorer50kmdum.f=dhsanalysis2$Explorer50kmdum.f > Province1.f=dhsanalysis3$Province1.f > WealthIndex.f=dhsanalysis3$WealthIndex.f > occupation2.f=dhsanalysis3$occupation2.f > highested.f=dhsanalysis3$highested.f > Age=dhsanalysis3$Age > Age2=dhsanalysis3$Age2 > HIVKnowledge=dhsanalysis3$HIVKnowledge > churchkm=dhsanalysis3$churchkm > lnhospitalkm=dhsanalysis3$lnhospitalkm > lnElevationMean=dhsanalysis3$lnElevationMean > ###VARIABLES IN SURVEY DESIGN > mydesighdhs$hivpositive.f < factor(dhsanalysis3$hivpositive) > mydesighdhs$protestant.f < factor(dhsanalysis3$protestant2) > mydesighdhs$Married.f < factor(dhsanalysis3$Married) > mydesighdhs$female.f < factor(dhsanalysis3$female) > mydesighdhs$urban1.f < factor(dhsanalysis3$urban1) > mydesighdhs$river10kmdum.f < factor(dhsanalysis3$river10kmdum) > mydesighdhs$Explorer50kmdum.f < factor(dhsanalysis3$Explorer50kmdum) > mydesighdhs$Rail50kmdum.f < factor(dhsanalysis3$Rail50kmdum) > mydesighdhs$Province1.f < factor(dhsanalysis3$Province1) > mydesighdhs$WealthIndex.f < factor(dhsanalysis3$WealthIndex) > mydesighdhs$occupation2.f < factor(dhsanalysis3$occupation2) > mydesighdhs$highested.f < factor(dhsanalysis3$highested) > mydesighdhs$Age=Age > mydesighdhs$Age2=Age2 > mydesighdhs$HIVKnowledge=HIVKnowledge > mydesighdhs$churchkm=churchkm > mydesighdhs$lnhospitalkm=lnhospitalkm > mydesighdhs$lnElevationMean=lnElevationMean > svyprobitest= svyglm(hivpositive.f~ churchkm +lnhospitalkm +protestant.f+ Age+ Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f + WealthIndex.f + HIVKnowledge+ occupation2.f+ highested.f,design=mydesighdhs,family = quasibinomial(link = "probit"),data=dhsanalysis3) Error in .subset2(x, i, exact = exact) : subscript out of bounds ## Iran my probit model without implementing the survey design in the model just to see whether Moran test is working > svyprobitest2= glm(hivpositive.f~ churchkm +lnhospitalkm +protestant.f+ Age+ Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f + WealthIndex.f + HIVKnowledge+ occupation2.f+ highested.f,family = quasibinomial(link = "probit"),data=dhsanalysis3) #Moran test > moran.test(residuals.glm(svyprobitest),knear2weight) Error in moran.test(residuals.glm(svyprobitest), knear2weight) : objects of different length Kind Regards, Michael Chanda Chiseni Phd Candidate Department of Economic History Lund University Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund *Africa is not poor, it is poorly managed (Ellen JohnsonSirleaf ). * [[alternative HTML version deleted]] _______________________________________________ RsigGeo mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/rsiggeo 
Administrator

On Tue, 11 Aug 2020, Chanda Chiseni wrote:
> I am trying to perform moran's I test on residuals from by probit model > using knearest neighbour weights, however i run into an error which i > cant seem to find the solution anywhere online the error is > >> moran.test(residuals.glm(svyprobitest),knear2weight) > Error in moran.test(residuals.glm(svyprobitest), knear2weight) : > objects of different length > > > The codes I use to come up to this are below > >> library(foreign) >> dhsanalysis2= read.dta("DHSSpatialreg11.dta") > ##removing NAs in PSU >> dhsanalysis3=subset(dhsanalysis2, !is.na(psu)) > ##generating a survey design >> mydesighdhs= svydesign(ids =~psu, data =dhsanalysis3, weight = ~wgt, > strata = ~v023,nest=TRUE) > ##Defining coordinate colums >> coordinates(dhsanalysis3)= c("longnum","latnum") > #Defining Projection >> proj4string(dhsanalysis3) < CRS("+init=epsg:4326") > ##binding longiude and latitude >> lon2< dhsanalysis3$longnum >> lat2< dhsanalysis3$latnum >> coords< cbind(lon2,lat2) > ##Creating spatial weights based on the nearest neighbour >> knear2= knearneigh(coords,k=2,longlat=T) > Warning message: > In knearneigh(coords, k = 2, longlat = T) : > knearneigh: identical points found ### I get a warning message on > identical points found, is this a problem and how would i deal with this at the same point, which could be an unintended duplicate (copy of the same observation), or it could indicate that any spatial process has multiple values at that point. In geostatistics, one would jitter the points to introduce distance. In your case, this probably isn't households living at one address point. All the observations at the same point will get the same sets of neighbours. If there are many colocated observations, k in knearest may be too small to include them all. >> knear2.nb= knn2nb(knear2) >> knear2weight= nb2listw(knear2.nb, style="W",zero.policy=T) > > ##declaring categorical variables as factor variables >> dhsanalysis3$hivpositive.f < factor(dhsanalysis3$hivpositive) >> dhsanalysis3$protestant.f < factor(dhsanalysis3$protestant2) >> dhsanalysis3$Married.f < factor(dhsanalysis3$Married) >> dhsanalysis3$female.f < factor(dhsanalysis3$female) >> dhsanalysis3$urban1.f < factor(dhsanalysis3$urban1) >> dhsanalysis3$river10kmdum.f < factor(dhsanalysis3$river10kmdum) >> dhsanalysis3$Explorer50kmdum.f < factor(dhsanalysis3$Explorer50kmdum) >> dhsanalysis3$Rail50kmdum.f < factor(dhsanalysis3$Rail50kmdum) >> dhsanalysis3$Province1.f < factor(dhsanalysis3$Province1) >> dhsanalysis3$WealthIndex.f < factor(dhsanalysis3$WealthIndex) >> dhsanalysis3$occupation2.f < factor(dhsanalysis3$occupation2) >> dhsanalysis3$highested.f < factor(dhsanalysis3$highested) >> protestant.f= dhsanalysis3$protestant.f >> hivpositive.f=dhsanalysi3$hivpositive.f >> Married.f=dhsanalysis3$Married.f >> female.f=dhsanalysis3$female.f >> urban1.f=dhsanalysis3$urban1.f >> river10kmdum.f=dhsanalysis3$river10kmdum.f >> Explorer50kmdum.f=dhsanalysis2$Explorer50kmdum.f >> Province1.f=dhsanalysis3$Province1.f >> WealthIndex.f=dhsanalysis3$WealthIndex.f >> occupation2.f=dhsanalysis3$occupation2.f >> highested.f=dhsanalysis3$highested.f >> Age=dhsanalysis3$Age >> Age2=dhsanalysis3$Age2 >> HIVKnowledge=dhsanalysis3$HIVKnowledge >> churchkm=dhsanalysis3$churchkm >> lnhospitalkm=dhsanalysis3$lnhospitalkm >> lnElevationMean=dhsanalysis3$lnElevationMean > >> ###VARIABLES IN SURVEY DESIGN >> mydesighdhs$hivpositive.f < factor(dhsanalysis3$hivpositive) >> mydesighdhs$protestant.f < factor(dhsanalysis3$protestant2) >> mydesighdhs$Married.f < factor(dhsanalysis3$Married) >> mydesighdhs$female.f < factor(dhsanalysis3$female) >> mydesighdhs$urban1.f < factor(dhsanalysis3$urban1) >> mydesighdhs$river10kmdum.f < factor(dhsanalysis3$river10kmdum) >> mydesighdhs$Explorer50kmdum.f < factor(dhsanalysis3$Explorer50kmdum) >> mydesighdhs$Rail50kmdum.f < factor(dhsanalysis3$Rail50kmdum) >> mydesighdhs$Province1.f < factor(dhsanalysis3$Province1) >> mydesighdhs$WealthIndex.f < factor(dhsanalysis3$WealthIndex) >> mydesighdhs$occupation2.f < factor(dhsanalysis3$occupation2) >> mydesighdhs$highested.f < factor(dhsanalysis3$highested) >> mydesighdhs$Age=Age >> mydesighdhs$Age2=Age2 >> mydesighdhs$HIVKnowledge=HIVKnowledge >> mydesighdhs$churchkm=churchkm >> mydesighdhs$lnhospitalkm=lnhospitalkm >> mydesighdhs$lnElevationMean=lnElevationMean > >> svyprobitest= svyglm(hivpositive.f~ churchkm +lnhospitalkm +protestant.f+ > Age+ > Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f > + WealthIndex.f + HIVKnowledge+ occupation2.f+ > highested.f,design=mydesighdhs,family = quasibinomial(link = > "probit"),data=dhsanalysis3) > Error in .subset2(x, i, exact = exact) : subscript out of bounds > > ## Iran my probit model without implementing the survey design in the model > just to see whether Moran test is working >> svyprobitest2= glm(hivpositive.f~ churchkm +lnhospitalkm +protestant.f+ > Age+ > Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f > + WealthIndex.f + HIVKnowledge+ occupation2.f+ highested.f,family = > quasibinomial(link = "probit"),data=dhsanalysis3) > #Moran test >> moran.test(residuals.glm(svyprobitest),knear2weight) > Error in moran.test(residuals.glm(svyprobitest), knear2weight) : > objects of different length > and variance of the statistic are based on a null model (intercept only), not a model with covariates. What are length(residuals.glm(svyprobitest)) and length(knear2.nb)? Did glm drop observations with missing values? If so, you should subset both the data submitted to glm() and the neighbour object so that they match. Hope this helps, Roger > > Kind Regards, > > Michael Chanda Chiseni > > Phd Candidate > > Department of Economic History > > Lund University > > Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund > > > > *Africa is not poor, it is poorly managed (Ellen JohnsonSirleaf ). * > > [[alternative HTML version deleted]] > > _______________________________________________ > RsigGeo mailing list > [hidden email] > https://stat.ethz.ch/mailman/listinfo/rsiggeo > Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N5045 Bergen, Norway. voice: +47 55 95 93 55; email: [hidden email] https://orcid.org/0000000323926140 https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en _______________________________________________ RsigGeo mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/rsiggeo
Roger Bivand
Department of Economics Norwegian School of Economics Helleveien 30 N5045 Bergen, Norway 
Hi Roger
Thank you for your response. I am using DHS individual level data , they cluster individuals into about 2530 people and assign them coordinates depending on where they were surveyed. So about 30 individuals have similar coordinates. Yes I do have more than one observation at one point as you have stated, this could be the source of my problem indeed, My question is how could i go about creating nearest neighbors and weights by clustering individuals that share the same point? I sought for a solution online but could not find any. Sorry I am a novice at spatial econometrics, this may be obvious to some , so instead of testing for spatial correlation in my residuals, I should in essence test for spatial autocorrelation in my outcome variable, for instance in my analysis I am using a dummy variable of whether an individual is hiv positive or negative. And is there a specific way of using a moran.test() on a dummy variable? Kind Regards, Michael Chanda Chiseni Phd Candidate Department of Economic History Lund University Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund *Africa is not poor, it is poorly managed (Ellen JohnsonSirleaf ). * On Wed, Aug 12, 2020 at 12:55 PM Roger Bivand <[hidden email]> wrote: > On Tue, 11 Aug 2020, Chanda Chiseni wrote: > > > I am trying to perform moran's I test on residuals from by probit model > > using knearest neighbour weights, however i run into an error which i > > cant seem to find the solution anywhere online the error is > > > >> moran.test(residuals.glm(svyprobitest),knear2weight) > > Error in moran.test(residuals.glm(svyprobitest), knear2weight) : > > objects of different length > > > > > > The codes I use to come up to this are below > > > >> library(foreign) > >> dhsanalysis2= read.dta("DHSSpatialreg11.dta") > > ##removing NAs in PSU > >> dhsanalysis3=subset(dhsanalysis2, !is.na(psu)) > > ##generating a survey design > >> mydesighdhs= svydesign(ids =~psu, data =dhsanalysis3, weight = ~wgt, > > strata = ~v023,nest=TRUE) > > ##Defining coordinate colums > >> coordinates(dhsanalysis3)= c("longnum","latnum") > > #Defining Projection > >> proj4string(dhsanalysis3) < CRS("+init=epsg:4326") > > ##binding longiude and latitude > >> lon2< dhsanalysis3$longnum > >> lat2< dhsanalysis3$latnum > >> coords< cbind(lon2,lat2) > > ##Creating spatial weights based on the nearest neighbour > >> knear2= knearneigh(coords,k=2,longlat=T) > > Warning message: > > In knearneigh(coords, k = 2, longlat = T) : > > knearneigh: identical points found ### I get a warning message on > > identical points found, is this a problem and how would i deal with this > > It usually indicates muddled thinking. You have more than one observation > at the same point, which could be an unintended duplicate (copy of the > same observation), or it could indicate that any spatial process has > multiple values at that point. In geostatistics, one would jitter the > points to introduce distance. In your case, this probably isn't households > living at one address point. All the observations at the same point will > get the same sets of neighbours. If there are many colocated > observations, k in knearest may be too small to include them all. > > >> knear2.nb= knn2nb(knear2) > >> knear2weight= nb2listw(knear2.nb, style="W",zero.policy=T) > > > > ##declaring categorical variables as factor variables > >> dhsanalysis3$hivpositive.f < factor(dhsanalysis3$hivpositive) > >> dhsanalysis3$protestant.f < factor(dhsanalysis3$protestant2) > >> dhsanalysis3$Married.f < factor(dhsanalysis3$Married) > >> dhsanalysis3$female.f < factor(dhsanalysis3$female) > >> dhsanalysis3$urban1.f < factor(dhsanalysis3$urban1) > >> dhsanalysis3$river10kmdum.f < factor(dhsanalysis3$river10kmdum) > >> dhsanalysis3$Explorer50kmdum.f < factor(dhsanalysis3$Explorer50kmdum) > >> dhsanalysis3$Rail50kmdum.f < factor(dhsanalysis3$Rail50kmdum) > >> dhsanalysis3$Province1.f < factor(dhsanalysis3$Province1) > >> dhsanalysis3$WealthIndex.f < factor(dhsanalysis3$WealthIndex) > >> dhsanalysis3$occupation2.f < factor(dhsanalysis3$occupation2) > >> dhsanalysis3$highested.f < factor(dhsanalysis3$highested) > >> protestant.f= dhsanalysis3$protestant.f > >> hivpositive.f=dhsanalysi3$hivpositive.f > >> Married.f=dhsanalysis3$Married.f > >> female.f=dhsanalysis3$female.f > >> urban1.f=dhsanalysis3$urban1.f > >> river10kmdum.f=dhsanalysis3$river10kmdum.f > >> Explorer50kmdum.f=dhsanalysis2$Explorer50kmdum.f > >> Province1.f=dhsanalysis3$Province1.f > >> WealthIndex.f=dhsanalysis3$WealthIndex.f > >> occupation2.f=dhsanalysis3$occupation2.f > >> highested.f=dhsanalysis3$highested.f > >> Age=dhsanalysis3$Age > >> Age2=dhsanalysis3$Age2 > >> HIVKnowledge=dhsanalysis3$HIVKnowledge > >> churchkm=dhsanalysis3$churchkm > >> lnhospitalkm=dhsanalysis3$lnhospitalkm > >> lnElevationMean=dhsanalysis3$lnElevationMean > > > >> ###VARIABLES IN SURVEY DESIGN > >> mydesighdhs$hivpositive.f < factor(dhsanalysis3$hivpositive) > >> mydesighdhs$protestant.f < factor(dhsanalysis3$protestant2) > >> mydesighdhs$Married.f < factor(dhsanalysis3$Married) > >> mydesighdhs$female.f < factor(dhsanalysis3$female) > >> mydesighdhs$urban1.f < factor(dhsanalysis3$urban1) > >> mydesighdhs$river10kmdum.f < factor(dhsanalysis3$river10kmdum) > >> mydesighdhs$Explorer50kmdum.f < factor(dhsanalysis3$Explorer50kmdum) > >> mydesighdhs$Rail50kmdum.f < factor(dhsanalysis3$Rail50kmdum) > >> mydesighdhs$Province1.f < factor(dhsanalysis3$Province1) > >> mydesighdhs$WealthIndex.f < factor(dhsanalysis3$WealthIndex) > >> mydesighdhs$occupation2.f < factor(dhsanalysis3$occupation2) > >> mydesighdhs$highested.f < factor(dhsanalysis3$highested) > >> mydesighdhs$Age=Age > >> mydesighdhs$Age2=Age2 > >> mydesighdhs$HIVKnowledge=HIVKnowledge > >> mydesighdhs$churchkm=churchkm > >> mydesighdhs$lnhospitalkm=lnhospitalkm > >> mydesighdhs$lnElevationMean=lnElevationMean > > > >> svyprobitest= svyglm(hivpositive.f~ churchkm +lnhospitalkm > +protestant.f+ > > Age+ > > > Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f > > + WealthIndex.f + HIVKnowledge+ occupation2.f+ > > highested.f,design=mydesighdhs,family = quasibinomial(link = > > "probit"),data=dhsanalysis3) > > Error in .subset2(x, i, exact = exact) : subscript out of bounds > > So you've an error anyway. > > > > > ## Iran my probit model without implementing the survey design in the > model > > just to see whether Moran test is working > >> svyprobitest2= glm(hivpositive.f~ churchkm +lnhospitalkm +protestant.f+ > > Age+ > > > Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f > > + WealthIndex.f + HIVKnowledge+ occupation2.f+ highested.f,family = > > quasibinomial(link = "probit"),data=dhsanalysis3) > > #Moran test > >> moran.test(residuals.glm(svyprobitest),knear2weight) > > Error in moran.test(residuals.glm(svyprobitest), knear2weight) : > > objects of different length > > > > Testing residuals using moran.test() is usually wrong, as the expectation > and variance of the statistic are based on a null model (intercept only), > not a model with covariates. > > What are length(residuals.glm(svyprobitest)) and length(knear2.nb)? Did > glm drop observations with missing values? If so, you should subset both > the data submitted to glm() and the neighbour object so that they match. > > Hope this helps, > > Roger > > > > > Kind Regards, > > > > Michael Chanda Chiseni > > > > Phd Candidate > > > > Department of Economic History > > > > Lund University > > > > Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund > > > > > > > > *Africa is not poor, it is poorly managed (Ellen JohnsonSirleaf ). * > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > RsigGeo mailing list > > [hidden email] > > https://stat.ethz.ch/mailman/listinfo/rsiggeo > > > >  > Roger Bivand > Department of Economics, Norwegian School of Economics, > Helleveien 30, N5045 Bergen, Norway. > voice: +47 55 95 93 55; email: [hidden email] > https://orcid.org/0000000323926140 > https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en [[alternative HTML version deleted]] _______________________________________________ RsigGeo mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/rsiggeo 
Administrator

On Wed, 12 Aug 2020, Chanda Chiseni wrote:
> Hi Roger > > Thank you for your response. > > I am using DHS individual level data , they cluster individuals into about > 2530 people and assign them coordinates depending on where they were > surveyed. So about 30 individuals have similar coordinates. Yes I do have > more than one observation at one point as you have stated, this could be > the source of my problem indeed, My question is how could i go about > creating nearest neighbors and weights by clustering individuals that share > the same point? I sought for a solution online but could not find any. ratio. See for example the HSAR package or other multilevel approaches, review in https://doi.org/10.1016/j.spasta.2017.01.002 and in references in HSAR. You have a lower layer of observations in an upper layer of known survey locations, so a multilevel approach respects that structure, I think. Probably an upper level MRF in mgcv::gam() or try the hgml package, or INLA/BayesX for Bayesian approaches. In these you fit a model with spatially structured random effects at the upper level (the weights probably need to be symmetric), which then show whether including this term improves the fit. Hope this helps, Roger > Sorry I am a novice at spatial econometrics, this may be obvious to some , > so instead of testing for spatial correlation in my residuals, I should in > essence test for spatial autocorrelation in my outcome variable, for > instance in my analysis I am using a dummy variable of whether an > individual is hiv positive or negative. And is there a specific way of > using a moran.test() on a dummy variable? > > > Kind Regards, > > Michael Chanda Chiseni > > Phd Candidate > > Department of Economic History > > Lund University > > Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund > > > > *Africa is not poor, it is poorly managed (Ellen JohnsonSirleaf ). * > > > > > > > On Wed, Aug 12, 2020 at 12:55 PM Roger Bivand <[hidden email]> wrote: > >> On Tue, 11 Aug 2020, Chanda Chiseni wrote: >> >>> I am trying to perform moran's I test on residuals from by probit model >>> using knearest neighbour weights, however i run into an error which i >>> cant seem to find the solution anywhere online the error is >>> >>>> moran.test(residuals.glm(svyprobitest),knear2weight) >>> Error in moran.test(residuals.glm(svyprobitest), knear2weight) : >>> objects of different length >>> >>> >>> The codes I use to come up to this are below >>> >>>> library(foreign) >>>> dhsanalysis2= read.dta("DHSSpatialreg11.dta") >>> ##removing NAs in PSU >>>> dhsanalysis3=subset(dhsanalysis2, !is.na(psu)) >>> ##generating a survey design >>>> mydesighdhs= svydesign(ids =~psu, data =dhsanalysis3, weight = ~wgt, >>> strata = ~v023,nest=TRUE) >>> ##Defining coordinate colums >>>> coordinates(dhsanalysis3)= c("longnum","latnum") >>> #Defining Projection >>>> proj4string(dhsanalysis3) < CRS("+init=epsg:4326") >>> ##binding longiude and latitude >>>> lon2< dhsanalysis3$longnum >>>> lat2< dhsanalysis3$latnum >>>> coords< cbind(lon2,lat2) >>> ##Creating spatial weights based on the nearest neighbour >>>> knear2= knearneigh(coords,k=2,longlat=T) >>> Warning message: >>> In knearneigh(coords, k = 2, longlat = T) : >>> knearneigh: identical points found ### I get a warning message on >>> identical points found, is this a problem and how would i deal with this >> >> It usually indicates muddled thinking. You have more than one observation >> at the same point, which could be an unintended duplicate (copy of the >> same observation), or it could indicate that any spatial process has >> multiple values at that point. In geostatistics, one would jitter the >> points to introduce distance. In your case, this probably isn't households >> living at one address point. All the observations at the same point will >> get the same sets of neighbours. If there are many colocated >> observations, k in knearest may be too small to include them all. >> >>>> knear2.nb= knn2nb(knear2) >>>> knear2weight= nb2listw(knear2.nb, style="W",zero.policy=T) >>> >>> ##declaring categorical variables as factor variables >>>> dhsanalysis3$hivpositive.f < factor(dhsanalysis3$hivpositive) >>>> dhsanalysis3$protestant.f < factor(dhsanalysis3$protestant2) >>>> dhsanalysis3$Married.f < factor(dhsanalysis3$Married) >>>> dhsanalysis3$female.f < factor(dhsanalysis3$female) >>>> dhsanalysis3$urban1.f < factor(dhsanalysis3$urban1) >>>> dhsanalysis3$river10kmdum.f < factor(dhsanalysis3$river10kmdum) >>>> dhsanalysis3$Explorer50kmdum.f < factor(dhsanalysis3$Explorer50kmdum) >>>> dhsanalysis3$Rail50kmdum.f < factor(dhsanalysis3$Rail50kmdum) >>>> dhsanalysis3$Province1.f < factor(dhsanalysis3$Province1) >>>> dhsanalysis3$WealthIndex.f < factor(dhsanalysis3$WealthIndex) >>>> dhsanalysis3$occupation2.f < factor(dhsanalysis3$occupation2) >>>> dhsanalysis3$highested.f < factor(dhsanalysis3$highested) >>>> protestant.f= dhsanalysis3$protestant.f >>>> hivpositive.f=dhsanalysi3$hivpositive.f >>>> Married.f=dhsanalysis3$Married.f >>>> female.f=dhsanalysis3$female.f >>>> urban1.f=dhsanalysis3$urban1.f >>>> river10kmdum.f=dhsanalysis3$river10kmdum.f >>>> Explorer50kmdum.f=dhsanalysis2$Explorer50kmdum.f >>>> Province1.f=dhsanalysis3$Province1.f >>>> WealthIndex.f=dhsanalysis3$WealthIndex.f >>>> occupation2.f=dhsanalysis3$occupation2.f >>>> highested.f=dhsanalysis3$highested.f >>>> Age=dhsanalysis3$Age >>>> Age2=dhsanalysis3$Age2 >>>> HIVKnowledge=dhsanalysis3$HIVKnowledge >>>> churchkm=dhsanalysis3$churchkm >>>> lnhospitalkm=dhsanalysis3$lnhospitalkm >>>> lnElevationMean=dhsanalysis3$lnElevationMean >>> >>>> ###VARIABLES IN SURVEY DESIGN >>>> mydesighdhs$hivpositive.f < factor(dhsanalysis3$hivpositive) >>>> mydesighdhs$protestant.f < factor(dhsanalysis3$protestant2) >>>> mydesighdhs$Married.f < factor(dhsanalysis3$Married) >>>> mydesighdhs$female.f < factor(dhsanalysis3$female) >>>> mydesighdhs$urban1.f < factor(dhsanalysis3$urban1) >>>> mydesighdhs$river10kmdum.f < factor(dhsanalysis3$river10kmdum) >>>> mydesighdhs$Explorer50kmdum.f < factor(dhsanalysis3$Explorer50kmdum) >>>> mydesighdhs$Rail50kmdum.f < factor(dhsanalysis3$Rail50kmdum) >>>> mydesighdhs$Province1.f < factor(dhsanalysis3$Province1) >>>> mydesighdhs$WealthIndex.f < factor(dhsanalysis3$WealthIndex) >>>> mydesighdhs$occupation2.f < factor(dhsanalysis3$occupation2) >>>> mydesighdhs$highested.f < factor(dhsanalysis3$highested) >>>> mydesighdhs$Age=Age >>>> mydesighdhs$Age2=Age2 >>>> mydesighdhs$HIVKnowledge=HIVKnowledge >>>> mydesighdhs$churchkm=churchkm >>>> mydesighdhs$lnhospitalkm=lnhospitalkm >>>> mydesighdhs$lnElevationMean=lnElevationMean >>> >>>> svyprobitest= svyglm(hivpositive.f~ churchkm +lnhospitalkm >> +protestant.f+ >>> Age+ >>> >> Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f >>> + WealthIndex.f + HIVKnowledge+ occupation2.f+ >>> highested.f,design=mydesighdhs,family = quasibinomial(link = >>> "probit"),data=dhsanalysis3) >>> Error in .subset2(x, i, exact = exact) : subscript out of bounds >> >> So you've an error anyway. >> >>> >>> ## Iran my probit model without implementing the survey design in the >> model >>> just to see whether Moran test is working >>>> svyprobitest2= glm(hivpositive.f~ churchkm +lnhospitalkm +protestant.f+ >>> Age+ >>> >> Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f >>> + WealthIndex.f + HIVKnowledge+ occupation2.f+ highested.f,family = >>> quasibinomial(link = "probit"),data=dhsanalysis3) >>> #Moran test >>>> moran.test(residuals.glm(svyprobitest),knear2weight) >>> Error in moran.test(residuals.glm(svyprobitest), knear2weight) : >>> objects of different length >>> >> >> Testing residuals using moran.test() is usually wrong, as the expectation >> and variance of the statistic are based on a null model (intercept only), >> not a model with covariates. >> >> What are length(residuals.glm(svyprobitest)) and length(knear2.nb)? Did >> glm drop observations with missing values? If so, you should subset both >> the data submitted to glm() and the neighbour object so that they match. >> >> Hope this helps, >> >> Roger >> >>> >>> Kind Regards, >>> >>> Michael Chanda Chiseni >>> >>> Phd Candidate >>> >>> Department of Economic History >>> >>> Lund University >>> >>> Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund >>> >>> >>> >>> *Africa is not poor, it is poorly managed (Ellen JohnsonSirleaf ). * >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> RsigGeo mailing list >>> [hidden email] >>> https://stat.ethz.ch/mailman/listinfo/rsiggeo >>> >> >>  >> Roger Bivand >> Department of Economics, Norwegian School of Economics, >> Helleveien 30, N5045 Bergen, Norway. >> voice: +47 55 95 93 55; email: [hidden email] >> https://orcid.org/0000000323926140 >> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en > > [[alternative HTML version deleted]] > > _______________________________________________ > RsigGeo mailing list > [hidden email] > https://stat.ethz.ch/mailman/listinfo/rsiggeo > Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N5045 Bergen, Norway. voice: +47 55 95 93 55; email: [hidden email] https://orcid.org/0000000323926140 https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en _______________________________________________ RsigGeo mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/rsiggeo
Roger Bivand
Department of Economics Norwegian School of Economics Helleveien 30 N5045 Bergen, Norway 
Free forum by Nabble  Edit this page 