Imputation of aggregate data for spatial analysis

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Imputation of aggregate data for spatial analysis

AMITHA PURANIK
Hello everyone!

I have data on the proportion of screening of cervical cancer at
district-level. I have about 530 data points out of the 545 districts. I
want to get a complete picture of the data in order to obtain the spatial
cluster map using LISA (local indicators of partial autocorrelation).

Would it be alright if I impute the missing values of the proportion for
the 15 districts using multiple imputation?

Or is there a better way for imputation in this case?

Any suggestion/ comment is highly appreciated!

Regards,
Amitha

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Imputation of aggregate data for spatial analysis

Roger Bivand
Administrator
On Mon, 19 Oct 2020, Amitha Puranik wrote:

> Hello everyone!
>
> I have data on the proportion of screening of cervical cancer at
> district-level. I have about 530 data points out of the 545 districts. I
> want to get a complete picture of the data in order to obtain the spatial
> cluster map using LISA (local indicators of partial autocorrelation).
>
> Would it be alright if I impute the missing values of the proportion for
> the 15 districts using multiple imputation?

Please do not even consider imputation of this kind of data, especially if
LISA is your tools. It makes no sense at all.

If, however, you model the screening proportions at district level, and
have a model that fits well using multiple well-observed covariates, both
INLA and other model fitting approaches can predict for unobserved
responses. However, in such cases, the model will have used the spatial
autocorrelation already (for example in a BYM model, see for example here:
http://www.paulamoraga.com/book-geospatial-info/), so putting the
posterior distributions at unobserved locations into a LISA makes no
sense, because the imputation alrready uses spatial autocorrelation.
Modelling, if you have relevant covariates, and model preferably using
Poisson reression offset by log size of age-selected female population,
makes much more sense than LISA.

Hope this helps,

Roger

>
> Or is there a better way for imputation in this case?
>
> Any suggestion/ comment is highly appreciated!
>
> Regards,
> Amitha
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: Imputation of aggregate data for spatial analysis

AMITHA PURANIK
Dear Prof Roger,

Thanks a lot for that suggestion!

Kind regards,
Amitha.













On Mon, Oct 19, 2020 at 1:51 PM Roger Bivand <[hidden email]> wrote:

> On Mon, 19 Oct 2020, Amitha Puranik wrote:
>
> > Hello everyone!
> >
> > I have data on the proportion of screening of cervical cancer at
> > district-level. I have about 530 data points out of the 545 districts. I
> > want to get a complete picture of the data in order to obtain the spatial
> > cluster map using LISA (local indicators of partial autocorrelation).
> >
> > Would it be alright if I impute the missing values of the proportion for
> > the 15 districts using multiple imputation?
>
> Please do not even consider imputation of this kind of data, especially if
> LISA is your tools. It makes no sense at all.
>
> If, however, you model the screening proportions at district level, and
> have a model that fits well using multiple well-observed covariates, both
> INLA and other model fitting approaches can predict for unobserved
> responses. However, in such cases, the model will have used the spatial
> autocorrelation already (for example in a BYM model, see for example here:
> http://www.paulamoraga.com/book-geospatial-info/), so putting the
> posterior distributions at unobserved locations into a LISA makes no
> sense, because the imputation alrready uses spatial autocorrelation.
> Modelling, if you have relevant covariates, and model preferably using
> Poisson reression offset by log size of age-selected female population,
> makes much more sense than LISA.
>
> Hope this helps,
>
> Roger
>
> >
> > Or is there a better way for imputation in this case?
> >
> > Any suggestion/ comment is highly appreciated!
> >
> > Regards,
> > Amitha
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > [hidden email]
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: [hidden email]
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo