Sf find number of parts in multigeometry

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Sf find number of parts in multigeometry

Martin Tomko
I am looking for an equivalent to Postgis ST_NumGeometries https://postgis.net/docs/ST_NumGeometries.html
I have multipolygons in an sf df, where most of them are likely to be single part. I want to identify those that are not single part, and possibly retain only the largest polygon part, and cast all into Polygon.
Any advice is appreciated, thanks!

Martin


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Sf find number of parts in multigeometry

obrl soil
Hi Martin,

for a multipolygon geometry column, you're working with a list of
lists of lists of matrices, so you can iterate over the geometry
column like

    sapply(st_geometry(x), length)

(or purrr::map_int(), perhaps) to report number of polygon components.
To subset the first polygon component and drop the rest,

    st_geometry(x) <- st_sfc(lapply(st_geometry(x), function(y)
st_polygon(y[[1]])), crs = 4326) # or whatever crs you're using

I would first run st_buffer(x, dist = 0L) and/or
lwgeom::st_make_valid() to ensure the geometries are properly
structured, although this can't solve every problem. Note that AFAIK
there's no inherent sorting to multipolygon components, although you
could probably reorder them by area or number of vertices before
subsetting if you were really keen.

cheers
L
On Tue, Oct 23, 2018 at 12:32 PM Martin Tomko <[hidden email]> wrote:

>
> I am looking for an equivalent to Postgis ST_NumGeometries https://postgis.net/docs/ST_NumGeometries.html
> I have multipolygons in an sf df, where most of them are likely to be single part. I want to identify those that are not single part, and possibly retain only the largest polygon part, and cast all into Polygon.
> Any advice is appreciated, thanks!
>
> Martin
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Sf find number of parts in multigeometry

Tim Appelhans
As an addition to obrl soil's answer, I have come to like

lengths(st_geometry(x)) - mind the s!

Best
Tim


On 10/23/2018 06:39 AM, obrl soil wrote:

> Hi Martin,
>
> for a multipolygon geometry column, you're working with a list of
> lists of lists of matrices, so you can iterate over the geometry
> column like
>
>      sapply(st_geometry(x), length)
>
> (or purrr::map_int(), perhaps) to report number of polygon components.
> To subset the first polygon component and drop the rest,
>
>      st_geometry(x) <- st_sfc(lapply(st_geometry(x), function(y)
> st_polygon(y[[1]])), crs = 4326) # or whatever crs you're using
>
> I would first run st_buffer(x, dist = 0L) and/or
> lwgeom::st_make_valid() to ensure the geometries are properly
> structured, although this can't solve every problem. Note that AFAIK
> there's no inherent sorting to multipolygon components, although you
> could probably reorder them by area or number of vertices before
> subsetting if you were really keen.
>
> cheers
> L
> On Tue, Oct 23, 2018 at 12:32 PM Martin Tomko <[hidden email]> wrote:
>> I am looking for an equivalent to Postgis ST_NumGeometries https://postgis.net/docs/ST_NumGeometries.html
>> I have multipolygons in an sf df, where most of them are likely to be single part. I want to identify those that are not single part, and possibly retain only the largest polygon part, and cast all into Polygon.
>> Any advice is appreciated, thanks!
>>
>> Martin
>>
>>
>>          [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> [hidden email]
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Sf find number of parts in multigeometry

Michael Sumner-2
In reply to this post by Martin Tomko
I see perfectly good answers to this, but can't resist sharing my own
approach.

 I use  gibble::gibble for a summary of parts. See how the multi-part
object is number 4, and has 3 subobjects (subobject will repeat within
object for holes).

library(sf)
x <- read_sf(system.file("gpkg/nc.gpkg", package = "sf"))
gibble::gibble(x)
#    A tibble: 108 x 5
#     nrow  ncol type         subobject object
#    <int> <int> <chr>            <int>  <int>
#1    27     2 MULTIPOLYGON         1      1
#2    26     2 MULTIPOLYGON         1      2
#3    28     2 MULTIPOLYGON         1      3
#4    26     2 MULTIPOLYGON         1      4
#5     7     2 MULTIPOLYGON         2      4
#6     5     2 MULTIPOLYGON         3      4
#7    34     2 MULTIPOLYGON         1      5
#...

(I use that for mapping out set-based operations on geometry data, it
doesn't make a huge amount of sense on its own. I suppose a rapply scheme
could be constructed to pluck out things, but you'd also want path extents
and sizes and so forth for greater control).

But, if the biggest area of each multipolygon is what you want, give each a
unique ID,  use st_cast to POLYGON, group by parent and slice out the
largest.

library(dplyr)

x %>% mutate(ID = row_number()) %>% st_cast("POLYGON") %>%
group_by(ID) %>% arrange(desc(st_area(.))) %>% slice(1) %>% ungroup()

Note that holes within a part might reduce the area logic, but so will
details of the map projection in use and so on. It's helpful to learn the
structure of the underlying geometry lists to craft your own rogue
solutions.

HTH


On Tue, Oct 23, 2018, 13:32 Martin Tomko <[hidden email]> wrote:

> I am looking for an equivalent to Postgis ST_NumGeometries
> https://postgis.net/docs/ST_NumGeometries.html
> I have multipolygons in an sf df, where most of them are likely to be
> single part. I want to identify those that are not single part, and
> possibly retain only the largest polygon part, and cast all into Polygon.
> Any advice is appreciated, thanks!
>
> Martin
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Dr. Michael Sumner
Software and Database Engineer
Australian Antarctic Division
203 Channel Highway
Kingston Tasmania 7050 Australia

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: Sf find number of parts in multigeometry

Martin Tomko
Dear all,
Thank you for the hints – I will test and report any surprises ☺
Martin

From: Michael Sumner <[hidden email]>
Date: Tuesday, 23 October 2018 at 5:59 pm
To: Martin Tomko <[hidden email]>
Cc: "[hidden email]" <[hidden email]>
Subject: Re: [R-sig-Geo] Sf find number of parts in multigeometry

I see perfectly good answers to this, but can't resist sharing my own approach.

 I use  gibble::gibble for a summary of parts. See how the multi-part object is number 4, and has 3 subobjects (subobject will repeat within object for holes).

library(sf)
x <- read_sf(system.file("gpkg/nc.gpkg", package = "sf"))
gibble::gibble(x)
#    A tibble: 108 x 5
#     nrow  ncol type         subobject object
#    <int> <int> <chr>            <int>  <int>
#1    27     2 MULTIPOLYGON         1      1
#2    26     2 MULTIPOLYGON         1      2
#3    28     2 MULTIPOLYGON         1      3
#4    26     2 MULTIPOLYGON         1      4
#5     7     2 MULTIPOLYGON         2      4
#6     5     2 MULTIPOLYGON         3      4
#7    34     2 MULTIPOLYGON         1      5
#...

(I use that for mapping out set-based operations on geometry data, it doesn't make a huge amount of sense on its own. I suppose a rapply scheme could be constructed to pluck out things, but you'd also want path extents and sizes and so forth for greater control).

But, if the biggest area of each multipolygon is what you want, give each a unique ID,  use st_cast to POLYGON, group by parent and slice out the largest.

library(dplyr)

x %>% mutate(ID = row_number()) %>% st_cast("POLYGON") %>%
group_by(ID) %>% arrange(desc(st_area(.))) %>% slice(1) %>% ungroup()

Note that holes within a part might reduce the area logic, but so will details of the map projection in use and so on. It's helpful to learn the structure of the underlying geometry lists to craft your own rogue solutions.

HTH

On Tue, Oct 23, 2018, 13:32 Martin Tomko <[hidden email]<mailto:[hidden email]>> wrote:
I am looking for an equivalent to Postgis ST_NumGeometries https://postgis.net/docs/ST_NumGeometries.html<https://postgis.net/docs/ST_NumGeometries.html>
I have multipolygons in an sf df, where most of them are likely to be single part. I want to identify those that are not single part, and possibly retain only the largest polygon part, and cast all into Polygon.
Any advice is appreciated, thanks!

Martin


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]<mailto:[hidden email]>
https://stat.ethz.ch/mailman/listinfo/r-sig-geo<https://stat.ethz.ch/mailman/listinfo/r-sig-geo>
--
Dr. Michael Sumner
Software and Database Engineer
Australian Antarctic Division
203 Channel Highway
Kingston Tasmania 7050 Australia

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo