rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Phil Haines
Dear list,

I have a single Polygons object containing multiple Polygon objects
that share a common border. When I output this using writeOGR() one of
the Polygon objects becomes a hole, as the following example shows.

Create a Polygons object containing two adjoining Polygon objects
> library(rgdal)
> r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
> r2 <- r1; r2[,1] <- r2[,1]+1
> Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
> SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)), data.frame(Example=c("Minimal")))

Perform a write/readOGR() cycle
> fn <- tempfile()
> writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
> SPDF2 <- readOGR(dsn=fn,layer='test')

Second Polygon object is now a hole
> sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
[1] FALSE  TRUE

I see from the sp documentation that "Polygon objects belonging to a
Polygons object should either not overlap one-other, or should be
fully included" but I am not sure how this relates to bordering
Polygon objects. I would welcome any advice as to whether what I am
asking of writeOGR is reasonable?

Thanks in advance for your time,
Phil

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
                         LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] rgdal_1.4-4 sp_1.3-1

loaded via a namespace (and not attached):
[1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
lattice_0.20-35

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Roger Bivand
Administrator
On Tue, 13 Aug 2019, Phil Haines wrote:

> Dear list,
>
> I have a single Polygons object containing multiple Polygon objects
> that share a common border. When I output this using writeOGR() one of
> the Polygon objects becomes a hole, as the following example shows.
>
> Create a Polygons object containing two adjoining Polygon objects
>> library(rgdal)
>> r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>> r2 <- r1; r2[,1] <- r2[,1]+1
>> Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>> SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)), data.frame(Example=c("Minimal")))
>
> Perform a write/readOGR() cycle
>> fn <- tempfile()
>> writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>> SPDF2 <- readOGR(dsn=fn,layer='test')
>
> Second Polygon object is now a hole
>> sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
> [1] FALSE  TRUE
>
> I see from the sp documentation that "Polygon objects belonging to a
> Polygons object should either not overlap one-other, or should be
> fully included" but I am not sure how this relates to bordering
> Polygon objects. I would welcome any advice as to whether what I am
> asking of writeOGR is reasonable?

The problem is with the 'ESRI Shapefile' representation and driver:

library(rgdal)
r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
r2 <- r1; r2[,1] <- r2[,1]+1
Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
  data.frame(Example=c("Minimal")))
sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")

# which constructs a MULTIPOLYGON object:

rgeos::writeWKT(SPDF)
library(sf)
st_as_text(st_geometry(st_as_sf(SPDF)))

# The 'ESRI Shapefile' driver is not Simple-Feature compliant (it
# pre-dates it), so the failure occurs by seeing the second exterior ring
# as an interior ring

fn <- tempfile()
writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
SPDF2 <- readOGR(dsn=fn, layer='test')
sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
rgeos::writeWKT(SPDF2)
st_as_text(st_geometry(st_as_sf(SPDF2)))

# This happens with sf too, using the same GDAL driver:

sf2 <- st_read(dsn=fn, layer='test')
st_geometry(sf2)
library(sf)
st_as_text(st_geometry(st_as_sf(sf2)))
rgeos::writeWKT(as(sf2, "Spatial"))

# Adding the comment fix doesn't help:

comment(slot(SPDF, "polygons")[[1]])
SPDF_c <- rgeos::createSPComment(SPDF)
comment(slot(SPDF_c, "polygons")[[1]])
writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
   verbose=TRUE)

# reports
# Object initially classed as: wkbPolygon
# SFS comments in Polygons objects
# Object reclassed as: wkbMultiPolygon

SPDF2_c <- readOGR(dsn=fn, layer='test_c')
sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "ringDir")
rgeos::writeWKT(SPDF2_c)
st_as_text(st_geometry(st_as_sf(SPDF2_c)))



# If the input object is written out with the GeoPackage driver:

fn1 <- tempfile(fileext=".gpkg")
writeOGR(SPDF, fn1, layer="test", driver='GPKG')
sf2a <- st_read(dsn=fn1,layer='test')
st_coordinates(st_geometry(sf2a))
SPDF2a <- readOGR(dsn=fn1)
sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
rgeos::writeWKT(SPDF2a)
st_as_text(st_geometry(st_as_sf(SPDF2a)))

# the issue is resolved. If we separate the exterior rings:

r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
   data.frame(Example=c("Minimal")))
fna <- tempfile()
writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
SPDF2_a <- readOGR(dsn=fna, layer='test')
sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "ringDir")
rgeos::writeWKT(SPDF2_a)
st_as_text(st_geometry(st_as_sf(SPDF2_a)))

# we are OK as the two exterior rings do not touch.

# does using sf make a difference?

fn_s <- tempfile(fileext=".shp")
st_write(st_as_sf(SPDF), dsn=fn_s)
sf_in <- st_read(fn_s)
st_as_text(st_geometry(st_as_sf(sf_in)))

# No

fn_s <- tempfile(fileext=".shp")
st_write(st_as_sf(SPDF_c), dsn=fn_s)
sf_in_c <- st_read(fn_s)
st_as_text(st_geometry(st_as_sf(sf_in_c)))

# nor with the pretend-SF-compliant comment set either.

So the weakness is in the "ESRI Shapefile" write driver, or possibly in
the OGRGeometryFactory::organizePolygons() function in GDAL used in
OGR_write() (a C++ function) called by writeOGR(). If sf::st_write() also
calls OGRGeometryFactory::organizePolygons(), we'd maybe consider that it
has a weakness for the "ESRI Shapefile" driver, but which does not affect
SF-compliant drivers.

This is probably related to a similar but inverse problem with the
SF-compliant GeoJSON driver in 2015:

https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html

continued the next month in:

https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html

The details are in this SVN diff

https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571

up to the end og the list thread, and this one from then until now:

https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733

Summary: could you change drivers, or is it really necessary to fix an EOL
problem? What is your use case?

>
> Thanks in advance for your time,

Thanks for a complete example,

Roger

> Phil
>
>> sessionInfo()
> R version 3.5.1 (2018-07-02)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows >= 8 x64 (build 9200)
>
> Matrix products: default
>
> locale:
> [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
> Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>                         LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] rgdal_1.4-4 sp_1.3-1
>
> loaded via a namespace (and not attached):
> [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
> lattice_0.20-35
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Roger Bivand
Administrator
On Fri, 16 Aug 2019, Roger Bivand wrote:

> On Tue, 13 Aug 2019, Phil Haines wrote:
>
>>  Dear list,
>>
>>  I have a single Polygons object containing multiple Polygon objects
>>  that share a common border. When I output this using writeOGR() one of
>>  the Polygon objects becomes a hole, as the following example shows.
>>
>>  Create a Polygons object containing two adjoining Polygon objects
>>>  library(rgdal)
>>>  r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>  r2 <- r1; r2[,1] <- r2[,1]+1
>>>  Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>  SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>  data.frame(Example=c("Minimal")))
>>
>>  Perform a write/readOGR() cycle
>>>  fn <- tempfile()
>>>  writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>  SPDF2 <- readOGR(dsn=fn,layer='test')
>>
>>  Second Polygon object is now a hole
>>>  sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
>>  [1] FALSE  TRUE
>>
>>  I see from the sp documentation that "Polygon objects belonging to a
>>  Polygons object should either not overlap one-other, or should be
>>  fully included" but I am not sure how this relates to bordering
>>  Polygon objects. I would welcome any advice as to whether what I am
>>  asking of writeOGR is reasonable?
>
> The problem is with the 'ESRI Shapefile' representation and driver:
>
> library(rgdal)
> r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
> r2 <- r1; r2[,1] <- r2[,1]+1
> Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
> SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
> data.frame(Example=c("Minimal")))
> sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
> sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")
>
> # which constructs a MULTIPOLYGON object:
>
> rgeos::writeWKT(SPDF)
> library(sf)
> st_as_text(st_geometry(st_as_sf(SPDF)))
>
> #  The 'ESRI Shapefile' driver is not Simple-Feature compliant (it pre-dates
> #  it), so the failure occurs by seeing the second exterior ring
> #  as an interior ring
>
> fn <- tempfile()
> writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
> SPDF2 <- readOGR(dsn=fn, layer='test')
> sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
> sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
> rgeos::writeWKT(SPDF2)
> st_as_text(st_geometry(st_as_sf(SPDF2)))
>
> # This happens with sf too, using the same GDAL driver:
>
> sf2 <- st_read(dsn=fn, layer='test')
> st_geometry(sf2)
> library(sf)
> st_as_text(st_geometry(st_as_sf(sf2)))
> rgeos::writeWKT(as(sf2, "Spatial"))
>
> # Adding the comment fix doesn't help:
>
> comment(slot(SPDF, "polygons")[[1]])
> SPDF_c <- rgeos::createSPComment(SPDF)
> comment(slot(SPDF_c, "polygons")[[1]])
> writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
>   verbose=TRUE)
>
> #  reports
> #  Object initially classed as: wkbPolygon
> #  SFS comments in Polygons objects
> #  Object reclassed as: wkbMultiPolygon
>
> SPDF2_c <- readOGR(dsn=fn, layer='test_c')
> sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
> sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "ringDir")
> rgeos::writeWKT(SPDF2_c)
> st_as_text(st_geometry(st_as_sf(SPDF2_c)))
>
>
>
> # If the input object is written out with the GeoPackage driver:
>
> fn1 <- tempfile(fileext=".gpkg")
> writeOGR(SPDF, fn1, layer="test", driver='GPKG')
> sf2a <- st_read(dsn=fn1,layer='test')
> st_coordinates(st_geometry(sf2a))
> SPDF2a <- readOGR(dsn=fn1)
> sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
> sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
> rgeos::writeWKT(SPDF2a)
> st_as_text(st_geometry(st_as_sf(SPDF2a)))
>
> # the issue is resolved. If we separate the exterior rings:
>
> r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
> Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
> SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
>  data.frame(Example=c("Minimal")))
> fna <- tempfile()
> writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
> SPDF2_a <- readOGR(dsn=fna, layer='test')
> sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
> sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "ringDir")
> rgeos::writeWKT(SPDF2_a)
> st_as_text(st_geometry(st_as_sf(SPDF2_a)))
>
> # we are OK as the two exterior rings do not touch.
>
> # does using sf make a difference?
>
> fn_s <- tempfile(fileext=".shp")
> st_write(st_as_sf(SPDF), dsn=fn_s)
> sf_in <- st_read(fn_s)
> st_as_text(st_geometry(st_as_sf(sf_in)))
>
> # No
>
> fn_s <- tempfile(fileext=".shp")
> st_write(st_as_sf(SPDF_c), dsn=fn_s)
> sf_in_c <- st_read(fn_s)
> st_as_text(st_geometry(st_as_sf(sf_in_c)))
>
> # nor with the pretend-SF-compliant comment set either.
>
> So the weakness is in the "ESRI Shapefile" write driver, or possibly in the
> OGRGeometryFactory::organizePolygons() function in GDAL used in OGR_write()
> (a C++ function) called by writeOGR(). If sf::st_write() also calls
> OGRGeometryFactory::organizePolygons(), we'd maybe consider that it has a
> weakness for the "ESRI Shapefile" driver, but which does not affect
> SF-compliant drivers.

Without the comment set, OGRGeometryFactory::organizePolygons() is used;
with it set, OGRGeometryFactory::organizePolygons() is not used, because
the object is declared as two exterior rings. In both cases, we have the
output object written out and read back in incorrectly with the ESRI
shapefile driver, but SF-compliant drivers round-trip (in the test
GeoJSON), correctly.

It is likely that the changes made in 2015 to accommodate GeoJSON led to
this possible regression for the ESRI Shapefile driver. I'm adding this
geometry to tests/tripup.R and data files; without code changes the hole
slot is wrong and the ring direction is changes to match, so the
coordinates change too.

Reading using the deprecated maptools::readShapeSpatial() also gets a hole
in the second external ring. However, writing with deprecated
maptools::writeSpatialShape() yields a shapefile that when read with
maptools::readShapeSpatial() gives the correct two exterior ring, no hole
object. When read with sf::st_read() and rgdal::readOGR(), the object is
also correct. So the problem definitely lies in rgdal::writeOGR(), and
sf::st_write() - roundtripping with sf::st_write() and sf::st_read()
degrades from MULTIPOLYGON to POLYGON with the ESRI Shapefile driver.

Roger

>
> This is probably related to a similar but inverse problem with the
> SF-compliant GeoJSON driver in 2015:
>
> https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html
>
> continued the next month in:
>
> https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html
>
> The details are in this SVN diff
>
> https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571
>
> up to the end og the list thread, and this one from then until now:
>
> https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733
>
> Summary: could you change drivers, or is it really necessary to fix an EOL
> problem? What is your use case?
>
>>
>>  Thanks in advance for your time,
>
> Thanks for a complete example,
>
> Roger
>
>>  Phil
>>
>>>  sessionInfo()
>>  R version 3.5.1 (2018-07-02)
>>  Platform: x86_64-w64-mingw32/x64 (64-bit)
>>  Running under: Windows >= 8 x64 (build 9200)
>>
>>  Matrix products: default
>>
>>  locale:
>>  [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>>  Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>>                          LC_TIME=English_United Kingdom.1252
>>
>>  attached base packages:
>>  [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>>  other attached packages:
>>  [1] rgdal_1.4-4 sp_1.3-1
>>
>>  loaded via a namespace (and not attached):
>>  [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
>>  lattice_0.20-35
>>
>>  _______________________________________________
>>  R-sig-Geo mailing list
>>  [hidden email]
>>  https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Roger Bivand
Administrator
Please follow up both here and on:

https://github.com/r-spatial/sf/issues/1130

as the problem is also seen in the sf package using the same GDAL ESRI
Shapefile driver.

Roger

On Fri, 16 Aug 2019, Roger Bivand wrote:

> On Fri, 16 Aug 2019, Roger Bivand wrote:
>
>>  On Tue, 13 Aug 2019, Phil Haines wrote:
>>
>>>   Dear list,
>>>
>>>   I have a single Polygons object containing multiple Polygon objects
>>>   that share a common border. When I output this using writeOGR() one of
>>>   the Polygon objects becomes a hole, as the following example shows.
>>>
>>>   Create a Polygons object containing two adjoining Polygon objects
>>>>   library(rgdal)
>>>>   r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>>   r2 <- r1; r2[,1] <- r2[,1]+1
>>>>   Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>>   SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>>   data.frame(Example=c("Minimal")))
>>>
>>>   Perform a write/readOGR() cycle
>>>>   fn <- tempfile()
>>>>   writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>>   SPDF2 <- readOGR(dsn=fn,layer='test')
>>>
>>>   Second Polygon object is now a hole
>>>>   sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
>>>   [1] FALSE  TRUE
>>>
>>>   I see from the sp documentation that "Polygon objects belonging to a
>>>   Polygons object should either not overlap one-other, or should be
>>>   fully included" but I am not sure how this relates to bordering
>>>   Polygon objects. I would welcome any advice as to whether what I am
>>>   asking of writeOGR is reasonable?
>>
>>  The problem is with the 'ESRI Shapefile' representation and driver:
>>
>>  library(rgdal)
>>  r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>  r2 <- r1; r2[,1] <- r2[,1]+1
>>  Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>  SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>  data.frame(Example=c("Minimal")))
>>  sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
>>  sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>
>>  # which constructs a MULTIPOLYGON object:
>>
>>  rgeos::writeWKT(SPDF)
>>  library(sf)
>>  st_as_text(st_geometry(st_as_sf(SPDF)))
>>
>> #   The 'ESRI Shapefile' driver is not Simple-Feature compliant (it
>> #   pre-dates it), so the failure occurs by seeing the second exterior ring
>> #   as an interior ring
>>
>>  fn <- tempfile()
>>  writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>  SPDF2 <- readOGR(dsn=fn, layer='test')
>>  sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
>>  sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>  rgeos::writeWKT(SPDF2)
>>  st_as_text(st_geometry(st_as_sf(SPDF2)))
>>
>>  # This happens with sf too, using the same GDAL driver:
>>
>>  sf2 <- st_read(dsn=fn, layer='test')
>>  st_geometry(sf2)
>>  library(sf)
>>  st_as_text(st_geometry(st_as_sf(sf2)))
>>  rgeos::writeWKT(as(sf2, "Spatial"))
>>
>>  # Adding the comment fix doesn't help:
>>
>>  comment(slot(SPDF, "polygons")[[1]])
>>  SPDF_c <- rgeos::createSPComment(SPDF)
>>  comment(slot(SPDF_c, "polygons")[[1]])
>>  writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
>>    verbose=TRUE)
>>
>> #   reports
>> #   Object initially classed as: wkbPolygon
>> #   SFS comments in Polygons objects
>> #   Object reclassed as: wkbMultiPolygon
>>
>>  SPDF2_c <- readOGR(dsn=fn, layer='test_c')
>>  sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
>>  sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>  rgeos::writeWKT(SPDF2_c)
>>  st_as_text(st_geometry(st_as_sf(SPDF2_c)))
>>
>>
>>
>>  # If the input object is written out with the GeoPackage driver:
>>
>>  fn1 <- tempfile(fileext=".gpkg")
>>  writeOGR(SPDF, fn1, layer="test", driver='GPKG')
>>  sf2a <- st_read(dsn=fn1,layer='test')
>>  st_coordinates(st_geometry(sf2a))
>>  SPDF2a <- readOGR(dsn=fn1)
>>  sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
>>  sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>  rgeos::writeWKT(SPDF2a)
>>  st_as_text(st_geometry(st_as_sf(SPDF2a)))
>>
>>  # the issue is resolved. If we separate the exterior rings:
>>
>>  r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
>>  Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
>>  SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
>>  data.frame(Example=c("Minimal")))
>>  fna <- tempfile()
>>  writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
>>  SPDF2_a <- readOGR(dsn=fna, layer='test')
>>  sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
>>  sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>  rgeos::writeWKT(SPDF2_a)
>>  st_as_text(st_geometry(st_as_sf(SPDF2_a)))
>>
>>  # we are OK as the two exterior rings do not touch.
>>
>>  # does using sf make a difference?
>>
>>  fn_s <- tempfile(fileext=".shp")
>>  st_write(st_as_sf(SPDF), dsn=fn_s)
>>  sf_in <- st_read(fn_s)
>>  st_as_text(st_geometry(st_as_sf(sf_in)))
>>
>>  # No
>>
>>  fn_s <- tempfile(fileext=".shp")
>>  st_write(st_as_sf(SPDF_c), dsn=fn_s)
>>  sf_in_c <- st_read(fn_s)
>>  st_as_text(st_geometry(st_as_sf(sf_in_c)))
>>
>>  # nor with the pretend-SF-compliant comment set either.
>>
>>  So the weakness is in the "ESRI Shapefile" write driver, or possibly in
>>  the OGRGeometryFactory::organizePolygons() function in GDAL used in
>>  OGR_write() (a C++ function) called by writeOGR(). If sf::st_write() also
>>  calls OGRGeometryFactory::organizePolygons(), we'd maybe consider that it
>>  has a weakness for the "ESRI Shapefile" driver, but which does not affect
>>  SF-compliant drivers.
>
> Without the comment set, OGRGeometryFactory::organizePolygons() is used; with
> it set, OGRGeometryFactory::organizePolygons() is not used, because the
> object is declared as two exterior rings. In both cases, we have the output
> object written out and read back in incorrectly with the ESRI shapefile
> driver, but SF-compliant drivers round-trip (in the test GeoJSON), correctly.
>
> It is likely that the changes made in 2015 to accommodate GeoJSON led to this
> possible regression for the ESRI Shapefile driver. I'm adding this geometry
> to tests/tripup.R and data files; without code changes the hole slot is wrong
> and the ring direction is changes to match, so the coordinates change too.
>
> Reading using the deprecated maptools::readShapeSpatial() also gets a hole in
> the second external ring. However, writing with deprecated
> maptools:: writeSpatialShape() yields a shapefile that when read with
> maptools:: readShapeSpatial() gives the correct two exterior ring, no hole
> object. When read with sf::st_read() and rgdal::readOGR(), the object is also
> correct. So the problem definitely lies in rgdal::writeOGR(), and
> sf::st_write() - roundtripping with sf::st_write() and sf::st_read() degrades
> from MULTIPOLYGON to POLYGON with the ESRI Shapefile driver.
>
> Roger
>
>>
>>  This is probably related to a similar but inverse problem with the
>>  SF-compliant GeoJSON driver in 2015:
>>
>>  https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html
>>
>>  continued the next month in:
>>
>>  https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html
>>
>>  The details are in this SVN diff
>>
>>  https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571
>>
>>  up to the end og the list thread, and this one from then until now:
>>
>>  https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733
>>
>>  Summary: could you change drivers, or is it really necessary to fix an EOL
>>  problem? What is your use case?
>>
>>>
>>>   Thanks in advance for your time,
>>
>>  Thanks for a complete example,
>>
>>  Roger
>>
>>>   Phil
>>>
>>>>   sessionInfo()
>>>   R version 3.5.1 (2018-07-02)
>>>   Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>   Running under: Windows >= 8 x64 (build 9200)
>>>
>>>   Matrix products: default
>>>
>>>   locale:
>>>   [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>>>   Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>>>                           LC_TIME=English_United Kingdom.1252
>>>
>>>   attached base packages:
>>>   [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>>   other attached packages:
>>>   [1] rgdal_1.4-4 sp_1.3-1
>>>
>>>   loaded via a namespace (and not attached):
>>>   [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
>>>   lattice_0.20-35
>>>
>>>   _______________________________________________
>>>   R-sig-Geo mailing list
>>>   [hidden email]
>>>   https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>>
>
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Roger Bivand
Administrator
The reason that the problem occurred is that a MULTIPOLYGON with two
exterior rings becomes invalid if the exterior rings touch along an edge
(this case). It is important to know the use case, to see whether:

library(rgeos)
writeWKT(Ps1)
gIsValid(Ps1, reason=TRUE)
Ps1a <- gUnaryUnion(gBuffer(Ps1, width=0))
gIsValid(Ps1a, reason=TRUE)
writeWKT(Ps1a)

or equivalently in sf for sf objects, should be applied before trying to
write the object out to file with this driver.

However, because drivers that are compliant with the simple features
standard (which bans exterior rings sharing edges) have been permissive
and do round-trip this invalid object, a relaxation in the OGR ESRI
shapefile driver has been provided and may be included in a future
release.

We need to know (see these issues):

https://github.com/r-spatial/sf/issues/1130
https://github.com/OSGeo/gdal/issues/1787

why it was desirable to write out this object using this driver? Could an
alternative driver have been used, or is ESRI shapefile the only format
used in the workflow?

If it has to be this driver, could the workflow be changed to repair
degenerate cases before writing? If using sp classes, rgeos may be used to
test for and probably repair such geometries before they reach
rgdal::writeOGR() for this driver. Adding code to sf and rgdal to trap
degenerate cases does encumber all users with valid geometries with
the time wasted on extra checking, so building checks into sf and rgdal is
not desirable.

I hope it is possible to find out more about the use case quickly, to pass
on to GDAL developers to help motivate a relaxation in their current
policy with regard to this driver, and to encourage them to include the
fix branch in a future release of GDAL.

Roger

On Sat, 17 Aug 2019, Roger Bivand wrote:

> Please follow up both here and on:
>
> https://github.com/r-spatial/sf/issues/1130
>
> as the problem is also seen in the sf package using the same GDAL ESRI
> Shapefile driver.
>
> Roger
>
> On Fri, 16 Aug 2019, Roger Bivand wrote:
>
>>  On Fri, 16 Aug 2019, Roger Bivand wrote:
>>
>>>   On Tue, 13 Aug 2019, Phil Haines wrote:
>>>
>>>>    Dear list,
>>>>
>>>>    I have a single Polygons object containing multiple Polygon objects
>>>>    that share a common border. When I output this using writeOGR() one of
>>>>    the Polygon objects becomes a hole, as the following example shows.
>>>>
>>>>    Create a Polygons object containing two adjoining Polygon objects
>>>>>    library(rgdal)
>>>>>    r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>>>    r2 <- r1; r2[,1] <- r2[,1]+1
>>>>>    Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>>>    SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>>>    data.frame(Example=c("Minimal")))
>>>>
>>>>    Perform a write/readOGR() cycle
>>>>>    fn <- tempfile()
>>>>>    writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>>>    SPDF2 <- readOGR(dsn=fn,layer='test')
>>>>
>>>>    Second Polygon object is now a hole
>>>>>    sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
>>>>    [1] FALSE  TRUE
>>>>
>>>>    I see from the sp documentation that "Polygon objects belonging to a
>>>>    Polygons object should either not overlap one-other, or should be
>>>>    fully included" but I am not sure how this relates to bordering
>>>>    Polygon objects. I would welcome any advice as to whether what I am
>>>>    asking of writeOGR is reasonable?
>>>
>>>   The problem is with the 'ESRI Shapefile' representation and driver:
>>>
>>>   library(rgdal)
>>>   r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>   r2 <- r1; r2[,1] <- r2[,1]+1
>>>   Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>   SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>   data.frame(Example=c("Minimal")))
>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>
>>>   # which constructs a MULTIPOLYGON object:
>>>
>>>   rgeos::writeWKT(SPDF)
>>>   library(sf)
>>>   st_as_text(st_geometry(st_as_sf(SPDF)))
>>>
>>> #    The 'ESRI Shapefile' driver is not Simple-Feature compliant (it
>>> #    pre-dates it), so the failure occurs by seeing the second exterior
>>> #    ring
>>> #    as an interior ring
>>>
>>>   fn <- tempfile()
>>>   writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>   SPDF2 <- readOGR(dsn=fn, layer='test')
>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>   rgeos::writeWKT(SPDF2)
>>>   st_as_text(st_geometry(st_as_sf(SPDF2)))
>>>
>>>   # This happens with sf too, using the same GDAL driver:
>>>
>>>   sf2 <- st_read(dsn=fn, layer='test')
>>>   st_geometry(sf2)
>>>   library(sf)
>>>   st_as_text(st_geometry(st_as_sf(sf2)))
>>>   rgeos::writeWKT(as(sf2, "Spatial"))
>>>
>>>   # Adding the comment fix doesn't help:
>>>
>>>   comment(slot(SPDF, "polygons")[[1]])
>>>   SPDF_c <- rgeos::createSPComment(SPDF)
>>>   comment(slot(SPDF_c, "polygons")[[1]])
>>>   writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
>>>     verbose=TRUE)
>>>
>>> #    reports
>>> #    Object initially classed as: wkbPolygon
>>> #    SFS comments in Polygons objects
>>> #    Object reclassed as: wkbMultiPolygon
>>>
>>>   SPDF2_c <- readOGR(dsn=fn, layer='test_c')
>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot,
>>>   "ringDir")
>>>   rgeos::writeWKT(SPDF2_c)
>>>   st_as_text(st_geometry(st_as_sf(SPDF2_c)))
>>>
>>>
>>>
>>>   # If the input object is written out with the GeoPackage driver:
>>>
>>>   fn1 <- tempfile(fileext=".gpkg")
>>>   writeOGR(SPDF, fn1, layer="test", driver='GPKG')
>>>   sf2a <- st_read(dsn=fn1,layer='test')
>>>   st_coordinates(st_geometry(sf2a))
>>>   SPDF2a <- readOGR(dsn=fn1)
>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>   rgeos::writeWKT(SPDF2a)
>>>   st_as_text(st_geometry(st_as_sf(SPDF2a)))
>>>
>>>   # the issue is resolved. If we separate the exterior rings:
>>>
>>>   r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
>>>   Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
>>>   SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
>>>   data.frame(Example=c("Minimal")))
>>>   fna <- tempfile()
>>>   writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
>>>   SPDF2_a <- readOGR(dsn=fna, layer='test')
>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot,
>>>   "ringDir")
>>>   rgeos::writeWKT(SPDF2_a)
>>>   st_as_text(st_geometry(st_as_sf(SPDF2_a)))
>>>
>>>   # we are OK as the two exterior rings do not touch.
>>>
>>>   # does using sf make a difference?
>>>
>>>   fn_s <- tempfile(fileext=".shp")
>>>   st_write(st_as_sf(SPDF), dsn=fn_s)
>>>   sf_in <- st_read(fn_s)
>>>   st_as_text(st_geometry(st_as_sf(sf_in)))
>>>
>>>   # No
>>>
>>>   fn_s <- tempfile(fileext=".shp")
>>>   st_write(st_as_sf(SPDF_c), dsn=fn_s)
>>>   sf_in_c <- st_read(fn_s)
>>>   st_as_text(st_geometry(st_as_sf(sf_in_c)))
>>>
>>>   # nor with the pretend-SF-compliant comment set either.
>>>
>>>   So the weakness is in the "ESRI Shapefile" write driver, or possibly in
>>>   the OGRGeometryFactory::organizePolygons() function in GDAL used in
>>>   OGR_write() (a C++ function) called by writeOGR(). If sf::st_write()
>>>   also
>>>   calls OGRGeometryFactory::organizePolygons(), we'd maybe consider that
>>>   it
>>>   has a weakness for the "ESRI Shapefile" driver, but which does not
>>>   affect
>>>   SF-compliant drivers.
>>
>>  Without the comment set, OGRGeometryFactory::organizePolygons() is used;
>>  with it set, OGRGeometryFactory::organizePolygons() is not used, because
>>  the object is declared as two exterior rings. In both cases, we have the
>>  output object written out and read back in incorrectly with the ESRI
>>  shapefile driver, but SF-compliant drivers round-trip (in the test
>>  GeoJSON), correctly.
>>
>>  It is likely that the changes made in 2015 to accommodate GeoJSON led to
>>  this possible regression for the ESRI Shapefile driver. I'm adding this
>>  geometry to tests/tripup.R and data files; without code changes the hole
>>  slot is wrong and the ring direction is changes to match, so the
>>  coordinates change too.
>>
>>  Reading using the deprecated maptools::readShapeSpatial() also gets a hole
>>  in the second external ring. However, writing with deprecated
>> maptools::  writeSpatialShape() yields a shapefile that when read with
>> maptools::  readShapeSpatial() gives the correct two exterior ring, no hole
>>  object. When read with sf::st_read() and rgdal::readOGR(), the object is
>>  also correct. So the problem definitely lies in rgdal::writeOGR(), and
>>  sf::st_write() - roundtripping with sf::st_write() and sf::st_read()
>>  degrades from MULTIPOLYGON to POLYGON with the ESRI Shapefile driver.
>>
>>  Roger
>>
>>>
>>>   This is probably related to a similar but inverse problem with the
>>>   SF-compliant GeoJSON driver in 2015:
>>>
>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html
>>>
>>>   continued the next month in:
>>>
>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html
>>>
>>>   The details are in this SVN diff
>>>
>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571
>>>
>>>   up to the end og the list thread, and this one from then until now:
>>>
>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733
>>>
>>>   Summary: could you change drivers, or is it really necessary to fix an
>>>   EOL
>>>   problem? What is your use case?
>>>
>>>>
>>>>    Thanks in advance for your time,
>>>
>>>   Thanks for a complete example,
>>>
>>>   Roger
>>>
>>>>    Phil
>>>>
>>>>>    sessionInfo()
>>>>    R version 3.5.1 (2018-07-02)
>>>>    Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>>    Running under: Windows >= 8 x64 (build 9200)
>>>>
>>>>    Matrix products: default
>>>>
>>>>    locale:
>>>>    [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>>>>    Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>>>>                            LC_TIME=English_United Kingdom.1252
>>>>
>>>>    attached base packages:
>>>>    [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>>    other attached packages:
>>>>    [1] rgdal_1.4-4 sp_1.3-1
>>>>
>>>>    loaded via a namespace (and not attached):
>>>>    [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
>>>>    lattice_0.20-35
>>>>
>>>>    _______________________________________________
>>>>    R-sig-Geo mailing list
>>>>    [hidden email]
>>>>    https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>
>>>
>>
>>
>
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Phil Haines
Hi Roger,

I originally encountered this issue while trying to reduce the size of
German postal code boundaries. I used rmapshaper::ms_simplify() which
introduced the polygons sharing a common edge. I definitely didn't
need them to be separate polygons, and would happily have merged them
had I been more familiar with gUnaryUnion().

I apologise if my question has resulted in unnecessary effort on your
part. I reported the behaviour because I found it surprising. However,
I now understand that this example is not a valid simple feature
geometry and that the solution is to take steps to ensure I have valid
geometries, and to repair if not.

Additionally I must confess that I only use the ESRI Shapefile driver
because I don't know any better... My use cases are reading and
writing from R (usually to produce maps) and occasionally opening in
QGIS. I can happily switch to any format that supports this workflow.

Thank you very much for your time,
Phil

On Sun, 18 Aug 2019 at 13:59, Roger Bivand <[hidden email]> wrote:

>
> The reason that the problem occurred is that a MULTIPOLYGON with two
> exterior rings becomes invalid if the exterior rings touch along an edge
> (this case). It is important to know the use case, to see whether:
>
> library(rgeos)
> writeWKT(Ps1)
> gIsValid(Ps1, reason=TRUE)
> Ps1a <- gUnaryUnion(gBuffer(Ps1, width=0))
> gIsValid(Ps1a, reason=TRUE)
> writeWKT(Ps1a)
>
> or equivalently in sf for sf objects, should be applied before trying to
> write the object out to file with this driver.
>
> However, because drivers that are compliant with the simple features
> standard (which bans exterior rings sharing edges) have been permissive
> and do round-trip this invalid object, a relaxation in the OGR ESRI
> shapefile driver has been provided and may be included in a future
> release.
>
> We need to know (see these issues):
>
> https://github.com/r-spatial/sf/issues/1130
> https://github.com/OSGeo/gdal/issues/1787
>
> why it was desirable to write out this object using this driver? Could an
> alternative driver have been used, or is ESRI shapefile the only format
> used in the workflow?
>
> If it has to be this driver, could the workflow be changed to repair
> degenerate cases before writing? If using sp classes, rgeos may be used to
> test for and probably repair such geometries before they reach
> rgdal::writeOGR() for this driver. Adding code to sf and rgdal to trap
> degenerate cases does encumber all users with valid geometries with
> the time wasted on extra checking, so building checks into sf and rgdal is
> not desirable.
>
> I hope it is possible to find out more about the use case quickly, to pass
> on to GDAL developers to help motivate a relaxation in their current
> policy with regard to this driver, and to encourage them to include the
> fix branch in a future release of GDAL.
>
> Roger
>
> On Sat, 17 Aug 2019, Roger Bivand wrote:
>
> > Please follow up both here and on:
> >
> > https://github.com/r-spatial/sf/issues/1130
> >
> > as the problem is also seen in the sf package using the same GDAL ESRI
> > Shapefile driver.
> >
> > Roger
> >
> > On Fri, 16 Aug 2019, Roger Bivand wrote:
> >
> >>  On Fri, 16 Aug 2019, Roger Bivand wrote:
> >>
> >>>   On Tue, 13 Aug 2019, Phil Haines wrote:
> >>>
> >>>>    Dear list,
> >>>>
> >>>>    I have a single Polygons object containing multiple Polygon objects
> >>>>    that share a common border. When I output this using writeOGR() one of
> >>>>    the Polygon objects becomes a hole, as the following example shows.
> >>>>
> >>>>    Create a Polygons object containing two adjoining Polygon objects
> >>>>>    library(rgdal)
> >>>>>    r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
> >>>>>    r2 <- r1; r2[,1] <- r2[,1]+1
> >>>>>    Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
> >>>>>    SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
> >>>>>    data.frame(Example=c("Minimal")))
> >>>>
> >>>>    Perform a write/readOGR() cycle
> >>>>>    fn <- tempfile()
> >>>>>    writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
> >>>>>    SPDF2 <- readOGR(dsn=fn,layer='test')
> >>>>
> >>>>    Second Polygon object is now a hole
> >>>>>    sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
> >>>>    [1] FALSE  TRUE
> >>>>
> >>>>    I see from the sp documentation that "Polygon objects belonging to a
> >>>>    Polygons object should either not overlap one-other, or should be
> >>>>    fully included" but I am not sure how this relates to bordering
> >>>>    Polygon objects. I would welcome any advice as to whether what I am
> >>>>    asking of writeOGR is reasonable?
> >>>
> >>>   The problem is with the 'ESRI Shapefile' representation and driver:
> >>>
> >>>   library(rgdal)
> >>>   r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
> >>>   r2 <- r1; r2[,1] <- r2[,1]+1
> >>>   Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
> >>>   SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
> >>>   data.frame(Example=c("Minimal")))
> >>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")
> >>>
> >>>   # which constructs a MULTIPOLYGON object:
> >>>
> >>>   rgeos::writeWKT(SPDF)
> >>>   library(sf)
> >>>   st_as_text(st_geometry(st_as_sf(SPDF)))
> >>>
> >>> #    The 'ESRI Shapefile' driver is not Simple-Feature compliant (it
> >>> #    pre-dates it), so the failure occurs by seeing the second exterior
> >>> #    ring
> >>> #    as an interior ring
> >>>
> >>>   fn <- tempfile()
> >>>   writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
> >>>   SPDF2 <- readOGR(dsn=fn, layer='test')
> >>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
> >>>   rgeos::writeWKT(SPDF2)
> >>>   st_as_text(st_geometry(st_as_sf(SPDF2)))
> >>>
> >>>   # This happens with sf too, using the same GDAL driver:
> >>>
> >>>   sf2 <- st_read(dsn=fn, layer='test')
> >>>   st_geometry(sf2)
> >>>   library(sf)
> >>>   st_as_text(st_geometry(st_as_sf(sf2)))
> >>>   rgeos::writeWKT(as(sf2, "Spatial"))
> >>>
> >>>   # Adding the comment fix doesn't help:
> >>>
> >>>   comment(slot(SPDF, "polygons")[[1]])
> >>>   SPDF_c <- rgeos::createSPComment(SPDF)
> >>>   comment(slot(SPDF_c, "polygons")[[1]])
> >>>   writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
> >>>     verbose=TRUE)
> >>>
> >>> #    reports
> >>> #    Object initially classed as: wkbPolygon
> >>> #    SFS comments in Polygons objects
> >>> #    Object reclassed as: wkbMultiPolygon
> >>>
> >>>   SPDF2_c <- readOGR(dsn=fn, layer='test_c')
> >>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot,
> >>>   "ringDir")
> >>>   rgeos::writeWKT(SPDF2_c)
> >>>   st_as_text(st_geometry(st_as_sf(SPDF2_c)))
> >>>
> >>>
> >>>
> >>>   # If the input object is written out with the GeoPackage driver:
> >>>
> >>>   fn1 <- tempfile(fileext=".gpkg")
> >>>   writeOGR(SPDF, fn1, layer="test", driver='GPKG')
> >>>   sf2a <- st_read(dsn=fn1,layer='test')
> >>>   st_coordinates(st_geometry(sf2a))
> >>>   SPDF2a <- readOGR(dsn=fn1)
> >>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
> >>>   rgeos::writeWKT(SPDF2a)
> >>>   st_as_text(st_geometry(st_as_sf(SPDF2a)))
> >>>
> >>>   # the issue is resolved. If we separate the exterior rings:
> >>>
> >>>   r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
> >>>   Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
> >>>   SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
> >>>   data.frame(Example=c("Minimal")))
> >>>   fna <- tempfile()
> >>>   writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
> >>>   SPDF2_a <- readOGR(dsn=fna, layer='test')
> >>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot,
> >>>   "ringDir")
> >>>   rgeos::writeWKT(SPDF2_a)
> >>>   st_as_text(st_geometry(st_as_sf(SPDF2_a)))
> >>>
> >>>   # we are OK as the two exterior rings do not touch.
> >>>
> >>>   # does using sf make a difference?
> >>>
> >>>   fn_s <- tempfile(fileext=".shp")
> >>>   st_write(st_as_sf(SPDF), dsn=fn_s)
> >>>   sf_in <- st_read(fn_s)
> >>>   st_as_text(st_geometry(st_as_sf(sf_in)))
> >>>
> >>>   # No
> >>>
> >>>   fn_s <- tempfile(fileext=".shp")
> >>>   st_write(st_as_sf(SPDF_c), dsn=fn_s)
> >>>   sf_in_c <- st_read(fn_s)
> >>>   st_as_text(st_geometry(st_as_sf(sf_in_c)))
> >>>
> >>>   # nor with the pretend-SF-compliant comment set either.
> >>>
> >>>   So the weakness is in the "ESRI Shapefile" write driver, or possibly in
> >>>   the OGRGeometryFactory::organizePolygons() function in GDAL used in
> >>>   OGR_write() (a C++ function) called by writeOGR(). If sf::st_write()
> >>>   also
> >>>   calls OGRGeometryFactory::organizePolygons(), we'd maybe consider that
> >>>   it
> >>>   has a weakness for the "ESRI Shapefile" driver, but which does not
> >>>   affect
> >>>   SF-compliant drivers.
> >>
> >>  Without the comment set, OGRGeometryFactory::organizePolygons() is used;
> >>  with it set, OGRGeometryFactory::organizePolygons() is not used, because
> >>  the object is declared as two exterior rings. In both cases, we have the
> >>  output object written out and read back in incorrectly with the ESRI
> >>  shapefile driver, but SF-compliant drivers round-trip (in the test
> >>  GeoJSON), correctly.
> >>
> >>  It is likely that the changes made in 2015 to accommodate GeoJSON led to
> >>  this possible regression for the ESRI Shapefile driver. I'm adding this
> >>  geometry to tests/tripup.R and data files; without code changes the hole
> >>  slot is wrong and the ring direction is changes to match, so the
> >>  coordinates change too.
> >>
> >>  Reading using the deprecated maptools::readShapeSpatial() also gets a hole
> >>  in the second external ring. However, writing with deprecated
> >> maptools::  writeSpatialShape() yields a shapefile that when read with
> >> maptools::  readShapeSpatial() gives the correct two exterior ring, no hole
> >>  object. When read with sf::st_read() and rgdal::readOGR(), the object is
> >>  also correct. So the problem definitely lies in rgdal::writeOGR(), and
> >>  sf::st_write() - roundtripping with sf::st_write() and sf::st_read()
> >>  degrades from MULTIPOLYGON to POLYGON with the ESRI Shapefile driver.
> >>
> >>  Roger
> >>
> >>>
> >>>   This is probably related to a similar but inverse problem with the
> >>>   SF-compliant GeoJSON driver in 2015:
> >>>
> >>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html
> >>>
> >>>   continued the next month in:
> >>>
> >>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html
> >>>
> >>>   The details are in this SVN diff
> >>>
> >>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571
> >>>
> >>>   up to the end og the list thread, and this one from then until now:
> >>>
> >>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733
> >>>
> >>>   Summary: could you change drivers, or is it really necessary to fix an
> >>>   EOL
> >>>   problem? What is your use case?
> >>>
> >>>>
> >>>>    Thanks in advance for your time,
> >>>
> >>>   Thanks for a complete example,
> >>>
> >>>   Roger
> >>>
> >>>>    Phil
> >>>>
> >>>>>    sessionInfo()
> >>>>    R version 3.5.1 (2018-07-02)
> >>>>    Platform: x86_64-w64-mingw32/x64 (64-bit)
> >>>>    Running under: Windows >= 8 x64 (build 9200)
> >>>>
> >>>>    Matrix products: default
> >>>>
> >>>>    locale:
> >>>>    [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
> >>>>    Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
> >>>>                            LC_TIME=English_United Kingdom.1252
> >>>>
> >>>>    attached base packages:
> >>>>    [1] stats     graphics  grDevices utils     datasets  methods   base
> >>>>
> >>>>    other attached packages:
> >>>>    [1] rgdal_1.4-4 sp_1.3-1
> >>>>
> >>>>    loaded via a namespace (and not attached):
> >>>>    [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
> >>>>    lattice_0.20-35
> >>>>
> >>>>    _______________________________________________
> >>>>    R-sig-Geo mailing list
> >>>>    [hidden email]
> >>>>    https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>>>
> >>>
> >>>
> >>
> >>
> >
> >
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: [hidden email]
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Roger Bivand
Administrator
Hi Phil,

On Sun, 18 Aug 2019, Phil Haines wrote:

> Hi Roger,
>
> I originally encountered this issue while trying to reduce the size of
> German postal code boundaries. I used rmapshaper::ms_simplify() which
> introduced the polygons sharing a common edge. I definitely didn't
> need them to be separate polygons, and would happily have merged them
> had I been more familiar with gUnaryUnion().

If the German postal code boundaries are open, could you please put (an
affected subset) on an URL, and provide a short script to replicate the
workflow. Then we can engage with the rmapshaper maintainer, and see
whether the underlying problem in the workflow is in the js library or its
R wrapper. I've opened: https://github.com/ateucher/rmapshaper/issues/89.

>
> I apologise if my question has resulted in unnecessary effort on your
> part. I reported the behaviour because I found it surprising. However,
> I now understand that this example is not a valid simple feature
> geometry and that the solution is to take steps to ensure I have valid
> geometries, and to repair if not.
>

No need to apologise!! Your reproducible example was excellent, without it
you wouldn't have contributed to getting the internal shapelib code
changed to make it less restrictive in future releases of GDAL,
potentially benefitting lots of users (who really should drop ESRI
shapefiles, and/or should check geometries for validity, but ... whose
work you've helped save). By the way, ESRI would be very happy if
shapefiles stopped being produced in current workflows, and that new files
should rather be GPKG.

> Additionally I must confess that I only use the ESRI Shapefile driver
> because I don't know any better... My use cases are reading and
> writing from R (usually to produce maps) and occasionally opening in
> QGIS. I can happily switch to any format that supports this workflow.
>

GPKG is now very broadly supported, is SF-compliant, and has the big
user-facing benefits of not having field name length restrictions, and
many fewer encoding issues for field names and string values.

Best wishes,

Roger

> Thank you very much for your time,
> Phil
>
> On Sun, 18 Aug 2019 at 13:59, Roger Bivand <[hidden email]> wrote:
>>
>> The reason that the problem occurred is that a MULTIPOLYGON with two
>> exterior rings becomes invalid if the exterior rings touch along an edge
>> (this case). It is important to know the use case, to see whether:
>>
>> library(rgeos)
>> writeWKT(Ps1)
>> gIsValid(Ps1, reason=TRUE)
>> Ps1a <- gUnaryUnion(gBuffer(Ps1, width=0))
>> gIsValid(Ps1a, reason=TRUE)
>> writeWKT(Ps1a)
>>
>> or equivalently in sf for sf objects, should be applied before trying to
>> write the object out to file with this driver.
>>
>> However, because drivers that are compliant with the simple features
>> standard (which bans exterior rings sharing edges) have been permissive
>> and do round-trip this invalid object, a relaxation in the OGR ESRI
>> shapefile driver has been provided and may be included in a future
>> release.
>>
>> We need to know (see these issues):
>>
>> https://github.com/r-spatial/sf/issues/1130
>> https://github.com/OSGeo/gdal/issues/1787
>>
>> why it was desirable to write out this object using this driver? Could an
>> alternative driver have been used, or is ESRI shapefile the only format
>> used in the workflow?
>>
>> If it has to be this driver, could the workflow be changed to repair
>> degenerate cases before writing? If using sp classes, rgeos may be used to
>> test for and probably repair such geometries before they reach
>> rgdal::writeOGR() for this driver. Adding code to sf and rgdal to trap
>> degenerate cases does encumber all users with valid geometries with
>> the time wasted on extra checking, so building checks into sf and rgdal is
>> not desirable.
>>
>> I hope it is possible to find out more about the use case quickly, to pass
>> on to GDAL developers to help motivate a relaxation in their current
>> policy with regard to this driver, and to encourage them to include the
>> fix branch in a future release of GDAL.
>>
>> Roger
>>
>> On Sat, 17 Aug 2019, Roger Bivand wrote:
>>
>>> Please follow up both here and on:
>>>
>>> https://github.com/r-spatial/sf/issues/1130
>>>
>>> as the problem is also seen in the sf package using the same GDAL ESRI
>>> Shapefile driver.
>>>
>>> Roger
>>>
>>> On Fri, 16 Aug 2019, Roger Bivand wrote:
>>>
>>>>  On Fri, 16 Aug 2019, Roger Bivand wrote:
>>>>
>>>>>   On Tue, 13 Aug 2019, Phil Haines wrote:
>>>>>
>>>>>>    Dear list,
>>>>>>
>>>>>>    I have a single Polygons object containing multiple Polygon objects
>>>>>>    that share a common border. When I output this using writeOGR() one of
>>>>>>    the Polygon objects becomes a hole, as the following example shows.
>>>>>>
>>>>>>    Create a Polygons object containing two adjoining Polygon objects
>>>>>>>    library(rgdal)
>>>>>>>    r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>>>>>    r2 <- r1; r2[,1] <- r2[,1]+1
>>>>>>>    Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>>>>>    SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>>>>>    data.frame(Example=c("Minimal")))
>>>>>>
>>>>>>    Perform a write/readOGR() cycle
>>>>>>>    fn <- tempfile()
>>>>>>>    writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>>>>>    SPDF2 <- readOGR(dsn=fn,layer='test')
>>>>>>
>>>>>>    Second Polygon object is now a hole
>>>>>>>    sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
>>>>>>    [1] FALSE  TRUE
>>>>>>
>>>>>>    I see from the sp documentation that "Polygon objects belonging to a
>>>>>>    Polygons object should either not overlap one-other, or should be
>>>>>>    fully included" but I am not sure how this relates to bordering
>>>>>>    Polygon objects. I would welcome any advice as to whether what I am
>>>>>>    asking of writeOGR is reasonable?
>>>>>
>>>>>   The problem is with the 'ESRI Shapefile' representation and driver:
>>>>>
>>>>>   library(rgdal)
>>>>>   r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>>>   r2 <- r1; r2[,1] <- r2[,1]+1
>>>>>   Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>>>   SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>>>   data.frame(Example=c("Minimal")))
>>>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>>>
>>>>>   # which constructs a MULTIPOLYGON object:
>>>>>
>>>>>   rgeos::writeWKT(SPDF)
>>>>>   library(sf)
>>>>>   st_as_text(st_geometry(st_as_sf(SPDF)))
>>>>>
>>>>> #    The 'ESRI Shapefile' driver is not Simple-Feature compliant (it
>>>>> #    pre-dates it), so the failure occurs by seeing the second exterior
>>>>> #    ring
>>>>> #    as an interior ring
>>>>>
>>>>>   fn <- tempfile()
>>>>>   writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>>>   SPDF2 <- readOGR(dsn=fn, layer='test')
>>>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>>>   rgeos::writeWKT(SPDF2)
>>>>>   st_as_text(st_geometry(st_as_sf(SPDF2)))
>>>>>
>>>>>   # This happens with sf too, using the same GDAL driver:
>>>>>
>>>>>   sf2 <- st_read(dsn=fn, layer='test')
>>>>>   st_geometry(sf2)
>>>>>   library(sf)
>>>>>   st_as_text(st_geometry(st_as_sf(sf2)))
>>>>>   rgeos::writeWKT(as(sf2, "Spatial"))
>>>>>
>>>>>   # Adding the comment fix doesn't help:
>>>>>
>>>>>   comment(slot(SPDF, "polygons")[[1]])
>>>>>   SPDF_c <- rgeos::createSPComment(SPDF)
>>>>>   comment(slot(SPDF_c, "polygons")[[1]])
>>>>>   writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
>>>>>     verbose=TRUE)
>>>>>
>>>>> #    reports
>>>>> #    Object initially classed as: wkbPolygon
>>>>> #    SFS comments in Polygons objects
>>>>> #    Object reclassed as: wkbMultiPolygon
>>>>>
>>>>>   SPDF2_c <- readOGR(dsn=fn, layer='test_c')
>>>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot,
>>>>>   "ringDir")
>>>>>   rgeos::writeWKT(SPDF2_c)
>>>>>   st_as_text(st_geometry(st_as_sf(SPDF2_c)))
>>>>>
>>>>>
>>>>>
>>>>>   # If the input object is written out with the GeoPackage driver:
>>>>>
>>>>>   fn1 <- tempfile(fileext=".gpkg")
>>>>>   writeOGR(SPDF, fn1, layer="test", driver='GPKG')
>>>>>   sf2a <- st_read(dsn=fn1,layer='test')
>>>>>   st_coordinates(st_geometry(sf2a))
>>>>>   SPDF2a <- readOGR(dsn=fn1)
>>>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>>>   rgeos::writeWKT(SPDF2a)
>>>>>   st_as_text(st_geometry(st_as_sf(SPDF2a)))
>>>>>
>>>>>   # the issue is resolved. If we separate the exterior rings:
>>>>>
>>>>>   r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
>>>>>   Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
>>>>>   SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
>>>>>   data.frame(Example=c("Minimal")))
>>>>>   fna <- tempfile()
>>>>>   writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
>>>>>   SPDF2_a <- readOGR(dsn=fna, layer='test')
>>>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot,
>>>>>   "ringDir")
>>>>>   rgeos::writeWKT(SPDF2_a)
>>>>>   st_as_text(st_geometry(st_as_sf(SPDF2_a)))
>>>>>
>>>>>   # we are OK as the two exterior rings do not touch.
>>>>>
>>>>>   # does using sf make a difference?
>>>>>
>>>>>   fn_s <- tempfile(fileext=".shp")
>>>>>   st_write(st_as_sf(SPDF), dsn=fn_s)
>>>>>   sf_in <- st_read(fn_s)
>>>>>   st_as_text(st_geometry(st_as_sf(sf_in)))
>>>>>
>>>>>   # No
>>>>>
>>>>>   fn_s <- tempfile(fileext=".shp")
>>>>>   st_write(st_as_sf(SPDF_c), dsn=fn_s)
>>>>>   sf_in_c <- st_read(fn_s)
>>>>>   st_as_text(st_geometry(st_as_sf(sf_in_c)))
>>>>>
>>>>>   # nor with the pretend-SF-compliant comment set either.
>>>>>
>>>>>   So the weakness is in the "ESRI Shapefile" write driver, or possibly in
>>>>>   the OGRGeometryFactory::organizePolygons() function in GDAL used in
>>>>>   OGR_write() (a C++ function) called by writeOGR(). If sf::st_write()
>>>>>   also
>>>>>   calls OGRGeometryFactory::organizePolygons(), we'd maybe consider that
>>>>>   it
>>>>>   has a weakness for the "ESRI Shapefile" driver, but which does not
>>>>>   affect
>>>>>   SF-compliant drivers.
>>>>
>>>>  Without the comment set, OGRGeometryFactory::organizePolygons() is used;
>>>>  with it set, OGRGeometryFactory::organizePolygons() is not used, because
>>>>  the object is declared as two exterior rings. In both cases, we have the
>>>>  output object written out and read back in incorrectly with the ESRI
>>>>  shapefile driver, but SF-compliant drivers round-trip (in the test
>>>>  GeoJSON), correctly.
>>>>
>>>>  It is likely that the changes made in 2015 to accommodate GeoJSON led to
>>>>  this possible regression for the ESRI Shapefile driver. I'm adding this
>>>>  geometry to tests/tripup.R and data files; without code changes the hole
>>>>  slot is wrong and the ring direction is changes to match, so the
>>>>  coordinates change too.
>>>>
>>>>  Reading using the deprecated maptools::readShapeSpatial() also gets a hole
>>>>  in the second external ring. However, writing with deprecated
>>>> maptools::  writeSpatialShape() yields a shapefile that when read with
>>>> maptools::  readShapeSpatial() gives the correct two exterior ring, no hole
>>>>  object. When read with sf::st_read() and rgdal::readOGR(), the object is
>>>>  also correct. So the problem definitely lies in rgdal::writeOGR(), and
>>>>  sf::st_write() - roundtripping with sf::st_write() and sf::st_read()
>>>>  degrades from MULTIPOLYGON to POLYGON with the ESRI Shapefile driver.
>>>>
>>>>  Roger
>>>>
>>>>>
>>>>>   This is probably related to a similar but inverse problem with the
>>>>>   SF-compliant GeoJSON driver in 2015:
>>>>>
>>>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html
>>>>>
>>>>>   continued the next month in:
>>>>>
>>>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html
>>>>>
>>>>>   The details are in this SVN diff
>>>>>
>>>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571
>>>>>
>>>>>   up to the end og the list thread, and this one from then until now:
>>>>>
>>>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733
>>>>>
>>>>>   Summary: could you change drivers, or is it really necessary to fix an
>>>>>   EOL
>>>>>   problem? What is your use case?
>>>>>
>>>>>>
>>>>>>    Thanks in advance for your time,
>>>>>
>>>>>   Thanks for a complete example,
>>>>>
>>>>>   Roger
>>>>>
>>>>>>    Phil
>>>>>>
>>>>>>>    sessionInfo()
>>>>>>    R version 3.5.1 (2018-07-02)
>>>>>>    Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>>>>    Running under: Windows >= 8 x64 (build 9200)
>>>>>>
>>>>>>    Matrix products: default
>>>>>>
>>>>>>    locale:
>>>>>>    [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>>>>>>    Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>>>>>>                            LC_TIME=English_United Kingdom.1252
>>>>>>
>>>>>>    attached base packages:
>>>>>>    [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>>
>>>>>>    other attached packages:
>>>>>>    [1] rgdal_1.4-4 sp_1.3-1
>>>>>>
>>>>>>    loaded via a namespace (and not attached):
>>>>>>    [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
>>>>>>    lattice_0.20-35
>>>>>>
>>>>>>    _______________________________________________
>>>>>>    R-sig-Geo mailing list
>>>>>>    [hidden email]
>>>>>>    https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; e-mail: [hidden email]
>> https://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Phil Haines
Hi Roger,

I have added an example to https://github.com/ateucher/rmapshaper/issues/89

Hopefully it illustrates the behaviour I was seeing from
rmapshaper::ms_simplify() but please let me know if not.

And thanks for the nudge towards GPKG, I will investigate.
Phil

On Mon, 19 Aug 2019 at 15:48, Roger Bivand <[hidden email]> wrote:

>
> Hi Phil,
>
> On Sun, 18 Aug 2019, Phil Haines wrote:
>
> > Hi Roger,
> >
> > I originally encountered this issue while trying to reduce the size of
> > German postal code boundaries. I used rmapshaper::ms_simplify() which
> > introduced the polygons sharing a common edge. I definitely didn't
> > need them to be separate polygons, and would happily have merged them
> > had I been more familiar with gUnaryUnion().
>
> If the German postal code boundaries are open, could you please put (an
> affected subset) on an URL, and provide a short script to replicate the
> workflow. Then we can engage with the rmapshaper maintainer, and see
> whether the underlying problem in the workflow is in the js library or its
> R wrapper. I've opened: https://github.com/ateucher/rmapshaper/issues/89.
>
> >
> > I apologise if my question has resulted in unnecessary effort on your
> > part. I reported the behaviour because I found it surprising. However,
> > I now understand that this example is not a valid simple feature
> > geometry and that the solution is to take steps to ensure I have valid
> > geometries, and to repair if not.
> >
>
> No need to apologise!! Your reproducible example was excellent, without it
> you wouldn't have contributed to getting the internal shapelib code
> changed to make it less restrictive in future releases of GDAL,
> potentially benefitting lots of users (who really should drop ESRI
> shapefiles, and/or should check geometries for validity, but ... whose
> work you've helped save). By the way, ESRI would be very happy if
> shapefiles stopped being produced in current workflows, and that new files
> should rather be GPKG.
>
> > Additionally I must confess that I only use the ESRI Shapefile driver
> > because I don't know any better... My use cases are reading and
> > writing from R (usually to produce maps) and occasionally opening in
> > QGIS. I can happily switch to any format that supports this workflow.
> >
>
> GPKG is now very broadly supported, is SF-compliant, and has the big
> user-facing benefits of not having field name length restrictions, and
> many fewer encoding issues for field names and string values.
>
> Best wishes,
>
> Roger
>
> > Thank you very much for your time,
> > Phil
> >
> > On Sun, 18 Aug 2019 at 13:59, Roger Bivand <[hidden email]> wrote:
> >>
> >> The reason that the problem occurred is that a MULTIPOLYGON with two
> >> exterior rings becomes invalid if the exterior rings touch along an edge
> >> (this case). It is important to know the use case, to see whether:
> >>
> >> library(rgeos)
> >> writeWKT(Ps1)
> >> gIsValid(Ps1, reason=TRUE)
> >> Ps1a <- gUnaryUnion(gBuffer(Ps1, width=0))
> >> gIsValid(Ps1a, reason=TRUE)
> >> writeWKT(Ps1a)
> >>
> >> or equivalently in sf for sf objects, should be applied before trying to
> >> write the object out to file with this driver.
> >>
> >> However, because drivers that are compliant with the simple features
> >> standard (which bans exterior rings sharing edges) have been permissive
> >> and do round-trip this invalid object, a relaxation in the OGR ESRI
> >> shapefile driver has been provided and may be included in a future
> >> release.
> >>
> >> We need to know (see these issues):
> >>
> >> https://github.com/r-spatial/sf/issues/1130
> >> https://github.com/OSGeo/gdal/issues/1787
> >>
> >> why it was desirable to write out this object using this driver? Could an
> >> alternative driver have been used, or is ESRI shapefile the only format
> >> used in the workflow?
> >>
> >> If it has to be this driver, could the workflow be changed to repair
> >> degenerate cases before writing? If using sp classes, rgeos may be used to
> >> test for and probably repair such geometries before they reach
> >> rgdal::writeOGR() for this driver. Adding code to sf and rgdal to trap
> >> degenerate cases does encumber all users with valid geometries with
> >> the time wasted on extra checking, so building checks into sf and rgdal is
> >> not desirable.
> >>
> >> I hope it is possible to find out more about the use case quickly, to pass
> >> on to GDAL developers to help motivate a relaxation in their current
> >> policy with regard to this driver, and to encourage them to include the
> >> fix branch in a future release of GDAL.
> >>
> >> Roger
> >>
> >> On Sat, 17 Aug 2019, Roger Bivand wrote:
> >>
> >>> Please follow up both here and on:
> >>>
> >>> https://github.com/r-spatial/sf/issues/1130
> >>>
> >>> as the problem is also seen in the sf package using the same GDAL ESRI
> >>> Shapefile driver.
> >>>
> >>> Roger
> >>>
> >>> On Fri, 16 Aug 2019, Roger Bivand wrote:
> >>>
> >>>>  On Fri, 16 Aug 2019, Roger Bivand wrote:
> >>>>
> >>>>>   On Tue, 13 Aug 2019, Phil Haines wrote:
> >>>>>
> >>>>>>    Dear list,
> >>>>>>
> >>>>>>    I have a single Polygons object containing multiple Polygon objects
> >>>>>>    that share a common border. When I output this using writeOGR() one of
> >>>>>>    the Polygon objects becomes a hole, as the following example shows.
> >>>>>>
> >>>>>>    Create a Polygons object containing two adjoining Polygon objects
> >>>>>>>    library(rgdal)
> >>>>>>>    r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
> >>>>>>>    r2 <- r1; r2[,1] <- r2[,1]+1
> >>>>>>>    Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
> >>>>>>>    SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
> >>>>>>>    data.frame(Example=c("Minimal")))
> >>>>>>
> >>>>>>    Perform a write/readOGR() cycle
> >>>>>>>    fn <- tempfile()
> >>>>>>>    writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
> >>>>>>>    SPDF2 <- readOGR(dsn=fn,layer='test')
> >>>>>>
> >>>>>>    Second Polygon object is now a hole
> >>>>>>>    sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
> >>>>>>    [1] FALSE  TRUE
> >>>>>>
> >>>>>>    I see from the sp documentation that "Polygon objects belonging to a
> >>>>>>    Polygons object should either not overlap one-other, or should be
> >>>>>>    fully included" but I am not sure how this relates to bordering
> >>>>>>    Polygon objects. I would welcome any advice as to whether what I am
> >>>>>>    asking of writeOGR is reasonable?
> >>>>>
> >>>>>   The problem is with the 'ESRI Shapefile' representation and driver:
> >>>>>
> >>>>>   library(rgdal)
> >>>>>   r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
> >>>>>   r2 <- r1; r2[,1] <- r2[,1]+1
> >>>>>   Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
> >>>>>   SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
> >>>>>   data.frame(Example=c("Minimal")))
> >>>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")
> >>>>>
> >>>>>   # which constructs a MULTIPOLYGON object:
> >>>>>
> >>>>>   rgeos::writeWKT(SPDF)
> >>>>>   library(sf)
> >>>>>   st_as_text(st_geometry(st_as_sf(SPDF)))
> >>>>>
> >>>>> #    The 'ESRI Shapefile' driver is not Simple-Feature compliant (it
> >>>>> #    pre-dates it), so the failure occurs by seeing the second exterior
> >>>>> #    ring
> >>>>> #    as an interior ring
> >>>>>
> >>>>>   fn <- tempfile()
> >>>>>   writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
> >>>>>   SPDF2 <- readOGR(dsn=fn, layer='test')
> >>>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
> >>>>>   rgeos::writeWKT(SPDF2)
> >>>>>   st_as_text(st_geometry(st_as_sf(SPDF2)))
> >>>>>
> >>>>>   # This happens with sf too, using the same GDAL driver:
> >>>>>
> >>>>>   sf2 <- st_read(dsn=fn, layer='test')
> >>>>>   st_geometry(sf2)
> >>>>>   library(sf)
> >>>>>   st_as_text(st_geometry(st_as_sf(sf2)))
> >>>>>   rgeos::writeWKT(as(sf2, "Spatial"))
> >>>>>
> >>>>>   # Adding the comment fix doesn't help:
> >>>>>
> >>>>>   comment(slot(SPDF, "polygons")[[1]])
> >>>>>   SPDF_c <- rgeos::createSPComment(SPDF)
> >>>>>   comment(slot(SPDF_c, "polygons")[[1]])
> >>>>>   writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
> >>>>>     verbose=TRUE)
> >>>>>
> >>>>> #    reports
> >>>>> #    Object initially classed as: wkbPolygon
> >>>>> #    SFS comments in Polygons objects
> >>>>> #    Object reclassed as: wkbMultiPolygon
> >>>>>
> >>>>>   SPDF2_c <- readOGR(dsn=fn, layer='test_c')
> >>>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot,
> >>>>>   "ringDir")
> >>>>>   rgeos::writeWKT(SPDF2_c)
> >>>>>   st_as_text(st_geometry(st_as_sf(SPDF2_c)))
> >>>>>
> >>>>>
> >>>>>
> >>>>>   # If the input object is written out with the GeoPackage driver:
> >>>>>
> >>>>>   fn1 <- tempfile(fileext=".gpkg")
> >>>>>   writeOGR(SPDF, fn1, layer="test", driver='GPKG')
> >>>>>   sf2a <- st_read(dsn=fn1,layer='test')
> >>>>>   st_coordinates(st_geometry(sf2a))
> >>>>>   SPDF2a <- readOGR(dsn=fn1)
> >>>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
> >>>>>   rgeos::writeWKT(SPDF2a)
> >>>>>   st_as_text(st_geometry(st_as_sf(SPDF2a)))
> >>>>>
> >>>>>   # the issue is resolved. If we separate the exterior rings:
> >>>>>
> >>>>>   r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
> >>>>>   Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
> >>>>>   SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
> >>>>>   data.frame(Example=c("Minimal")))
> >>>>>   fna <- tempfile()
> >>>>>   writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
> >>>>>   SPDF2_a <- readOGR(dsn=fna, layer='test')
> >>>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
> >>>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot,
> >>>>>   "ringDir")
> >>>>>   rgeos::writeWKT(SPDF2_a)
> >>>>>   st_as_text(st_geometry(st_as_sf(SPDF2_a)))
> >>>>>
> >>>>>   # we are OK as the two exterior rings do not touch.
> >>>>>
> >>>>>   # does using sf make a difference?
> >>>>>
> >>>>>   fn_s <- tempfile(fileext=".shp")
> >>>>>   st_write(st_as_sf(SPDF), dsn=fn_s)
> >>>>>   sf_in <- st_read(fn_s)
> >>>>>   st_as_text(st_geometry(st_as_sf(sf_in)))
> >>>>>
> >>>>>   # No
> >>>>>
> >>>>>   fn_s <- tempfile(fileext=".shp")
> >>>>>   st_write(st_as_sf(SPDF_c), dsn=fn_s)
> >>>>>   sf_in_c <- st_read(fn_s)
> >>>>>   st_as_text(st_geometry(st_as_sf(sf_in_c)))
> >>>>>
> >>>>>   # nor with the pretend-SF-compliant comment set either.
> >>>>>
> >>>>>   So the weakness is in the "ESRI Shapefile" write driver, or possibly in
> >>>>>   the OGRGeometryFactory::organizePolygons() function in GDAL used in
> >>>>>   OGR_write() (a C++ function) called by writeOGR(). If sf::st_write()
> >>>>>   also
> >>>>>   calls OGRGeometryFactory::organizePolygons(), we'd maybe consider that
> >>>>>   it
> >>>>>   has a weakness for the "ESRI Shapefile" driver, but which does not
> >>>>>   affect
> >>>>>   SF-compliant drivers.
> >>>>
> >>>>  Without the comment set, OGRGeometryFactory::organizePolygons() is used;
> >>>>  with it set, OGRGeometryFactory::organizePolygons() is not used, because
> >>>>  the object is declared as two exterior rings. In both cases, we have the
> >>>>  output object written out and read back in incorrectly with the ESRI
> >>>>  shapefile driver, but SF-compliant drivers round-trip (in the test
> >>>>  GeoJSON), correctly.
> >>>>
> >>>>  It is likely that the changes made in 2015 to accommodate GeoJSON led to
> >>>>  this possible regression for the ESRI Shapefile driver. I'm adding this
> >>>>  geometry to tests/tripup.R and data files; without code changes the hole
> >>>>  slot is wrong and the ring direction is changes to match, so the
> >>>>  coordinates change too.
> >>>>
> >>>>  Reading using the deprecated maptools::readShapeSpatial() also gets a hole
> >>>>  in the second external ring. However, writing with deprecated
> >>>> maptools::  writeSpatialShape() yields a shapefile that when read with
> >>>> maptools::  readShapeSpatial() gives the correct two exterior ring, no hole
> >>>>  object. When read with sf::st_read() and rgdal::readOGR(), the object is
> >>>>  also correct. So the problem definitely lies in rgdal::writeOGR(), and
> >>>>  sf::st_write() - roundtripping with sf::st_write() and sf::st_read()
> >>>>  degrades from MULTIPOLYGON to POLYGON with the ESRI Shapefile driver.
> >>>>
> >>>>  Roger
> >>>>
> >>>>>
> >>>>>   This is probably related to a similar but inverse problem with the
> >>>>>   SF-compliant GeoJSON driver in 2015:
> >>>>>
> >>>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html
> >>>>>
> >>>>>   continued the next month in:
> >>>>>
> >>>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html
> >>>>>
> >>>>>   The details are in this SVN diff
> >>>>>
> >>>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571
> >>>>>
> >>>>>   up to the end og the list thread, and this one from then until now:
> >>>>>
> >>>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733
> >>>>>
> >>>>>   Summary: could you change drivers, or is it really necessary to fix an
> >>>>>   EOL
> >>>>>   problem? What is your use case?
> >>>>>
> >>>>>>
> >>>>>>    Thanks in advance for your time,
> >>>>>
> >>>>>   Thanks for a complete example,
> >>>>>
> >>>>>   Roger
> >>>>>
> >>>>>>    Phil
> >>>>>>
> >>>>>>>    sessionInfo()
> >>>>>>    R version 3.5.1 (2018-07-02)
> >>>>>>    Platform: x86_64-w64-mingw32/x64 (64-bit)
> >>>>>>    Running under: Windows >= 8 x64 (build 9200)
> >>>>>>
> >>>>>>    Matrix products: default
> >>>>>>
> >>>>>>    locale:
> >>>>>>    [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
> >>>>>>    Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
> >>>>>>                            LC_TIME=English_United Kingdom.1252
> >>>>>>
> >>>>>>    attached base packages:
> >>>>>>    [1] stats     graphics  grDevices utils     datasets  methods   base
> >>>>>>
> >>>>>>    other attached packages:
> >>>>>>    [1] rgdal_1.4-4 sp_1.3-1
> >>>>>>
> >>>>>>    loaded via a namespace (and not attached):
> >>>>>>    [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
> >>>>>>    lattice_0.20-35
> >>>>>>
> >>>>>>    _______________________________________________
> >>>>>>    R-sig-Geo mailing list
> >>>>>>    [hidden email]
> >>>>>>    https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >> --
> >> Roger Bivand
> >> Department of Economics, Norwegian School of Economics,
> >> Helleveien 30, N-5045 Bergen, Norway.
> >> voice: +47 55 95 93 55; e-mail: [hidden email]
> >> https://orcid.org/0000-0003-2392-6140
> >> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> >
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: [hidden email]
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Roger Bivand
Administrator
Good, thanks very much!

Roger

On Mon, 19 Aug 2019, Phil Haines wrote:

> Hi Roger,
>
> I have added an example to https://github.com/ateucher/rmapshaper/issues/89
>
> Hopefully it illustrates the behaviour I was seeing from
> rmapshaper::ms_simplify() but please let me know if not.
>
> And thanks for the nudge towards GPKG, I will investigate.
> Phil
>
> On Mon, 19 Aug 2019 at 15:48, Roger Bivand <[hidden email]> wrote:
>>
>> Hi Phil,
>>
>> On Sun, 18 Aug 2019, Phil Haines wrote:
>>
>>> Hi Roger,
>>>
>>> I originally encountered this issue while trying to reduce the size of
>>> German postal code boundaries. I used rmapshaper::ms_simplify() which
>>> introduced the polygons sharing a common edge. I definitely didn't
>>> need them to be separate polygons, and would happily have merged them
>>> had I been more familiar with gUnaryUnion().
>>
>> If the German postal code boundaries are open, could you please put (an
>> affected subset) on an URL, and provide a short script to replicate the
>> workflow. Then we can engage with the rmapshaper maintainer, and see
>> whether the underlying problem in the workflow is in the js library or its
>> R wrapper. I've opened: https://github.com/ateucher/rmapshaper/issues/89.
>>
>>>
>>> I apologise if my question has resulted in unnecessary effort on your
>>> part. I reported the behaviour because I found it surprising. However,
>>> I now understand that this example is not a valid simple feature
>>> geometry and that the solution is to take steps to ensure I have valid
>>> geometries, and to repair if not.
>>>
>>
>> No need to apologise!! Your reproducible example was excellent, without it
>> you wouldn't have contributed to getting the internal shapelib code
>> changed to make it less restrictive in future releases of GDAL,
>> potentially benefitting lots of users (who really should drop ESRI
>> shapefiles, and/or should check geometries for validity, but ... whose
>> work you've helped save). By the way, ESRI would be very happy if
>> shapefiles stopped being produced in current workflows, and that new files
>> should rather be GPKG.
>>
>>> Additionally I must confess that I only use the ESRI Shapefile driver
>>> because I don't know any better... My use cases are reading and
>>> writing from R (usually to produce maps) and occasionally opening in
>>> QGIS. I can happily switch to any format that supports this workflow.
>>>
>>
>> GPKG is now very broadly supported, is SF-compliant, and has the big
>> user-facing benefits of not having field name length restrictions, and
>> many fewer encoding issues for field names and string values.
>>
>> Best wishes,
>>
>> Roger
>>
>>> Thank you very much for your time,
>>> Phil
>>>
>>> On Sun, 18 Aug 2019 at 13:59, Roger Bivand <[hidden email]> wrote:
>>>>
>>>> The reason that the problem occurred is that a MULTIPOLYGON with two
>>>> exterior rings becomes invalid if the exterior rings touch along an edge
>>>> (this case). It is important to know the use case, to see whether:
>>>>
>>>> library(rgeos)
>>>> writeWKT(Ps1)
>>>> gIsValid(Ps1, reason=TRUE)
>>>> Ps1a <- gUnaryUnion(gBuffer(Ps1, width=0))
>>>> gIsValid(Ps1a, reason=TRUE)
>>>> writeWKT(Ps1a)
>>>>
>>>> or equivalently in sf for sf objects, should be applied before trying to
>>>> write the object out to file with this driver.
>>>>
>>>> However, because drivers that are compliant with the simple features
>>>> standard (which bans exterior rings sharing edges) have been permissive
>>>> and do round-trip this invalid object, a relaxation in the OGR ESRI
>>>> shapefile driver has been provided and may be included in a future
>>>> release.
>>>>
>>>> We need to know (see these issues):
>>>>
>>>> https://github.com/r-spatial/sf/issues/1130
>>>> https://github.com/OSGeo/gdal/issues/1787
>>>>
>>>> why it was desirable to write out this object using this driver? Could an
>>>> alternative driver have been used, or is ESRI shapefile the only format
>>>> used in the workflow?
>>>>
>>>> If it has to be this driver, could the workflow be changed to repair
>>>> degenerate cases before writing? If using sp classes, rgeos may be used to
>>>> test for and probably repair such geometries before they reach
>>>> rgdal::writeOGR() for this driver. Adding code to sf and rgdal to trap
>>>> degenerate cases does encumber all users with valid geometries with
>>>> the time wasted on extra checking, so building checks into sf and rgdal is
>>>> not desirable.
>>>>
>>>> I hope it is possible to find out more about the use case quickly, to pass
>>>> on to GDAL developers to help motivate a relaxation in their current
>>>> policy with regard to this driver, and to encourage them to include the
>>>> fix branch in a future release of GDAL.
>>>>
>>>> Roger
>>>>
>>>> On Sat, 17 Aug 2019, Roger Bivand wrote:
>>>>
>>>>> Please follow up both here and on:
>>>>>
>>>>> https://github.com/r-spatial/sf/issues/1130
>>>>>
>>>>> as the problem is also seen in the sf package using the same GDAL ESRI
>>>>> Shapefile driver.
>>>>>
>>>>> Roger
>>>>>
>>>>> On Fri, 16 Aug 2019, Roger Bivand wrote:
>>>>>
>>>>>>  On Fri, 16 Aug 2019, Roger Bivand wrote:
>>>>>>
>>>>>>>   On Tue, 13 Aug 2019, Phil Haines wrote:
>>>>>>>
>>>>>>>>    Dear list,
>>>>>>>>
>>>>>>>>    I have a single Polygons object containing multiple Polygon objects
>>>>>>>>    that share a common border. When I output this using writeOGR() one of
>>>>>>>>    the Polygon objects becomes a hole, as the following example shows.
>>>>>>>>
>>>>>>>>    Create a Polygons object containing two adjoining Polygon objects
>>>>>>>>>    library(rgdal)
>>>>>>>>>    r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>>>>>>>    r2 <- r1; r2[,1] <- r2[,1]+1
>>>>>>>>>    Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>>>>>>>    SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>>>>>>>    data.frame(Example=c("Minimal")))
>>>>>>>>
>>>>>>>>    Perform a write/readOGR() cycle
>>>>>>>>>    fn <- tempfile()
>>>>>>>>>    writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>>>>>>>    SPDF2 <- readOGR(dsn=fn,layer='test')
>>>>>>>>
>>>>>>>>    Second Polygon object is now a hole
>>>>>>>>>    sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
>>>>>>>>    [1] FALSE  TRUE
>>>>>>>>
>>>>>>>>    I see from the sp documentation that "Polygon objects belonging to a
>>>>>>>>    Polygons object should either not overlap one-other, or should be
>>>>>>>>    fully included" but I am not sure how this relates to bordering
>>>>>>>>    Polygon objects. I would welcome any advice as to whether what I am
>>>>>>>>    asking of writeOGR is reasonable?
>>>>>>>
>>>>>>>   The problem is with the 'ESRI Shapefile' representation and driver:
>>>>>>>
>>>>>>>   library(rgdal)
>>>>>>>   r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>>>>>   r2 <- r1; r2[,1] <- r2[,1]+1
>>>>>>>   Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>>>>>   SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>>>>>   data.frame(Example=c("Minimal")))
>>>>>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>>>>>
>>>>>>>   # which constructs a MULTIPOLYGON object:
>>>>>>>
>>>>>>>   rgeos::writeWKT(SPDF)
>>>>>>>   library(sf)
>>>>>>>   st_as_text(st_geometry(st_as_sf(SPDF)))
>>>>>>>
>>>>>>> #    The 'ESRI Shapefile' driver is not Simple-Feature compliant (it
>>>>>>> #    pre-dates it), so the failure occurs by seeing the second exterior
>>>>>>> #    ring
>>>>>>> #    as an interior ring
>>>>>>>
>>>>>>>   fn <- tempfile()
>>>>>>>   writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>>>>>   SPDF2 <- readOGR(dsn=fn, layer='test')
>>>>>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>>>>>   rgeos::writeWKT(SPDF2)
>>>>>>>   st_as_text(st_geometry(st_as_sf(SPDF2)))
>>>>>>>
>>>>>>>   # This happens with sf too, using the same GDAL driver:
>>>>>>>
>>>>>>>   sf2 <- st_read(dsn=fn, layer='test')
>>>>>>>   st_geometry(sf2)
>>>>>>>   library(sf)
>>>>>>>   st_as_text(st_geometry(st_as_sf(sf2)))
>>>>>>>   rgeos::writeWKT(as(sf2, "Spatial"))
>>>>>>>
>>>>>>>   # Adding the comment fix doesn't help:
>>>>>>>
>>>>>>>   comment(slot(SPDF, "polygons")[[1]])
>>>>>>>   SPDF_c <- rgeos::createSPComment(SPDF)
>>>>>>>   comment(slot(SPDF_c, "polygons")[[1]])
>>>>>>>   writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
>>>>>>>     verbose=TRUE)
>>>>>>>
>>>>>>> #    reports
>>>>>>> #    Object initially classed as: wkbPolygon
>>>>>>> #    SFS comments in Polygons objects
>>>>>>> #    Object reclassed as: wkbMultiPolygon
>>>>>>>
>>>>>>>   SPDF2_c <- readOGR(dsn=fn, layer='test_c')
>>>>>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot,
>>>>>>>   "ringDir")
>>>>>>>   rgeos::writeWKT(SPDF2_c)
>>>>>>>   st_as_text(st_geometry(st_as_sf(SPDF2_c)))
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   # If the input object is written out with the GeoPackage driver:
>>>>>>>
>>>>>>>   fn1 <- tempfile(fileext=".gpkg")
>>>>>>>   writeOGR(SPDF, fn1, layer="test", driver='GPKG')
>>>>>>>   sf2a <- st_read(dsn=fn1,layer='test')
>>>>>>>   st_coordinates(st_geometry(sf2a))
>>>>>>>   SPDF2a <- readOGR(dsn=fn1)
>>>>>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>>>>>   rgeos::writeWKT(SPDF2a)
>>>>>>>   st_as_text(st_geometry(st_as_sf(SPDF2a)))
>>>>>>>
>>>>>>>   # the issue is resolved. If we separate the exterior rings:
>>>>>>>
>>>>>>>   r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
>>>>>>>   Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
>>>>>>>   SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
>>>>>>>   data.frame(Example=c("Minimal")))
>>>>>>>   fna <- tempfile()
>>>>>>>   writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
>>>>>>>   SPDF2_a <- readOGR(dsn=fna, layer='test')
>>>>>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot,
>>>>>>>   "ringDir")
>>>>>>>   rgeos::writeWKT(SPDF2_a)
>>>>>>>   st_as_text(st_geometry(st_as_sf(SPDF2_a)))
>>>>>>>
>>>>>>>   # we are OK as the two exterior rings do not touch.
>>>>>>>
>>>>>>>   # does using sf make a difference?
>>>>>>>
>>>>>>>   fn_s <- tempfile(fileext=".shp")
>>>>>>>   st_write(st_as_sf(SPDF), dsn=fn_s)
>>>>>>>   sf_in <- st_read(fn_s)
>>>>>>>   st_as_text(st_geometry(st_as_sf(sf_in)))
>>>>>>>
>>>>>>>   # No
>>>>>>>
>>>>>>>   fn_s <- tempfile(fileext=".shp")
>>>>>>>   st_write(st_as_sf(SPDF_c), dsn=fn_s)
>>>>>>>   sf_in_c <- st_read(fn_s)
>>>>>>>   st_as_text(st_geometry(st_as_sf(sf_in_c)))
>>>>>>>
>>>>>>>   # nor with the pretend-SF-compliant comment set either.
>>>>>>>
>>>>>>>   So the weakness is in the "ESRI Shapefile" write driver, or possibly in
>>>>>>>   the OGRGeometryFactory::organizePolygons() function in GDAL used in
>>>>>>>   OGR_write() (a C++ function) called by writeOGR(). If sf::st_write()
>>>>>>>   also
>>>>>>>   calls OGRGeometryFactory::organizePolygons(), we'd maybe consider that
>>>>>>>   it
>>>>>>>   has a weakness for the "ESRI Shapefile" driver, but which does not
>>>>>>>   affect
>>>>>>>   SF-compliant drivers.
>>>>>>
>>>>>>  Without the comment set, OGRGeometryFactory::organizePolygons() is used;
>>>>>>  with it set, OGRGeometryFactory::organizePolygons() is not used, because
>>>>>>  the object is declared as two exterior rings. In both cases, we have the
>>>>>>  output object written out and read back in incorrectly with the ESRI
>>>>>>  shapefile driver, but SF-compliant drivers round-trip (in the test
>>>>>>  GeoJSON), correctly.
>>>>>>
>>>>>>  It is likely that the changes made in 2015 to accommodate GeoJSON led to
>>>>>>  this possible regression for the ESRI Shapefile driver. I'm adding this
>>>>>>  geometry to tests/tripup.R and data files; without code changes the hole
>>>>>>  slot is wrong and the ring direction is changes to match, so the
>>>>>>  coordinates change too.
>>>>>>
>>>>>>  Reading using the deprecated maptools::readShapeSpatial() also gets a hole
>>>>>>  in the second external ring. However, writing with deprecated
>>>>>> maptools::  writeSpatialShape() yields a shapefile that when read with
>>>>>> maptools::  readShapeSpatial() gives the correct two exterior ring, no hole
>>>>>>  object. When read with sf::st_read() and rgdal::readOGR(), the object is
>>>>>>  also correct. So the problem definitely lies in rgdal::writeOGR(), and
>>>>>>  sf::st_write() - roundtripping with sf::st_write() and sf::st_read()
>>>>>>  degrades from MULTIPOLYGON to POLYGON with the ESRI Shapefile driver.
>>>>>>
>>>>>>  Roger
>>>>>>
>>>>>>>
>>>>>>>   This is probably related to a similar but inverse problem with the
>>>>>>>   SF-compliant GeoJSON driver in 2015:
>>>>>>>
>>>>>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html
>>>>>>>
>>>>>>>   continued the next month in:
>>>>>>>
>>>>>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html
>>>>>>>
>>>>>>>   The details are in this SVN diff
>>>>>>>
>>>>>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571
>>>>>>>
>>>>>>>   up to the end og the list thread, and this one from then until now:
>>>>>>>
>>>>>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733
>>>>>>>
>>>>>>>   Summary: could you change drivers, or is it really necessary to fix an
>>>>>>>   EOL
>>>>>>>   problem? What is your use case?
>>>>>>>
>>>>>>>>
>>>>>>>>    Thanks in advance for your time,
>>>>>>>
>>>>>>>   Thanks for a complete example,
>>>>>>>
>>>>>>>   Roger
>>>>>>>
>>>>>>>>    Phil
>>>>>>>>
>>>>>>>>>    sessionInfo()
>>>>>>>>    R version 3.5.1 (2018-07-02)
>>>>>>>>    Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>>>>>>    Running under: Windows >= 8 x64 (build 9200)
>>>>>>>>
>>>>>>>>    Matrix products: default
>>>>>>>>
>>>>>>>>    locale:
>>>>>>>>    [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>>>>>>>>    Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>>>>>>>>                            LC_TIME=English_United Kingdom.1252
>>>>>>>>
>>>>>>>>    attached base packages:
>>>>>>>>    [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>>>>
>>>>>>>>    other attached packages:
>>>>>>>>    [1] rgdal_1.4-4 sp_1.3-1
>>>>>>>>
>>>>>>>>    loaded via a namespace (and not attached):
>>>>>>>>    [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
>>>>>>>>    lattice_0.20-35
>>>>>>>>
>>>>>>>>    _______________________________________________
>>>>>>>>    R-sig-Geo mailing list
>>>>>>>>    [hidden email]
>>>>>>>>    https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Roger Bivand
>>>> Department of Economics, Norwegian School of Economics,
>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>> voice: +47 55 95 93 55; e-mail: [hidden email]
>>>> https://orcid.org/0000-0003-2392-6140
>>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>>
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; e-mail: [hidden email]
>> https://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Andy Teucher
In reply to this post by Phil Haines
Hi Phil and Roger,

To close the loop on this issue, I have confirmed that the mapshaper javascript library (which rmapshaper wraps) can produce invalid polygons (https://github.com/ateucher/rmapshaper/issues/89), and I will address this in a future release of rmapshaper (https://github.com/ateucher/rmapshaper/issues/90). I have also alerted the maintainer of mapshaper and he is going to add functionality to address this as well (https://github.com/mbloch/mapshaper/issues/352).

Thanks for the thorough investigation and bringing this to my attention!

Cheers,
Andy Teucher (rmapshaper package maintainer)

> Date: Tue, 20 Aug 2019 11:55:00 +0200
> From: Roger Bivand <[hidden email]>
> To: Phil Haines <[hidden email]>
> Cc: <[hidden email]>
> Subject: Re: [R-sig-Geo]  rgdal::writeOGR with driver='ESRI Shapefile'
> converts Polygon object into a hole
> Message-ID: <[hidden email]>
> Content-Type: text/plain; charset="us-ascii"; Format="flowed"
>
> Good, thanks very much!
>
> Roger
>
> On Mon, 19 Aug 2019, Phil Haines wrote:
>
>> Hi Roger,
>>
>> I have added an example to https://github.com/ateucher/rmapshaper/issues/89
>>
>> Hopefully it illustrates the behaviour I was seeing from
>> rmapshaper::ms_simplify() but please let me know if not.
>>
>> And thanks for the nudge towards GPKG, I will investigate.
>> Phil
>>
>> On Mon, 19 Aug 2019 at 15:48, Roger Bivand <[hidden email]> wrote:
>>>
>>> Hi Phil,
>>>
>>> On Sun, 18 Aug 2019, Phil Haines wrote:
>>>
>>>> Hi Roger,
>>>>
>>>> I originally encountered this issue while trying to reduce the size of
>>>> German postal code boundaries. I used rmapshaper::ms_simplify() which
>>>> introduced the polygons sharing a common edge. I definitely didn't
>>>> need them to be separate polygons, and would happily have merged them
>>>> had I been more familiar with gUnaryUnion().
>>>
>>> If the German postal code boundaries are open, could you please put (an
>>> affected subset) on an URL, and provide a short script to replicate the
>>> workflow. Then we can engage with the rmapshaper maintainer, and see
>>> whether the underlying problem in the workflow is in the js library or its
>>> R wrapper. I've opened: https://github.com/ateucher/rmapshaper/issues/89.
>>>
>>>>
>>>> I apologise if my question has resulted in unnecessary effort on your
>>>> part. I reported the behaviour because I found it surprising. However,
>>>> I now understand that this example is not a valid simple feature
>>>> geometry and that the solution is to take steps to ensure I have valid
>>>> geometries, and to repair if not.
>>>>
>>>
>>> No need to apologise!! Your reproducible example was excellent, without it
>>> you wouldn't have contributed to getting the internal shapelib code
>>> changed to make it less restrictive in future releases of GDAL,
>>> potentially benefitting lots of users (who really should drop ESRI
>>> shapefiles, and/or should check geometries for validity, but ... whose
>>> work you've helped save). By the way, ESRI would be very happy if
>>> shapefiles stopped being produced in current workflows, and that new files
>>> should rather be GPKG.
>>>
>>>> Additionally I must confess that I only use the ESRI Shapefile driver
>>>> because I don't know any better... My use cases are reading and
>>>> writing from R (usually to produce maps) and occasionally opening in
>>>> QGIS. I can happily switch to any format that supports this workflow.
>>>>
>>>
>>> GPKG is now very broadly supported, is SF-compliant, and has the big
>>> user-facing benefits of not having field name length restrictions, and
>>> many fewer encoding issues for field names and string values.
>>>
>>> Best wishes,
>>>
>>> Roger
>>>
>>>> Thank you very much for your time,
>>>> Phil
>>>>
>>>> On Sun, 18 Aug 2019 at 13:59, Roger Bivand <[hidden email]> wrote:
>>>>>
>>>>> The reason that the problem occurred is that a MULTIPOLYGON with two
>>>>> exterior rings becomes invalid if the exterior rings touch along an edge
>>>>> (this case). It is important to know the use case, to see whether:
>>>>>
>>>>> library(rgeos)
>>>>> writeWKT(Ps1)
>>>>> gIsValid(Ps1, reason=TRUE)
>>>>> Ps1a <- gUnaryUnion(gBuffer(Ps1, width=0))
>>>>> gIsValid(Ps1a, reason=TRUE)
>>>>> writeWKT(Ps1a)
>>>>>
>>>>> or equivalently in sf for sf objects, should be applied before trying to
>>>>> write the object out to file with this driver.
>>>>>
>>>>> However, because drivers that are compliant with the simple features
>>>>> standard (which bans exterior rings sharing edges) have been permissive
>>>>> and do round-trip this invalid object, a relaxation in the OGR ESRI
>>>>> shapefile driver has been provided and may be included in a future
>>>>> release.
>>>>>
>>>>> We need to know (see these issues):
>>>>>
>>>>> https://github.com/r-spatial/sf/issues/1130
>>>>> https://github.com/OSGeo/gdal/issues/1787
>>>>>
>>>>> why it was desirable to write out this object using this driver? Could an
>>>>> alternative driver have been used, or is ESRI shapefile the only format
>>>>> used in the workflow?
>>>>>
>>>>> If it has to be this driver, could the workflow be changed to repair
>>>>> degenerate cases before writing? If using sp classes, rgeos may be used to
>>>>> test for and probably repair such geometries before they reach
>>>>> rgdal::writeOGR() for this driver. Adding code to sf and rgdal to trap
>>>>> degenerate cases does encumber all users with valid geometries with
>>>>> the time wasted on extra checking, so building checks into sf and rgdal is
>>>>> not desirable.
>>>>>
>>>>> I hope it is possible to find out more about the use case quickly, to pass
>>>>> on to GDAL developers to help motivate a relaxation in their current
>>>>> policy with regard to this driver, and to encourage them to include the
>>>>> fix branch in a future release of GDAL.
>>>>>
>>>>> Roger
>>>>>
>>>>> On Sat, 17 Aug 2019, Roger Bivand wrote:
>>>>>
>>>>>> Please follow up both here and on:
>>>>>>
>>>>>> https://github.com/r-spatial/sf/issues/1130
>>>>>>
>>>>>> as the problem is also seen in the sf package using the same GDAL ESRI
>>>>>> Shapefile driver.
>>>>>>
>>>>>> Roger
>>>>>>
>>>>>> On Fri, 16 Aug 2019, Roger Bivand wrote:
>>>>>>
>>>>>>> On Fri, 16 Aug 2019, Roger Bivand wrote:
>>>>>>>
>>>>>>>>  On Tue, 13 Aug 2019, Phil Haines wrote:
>>>>>>>>
>>>>>>>>>   Dear list,
>>>>>>>>>
>>>>>>>>>   I have a single Polygons object containing multiple Polygon objects
>>>>>>>>>   that share a common border. When I output this using writeOGR() one of
>>>>>>>>>   the Polygon objects becomes a hole, as the following example shows.
>>>>>>>>>
>>>>>>>>>   Create a Polygons object containing two adjoining Polygon objects
>>>>>>>>>>   library(rgdal)
>>>>>>>>>>   r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>>>>>>>>   r2 <- r1; r2[,1] <- r2[,1]+1
>>>>>>>>>>   Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>>>>>>>>   SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>>>>>>>>   data.frame(Example=c("Minimal")))
>>>>>>>>>
>>>>>>>>>   Perform a write/readOGR() cycle
>>>>>>>>>>   fn <- tempfile()
>>>>>>>>>>   writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>>>>>>>>   SPDF2 <- readOGR(dsn=fn,layer='test')
>>>>>>>>>
>>>>>>>>>   Second Polygon object is now a hole
>>>>>>>>>>   sapply(SPDF2@polygons[[1]]@Polygons,slot,"hole")
>>>>>>>>>   [1] FALSE  TRUE
>>>>>>>>>
>>>>>>>>>   I see from the sp documentation that "Polygon objects belonging to a
>>>>>>>>>   Polygons object should either not overlap one-other, or should be
>>>>>>>>>   fully included" but I am not sure how this relates to bordering
>>>>>>>>>   Polygon objects. I would welcome any advice as to whether what I am
>>>>>>>>>   asking of writeOGR is reasonable?
>>>>>>>>
>>>>>>>>  The problem is with the 'ESRI Shapefile' representation and driver:
>>>>>>>>
>>>>>>>>  library(rgdal)
>>>>>>>>  r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>>>>>>  r2 <- r1; r2[,1] <- r2[,1]+1
>>>>>>>>  Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>>>>>>  SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>>>>>>  data.frame(Example=c("Minimal")))
>>>>>>>>  sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>>  sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>>>>>>
>>>>>>>>  # which constructs a MULTIPOLYGON object:
>>>>>>>>
>>>>>>>>  rgeos::writeWKT(SPDF)
>>>>>>>>  library(sf)
>>>>>>>>  st_as_text(st_geometry(st_as_sf(SPDF)))
>>>>>>>>
>>>>>>>> #    The 'ESRI Shapefile' driver is not Simple-Feature compliant (it
>>>>>>>> #    pre-dates it), so the failure occurs by seeing the second exterior
>>>>>>>> #    ring
>>>>>>>> #    as an interior ring
>>>>>>>>
>>>>>>>>  fn <- tempfile()
>>>>>>>>  writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>>>>>>  SPDF2 <- readOGR(dsn=fn, layer='test')
>>>>>>>>  sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>>  sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>>>>>>  rgeos::writeWKT(SPDF2)
>>>>>>>>  st_as_text(st_geometry(st_as_sf(SPDF2)))
>>>>>>>>
>>>>>>>>  # This happens with sf too, using the same GDAL driver:
>>>>>>>>
>>>>>>>>  sf2 <- st_read(dsn=fn, layer='test')
>>>>>>>>  st_geometry(sf2)
>>>>>>>>  library(sf)
>>>>>>>>  st_as_text(st_geometry(st_as_sf(sf2)))
>>>>>>>>  rgeos::writeWKT(as(sf2, "Spatial"))
>>>>>>>>
>>>>>>>>  # Adding the comment fix doesn't help:
>>>>>>>>
>>>>>>>>  comment(slot(SPDF, "polygons")[[1]])
>>>>>>>>  SPDF_c <- rgeos::createSPComment(SPDF)
>>>>>>>>  comment(slot(SPDF_c, "polygons")[[1]])
>>>>>>>>  writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
>>>>>>>>    verbose=TRUE)
>>>>>>>>
>>>>>>>> #    reports
>>>>>>>> #    Object initially classed as: wkbPolygon
>>>>>>>> #    SFS comments in Polygons objects
>>>>>>>> #    Object reclassed as: wkbMultiPolygon
>>>>>>>>
>>>>>>>>  SPDF2_c <- readOGR(dsn=fn, layer='test_c')
>>>>>>>>  sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>>  sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot,
>>>>>>>>  "ringDir")
>>>>>>>>  rgeos::writeWKT(SPDF2_c)
>>>>>>>>  st_as_text(st_geometry(st_as_sf(SPDF2_c)))
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  # If the input object is written out with the GeoPackage driver:
>>>>>>>>
>>>>>>>>  fn1 <- tempfile(fileext=".gpkg")
>>>>>>>>  writeOGR(SPDF, fn1, layer="test", driver='GPKG')
>>>>>>>>  sf2a <- st_read(dsn=fn1,layer='test')
>>>>>>>>  st_coordinates(st_geometry(sf2a))
>>>>>>>>  SPDF2a <- readOGR(dsn=fn1)
>>>>>>>>  sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>>  sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>>>>>>  rgeos::writeWKT(SPDF2a)
>>>>>>>>  st_as_text(st_geometry(st_as_sf(SPDF2a)))
>>>>>>>>
>>>>>>>>  # the issue is resolved. If we separate the exterior rings:
>>>>>>>>
>>>>>>>>  r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
>>>>>>>>  Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
>>>>>>>>  SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
>>>>>>>>  data.frame(Example=c("Minimal")))
>>>>>>>>  fna <- tempfile()
>>>>>>>>  writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
>>>>>>>>  SPDF2_a <- readOGR(dsn=fna, layer='test')
>>>>>>>>  sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>>>>>>  sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot,
>>>>>>>>  "ringDir")
>>>>>>>>  rgeos::writeWKT(SPDF2_a)
>>>>>>>>  st_as_text(st_geometry(st_as_sf(SPDF2_a)))
>>>>>>>>
>>>>>>>>  # we are OK as the two exterior rings do not touch.
>>>>>>>>
>>>>>>>>  # does using sf make a difference?
>>>>>>>>
>>>>>>>>  fn_s <- tempfile(fileext=".shp")
>>>>>>>>  st_write(st_as_sf(SPDF), dsn=fn_s)
>>>>>>>>  sf_in <- st_read(fn_s)
>>>>>>>>  st_as_text(st_geometry(st_as_sf(sf_in)))
>>>>>>>>
>>>>>>>>  # No
>>>>>>>>
>>>>>>>>  fn_s <- tempfile(fileext=".shp")
>>>>>>>>  st_write(st_as_sf(SPDF_c), dsn=fn_s)
>>>>>>>>  sf_in_c <- st_read(fn_s)
>>>>>>>>  st_as_text(st_geometry(st_as_sf(sf_in_c)))
>>>>>>>>
>>>>>>>>  # nor with the pretend-SF-compliant comment set either.
>>>>>>>>
>>>>>>>>  So the weakness is in the "ESRI Shapefile" write driver, or possibly in
>>>>>>>>  the OGRGeometryFactory::organizePolygons() function in GDAL used in
>>>>>>>>  OGR_write() (a C++ function) called by writeOGR(). If sf::st_write()
>>>>>>>>  also
>>>>>>>>  calls OGRGeometryFactory::organizePolygons(), we'd maybe consider that
>>>>>>>>  it
>>>>>>>>  has a weakness for the "ESRI Shapefile" driver, but which does not
>>>>>>>>  affect
>>>>>>>>  SF-compliant drivers.
>>>>>>>
>>>>>>> Without the comment set, OGRGeometryFactory::organizePolygons() is used;
>>>>>>> with it set, OGRGeometryFactory::organizePolygons() is not used, because
>>>>>>> the object is declared as two exterior rings. In both cases, we have the
>>>>>>> output object written out and read back in incorrectly with the ESRI
>>>>>>> shapefile driver, but SF-compliant drivers round-trip (in the test
>>>>>>> GeoJSON), correctly.
>>>>>>>
>>>>>>> It is likely that the changes made in 2015 to accommodate GeoJSON led to
>>>>>>> this possible regression for the ESRI Shapefile driver. I'm adding this
>>>>>>> geometry to tests/tripup.R and data files; without code changes the hole
>>>>>>> slot is wrong and the ring direction is changes to match, so the
>>>>>>> coordinates change too.
>>>>>>>
>>>>>>> Reading using the deprecated maptools::readShapeSpatial() also gets a hole
>>>>>>> in the second external ring. However, writing with deprecated
>>>>>>> maptools::  writeSpatialShape() yields a shapefile that when read with
>>>>>>> maptools::  readShapeSpatial() gives the correct two exterior ring, no hole
>>>>>>> object. When read with sf::st_read() and rgdal::readOGR(), the object is
>>>>>>> also correct. So the problem definitely lies in rgdal::writeOGR(), and
>>>>>>> sf::st_write() - roundtripping with sf::st_write() and sf::st_read()
>>>>>>> degrades from MULTIPOLYGON to POLYGON with the ESRI Shapefile driver.
>>>>>>>
>>>>>>> Roger
>>>>>>>
>>>>>>>>
>>>>>>>>  This is probably related to a similar but inverse problem with the
>>>>>>>>  SF-compliant GeoJSON driver in 2015:
>>>>>>>>
>>>>>>>>  https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html
>>>>>>>>
>>>>>>>>  continued the next month in:
>>>>>>>>
>>>>>>>>  https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html
>>>>>>>>
>>>>>>>>  The details are in this SVN diff
>>>>>>>>
>>>>>>>>  https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571
>>>>>>>>
>>>>>>>>  up to the end og the list thread, and this one from then until now:
>>>>>>>>
>>>>>>>>  https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733
>>>>>>>>
>>>>>>>>  Summary: could you change drivers, or is it really necessary to fix an
>>>>>>>>  EOL
>>>>>>>>  problem? What is your use case?
>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Thanks in advance for your time,
>>>>>>>>
>>>>>>>>  Thanks for a complete example,
>>>>>>>>
>>>>>>>>  Roger
>>>>>>>>
>>>>>>>>>   Phil
>>>>>>>>>
>>>>>>>>>>   sessionInfo()
>>>>>>>>>   R version 3.5.1 (2018-07-02)
>>>>>>>>>   Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>>>>>>>   Running under: Windows >= 8 x64 (build 9200)
>>>>>>>>>
>>>>>>>>>   Matrix products: default
>>>>>>>>>
>>>>>>>>>   locale:
>>>>>>>>>   [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>>>>>>>>>   Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>>>>>>>>>                           LC_TIME=English_United Kingdom.1252
>>>>>>>>>
>>>>>>>>>   attached base packages:
>>>>>>>>>   [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>>>>>
>>>>>>>>>   other attached packages:
>>>>>>>>>   [1] rgdal_1.4-4 sp_1.3-1
>>>>>>>>>
>>>>>>>>>   loaded via a namespace (and not attached):
>>>>>>>>>   [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
>>>>>>>>>   lattice_0.20-35
>>>>>>>>>
>>>>>>>>>   _______________________________________________
>>>>>>>>>   R-sig-Geo mailing list
>>>>>>>>>   [hidden email]
>>>>>>>>>   https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Roger Bivand
>>>>> Department of Economics, Norwegian School of Economics,
>>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>>> voice: +47 55 95 93 55; e-mail: [hidden email]
>>>>> https://orcid.org/0000-0003-2392-6140
>>>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>>>
>>>
>>> --
>>> Roger Bivand
>>> Department of Economics, Norwegian School of Economics,
>>> Helleveien 30, N-5045 Bergen, Norway.
>>> voice: +47 55 95 93 55; e-mail: [hidden email]
>>> https://orcid.org/0000-0003-2392-6140
>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: [hidden email]
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo