sp: a package with classes and methods for spatial data in R

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

sp: a package with classes and methods for spatial data in R

Edzer J. Pebesma-2
Dear spatial R package maintainer and/or R-sig-geo subscriber,

we are happy to announce the beta (i.e. pre-CRAN) release of "sp",
a new R package which has new-style classes and methods for spatial data.

Spatial data types that sp implements are: points, grids, lines and
polygons (i.e., rings). Methods include

  + the usual print, summary, plot, [, [[, ...
  + coercion between types (e.g. points and grids, matrices, data.frames)
  + coordinates(x), which returns the spatial coordinates of x
  + bbox(x), returns the bounding box of x
  + overlay, to query the value of e.g. points in polygons or grid
    (essentially does a point-in-polygon or point-in-raster cell)
  + spsample, for random sampling methods over a spatial domain.

An additional package (spproj) provides coordinate reference system
transformation (projection or re-projection) using the PROJ.4 library
[2]. Others will provide interfaces to GRASS and gdal.

A good deal of work has also gone into providing plotting methods using
base, grid and lattice graphics, through the spplot function, a
front-end to lattice plots for spatial data (see gallery [1]).

The home page of these packages is found at
http://r-spatial.sourceforge.net/

The reason why we wrote this package is that we think R is an excellent
environment to deal with spatial data, but that it lacks a uniform way
to deal with spatial data. Compared to the handling of dates and times,
which can utilize base classes or those provided in the chron package,
spatial data handling is much more fragmented. As a consequence:

  - various packages make their own assumptions about how spatial data
    are organized
  - spatial data organized for a certain package cannot easily be used
    for another package
  - few (or no) packages address the full range of spatial data types
    (points, grids, lines, polygons)
  - generic spatial functionality (e.g. I/O to GIS, plotting, projection)
    is scattered and limited in functionality.

It also means that many different package authors have to use time writing
similar data handling code, rather than concentrating on analytical
functions. If the sp package achieves its goals, data I/O will become
many-to-one, and data access for analysis one-to-many, providing a shared
data object layer for which shared methods can be written.

Classes and methods for spatial data are only useful when the spatial
packages support them. Our team includes maintainers of a number
of spatial R packages, but we would value your support in making sp
a success.

First, we would like to ask you to review critically what we are proposing
in the package, and to give us the benefit of your experience in spatial
data handling. Are there classes of data that you feel we define wrongly
or inadequately? Are there clearly better ways of designing the classes
and methods we have tried so far? Are any of our classes unnecessary? We
realise that the documentation needs more work, first we would like to
get the code into better shape. So your comments will have most value
now; when on CRAN, users will count on the classes (i.e. the slots)
being fixed; at this stage we can still modify them.

Secondly, we would like to invite you to consider supporting the sp
classes directly in your packages. Two possible ways of supporting sp
classes are:

a. to write conversion routines to and from the classes in sp

b. extend your package with methods for your (main) package functions
    that accept and return the classes provided by sp, allowing the user
    to directly display the results, or use them in other packages.

If we can help in any way with this process, please let us know.

The development of this package is a joint effort of Virgilio Gomez-Rubio,
Barry Rowlinson, Roger Bivand and Edzer Pebesma, and followed from
discussions held at a pre-DSC2003 workshop [3], announcements on R-sig-geo
[4], and a meeting held last November in Lancaster [5].

With best regards,
--
Roger Bivand and Edzer Pebesma


[1] http://r-spatial.sourceforge.net/
[2] http://www.remotesensing.org/proj/
[3] http://spatial.nhh.no/meetings/vienna/index.html
[4] e.g. https://stat.ethz.ch/pipermail/r-sig-geo/2003-October/000028.html
[5] http://elearning.maths.lancs.ac.uk:8080/RSpatial/



Reply | Threaded
Open this post in threaded view
|

sp: a package with classes and methods for spatial data in R

Tim Keitt
Very cool. I'd been thinking of something similar, but not had the time
to act on it. Looks like we need to add OGR wrappers to rgdal so that
vector data can be read in.

A couple of immediate items I noticed:

1) "sp" is rather terse, no? I like more descriptive package names.

2) I think it is always a good idea to let typing describe internal
states (ie making objects stateless). It leads to much cleaner
interfaces and many fewer coding errors. If there are two different ways
to represent a grid, make two classes. If they are perfectly
substitutable (functions can take either), join them with a super-class.
If one is a perfect subset of the other, then make it the parent class
and the extension the child (inherits). See GDALDataset and
GDALReadOnlyDataset. GDALDataset is a perfect superset of
GDALReadOnlyDataset and so can be passed to any function needing the
interface of GDALReadOnlyDataset. This is formalized by having
GDALDataset inherit from GDALReadOnlyDataset. If you are writing
functions that check or update the internal states of your objects
(except ordinary insertion and deletion of contained items), then you
should think twice about the object hierarchy. This is why there are no
"is.writable" or "make.writable" functions in rgdal. That's all handled
by type declarations (meaning you can customize dispatch on types in
your methods ;-)

3) As long as we can override methods of the interface, all will be
well. We can substitute external data representations when we need to
scale to larger datasets. Try allocating a 50Kx50K matrix in R. There
will be times when we will want to have data be in a database or on disk
and still manipulate it with the "sp" interface. I don't see any
barriers to this at the moment. These external wrappers will want to
inherit from existing classes, so see point 2 above.

Love the image gallery! Nice work!

THK

On Thu, 2005-04-14 at 18:43 +0200, Edzer J. Pebesma wrote:

> Dear spatial R package maintainer and/or R-sig-geo subscriber,
>
> we are happy to announce the beta (i.e. pre-CRAN) release of "sp",
> a new R package which has new-style classes and methods for spatial data.
>
> Spatial data types that sp implements are: points, grids, lines and
> polygons (i.e., rings). Methods include
>
>   + the usual print, summary, plot, [, [[, ...
>   + coercion between types (e.g. points and grids, matrices, data.frames)
>   + coordinates(x), which returns the spatial coordinates of x
>   + bbox(x), returns the bounding box of x
>   + overlay, to query the value of e.g. points in polygons or grid
>     (essentially does a point-in-polygon or point-in-raster cell)
>   + spsample, for random sampling methods over a spatial domain.
>
> An additional package (spproj) provides coordinate reference system
> transformation (projection or re-projection) using the PROJ.4 library
> [2]. Others will provide interfaces to GRASS and gdal.
>
> A good deal of work has also gone into providing plotting methods using
> base, grid and lattice graphics, through the spplot function, a
> front-end to lattice plots for spatial data (see gallery [1]).
>
> The home page of these packages is found at
> http://r-spatial.sourceforge.net/
>
> The reason why we wrote this package is that we think R is an excellent
> environment to deal with spatial data, but that it lacks a uniform way
> to deal with spatial data. Compared to the handling of dates and times,
> which can utilize base classes or those provided in the chron package,
> spatial data handling is much more fragmented. As a consequence:
>
>   - various packages make their own assumptions about how spatial data
>     are organized
>   - spatial data organized for a certain package cannot easily be used
>     for another package
>   - few (or no) packages address the full range of spatial data types
>     (points, grids, lines, polygons)
>   - generic spatial functionality (e.g. I/O to GIS, plotting, projection)
>     is scattered and limited in functionality.
>
> It also means that many different package authors have to use time writing
> similar data handling code, rather than concentrating on analytical
> functions. If the sp package achieves its goals, data I/O will become
> many-to-one, and data access for analysis one-to-many, providing a shared
> data object layer for which shared methods can be written.
>
> Classes and methods for spatial data are only useful when the spatial
> packages support them. Our team includes maintainers of a number
> of spatial R packages, but we would value your support in making sp
> a success.
>
> First, we would like to ask you to review critically what we are proposing
> in the package, and to give us the benefit of your experience in spatial
> data handling. Are there classes of data that you feel we define wrongly
> or inadequately? Are there clearly better ways of designing the classes
> and methods we have tried so far? Are any of our classes unnecessary? We
> realise that the documentation needs more work, first we would like to
> get the code into better shape. So your comments will have most value
> now; when on CRAN, users will count on the classes (i.e. the slots)
> being fixed; at this stage we can still modify them.
>
> Secondly, we would like to invite you to consider supporting the sp
> classes directly in your packages. Two possible ways of supporting sp
> classes are:
>
> a. to write conversion routines to and from the classes in sp
>
> b. extend your package with methods for your (main) package functions
>     that accept and return the classes provided by sp, allowing the user
>     to directly display the results, or use them in other packages.
>
> If we can help in any way with this process, please let us know.
>
> The development of this package is a joint effort of Virgilio Gomez-Rubio,
> Barry Rowlinson, Roger Bivand and Edzer Pebesma, and followed from
> discussions held at a pre-DSC2003 workshop [3], announcements on R-sig-geo
> [4], and a meeting held last November in Lancaster [5].
>
> With best regards,
> --
> Roger Bivand and Edzer Pebesma
>
>
> [1] http://r-spatial.sourceforge.net/
> [2] http://www.remotesensing.org/proj/
> [3] http://spatial.nhh.no/meetings/vienna/index.html
> [4] e.g. https://stat.ethz.ch/pipermail/r-sig-geo/2003-October/000028.html
> [5] http://elearning.maths.lancs.ac.uk:8080/RSpatial/
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo



Reply | Threaded
Open this post in threaded view
|

sp: a package with classes and methods for spatialdata in R

Andy Bunn
> Very cool. I'd been thinking of something similar, but not had the time

Indeed! Bravo. This is a big day for R users (at least it is for me). I
agree with Tim's comment wrt to having db functionality. I run out of RAM
way too fast as it is and will be tempted to do bigger things in R now that
this package exists.

With sincere respect,
-Andy
~`~`~`~`~`~`~`~`~`~`~`~`~`~
Andy Bunn
Post-Doctoral Fellow
Woods Hole Research Center
http://www.whrc.org



Reply | Threaded
Open this post in threaded view
|

sp: a package with classes and methods for spatial data in R

Susumu Tanimura
In reply to this post by Edzer J. Pebesma-2
Dear Roger Bivand and Edzer Pebesma,

It is happy to see improvement of the sp and related packages.  My
request for easy installation is to prepare your download site to
accept install.package() command inside R.

For example, we can install RGtk package with contriburl option.
> install.packages("RGtk", contriburl="http://www.omegahat.org/download/R/packages")

Another example,
> install.packages("pmg",contriburl="http://www.math.csi.cuny.edu/pmg")

My trial was
> install.packages("sp",contriburl="http://r-spatial.sourceforge.net")
trying URL `http://r-spatial.sourceforge.net/PACKAGES'
Error in download.file(url = paste(contriburl, "PACKAGES", sep = "/"),  :
        cannot open URL `http://r-spatial.sourceforge.net/PACKAGES'
In addition: Warning message:
cannot open: HTTP status was `404 Not Found'

Concentration on improving R package is  a priority matter, but hope
to provide us a little more convenience.

I apologize if you already have such system on the site.

Thank you.

--
Susumu Tanimura



Reply | Threaded
Open this post in threaded view
|

sp: a package with classes and methods for spatial data in R

Edzer J. Pebesma-2
In reply to this post by Tim Keitt
Tim Keitt wrote:

>Very cool. I'd been thinking of something similar, but not had the time
>to act on it. Looks like we need to add OGR wrappers to rgdal so that
>vector data can be read in.
>  
>
I think Barry has been there with Rmaps. It would
be nice to integrate this functionality into rgdal, because OGR
is part of rgdal, and the external-library dependence is still,
well, a bit clumsy for R under Windows.

>A couple of immediate items I noticed:
>
>1) "sp" is rather terse, no? I like more descriptive package names.
>  
>
Yes. Think of ts for time series. We thought of "spatial", but
that name has been taken long ago. sp is the first two letters
in spdep, splancs, spproj, ... Other suggestions, please?

>2) I think it is always a good idea to let typing describe internal
>states (ie making objects stateless). It leads to much cleaner
>interfaces and many fewer coding errors. If there are two different ways
>to represent a grid, make two classes. If they are perfectly
>substitutable (functions can take either), join them with a super-class.
>If one is a perfect subset of the other, then make it the parent class
>and the extension the child (inherits). See GDALDataset and
>GDALReadOnlyDataset. GDALDataset is a perfect superset of
>GDALReadOnlyDataset and so can be passed to any function needing the
>interface of GDALReadOnlyDataset. This is formalized by having
>GDALDataset inherit from GDALReadOnlyDataset. If you are writing
>functions that check or update the internal states of your objects
>(except ordinary insertion and deletion of contained items), then you
>should think twice about the object hierarchy. This is why there are no
>"is.writable" or "make.writable" functions in rgdal. That's all handled
>by type declarations (meaning you can customize dispatch on types in
>your methods ;-)
>  
>
I've been struggling with this for quite a while, and I do value
your suggestions. In an earlier attempt I tried two classes,
but I got strangled by the many setIs() and setAs() constructs
needed, which in addition didn't seem to do what I wanted.

Trouble is: right now SpatialGrid extends SpatialPoints: they're
points that happen to lie on a grid, so they carry grid topology
and an index to which grid cells the points belong. The full grid
representation in some sense is a special case of this, because
all points are there, in ordered form. In the data requirement
sense they're not: full grids don't need coordinates nor an index.
Data-wise, the extension is reversed: a full grid is the most
general case, the partial grid needs in addition an index (and,
-disputably-, coordinates). I'll think a bit longer about the
superclass idea, and struggle further with the setIs() stuff.

It's very convenient to have grids that behave like specially
ordered points.

>3) As long as we can override methods of the interface, all will be
>well. We can substitute external data representations when we need to
>scale to larger datasets. Try allocating a 50Kx50K matrix in R. There
>will be times when we will want to have data be in a database or on disk
>and still manipulate it with the "sp" interface. I don't see any
>barriers to this at the moment. These external wrappers will want to
>inherit from existing classes, so see point 2 above.
>  
>
I was playing with Europe on a kilometer grid, and wasn't
too happy about how much it filled the machine's memory.
This sounds extremely valuable; could this be done by moving
parts from rgdal into sp, or do we need to copy? Again, full
integration is out of the question because of the Windows/
CRAN/external library issue.

>Love the image gallery! Nice work!
>  
>
Thanks! Again -- what is still missing?
--
Edzer



Reply | Threaded
Open this post in threaded view
|

sp: a package with classes and methods for spatial data in R

Martin Maechler
>>>>> "Edzer" == Edzer J Pebesma <e.pebesma at geo.uu.nl>
>>>>>     on Fri, 15 Apr 2005 15:59:17 +0200 writes:

    Edzer> Tim Keitt wrote:
    >> Very cool. I'd been thinking of something similar, but not had the time
    >> to act on it. Looks like we need to add OGR wrappers to rgdal so that
    >> vector data can be read in.
    >>
    >>
    Edzer> I think Barry has been there with Rmaps. It would
    Edzer> be nice to integrate this functionality into rgdal, because OGR
    Edzer> is part of rgdal, and the external-library dependence is still,
    Edzer> well, a bit clumsy for R under Windows.

    >> A couple of immediate items I noticed:
    >>
    >> 1) "sp" is rather terse, no? I like more descriptive package names.
    >>
    >>
    Edzer> Yes. Think of ts for time series. We thought of "spatial", but
    Edzer> that name has been taken long ago. sp is the first two letters
    Edzer> in spdep, splancs, spproj, ... Other suggestions, please?

Actually, I have liked the "conciseness" of 'sp'
(when I first heard about it from Roger quite a while ago):

It should become *the* basic spatial class infrastructure
package.  Hence it makes sense to emphasize its importance by
attributing a `top level' name.

Martin



Reply | Threaded
Open this post in threaded view
|

sp: a package with classes and methods for spatial data in R

Tim Keitt
I agree, the package is a great foundation for building a richer spatial
analysis framework. I'm nit-picking about the name. Not a real
complaint. Mostly I'm thinking right now about the class relationships.
We need to get that right. This is a very good start. I may suggest some
very minor tweaks. Overall, I like the layout.

THK

On Fri, 2005-04-15 at 17:42 +0200, Martin Maechler wrote:

> >>>>> "Edzer" == Edzer J Pebesma <e.pebesma at geo.uu.nl>
> >>>>>     on Fri, 15 Apr 2005 15:59:17 +0200 writes:
>
>     Edzer> Tim Keitt wrote:
>     >> Very cool. I'd been thinking of something similar, but not had the time
>     >> to act on it. Looks like we need to add OGR wrappers to rgdal so that
>     >> vector data can be read in.
>     >>
>     >>
>     Edzer> I think Barry has been there with Rmaps. It would
>     Edzer> be nice to integrate this functionality into rgdal, because OGR
>     Edzer> is part of rgdal, and the external-library dependence is still,
>     Edzer> well, a bit clumsy for R under Windows.
>
>     >> A couple of immediate items I noticed:
>     >>
>     >> 1) "sp" is rather terse, no? I like more descriptive package names.
>     >>
>     >>
>     Edzer> Yes. Think of ts for time series. We thought of "spatial", but
>     Edzer> that name has been taken long ago. sp is the first two letters
>     Edzer> in spdep, splancs, spproj, ... Other suggestions, please?
>
> Actually, I have liked the "conciseness" of 'sp'
> (when I first heard about it from Roger quite a while ago):
>
> It should become *the* basic spatial class infrastructure
> package.  Hence it makes sense to emphasize its importance by
> attributing a `top level' name.
>
> Martin