rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Thorsten Behrens
Dear all,

my problem is that I want to read a big geotiff raster dataset into R
and convert it to a matrix, which does not work.
The file is big but there is sufficient memory. I need all the data in
the memory at the same time.

The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04 LTS
with the latest version of the packages (see session info below) and
256GB RAM installed.

When loading the raster dataset using rgdal (via readGDAL or
raster::readAll) I get the follwoing error in R 4.0.0:

```
Error in rgdal::getRasterData(con, offset = offs, region.dim = reg, band
= object@data@band) :
   long vectors not supported yet: memory.c:3782
```

In R 3.6.3 is is "... memory.c:3717"

However, I can load the same file with the tiff package and a file of
the same size in the native raster package format (*.grd) with the
raster package but again not with the rgdal package.

gdalinfo (gdalUtils) does not complain (see below). Hence, Even Rouault
assumes the problem is related to rgdal and not gdal
(https://github.com/OSGeo/gdal/issues/2442).

Below you find reproducible code, which generates a raster file, saves
the two formats (.tiff and .grd) and tries to read them with the
different packages.

Is this a known limitation? Any help is greatly appreciated!

Thanks a lot in advance!

Best wishes and stay healthy,
Thorsten



### Steps to reproduce the problem.

R code:

```
library(rgdal) # 1.4-8
library(raster) # 3.1-5
library(tiff) # 0.1-5

## generate and manipulate a big raster dataset
# - generate
rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000,
0, 48000))) # all fine

# - manipulate
values(rDemTest) = 1 # all fine

# - convert
mDemTest = raster::as.matrix(rDemTest) # all fine
str(mDemTest)

## save a big dataset

# - as raster/gdal
sFileNameTiff = "BigData.tif"
writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE, NAflag =
-9999) # all fine

# - as raster native
sFileNameNative = "BigData.grd"
writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
NAflag = -9999) # all fine


## load the big raster datasets with different packages and options
# - load the tiff data with the gdal package via the raster package
rDem = raster(sFileNameTiff) # all fine
extent(rDem) # all fine
mDem = raster::as.matrix(rDem) # error
rDem = readAll(rDem) # error

# - load the native raster data with the raster package
rDem = raster(sFileNameNative) # all fine
extent(rDem) # all fine
mDem = raster::as.matrix(rDem) # all fine
str(mDem)

# - load the tiff data with the tiff package
mDem = readTIFF(sFileNameTiff) # all fine
str(mDem)

# - load the tiff data with the gdal package
sfDem = readGDAL(sFileNameTiff) # error

# - load the native raster data with the gdal package
sfDem = readGDAL(sFileNameNative) # error

```


### Startup messages when rgdal is attached (requested by Roger Bivand)
 > library(rgdal)
rgdal: version: 1.4-8, (SVN revision 845)
  Geospatial Data Abstraction Library extensions to R successfully loaded
  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
  Path to GDAL shared files:
  GDAL binary built with GEOS: TRUE
  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020, [PJ_VERSION: 631]
  Path to PROJ.4 shared files: (autodetected)
  Linking to sp version: 1.4-1


### Session info
 > sessionInfo()
R version 4.0.0 (2020-04-24)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
  [4] LC_COLLATE=de_DE.UTF-8     LC_MONETARY=de_DE.UTF-8
LC_MESSAGES=de_DE.UTF-8
  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C             LC_MEASUREMENT=de_DE.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods base

other attached packages:
[1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5     
sp_1.4-1

loaded via a namespace (and not attached):
  [1] compiler_4.0.0    tools_4.0.0       Rcpp_1.0.4.6 R.methodsS3_1.8.0
codetools_0.2-16
  [6] grid_4.0.0        iterators_1.0.12  foreach_1.5.0
R.utils_2.9.2     R.oo_1.23.0
[11] lattice_0.20-41


### gdalInfo
 > gdalinfo(sFileNameTiff)
  [1] "Driver: GTiff/GeoTIFF"
  [2] "Files: BigData.tif"
  [3] "Size is 72000, 48000"
  [4] "Origin = (0.000000000000000,48000.000000000000000)"
  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
  [6] "Image Structure Metadata:"
  [7] "  COMPRESSION=LZW"
  [8] "  INTERLEAVE=BAND"
  [9] "Corner Coordinates:"
[10] "Upper Left  (       0.000,   48000.000) "
[11] "Lower Left  (   0.0000000,   0.0000000) "
[12] "Upper Right (   72000.000,   48000.000) "
[13] "Lower Right (   72000.000,       0.000) "
[14] "Center      (   36000.000,   24000.000) "
[15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
[16] "  Min=1.000 Max=1.000 "
[17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
[18] "  NoData Value=-9999"
[19] "  Metadata:"
[20] "    STATISTICS_MAXIMUM=1"
[21] "    STATISTICS_MEAN=nan"
[22] "    STATISTICS_MINIMUM=1"
[23] "    STATISTICS_STDDEV=nan"

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Roger Bivand
Administrator
On Mon, 27 Apr 2020, Thorsten Behrens wrote:

> Dear all,
>
> my problem is that I want to read a big geotiff raster dataset into R and
> convert it to a matrix, which does not work.
> The file is big but there is sufficient memory. I need all the data in the
> memory at the same time.
>
> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04 LTS with
> the latest version of the packages (see session info below) and 256GB RAM
> installed.
>
> When loading the raster dataset using rgdal (via readGDAL or raster::readAll)
> I get the follwoing error in R 4.0.0:
>
> ```
> Error in rgdal::getRasterData(con, offset = offs, region.dim = reg, band =
> object@data@band) :
>   long vectors not supported yet: memory.c:3782
> ```
On a 16GB Fedora platform:

> library(raster) # 3.1-5
> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000,
0,
+ 48000))) # all fine
> rDemTest
class      : RasterLayer
dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
resolution : 1, 1  (x, y)
extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
crs        : NA

> values(rDemTest) = 1
Error: cannot allocate vector of size 25.7 Gb

So you are deceiving yourself into thinking that all is fine at this
point. Please try to instantiate an example that can be reproduced on a
machine with 8GB RAM.

Further note that rgdal::readGDAL() is not how you handle very large
objects in files, and never has been. raster can handle blocks of data
from bands in file; stars and gdalcubes can use proxy=TRUE for the same
purpose. Why did you choose rgdal::readGDAL() when this is not its
purpose?

You did not say how much RAM is on your platform.

Roger

>
> In R 3.6.3 is is "... memory.c:3717"
>
> However, I can load the same file with the tiff package and a file of the
> same size in the native raster package format (*.grd) with the raster package
> but again not with the rgdal package.
>
> gdalinfo (gdalUtils) does not complain (see below). Hence, Even Rouault
> assumes the problem is related to rgdal and not gdal
> (https://github.com/OSGeo/gdal/issues/2442).
>
> Below you find reproducible code, which generates a raster file, saves the
> two formats (.tiff and .grd) and tries to read them with the different
> packages.
>
> Is this a known limitation? Any help is greatly appreciated!
>
> Thanks a lot in advance!
>
> Best wishes and stay healthy,
> Thorsten
>
>
>
> ### Steps to reproduce the problem.
>
> R code:
>
> ```
> library(rgdal) # 1.4-8
> library(raster) # 3.1-5
> library(tiff) # 0.1-5
>
> ## generate and manipulate a big raster dataset
> # - generate
> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000, 0,
> 48000))) # all fine
>
> # - manipulate
> values(rDemTest) = 1 # all fine
>
> # - convert
> mDemTest = raster::as.matrix(rDemTest) # all fine
> str(mDemTest)
>
> ## save a big dataset
>
> # - as raster/gdal
> sFileNameTiff = "BigData.tif"
> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE, NAflag =
> -9999) # all fine
>
> # - as raster native
> sFileNameNative = "BigData.grd"
> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE, NAflag =
> -9999) # all fine
>
>
> ## load the big raster datasets with different packages and options
> # - load the tiff data with the gdal package via the raster package
> rDem = raster(sFileNameTiff) # all fine
> extent(rDem) # all fine
> mDem = raster::as.matrix(rDem) # error
> rDem = readAll(rDem) # error
>
> # - load the native raster data with the raster package
> rDem = raster(sFileNameNative) # all fine
> extent(rDem) # all fine
> mDem = raster::as.matrix(rDem) # all fine
> str(mDem)
>
> # - load the tiff data with the tiff package
> mDem = readTIFF(sFileNameTiff) # all fine
> str(mDem)
>
> # - load the tiff data with the gdal package
> sfDem = readGDAL(sFileNameTiff) # error
>
> # - load the native raster data with the gdal package
> sfDem = readGDAL(sFileNameNative) # error
>
> ```
>
>
> ### Startup messages when rgdal is attached (requested by Roger Bivand)
>>  library(rgdal)
> rgdal: version: 1.4-8, (SVN revision 845)
>  Geospatial Data Abstraction Library extensions to R successfully loaded
>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>  Path to GDAL shared files:
>  GDAL binary built with GEOS: TRUE
>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020, [PJ_VERSION: 631]
>  Path to PROJ.4 shared files: (autodetected)
>  Linking to sp version: 1.4-1
>
>
> ### Session info
>>  sessionInfo()
> R version 4.0.0 (2020-04-24)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 20.04 LTS
>
> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>
> locale:
>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
>  [4] LC_COLLATE=de_DE.UTF-8     LC_MONETARY=de_DE.UTF-8
> LC_MESSAGES=de_DE.UTF-8
>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
> [10] LC_TELEPHONE=C             LC_MEASUREMENT=de_DE.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods base
>
> other attached packages:
> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5      sp_1.4-1
>
> loaded via a namespace (and not attached):
>  [1] compiler_4.0.0    tools_4.0.0       Rcpp_1.0.4.6 R.methodsS3_1.8.0
> codetools_0.2-16
>  [6] grid_4.0.0        iterators_1.0.12  foreach_1.5.0 R.utils_2.9.2    
> R.oo_1.23.0
> [11] lattice_0.20-41
>
>
> ### gdalInfo
>>  gdalinfo(sFileNameTiff)
>  [1] "Driver: GTiff/GeoTIFF"
>  [2] "Files: BigData.tif"
>  [3] "Size is 72000, 48000"
>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
>  [6] "Image Structure Metadata:"
>  [7] "  COMPRESSION=LZW"
>  [8] "  INTERLEAVE=BAND"
>  [9] "Corner Coordinates:"
> [10] "Upper Left  (       0.000,   48000.000) "
> [11] "Lower Left  (   0.0000000,   0.0000000) "
> [12] "Upper Right (   72000.000,   48000.000) "
> [13] "Lower Right (   72000.000,       0.000) "
> [14] "Center      (   36000.000,   24000.000) "
> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
> [16] "  Min=1.000 Max=1.000 "
> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
> [18] "  NoData Value=-9999"
> [19] "  Metadata:"
> [20] "    STATISTICS_MAXIMUM=1"
> [21] "    STATISTICS_MEAN=nan"
> [22] "    STATISTICS_MINIMUM=1"
> [23] "    STATISTICS_STDDEV=nan"
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Thorsten Behrens
Roger,

thanks a lot for your reply!

I have 256GB RAM installed (mentioned it somewhere). And there, all is
fine when I run:

rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000,

values(rDemTest) = 1

When limiting the memory to about 8GB with ulimit::memory_limit(8000),
the max size which can be allocated seems to be around 10000  x 10000px.
In this case all tests run fine. Unfortunately it seems to be related to
the size of the grid (48000 x 72000) and therefore the problem can't be
reproduced on machines with 8GB RAM. For some processing steps I need
grids of that size in the memory, which is why I have 256 GB installed.

Normally, I use the raster package and not rgdal::readGDAL(). This was
just a desperate attempt to find the source of the problem.

This is what I use in my code:

rDem = raster(sFileNameTiff)
mDem = raster::as.matrix(rDem)

But maybe this is the same...

Any further suggestions are much appreciated!

Thanks again!

Best,

Thorsten




Am 27.04.2020 um 14:50 schrieb Roger Bivand:

> On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>
>> Dear all,
>>
>> my problem is that I want to read a big geotiff raster dataset into R
>> and convert it to a matrix, which does not work.
>> The file is big but there is sufficient memory. I need all the data
>> in the memory at the same time.
>>
>> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04
>> LTS with the latest version of the packages (see session info below)
>> and 256GB RAM installed.
>>
>> When loading the raster dataset using rgdal (via readGDAL or
>> raster::readAll) I get the follwoing error in R 4.0.0:
>>
>> ```
>> Error in rgdal::getRasterData(con, offset = offs, region.dim = reg,
>> band = object@data@band) :
>>   long vectors not supported yet: memory.c:3782
>> ```
>
> On a 16GB Fedora platform:
>
>> library(raster) # 3.1-5
>> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000,
> 0,
> + 48000))) # all fine
>> rDemTest
> class      : RasterLayer
> dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
> resolution : 1, 1  (x, y)
> extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
> crs        : NA
>
>> values(rDemTest) = 1
> Error: cannot allocate vector of size 25.7 Gb
>
> So you are deceiving yourself into thinking that all is fine at this
> point. Please try to instantiate an example that can be reproduced on
> a machine with 8GB RAM.
>
> Further note that rgdal::readGDAL() is not how you handle very large
> objects in files, and never has been. raster can handle blocks of data
> from bands in file; stars and gdalcubes can use proxy=TRUE for the
> same purpose. Why did you choose rgdal::readGDAL() when this is not
> its purpose?
>
> You did not say how much RAM is on your platform.
>
> Roger
>
>>
>> In R 3.6.3 is is "... memory.c:3717"
>>
>> However, I can load the same file with the tiff package and a file of
>> the same size in the native raster package format (*.grd) with the
>> raster package but again not with the rgdal package.
>>
>> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
>> Rouault assumes the problem is related to rgdal and not gdal
>> (https://github.com/OSGeo/gdal/issues/2442).
>>
>> Below you find reproducible code, which generates a raster file,
>> saves the two formats (.tiff and .grd) and tries to read them with
>> the different packages.
>>
>> Is this a known limitation? Any help is greatly appreciated!
>>
>> Thanks a lot in advance!
>>
>> Best wishes and stay healthy,
>> Thorsten
>>
>>
>>
>> ### Steps to reproduce the problem.
>>
>> R code:
>>
>> ```
>> library(rgdal) # 1.4-8
>> library(raster) # 3.1-5
>> library(tiff) # 0.1-5
>>
>> ## generate and manipulate a big raster dataset
>> # - generate
>> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>> 72000, 0, 48000))) # all fine
>>
>> # - manipulate
>> values(rDemTest) = 1 # all fine
>>
>> # - convert
>> mDemTest = raster::as.matrix(rDemTest) # all fine
>> str(mDemTest)
>>
>> ## save a big dataset
>>
>> # - as raster/gdal
>> sFileNameTiff = "BigData.tif"
>> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
>> NAflag = -9999) # all fine
>>
>> # - as raster native
>> sFileNameNative = "BigData.grd"
>> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
>> NAflag = -9999) # all fine
>>
>>
>> ## load the big raster datasets with different packages and options
>> # - load the tiff data with the gdal package via the raster package
>> rDem = raster(sFileNameTiff) # all fine
>> extent(rDem) # all fine
>> mDem = raster::as.matrix(rDem) # error
>> rDem = readAll(rDem) # error
>>
>> # - load the native raster data with the raster package
>> rDem = raster(sFileNameNative) # all fine
>> extent(rDem) # all fine
>> mDem = raster::as.matrix(rDem) # all fine
>> str(mDem)
>>
>> # - load the tiff data with the tiff package
>> mDem = readTIFF(sFileNameTiff) # all fine
>> str(mDem)
>>
>> # - load the tiff data with the gdal package
>> sfDem = readGDAL(sFileNameTiff) # error
>>
>> # - load the native raster data with the gdal package
>> sfDem = readGDAL(sFileNameNative) # error
>>
>> ```
>>
>>
>> ### Startup messages when rgdal is attached (requested by Roger Bivand)
>>>  library(rgdal)
>> rgdal: version: 1.4-8, (SVN revision 845)
>>  Geospatial Data Abstraction Library extensions to R successfully loaded
>>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>>  Path to GDAL shared files:
>>  GDAL binary built with GEOS: TRUE
>>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020, [PJ_VERSION:
>> 631]
>>  Path to PROJ.4 shared files: (autodetected)
>>  Linking to sp version: 1.4-1
>>
>>
>> ### Session info
>>>  sessionInfo()
>> R version 4.0.0 (2020-04-24)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu 20.04 LTS
>>
>> Matrix products: default
>> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>>
>> locale:
>>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
>>  [4] LC_COLLATE=de_DE.UTF-8     LC_MONETARY=de_DE.UTF-8
>> LC_MESSAGES=de_DE.UTF-8
>>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
>> [10] LC_TELEPHONE=C             LC_MEASUREMENT=de_DE.UTF-8
>> LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods base
>>
>> other attached packages:
>> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5     
>> sp_1.4-1
>>
>> loaded via a namespace (and not attached):
>>  [1] compiler_4.0.0    tools_4.0.0       Rcpp_1.0.4.6
>> R.methodsS3_1.8.0 codetools_0.2-16
>>  [6] grid_4.0.0        iterators_1.0.12  foreach_1.5.0
>> R.utils_2.9.2     R.oo_1.23.0
>> [11] lattice_0.20-41
>>
>>
>> ### gdalInfo
>>>  gdalinfo(sFileNameTiff)
>>  [1] "Driver: GTiff/GeoTIFF"
>>  [2] "Files: BigData.tif"
>>  [3] "Size is 72000, 48000"
>>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
>>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
>>  [6] "Image Structure Metadata:"
>>  [7] "  COMPRESSION=LZW"
>>  [8] "  INTERLEAVE=BAND"
>>  [9] "Corner Coordinates:"
>> [10] "Upper Left  (       0.000,   48000.000) "
>> [11] "Lower Left  (   0.0000000,   0.0000000) "
>> [12] "Upper Right (   72000.000,   48000.000) "
>> [13] "Lower Right (   72000.000,       0.000) "
>> [14] "Center      (   36000.000,   24000.000) "
>> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
>> [16] "  Min=1.000 Max=1.000 "
>> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
>> [18] "  NoData Value=-9999"
>> [19] "  Metadata:"
>> [20] "    STATISTICS_MAXIMUM=1"
>> [21] "    STATISTICS_MEAN=nan"
>> [22] "    STATISTICS_MINIMUM=1"
>> [23] "    STATISTICS_STDDEV=nan"
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> [hidden email]
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>>
>

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Michael Sumner-2
Try stars it worked for me on a test

On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens, <[hidden email]>
wrote:

> Roger,
>
> thanks a lot for your reply!
>
> I have 256GB RAM installed (mentioned it somewhere). And there, all is
> fine when I run:
>
> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000,
>
> values(rDemTest) = 1
>
> When limiting the memory to about 8GB with ulimit::memory_limit(8000),
> the max size which can be allocated seems to be around 10000  x 10000px.
> In this case all tests run fine. Unfortunately it seems to be related to
> the size of the grid (48000 x 72000) and therefore the problem can't be
> reproduced on machines with 8GB RAM. For some processing steps I need
> grids of that size in the memory, which is why I have 256 GB installed.
>
> Normally, I use the raster package and not rgdal::readGDAL(). This was
> just a desperate attempt to find the source of the problem.
>
> This is what I use in my code:
>
> rDem = raster(sFileNameTiff)
> mDem = raster::as.matrix(rDem)
>
> But maybe this is the same...
>
> Any further suggestions are much appreciated!
>
> Thanks again!
>
> Best,
>
> Thorsten
>
>
>
>
> Am 27.04.2020 um 14:50 schrieb Roger Bivand:
> > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
> >
> >> Dear all,
> >>
> >> my problem is that I want to read a big geotiff raster dataset into R
> >> and convert it to a matrix, which does not work.
> >> The file is big but there is sufficient memory. I need all the data
> >> in the memory at the same time.
> >>
> >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04
> >> LTS with the latest version of the packages (see session info below)
> >> and 256GB RAM installed.
> >>
> >> When loading the raster dataset using rgdal (via readGDAL or
> >> raster::readAll) I get the follwoing error in R 4.0.0:
> >>
> >> ```
> >> Error in rgdal::getRasterData(con, offset = offs, region.dim = reg,
> >> band = object@data@band) :
> >>   long vectors not supported yet: memory.c:3782
> >> ```
> >
> > On a 16GB Fedora platform:
> >
> >> library(raster) # 3.1-5
> >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000,
> > 0,
> > + 48000))) # all fine
> >> rDemTest
> > class      : RasterLayer
> > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
> > resolution : 1, 1  (x, y)
> > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
> > crs        : NA
> >
> >> values(rDemTest) = 1
> > Error: cannot allocate vector of size 25.7 Gb
> >
> > So you are deceiving yourself into thinking that all is fine at this
> > point. Please try to instantiate an example that can be reproduced on
> > a machine with 8GB RAM.
> >
> > Further note that rgdal::readGDAL() is not how you handle very large
> > objects in files, and never has been. raster can handle blocks of data
> > from bands in file; stars and gdalcubes can use proxy=TRUE for the
> > same purpose. Why did you choose rgdal::readGDAL() when this is not
> > its purpose?
> >
> > You did not say how much RAM is on your platform.
> >
> > Roger
> >
> >>
> >> In R 3.6.3 is is "... memory.c:3717"
> >>
> >> However, I can load the same file with the tiff package and a file of
> >> the same size in the native raster package format (*.grd) with the
> >> raster package but again not with the rgdal package.
> >>
> >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
> >> Rouault assumes the problem is related to rgdal and not gdal
> >> (https://github.com/OSGeo/gdal/issues/2442).
> >>
> >> Below you find reproducible code, which generates a raster file,
> >> saves the two formats (.tiff and .grd) and tries to read them with
> >> the different packages.
> >>
> >> Is this a known limitation? Any help is greatly appreciated!
> >>
> >> Thanks a lot in advance!
> >>
> >> Best wishes and stay healthy,
> >> Thorsten
> >>
> >>
> >>
> >> ### Steps to reproduce the problem.
> >>
> >> R code:
> >>
> >> ```
> >> library(rgdal) # 1.4-8
> >> library(raster) # 3.1-5
> >> library(tiff) # 0.1-5
> >>
> >> ## generate and manipulate a big raster dataset
> >> # - generate
> >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> >> 72000, 0, 48000))) # all fine
> >>
> >> # - manipulate
> >> values(rDemTest) = 1 # all fine
> >>
> >> # - convert
> >> mDemTest = raster::as.matrix(rDemTest) # all fine
> >> str(mDemTest)
> >>
> >> ## save a big dataset
> >>
> >> # - as raster/gdal
> >> sFileNameTiff = "BigData.tif"
> >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
> >> NAflag = -9999) # all fine
> >>
> >> # - as raster native
> >> sFileNameNative = "BigData.grd"
> >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
> >> NAflag = -9999) # all fine
> >>
> >>
> >> ## load the big raster datasets with different packages and options
> >> # - load the tiff data with the gdal package via the raster package
> >> rDem = raster(sFileNameTiff) # all fine
> >> extent(rDem) # all fine
> >> mDem = raster::as.matrix(rDem) # error
> >> rDem = readAll(rDem) # error
> >>
> >> # - load the native raster data with the raster package
> >> rDem = raster(sFileNameNative) # all fine
> >> extent(rDem) # all fine
> >> mDem = raster::as.matrix(rDem) # all fine
> >> str(mDem)
> >>
> >> # - load the tiff data with the tiff package
> >> mDem = readTIFF(sFileNameTiff) # all fine
> >> str(mDem)
> >>
> >> # - load the tiff data with the gdal package
> >> sfDem = readGDAL(sFileNameTiff) # error
> >>
> >> # - load the native raster data with the gdal package
> >> sfDem = readGDAL(sFileNameNative) # error
> >>
> >> ```
> >>
> >>
> >> ### Startup messages when rgdal is attached (requested by Roger Bivand)
> >>>  library(rgdal)
> >> rgdal: version: 1.4-8, (SVN revision 845)
> >>  Geospatial Data Abstraction Library extensions to R successfully loaded
> >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
> >>  Path to GDAL shared files:
> >>  GDAL binary built with GEOS: TRUE
> >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020, [PJ_VERSION:
> >> 631]
> >>  Path to PROJ.4 shared files: (autodetected)
> >>  Linking to sp version: 1.4-1
> >>
> >>
> >> ### Session info
> >>>  sessionInfo()
> >> R version 4.0.0 (2020-04-24)
> >> Platform: x86_64-pc-linux-gnu (64-bit)
> >> Running under: Ubuntu 20.04 LTS
> >>
> >> Matrix products: default
> >> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
> >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
> >>
> >> locale:
> >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
> >>  [4] LC_COLLATE=de_DE.UTF-8     LC_MONETARY=de_DE.UTF-8
> >> LC_MESSAGES=de_DE.UTF-8
> >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
> >> [10] LC_TELEPHONE=C             LC_MEASUREMENT=de_DE.UTF-8
> >> LC_IDENTIFICATION=C
> >>
> >> attached base packages:
> >> [1] stats     graphics  grDevices utils     datasets  methods base
> >>
> >> other attached packages:
> >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5
> >> sp_1.4-1
> >>
> >> loaded via a namespace (and not attached):
> >>  [1] compiler_4.0.0    tools_4.0.0       Rcpp_1.0.4.6
> >> R.methodsS3_1.8.0 codetools_0.2-16
> >>  [6] grid_4.0.0        iterators_1.0.12  foreach_1.5.0
> >> R.utils_2.9.2     R.oo_1.23.0
> >> [11] lattice_0.20-41
> >>
> >>
> >> ### gdalInfo
> >>>  gdalinfo(sFileNameTiff)
> >>  [1] "Driver: GTiff/GeoTIFF"
> >>  [2] "Files: BigData.tif"
> >>  [3] "Size is 72000, 48000"
> >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
> >>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
> >>  [6] "Image Structure Metadata:"
> >>  [7] "  COMPRESSION=LZW"
> >>  [8] "  INTERLEAVE=BAND"
> >>  [9] "Corner Coordinates:"
> >> [10] "Upper Left  (       0.000,   48000.000) "
> >> [11] "Lower Left  (   0.0000000,   0.0000000) "
> >> [12] "Upper Right (   72000.000,   48000.000) "
> >> [13] "Lower Right (   72000.000,       0.000) "
> >> [14] "Center      (   36000.000,   24000.000) "
> >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
> >> [16] "  Min=1.000 Max=1.000 "
> >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
> >> [18] "  NoData Value=-9999"
> >> [19] "  Metadata:"
> >> [20] "    STATISTICS_MAXIMUM=1"
> >> [21] "    STATISTICS_MEAN=nan"
> >> [22] "    STATISTICS_MINIMUM=1"
> >> [23] "    STATISTICS_STDDEV=nan"
> >>
> >> _______________________________________________
> >> R-sig-Geo mailing list
> >> [hidden email]
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>
> >>
> >
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Thorsten Behrens
Michael,

Thanks for the hint, it seems to work! Real-world tests will follow in
the next few days...

So it definitely seems to be a problem of rgdal. It would be great if it
could still be solved.

Best,

Thorsten



Am 27.04.2020 um 15:58 schrieb Michael Sumner:

> Try stars it worked for me on a test
>
> On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
> <[hidden email] <mailto:[hidden email]>>
> wrote:
>
>     Roger,
>
>     thanks a lot for your reply!
>
>     I have 256GB RAM installed (mentioned it somewhere). And there,
>     all is
>     fine when I run:
>
>     rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000,
>
>     values(rDemTest) = 1
>
>     When limiting the memory to about 8GB with
>     ulimit::memory_limit(8000),
>     the max size which can be allocated seems to be around 10000 x
>     10000px.
>     In this case all tests run fine. Unfortunately it seems to be
>     related to
>     the size of the grid (48000 x 72000) and therefore the problem
>     can't be
>     reproduced on machines with 8GB RAM. For some processing steps I need
>     grids of that size in the memory, which is why I have 256 GB
>     installed.
>
>     Normally, I use the raster package and not rgdal::readGDAL(). This
>     was
>     just a desperate attempt to find the source of the problem.
>
>     This is what I use in my code:
>
>     rDem = raster(sFileNameTiff)
>     mDem = raster::as.matrix(rDem)
>
>     But maybe this is the same...
>
>     Any further suggestions are much appreciated!
>
>     Thanks again!
>
>     Best,
>
>     Thorsten
>
>
>
>
>     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
>     > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>     >
>     >> Dear all,
>     >>
>     >> my problem is that I want to read a big geotiff raster dataset
>     into R
>     >> and convert it to a matrix, which does not work.
>     >> The file is big but there is sufficient memory. I need all the
>     data
>     >> in the memory at the same time.
>     >>
>     >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04
>     >> LTS with the latest version of the packages (see session info
>     below)
>     >> and 256GB RAM installed.
>     >>
>     >> When loading the raster dataset using rgdal (via readGDAL or
>     >> raster::readAll) I get the follwoing error in R 4.0.0:
>     >>
>     >> ```
>     >> Error in rgdal::getRasterData(con, offset = offs, region.dim =
>     reg,
>     >> band = object@data@band) :
>     >>   long vectors not supported yet: memory.c:3782
>     >> ```
>     >
>     > On a 16GB Fedora platform:
>     >
>     >> library(raster) # 3.1-5
>     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>     72000,
>     > 0,
>     > + 48000))) # all fine
>     >> rDemTest
>     > class      : RasterLayer
>     > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
>     > resolution : 1, 1  (x, y)
>     > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
>     > crs        : NA
>     >
>     >> values(rDemTest) = 1
>     > Error: cannot allocate vector of size 25.7 Gb
>     >
>     > So you are deceiving yourself into thinking that all is fine at
>     this
>     > point. Please try to instantiate an example that can be
>     reproduced on
>     > a machine with 8GB RAM.
>     >
>     > Further note that rgdal::readGDAL() is not how you handle very
>     large
>     > objects in files, and never has been. raster can handle blocks
>     of data
>     > from bands in file; stars and gdalcubes can use proxy=TRUE for the
>     > same purpose. Why did you choose rgdal::readGDAL() when this is not
>     > its purpose?
>     >
>     > You did not say how much RAM is on your platform.
>     >
>     > Roger
>     >
>     >>
>     >> In R 3.6.3 is is "... memory.c:3717"
>     >>
>     >> However, I can load the same file with the tiff package and a
>     file of
>     >> the same size in the native raster package format (*.grd) with the
>     >> raster package but again not with the rgdal package.
>     >>
>     >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
>     >> Rouault assumes the problem is related to rgdal and not gdal
>     >> (https://github.com/OSGeo/gdal/issues/2442).
>     >>
>     >> Below you find reproducible code, which generates a raster file,
>     >> saves the two formats (.tiff and .grd) and tries to read them with
>     >> the different packages.
>     >>
>     >> Is this a known limitation? Any help is greatly appreciated!
>     >>
>     >> Thanks a lot in advance!
>     >>
>     >> Best wishes and stay healthy,
>     >> Thorsten
>     >>
>     >>
>     >>
>     >> ### Steps to reproduce the problem.
>     >>
>     >> R code:
>     >>
>     >> ```
>     >> library(rgdal) # 1.4-8
>     >> library(raster) # 3.1-5
>     >> library(tiff) # 0.1-5
>     >>
>     >> ## generate and manipulate a big raster dataset
>     >> # - generate
>     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>     >> 72000, 0, 48000))) # all fine
>     >>
>     >> # - manipulate
>     >> values(rDemTest) = 1 # all fine
>     >>
>     >> # - convert
>     >> mDemTest = raster::as.matrix(rDemTest) # all fine
>     >> str(mDemTest)
>     >>
>     >> ## save a big dataset
>     >>
>     >> # - as raster/gdal
>     >> sFileNameTiff = "BigData.tif"
>     >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
>     >> NAflag = -9999) # all fine
>     >>
>     >> # - as raster native
>     >> sFileNameNative = "BigData.grd"
>     >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
>     >> NAflag = -9999) # all fine
>     >>
>     >>
>     >> ## load the big raster datasets with different packages and options
>     >> # - load the tiff data with the gdal package via the raster package
>     >> rDem = raster(sFileNameTiff) # all fine
>     >> extent(rDem) # all fine
>     >> mDem = raster::as.matrix(rDem) # error
>     >> rDem = readAll(rDem) # error
>     >>
>     >> # - load the native raster data with the raster package
>     >> rDem = raster(sFileNameNative) # all fine
>     >> extent(rDem) # all fine
>     >> mDem = raster::as.matrix(rDem) # all fine
>     >> str(mDem)
>     >>
>     >> # - load the tiff data with the tiff package
>     >> mDem = readTIFF(sFileNameTiff) # all fine
>     >> str(mDem)
>     >>
>     >> # - load the tiff data with the gdal package
>     >> sfDem = readGDAL(sFileNameTiff) # error
>     >>
>     >> # - load the native raster data with the gdal package
>     >> sfDem = readGDAL(sFileNameNative) # error
>     >>
>     >> ```
>     >>
>     >>
>     >> ### Startup messages when rgdal is attached (requested by Roger
>     Bivand)
>     >>>  library(rgdal)
>     >> rgdal: version: 1.4-8, (SVN revision 845)
>     >>  Geospatial Data Abstraction Library extensions to R
>     successfully loaded
>     >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>     >>  Path to GDAL shared files:
>     >>  GDAL binary built with GEOS: TRUE
>     >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
>     [PJ_VERSION:
>     >> 631]
>     >>  Path to PROJ.4 shared files: (autodetected)
>     >>  Linking to sp version: 1.4-1
>     >>
>     >>
>     >> ### Session info
>     >>>  sessionInfo()
>     >> R version 4.0.0 (2020-04-24)
>     >> Platform: x86_64-pc-linux-gnu (64-bit)
>     >> Running under: Ubuntu 20.04 LTS
>     >>
>     >> Matrix products: default
>     >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>     >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>     >>
>     >> locale:
>     >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
>     >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
>     >> LC_MESSAGES=de_DE.UTF-8
>     >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
>     >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
>     >> LC_IDENTIFICATION=C
>     >>
>     >> attached base packages:
>     >> [1] stats     graphics  grDevices utils     datasets methods base
>     >>
>     >> other attached packages:
>     >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5
>     >> sp_1.4-1
>     >>
>     >> loaded via a namespace (and not attached):
>     >>  [1] compiler_4.0.0    tools_4.0.0       Rcpp_1.0.4.6
>     >> R.methodsS3_1.8.0 codetools_0.2-16
>     >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
>     >> R.utils_2.9.2     R.oo_1.23.0
>     >> [11] lattice_0.20-41
>     >>
>     >>
>     >> ### gdalInfo
>     >>>  gdalinfo(sFileNameTiff)
>     >>  [1] "Driver: GTiff/GeoTIFF"
>     >>  [2] "Files: BigData.tif"
>     >>  [3] "Size is 72000, 48000"
>     >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
>     >>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
>     >>  [6] "Image Structure Metadata:"
>     >>  [7] "  COMPRESSION=LZW"
>     >>  [8] "  INTERLEAVE=BAND"
>     >>  [9] "Corner Coordinates:"
>     >> [10] "Upper Left  (       0.000,   48000.000) "
>     >> [11] "Lower Left  (   0.0000000,   0.0000000) "
>     >> [12] "Upper Right (   72000.000,   48000.000) "
>     >> [13] "Lower Right (   72000.000,       0.000) "
>     >> [14] "Center      (   36000.000,   24000.000) "
>     >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
>     >> [16] "  Min=1.000 Max=1.000 "
>     >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
>     >> [18] "  NoData Value=-9999"
>     >> [19] "  Metadata:"
>     >> [20] "    STATISTICS_MAXIMUM=1"
>     >> [21] "    STATISTICS_MEAN=nan"
>     >> [22] "    STATISTICS_MINIMUM=1"
>     >> [23] "    STATISTICS_STDDEV=nan"
>     >>
>     >> _______________________________________________
>     >> R-sig-Geo mailing list
>     >> [hidden email] <mailto:[hidden email]>
>     >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>     >>
>     >>
>     >
>
>     _______________________________________________
>     R-sig-Geo mailing list
>     [hidden email] <mailto:[hidden email]>
>     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Roger Bivand
Administrator
On Tue, 28 Apr 2020, Thorsten Behrens wrote:

> Michael,
>
> Thanks for the hint, it seems to work! Real-world tests will follow in
> the next few days...
>
> So it definitely seems to be a problem of rgdal. It would be great if it
> could still be solved.

Not rgdal, but your use of it. Try looking at a sequence of

file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
obj <- GDAL.open(file)
(dims <- dim(obj))
band <- 1
data_vector <- getRasterData(obj, band)
GDAL.close(obj)
str(data_vector)

This does not create any more complicated objects, just a matrix. In some
cases, the rows in the matrix are ordered S -> N, so it may appear the
wrong way up.

rgdal::getRasterData() is lightweight, and has many arguments which may be
helpful. rgdal::readGDAL() is heavyweight, creating a SpatialGridDataFrame
object. This involves much copying of data, but the output object can be
used for example in mapping or analysis without further conversion. My
guess is that rgdal::getRasterData() gives you your matrix directly. Look
at the R code to see how as.is= etc. work (files may include scale and
offset values - recently a user was confused that scale and offset were
"magically" applied to convert Uint16 values to degrees Kelvin on
reading). For example, if as.is == TRUE or scale == 1 and offset == 0, no
copying of the input matrix occurs because it is not converted. If you
could check this route, others following this thread could also benefit;
if I'm wrong, that would also be good to know.

Roger


>
> Best,
>
> Thorsten
>
>
>
> Am 27.04.2020 um 15:58 schrieb Michael Sumner:
>> Try stars it worked for me on a test
>>
>> On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
>> <[hidden email] <mailto:[hidden email]>>
>> wrote:
>>
>>     Roger,
>>
>>     thanks a lot for your reply!
>>
>>     I have 256GB RAM installed (mentioned it somewhere). And there,
>>     all is
>>     fine when I run:
>>
>>     rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000,
>>
>>     values(rDemTest) = 1
>>
>>     When limiting the memory to about 8GB with
>>     ulimit::memory_limit(8000),
>>     the max size which can be allocated seems to be around 10000 x
>>     10000px.
>>     In this case all tests run fine. Unfortunately it seems to be
>>     related to
>>     the size of the grid (48000 x 72000) and therefore the problem
>>     can't be
>>     reproduced on machines with 8GB RAM. For some processing steps I need
>>     grids of that size in the memory, which is why I have 256 GB
>>     installed.
>>
>>     Normally, I use the raster package and not rgdal::readGDAL(). This
>>     was
>>     just a desperate attempt to find the source of the problem.
>>
>>     This is what I use in my code:
>>
>>     rDem = raster(sFileNameTiff)
>>     mDem = raster::as.matrix(rDem)
>>
>>     But maybe this is the same...
>>
>>     Any further suggestions are much appreciated!
>>
>>     Thanks again!
>>
>>     Best,
>>
>>     Thorsten
>>
>>
>>
>>
>>     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
>>    > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>>    >
>>    >> Dear all,
>>    >>
>>    >> my problem is that I want to read a big geotiff raster dataset
>>     into R
>>    >> and convert it to a matrix, which does not work.
>>    >> The file is big but there is sufficient memory. I need all the
>>     data
>>    >> in the memory at the same time.
>>    >>
>>    >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04
>>    >> LTS with the latest version of the packages (see session info
>>     below)
>>    >> and 256GB RAM installed.
>>    >>
>>    >> When loading the raster dataset using rgdal (via readGDAL or
>>    >> raster::readAll) I get the follwoing error in R 4.0.0:
>>    >>
>>    >> ```
>>    >> Error in rgdal::getRasterData(con, offset = offs, region.dim =
>>     reg,
>>    >> band = object@data@band) :
>>    >>   long vectors not supported yet: memory.c:3782
>>    >> ```
>>    >
>>    > On a 16GB Fedora platform:
>>    >
>>    >> library(raster) # 3.1-5
>>    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>>     72000,
>>    > 0,
>>    > + 48000))) # all fine
>>    >> rDemTest
>>    > class      : RasterLayer
>>    > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
>>    > resolution : 1, 1  (x, y)
>>    > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
>>    > crs        : NA
>>    >
>>    >> values(rDemTest) = 1
>>    > Error: cannot allocate vector of size 25.7 Gb
>>    >
>>    > So you are deceiving yourself into thinking that all is fine at
>>     this
>>    > point. Please try to instantiate an example that can be
>>     reproduced on
>>    > a machine with 8GB RAM.
>>    >
>>    > Further note that rgdal::readGDAL() is not how you handle very
>>     large
>>    > objects in files, and never has been. raster can handle blocks
>>     of data
>>    > from bands in file; stars and gdalcubes can use proxy=TRUE for the
>>    > same purpose. Why did you choose rgdal::readGDAL() when this is not
>>    > its purpose?
>>    >
>>    > You did not say how much RAM is on your platform.
>>    >
>>    > Roger
>>    >
>>    >>
>>    >> In R 3.6.3 is is "... memory.c:3717"
>>    >>
>>    >> However, I can load the same file with the tiff package and a
>>     file of
>>    >> the same size in the native raster package format (*.grd) with the
>>    >> raster package but again not with the rgdal package.
>>    >>
>>    >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
>>    >> Rouault assumes the problem is related to rgdal and not gdal
>>    >> (https://github.com/OSGeo/gdal/issues/2442).
>>    >>
>>    >> Below you find reproducible code, which generates a raster file,
>>    >> saves the two formats (.tiff and .grd) and tries to read them with
>>    >> the different packages.
>>    >>
>>    >> Is this a known limitation? Any help is greatly appreciated!
>>    >>
>>    >> Thanks a lot in advance!
>>    >>
>>    >> Best wishes and stay healthy,
>>    >> Thorsten
>>    >>
>>    >>
>>    >>
>>    >> ### Steps to reproduce the problem.
>>    >>
>>    >> R code:
>>    >>
>>    >> ```
>>    >> library(rgdal) # 1.4-8
>>    >> library(raster) # 3.1-5
>>    >> library(tiff) # 0.1-5
>>    >>
>>    >> ## generate and manipulate a big raster dataset
>>    >> # - generate
>>    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>>    >> 72000, 0, 48000))) # all fine
>>    >>
>>    >> # - manipulate
>>    >> values(rDemTest) = 1 # all fine
>>    >>
>>    >> # - convert
>>    >> mDemTest = raster::as.matrix(rDemTest) # all fine
>>    >> str(mDemTest)
>>    >>
>>    >> ## save a big dataset
>>    >>
>>    >> # - as raster/gdal
>>    >> sFileNameTiff = "BigData.tif"
>>    >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
>>    >> NAflag = -9999) # all fine
>>    >>
>>    >> # - as raster native
>>    >> sFileNameNative = "BigData.grd"
>>    >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
>>    >> NAflag = -9999) # all fine
>>    >>
>>    >>
>>    >> ## load the big raster datasets with different packages and options
>>    >> # - load the tiff data with the gdal package via the raster package
>>    >> rDem = raster(sFileNameTiff) # all fine
>>    >> extent(rDem) # all fine
>>    >> mDem = raster::as.matrix(rDem) # error
>>    >> rDem = readAll(rDem) # error
>>    >>
>>    >> # - load the native raster data with the raster package
>>    >> rDem = raster(sFileNameNative) # all fine
>>    >> extent(rDem) # all fine
>>    >> mDem = raster::as.matrix(rDem) # all fine
>>    >> str(mDem)
>>    >>
>>    >> # - load the tiff data with the tiff package
>>    >> mDem = readTIFF(sFileNameTiff) # all fine
>>    >> str(mDem)
>>    >>
>>    >> # - load the tiff data with the gdal package
>>    >> sfDem = readGDAL(sFileNameTiff) # error
>>    >>
>>    >> # - load the native raster data with the gdal package
>>    >> sfDem = readGDAL(sFileNameNative) # error
>>    >>
>>    >> ```
>>    >>
>>    >>
>>    >> ### Startup messages when rgdal is attached (requested by Roger
>>     Bivand)
>>    >>>  library(rgdal)
>>    >> rgdal: version: 1.4-8, (SVN revision 845)
>>    >>  Geospatial Data Abstraction Library extensions to R
>>     successfully loaded
>>    >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>>    >>  Path to GDAL shared files:
>>    >>  GDAL binary built with GEOS: TRUE
>>    >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
>>     [PJ_VERSION:
>>    >> 631]
>>    >>  Path to PROJ.4 shared files: (autodetected)
>>    >>  Linking to sp version: 1.4-1
>>    >>
>>    >>
>>    >> ### Session info
>>    >>>  sessionInfo()
>>    >> R version 4.0.0 (2020-04-24)
>>    >> Platform: x86_64-pc-linux-gnu (64-bit)
>>    >> Running under: Ubuntu 20.04 LTS
>>    >>
>>    >> Matrix products: default
>>    >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>>    >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>>    >>
>>    >> locale:
>>    >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
>>    >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
>>    >> LC_MESSAGES=de_DE.UTF-8
>>    >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
>>    >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
>>    >> LC_IDENTIFICATION=C
>>    >>
>>    >> attached base packages:
>>    >> [1] stats     graphics  grDevices utils     datasets methods base
>>    >>
>>    >> other attached packages:
>>    >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5
>>    >> sp_1.4-1
>>    >>
>>    >> loaded via a namespace (and not attached):
>>    >>  [1] compiler_4.0.0    tools_4.0.0       Rcpp_1.0.4.6
>>    >> R.methodsS3_1.8.0 codetools_0.2-16
>>    >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
>>    >> R.utils_2.9.2     R.oo_1.23.0
>>    >> [11] lattice_0.20-41
>>    >>
>>    >>
>>    >> ### gdalInfo
>>    >>>  gdalinfo(sFileNameTiff)
>>    >>  [1] "Driver: GTiff/GeoTIFF"
>>    >>  [2] "Files: BigData.tif"
>>    >>  [3] "Size is 72000, 48000"
>>    >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
>>    >>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
>>    >>  [6] "Image Structure Metadata:"
>>    >>  [7] "  COMPRESSION=LZW"
>>    >>  [8] "  INTERLEAVE=BAND"
>>    >>  [9] "Corner Coordinates:"
>>    >> [10] "Upper Left  (       0.000,   48000.000) "
>>    >> [11] "Lower Left  (   0.0000000,   0.0000000) "
>>    >> [12] "Upper Right (   72000.000,   48000.000) "
>>    >> [13] "Lower Right (   72000.000,       0.000) "
>>    >> [14] "Center      (   36000.000,   24000.000) "
>>    >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
>>    >> [16] "  Min=1.000 Max=1.000 "
>>    >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
>>    >> [18] "  NoData Value=-9999"
>>    >> [19] "  Metadata:"
>>    >> [20] "    STATISTICS_MAXIMUM=1"
>>    >> [21] "    STATISTICS_MEAN=nan"
>>    >> [22] "    STATISTICS_MINIMUM=1"
>>    >> [23] "    STATISTICS_STDDEV=nan"
>>    >>
>>    >> _______________________________________________
>>    >> R-sig-Geo mailing list
>>    >> [hidden email] <mailto:[hidden email]>
>>    >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>    >>
>>    >>
>>    >
>>
>>     _______________________________________________
>>     R-sig-Geo mailing list
>>     [hidden email] <mailto:[hidden email]>
>>     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Michael Sumner-2
In reply to this post by Thorsten Behrens
I reckon that's unlikely, it's also unlikely to scale or be fast enough
ever  given what rgdal is - raster will happily crop sections of it out but
I guess your workflow is to write the geotiff out in one go? You want rgdal
because you want raster?  Have you tried terra? (I forgot to mention that
one, and I haven't tried on that size GeoTIFF with it. It has its own
internal connection to GDAL).

Possibly you can instantiate a GeoTIFF with rgdal of the right size, then
write to it piecewise - but if the convenience of raster is what you're
after that's obviously no good. I think I'm missing some workflow
constraint or motivation you have, but I'm interested (it's an awkward
space).

Cheers, Mike

On Tue, Apr 28, 2020 at 7:14 PM Thorsten Behrens <
[hidden email]> wrote:

> Michael,
>
> Thanks for the hint, it seems to work! Real-world tests will follow in
> the next few days...
>
> So it definitely seems to be a problem of rgdal. It would be great if it
> could still be solved.
>
> Best,
>
> Thorsten
>
>
>
> Am 27.04.2020 um 15:58 schrieb Michael Sumner:
> > Try stars it worked for me on a test
> >
> > On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
> > <[hidden email] <mailto:[hidden email]>>
> > wrote:
> >
> >     Roger,
> >
> >     thanks a lot for your reply!
> >
> >     I have 256GB RAM installed (mentioned it somewhere). And there,
> >     all is
> >     fine when I run:
> >
> >     rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> 72000,
> >
> >     values(rDemTest) = 1
> >
> >     When limiting the memory to about 8GB with
> >     ulimit::memory_limit(8000),
> >     the max size which can be allocated seems to be around 10000 x
> >     10000px.
> >     In this case all tests run fine. Unfortunately it seems to be
> >     related to
> >     the size of the grid (48000 x 72000) and therefore the problem
> >     can't be
> >     reproduced on machines with 8GB RAM. For some processing steps I need
> >     grids of that size in the memory, which is why I have 256 GB
> >     installed.
> >
> >     Normally, I use the raster package and not rgdal::readGDAL(). This
> >     was
> >     just a desperate attempt to find the source of the problem.
> >
> >     This is what I use in my code:
> >
> >     rDem = raster(sFileNameTiff)
> >     mDem = raster::as.matrix(rDem)
> >
> >     But maybe this is the same...
> >
> >     Any further suggestions are much appreciated!
> >
> >     Thanks again!
> >
> >     Best,
> >
> >     Thorsten
> >
> >
> >
> >
> >     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
> >     > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
> >     >
> >     >> Dear all,
> >     >>
> >     >> my problem is that I want to read a big geotiff raster dataset
> >     into R
> >     >> and convert it to a matrix, which does not work.
> >     >> The file is big but there is sufficient memory. I need all the
> >     data
> >     >> in the memory at the same time.
> >     >>
> >     >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04
> >     >> LTS with the latest version of the packages (see session info
> >     below)
> >     >> and 256GB RAM installed.
> >     >>
> >     >> When loading the raster dataset using rgdal (via readGDAL or
> >     >> raster::readAll) I get the follwoing error in R 4.0.0:
> >     >>
> >     >> ```
> >     >> Error in rgdal::getRasterData(con, offset = offs, region.dim =
> >     reg,
> >     >> band = object@data@band) :
> >     >>   long vectors not supported yet: memory.c:3782
> >     >> ```
> >     >
> >     > On a 16GB Fedora platform:
> >     >
> >     >> library(raster) # 3.1-5
> >     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> >     72000,
> >     > 0,
> >     > + 48000))) # all fine
> >     >> rDemTest
> >     > class      : RasterLayer
> >     > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
> >     > resolution : 1, 1  (x, y)
> >     > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
> >     > crs        : NA
> >     >
> >     >> values(rDemTest) = 1
> >     > Error: cannot allocate vector of size 25.7 Gb
> >     >
> >     > So you are deceiving yourself into thinking that all is fine at
> >     this
> >     > point. Please try to instantiate an example that can be
> >     reproduced on
> >     > a machine with 8GB RAM.
> >     >
> >     > Further note that rgdal::readGDAL() is not how you handle very
> >     large
> >     > objects in files, and never has been. raster can handle blocks
> >     of data
> >     > from bands in file; stars and gdalcubes can use proxy=TRUE for the
> >     > same purpose. Why did you choose rgdal::readGDAL() when this is not
> >     > its purpose?
> >     >
> >     > You did not say how much RAM is on your platform.
> >     >
> >     > Roger
> >     >
> >     >>
> >     >> In R 3.6.3 is is "... memory.c:3717"
> >     >>
> >     >> However, I can load the same file with the tiff package and a
> >     file of
> >     >> the same size in the native raster package format (*.grd) with the
> >     >> raster package but again not with the rgdal package.
> >     >>
> >     >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
> >     >> Rouault assumes the problem is related to rgdal and not gdal
> >     >> (https://github.com/OSGeo/gdal/issues/2442).
> >     >>
> >     >> Below you find reproducible code, which generates a raster file,
> >     >> saves the two formats (.tiff and .grd) and tries to read them with
> >     >> the different packages.
> >     >>
> >     >> Is this a known limitation? Any help is greatly appreciated!
> >     >>
> >     >> Thanks a lot in advance!
> >     >>
> >     >> Best wishes and stay healthy,
> >     >> Thorsten
> >     >>
> >     >>
> >     >>
> >     >> ### Steps to reproduce the problem.
> >     >>
> >     >> R code:
> >     >>
> >     >> ```
> >     >> library(rgdal) # 1.4-8
> >     >> library(raster) # 3.1-5
> >     >> library(tiff) # 0.1-5
> >     >>
> >     >> ## generate and manipulate a big raster dataset
> >     >> # - generate
> >     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> >     >> 72000, 0, 48000))) # all fine
> >     >>
> >     >> # - manipulate
> >     >> values(rDemTest) = 1 # all fine
> >     >>
> >     >> # - convert
> >     >> mDemTest = raster::as.matrix(rDemTest) # all fine
> >     >> str(mDemTest)
> >     >>
> >     >> ## save a big dataset
> >     >>
> >     >> # - as raster/gdal
> >     >> sFileNameTiff = "BigData.tif"
> >     >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
> >     >> NAflag = -9999) # all fine
> >     >>
> >     >> # - as raster native
> >     >> sFileNameNative = "BigData.grd"
> >     >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
> >     >> NAflag = -9999) # all fine
> >     >>
> >     >>
> >     >> ## load the big raster datasets with different packages and
> options
> >     >> # - load the tiff data with the gdal package via the raster
> package
> >     >> rDem = raster(sFileNameTiff) # all fine
> >     >> extent(rDem) # all fine
> >     >> mDem = raster::as.matrix(rDem) # error
> >     >> rDem = readAll(rDem) # error
> >     >>
> >     >> # - load the native raster data with the raster package
> >     >> rDem = raster(sFileNameNative) # all fine
> >     >> extent(rDem) # all fine
> >     >> mDem = raster::as.matrix(rDem) # all fine
> >     >> str(mDem)
> >     >>
> >     >> # - load the tiff data with the tiff package
> >     >> mDem = readTIFF(sFileNameTiff) # all fine
> >     >> str(mDem)
> >     >>
> >     >> # - load the tiff data with the gdal package
> >     >> sfDem = readGDAL(sFileNameTiff) # error
> >     >>
> >     >> # - load the native raster data with the gdal package
> >     >> sfDem = readGDAL(sFileNameNative) # error
> >     >>
> >     >> ```
> >     >>
> >     >>
> >     >> ### Startup messages when rgdal is attached (requested by Roger
> >     Bivand)
> >     >>>  library(rgdal)
> >     >> rgdal: version: 1.4-8, (SVN revision 845)
> >     >>  Geospatial Data Abstraction Library extensions to R
> >     successfully loaded
> >     >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
> >     >>  Path to GDAL shared files:
> >     >>  GDAL binary built with GEOS: TRUE
> >     >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
> >     [PJ_VERSION:
> >     >> 631]
> >     >>  Path to PROJ.4 shared files: (autodetected)
> >     >>  Linking to sp version: 1.4-1
> >     >>
> >     >>
> >     >> ### Session info
> >     >>>  sessionInfo()
> >     >> R version 4.0.0 (2020-04-24)
> >     >> Platform: x86_64-pc-linux-gnu (64-bit)
> >     >> Running under: Ubuntu 20.04 LTS
> >     >>
> >     >> Matrix products: default
> >     >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
> >     >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
> >     >>
> >     >> locale:
> >     >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
> >     >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
> >     >> LC_MESSAGES=de_DE.UTF-8
> >     >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
> >     >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
> >     >> LC_IDENTIFICATION=C
> >     >>
> >     >> attached base packages:
> >     >> [1] stats     graphics  grDevices utils     datasets methods base
> >     >>
> >     >> other attached packages:
> >     >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5
> >     >> sp_1.4-1
> >     >>
> >     >> loaded via a namespace (and not attached):
> >     >>  [1] compiler_4.0.0    tools_4.0.0       Rcpp_1.0.4.6
> >     >> R.methodsS3_1.8.0 codetools_0.2-16
> >     >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
> >     >> R.utils_2.9.2     R.oo_1.23.0
> >     >> [11] lattice_0.20-41
> >     >>
> >     >>
> >     >> ### gdalInfo
> >     >>>  gdalinfo(sFileNameTiff)
> >     >>  [1] "Driver: GTiff/GeoTIFF"
> >     >>  [2] "Files: BigData.tif"
> >     >>  [3] "Size is 72000, 48000"
> >     >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
> >     >>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
> >     >>  [6] "Image Structure Metadata:"
> >     >>  [7] "  COMPRESSION=LZW"
> >     >>  [8] "  INTERLEAVE=BAND"
> >     >>  [9] "Corner Coordinates:"
> >     >> [10] "Upper Left  (       0.000,   48000.000) "
> >     >> [11] "Lower Left  (   0.0000000,   0.0000000) "
> >     >> [12] "Upper Right (   72000.000,   48000.000) "
> >     >> [13] "Lower Right (   72000.000,       0.000) "
> >     >> [14] "Center      (   36000.000,   24000.000) "
> >     >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
> >     >> [16] "  Min=1.000 Max=1.000 "
> >     >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
> >     >> [18] "  NoData Value=-9999"
> >     >> [19] "  Metadata:"
> >     >> [20] "    STATISTICS_MAXIMUM=1"
> >     >> [21] "    STATISTICS_MEAN=nan"
> >     >> [22] "    STATISTICS_MINIMUM=1"
> >     >> [23] "    STATISTICS_STDDEV=nan"
> >     >>
> >     >> _______________________________________________
> >     >> R-sig-Geo mailing list
> >     >> [hidden email] <mailto:[hidden email]>
> >     >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >     >>
> >     >>
> >     >
> >
> >     _______________________________________________
> >     R-sig-Geo mailing list
> >     [hidden email] <mailto:[hidden email]>
> >     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>


--
Michael Sumner
Software and Database Engineer
Australian Antarctic Division
Hobart, Australia
e-mail: [hidden email]

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Michael Sumner-2
In reply to this post by Roger Bivand
Oh nice one, Roger I'd lost sight of all that churn there.

Cheers, Mike


On Tue, Apr 28, 2020 at 8:11 PM Roger Bivand <[hidden email]> wrote:

> On Tue, 28 Apr 2020, Thorsten Behrens wrote:
>
> > Michael,
> >
> > Thanks for the hint, it seems to work! Real-world tests will follow in
> > the next few days...
> >
> > So it definitely seems to be a problem of rgdal. It would be great if it
> > could still be solved.
>
> Not rgdal, but your use of it. Try looking at a sequence of
>
> file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
> obj <- GDAL.open(file)
> (dims <- dim(obj))
> band <- 1
> data_vector <- getRasterData(obj, band)
> GDAL.close(obj)
> str(data_vector)
>
> This does not create any more complicated objects, just a matrix. In some
> cases, the rows in the matrix are ordered S -> N, so it may appear the
> wrong way up.
>
> rgdal::getRasterData() is lightweight, and has many arguments which may be
> helpful. rgdal::readGDAL() is heavyweight, creating a SpatialGridDataFrame
> object. This involves much copying of data, but the output object can be
> used for example in mapping or analysis without further conversion. My
> guess is that rgdal::getRasterData() gives you your matrix directly. Look
> at the R code to see how as.is= etc. work (files may include scale and
> offset values - recently a user was confused that scale and offset were
> "magically" applied to convert Uint16 values to degrees Kelvin on
> reading). For example, if as.is == TRUE or scale == 1 and offset == 0, no
> copying of the input matrix occurs because it is not converted. If you
> could check this route, others following this thread could also benefit;
> if I'm wrong, that would also be good to know.
>
> Roger
>
>
> >
> > Best,
> >
> > Thorsten
> >
> >
> >
> > Am 27.04.2020 um 15:58 schrieb Michael Sumner:
> >> Try stars it worked for me on a test
> >>
> >> On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
> >> <[hidden email] <mailto:[hidden email]>>
> >> wrote:
> >>
> >>     Roger,
> >>
> >>     thanks a lot for your reply!
> >>
> >>     I have 256GB RAM installed (mentioned it somewhere). And there,
> >>     all is
> >>     fine when I run:
> >>
> >>     rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> 72000,
> >>
> >>     values(rDemTest) = 1
> >>
> >>     When limiting the memory to about 8GB with
> >>     ulimit::memory_limit(8000),
> >>     the max size which can be allocated seems to be around 10000 x
> >>     10000px.
> >>     In this case all tests run fine. Unfortunately it seems to be
> >>     related to
> >>     the size of the grid (48000 x 72000) and therefore the problem
> >>     can't be
> >>     reproduced on machines with 8GB RAM. For some processing steps I
> need
> >>     grids of that size in the memory, which is why I have 256 GB
> >>     installed.
> >>
> >>     Normally, I use the raster package and not rgdal::readGDAL(). This
> >>     was
> >>     just a desperate attempt to find the source of the problem.
> >>
> >>     This is what I use in my code:
> >>
> >>     rDem = raster(sFileNameTiff)
> >>     mDem = raster::as.matrix(rDem)
> >>
> >>     But maybe this is the same...
> >>
> >>     Any further suggestions are much appreciated!
> >>
> >>     Thanks again!
> >>
> >>     Best,
> >>
> >>     Thorsten
> >>
> >>
> >>
> >>
> >>     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
> >>    > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
> >>    >
> >>    >> Dear all,
> >>    >>
> >>    >> my problem is that I want to read a big geotiff raster dataset
> >>     into R
> >>    >> and convert it to a matrix, which does not work.
> >>    >> The file is big but there is sufficient memory. I need all the
> >>     data
> >>    >> in the memory at the same time.
> >>    >>
> >>    >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04
> >>    >> LTS with the latest version of the packages (see session info
> >>     below)
> >>    >> and 256GB RAM installed.
> >>    >>
> >>    >> When loading the raster dataset using rgdal (via readGDAL or
> >>    >> raster::readAll) I get the follwoing error in R 4.0.0:
> >>    >>
> >>    >> ```
> >>    >> Error in rgdal::getRasterData(con, offset = offs, region.dim =
> >>     reg,
> >>    >> band = object@data@band) :
> >>    >>   long vectors not supported yet: memory.c:3782
> >>    >> ```
> >>    >
> >>    > On a 16GB Fedora platform:
> >>    >
> >>    >> library(raster) # 3.1-5
> >>    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> >>     72000,
> >>    > 0,
> >>    > + 48000))) # all fine
> >>    >> rDemTest
> >>    > class      : RasterLayer
> >>    > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
> >>    > resolution : 1, 1  (x, y)
> >>    > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
> >>    > crs        : NA
> >>    >
> >>    >> values(rDemTest) = 1
> >>    > Error: cannot allocate vector of size 25.7 Gb
> >>    >
> >>    > So you are deceiving yourself into thinking that all is fine at
> >>     this
> >>    > point. Please try to instantiate an example that can be
> >>     reproduced on
> >>    > a machine with 8GB RAM.
> >>    >
> >>    > Further note that rgdal::readGDAL() is not how you handle very
> >>     large
> >>    > objects in files, and never has been. raster can handle blocks
> >>     of data
> >>    > from bands in file; stars and gdalcubes can use proxy=TRUE for the
> >>    > same purpose. Why did you choose rgdal::readGDAL() when this is not
> >>    > its purpose?
> >>    >
> >>    > You did not say how much RAM is on your platform.
> >>    >
> >>    > Roger
> >>    >
> >>    >>
> >>    >> In R 3.6.3 is is "... memory.c:3717"
> >>    >>
> >>    >> However, I can load the same file with the tiff package and a
> >>     file of
> >>    >> the same size in the native raster package format (*.grd) with the
> >>    >> raster package but again not with the rgdal package.
> >>    >>
> >>    >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
> >>    >> Rouault assumes the problem is related to rgdal and not gdal
> >>    >> (https://github.com/OSGeo/gdal/issues/2442).
> >>    >>
> >>    >> Below you find reproducible code, which generates a raster file,
> >>    >> saves the two formats (.tiff and .grd) and tries to read them with
> >>    >> the different packages.
> >>    >>
> >>    >> Is this a known limitation? Any help is greatly appreciated!
> >>    >>
> >>    >> Thanks a lot in advance!
> >>    >>
> >>    >> Best wishes and stay healthy,
> >>    >> Thorsten
> >>    >>
> >>    >>
> >>    >>
> >>    >> ### Steps to reproduce the problem.
> >>    >>
> >>    >> R code:
> >>    >>
> >>    >> ```
> >>    >> library(rgdal) # 1.4-8
> >>    >> library(raster) # 3.1-5
> >>    >> library(tiff) # 0.1-5
> >>    >>
> >>    >> ## generate and manipulate a big raster dataset
> >>    >> # - generate
> >>    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> >>    >> 72000, 0, 48000))) # all fine
> >>    >>
> >>    >> # - manipulate
> >>    >> values(rDemTest) = 1 # all fine
> >>    >>
> >>    >> # - convert
> >>    >> mDemTest = raster::as.matrix(rDemTest) # all fine
> >>    >> str(mDemTest)
> >>    >>
> >>    >> ## save a big dataset
> >>    >>
> >>    >> # - as raster/gdal
> >>    >> sFileNameTiff = "BigData.tif"
> >>    >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
> >>    >> NAflag = -9999) # all fine
> >>    >>
> >>    >> # - as raster native
> >>    >> sFileNameNative = "BigData.grd"
> >>    >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
> >>    >> NAflag = -9999) # all fine
> >>    >>
> >>    >>
> >>    >> ## load the big raster datasets with different packages and
> options
> >>    >> # - load the tiff data with the gdal package via the raster
> package
> >>    >> rDem = raster(sFileNameTiff) # all fine
> >>    >> extent(rDem) # all fine
> >>    >> mDem = raster::as.matrix(rDem) # error
> >>    >> rDem = readAll(rDem) # error
> >>    >>
> >>    >> # - load the native raster data with the raster package
> >>    >> rDem = raster(sFileNameNative) # all fine
> >>    >> extent(rDem) # all fine
> >>    >> mDem = raster::as.matrix(rDem) # all fine
> >>    >> str(mDem)
> >>    >>
> >>    >> # - load the tiff data with the tiff package
> >>    >> mDem = readTIFF(sFileNameTiff) # all fine
> >>    >> str(mDem)
> >>    >>
> >>    >> # - load the tiff data with the gdal package
> >>    >> sfDem = readGDAL(sFileNameTiff) # error
> >>    >>
> >>    >> # - load the native raster data with the gdal package
> >>    >> sfDem = readGDAL(sFileNameNative) # error
> >>    >>
> >>    >> ```
> >>    >>
> >>    >>
> >>    >> ### Startup messages when rgdal is attached (requested by Roger
> >>     Bivand)
> >>    >>>  library(rgdal)
> >>    >> rgdal: version: 1.4-8, (SVN revision 845)
> >>    >>  Geospatial Data Abstraction Library extensions to R
> >>     successfully loaded
> >>    >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
> >>    >>  Path to GDAL shared files:
> >>    >>  GDAL binary built with GEOS: TRUE
> >>    >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
> >>     [PJ_VERSION:
> >>    >> 631]
> >>    >>  Path to PROJ.4 shared files: (autodetected)
> >>    >>  Linking to sp version: 1.4-1
> >>    >>
> >>    >>
> >>    >> ### Session info
> >>    >>>  sessionInfo()
> >>    >> R version 4.0.0 (2020-04-24)
> >>    >> Platform: x86_64-pc-linux-gnu (64-bit)
> >>    >> Running under: Ubuntu 20.04 LTS
> >>    >>
> >>    >> Matrix products: default
> >>    >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
> >>    >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
> >>    >>
> >>    >> locale:
> >>    >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
> >>    >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
> >>    >> LC_MESSAGES=de_DE.UTF-8
> >>    >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
> >>    >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
> >>    >> LC_IDENTIFICATION=C
> >>    >>
> >>    >> attached base packages:
> >>    >> [1] stats     graphics  grDevices utils     datasets methods base
> >>    >>
> >>    >> other attached packages:
> >>    >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5
> >>    >> sp_1.4-1
> >>    >>
> >>    >> loaded via a namespace (and not attached):
> >>    >>  [1] compiler_4.0.0    tools_4.0.0       Rcpp_1.0.4.6
> >>    >> R.methodsS3_1.8.0 codetools_0.2-16
> >>    >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
> >>    >> R.utils_2.9.2     R.oo_1.23.0
> >>    >> [11] lattice_0.20-41
> >>    >>
> >>    >>
> >>    >> ### gdalInfo
> >>    >>>  gdalinfo(sFileNameTiff)
> >>    >>  [1] "Driver: GTiff/GeoTIFF"
> >>    >>  [2] "Files: BigData.tif"
> >>    >>  [3] "Size is 72000, 48000"
> >>    >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
> >>    >>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
> >>    >>  [6] "Image Structure Metadata:"
> >>    >>  [7] "  COMPRESSION=LZW"
> >>    >>  [8] "  INTERLEAVE=BAND"
> >>    >>  [9] "Corner Coordinates:"
> >>    >> [10] "Upper Left  (       0.000,   48000.000) "
> >>    >> [11] "Lower Left  (   0.0000000,   0.0000000) "
> >>    >> [12] "Upper Right (   72000.000,   48000.000) "
> >>    >> [13] "Lower Right (   72000.000,       0.000) "
> >>    >> [14] "Center      (   36000.000,   24000.000) "
> >>    >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
> >>    >> [16] "  Min=1.000 Max=1.000 "
> >>    >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
> >>    >> [18] "  NoData Value=-9999"
> >>    >> [19] "  Metadata:"
> >>    >> [20] "    STATISTICS_MAXIMUM=1"
> >>    >> [21] "    STATISTICS_MEAN=nan"
> >>    >> [22] "    STATISTICS_MINIMUM=1"
> >>    >> [23] "    STATISTICS_STDDEV=nan"
> >>    >>
> >>    >> _______________________________________________
> >>    >> R-sig-Geo mailing list
> >>    >> [hidden email] <mailto:[hidden email]>
> >>    >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>    >>
> >>    >>
> >>    >
> >>
> >>     _______________________________________________
> >>     R-sig-Geo mailing list
> >>     [hidden email] <mailto:[hidden email]>
> >>     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > [hidden email]
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: [hidden email]
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>


--
Michael Sumner
Software and Database Engineer
Australian Antarctic Division
Hobart, Australia
e-mail: [hidden email]

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Thorsten Behrens
In reply to this post by Roger Bivand
Roger and Mike,

I really appreciate your help on this!

I had a look at getRasterData(). It results in the same error. Hence, I
made further tests where I compared grids with the following numbers of
cols and rows:

nPx = floor(sqrt(2^31 -1)) # 46340

and

nPx = ceiling(sqrt(2^31 -1)) # 46341

The result is clear. Raster data with 46340 * 46340 px can be loaded
with getRasterData() and with raster(), raster::as.matrix(), whereas
datasets with 46341 * 46341 px cannot and result in the error:

Error in getRasterData(gdCeil, band = 1) :
long vectors not supported yet: memory.c:3782

read_stars() works for both. You find the corresponding code below.

Are there any other option I can try?

Thorsten


Reproducible code:

## generate raster datasets
# 46340 * 46340 grid dataset
sFileNameFloor = "Floor.tif"

nPx = floor(sqrt(2^31 -1)) # 46340
nPx

rFloor = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0, nPx)))
values(rFloor) = 1

writeRaster(rFloor, sFileNameFloor, "GTiff", overwrite = TRUE, NAflag =
-9999)


# 46341 * 46341 grid dataset
sFileNameCeil = "Ceil.tif"

nPx = ceiling(sqrt(2^31 -1))
nPx

rCeil = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0, nPx)))
values(rCeil) = 1

writeRaster(rCeil, sFileNameCeil, "GTiff", overwrite = TRUE, NAflag =
-9999)


## load raster datasets with different methods

# load Ceil
gdCeil = GDAL.open(sFileNameCeil)
dim(gdCeil)

vnCeil = getRasterData(gdCeil, band = 1) # error

GDAL.close(gdCeil)
str(vnCeil)

stCeil = read_stars(sFileNameCeil) # all fine
str(stCeil[[1]])

rCeil = raster(sFileNameCeil)
mCeil = raster::as.matrix(rCeil) # error
str(mCeil)


# load Floor
gdFloor = GDAL.open(sFileNameFloor)
dim(gdFloor)

vnData = getRasterData(gdFloor, band = 1) # all fine

GDAL.close(gdFloor)
str(vnData)

stFloor= read_stars(sFileNameFloor) # all fine
str(stFloor[[1]])

rFloor = raster(sFileNameFloor)
mFloor = raster::as.matrix(rFloor) # all fine
str(mFloor)




Am 28.04.2020 um 12:10 schrieb Roger Bivand:

> On Tue, 28 Apr 2020, Thorsten Behrens wrote:
>
>> Michael,
>>
>> Thanks for the hint, it seems to work! Real-world tests will follow in
>> the next few days...
>>
>> So it definitely seems to be a problem of rgdal. It would be great if it
>> could still be solved.
>
> Not rgdal, but your use of it. Try looking at a sequence of
>
> file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
> obj <- GDAL.open(file)
> (dims <- dim(obj))
> band <- 1
> data_vector <- getRasterData(obj, band)
> GDAL.close(obj)
> str(data_vector)
>
> This does not create any more complicated objects, just a matrix. In
> some cases, the rows in the matrix are ordered S -> N, so it may
> appear the wrong way up.
>
> rgdal::getRasterData() is lightweight, and has many arguments which
> may be helpful. rgdal::readGDAL() is heavyweight, creating a
> SpatialGridDataFrame object. This involves much copying of data, but
> the output object can be used for example in mapping or analysis
> without further conversion. My guess is that rgdal::getRasterData()
> gives you your matrix directly. Look at the R code to see how as.is=
> etc. work (files may include scale and offset values - recently a user
> was confused that scale and offset were "magically" applied to convert
> Uint16 values to degrees Kelvin on reading). For example, if as.is ==
> TRUE or scale == 1 and offset == 0, no copying of the input matrix
> occurs because it is not converted. If you could check this route,
> others following this thread could also benefit; if I'm wrong, that
> would also be good to know.
>
> Roger
>
>
>>
>> Best,
>>
>> Thorsten
>>
>>
>>
>> Am 27.04.2020 um 15:58 schrieb Michael Sumner:
>>> Try stars it worked for me on a test
>>>
>>> On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
>>> <[hidden email] <mailto:[hidden email]>>
>>> wrote:
>>>
>>>     Roger,
>>>
>>>     thanks a lot for your reply!
>>>
>>>     I have 256GB RAM installed (mentioned it somewhere). And there,
>>>     all is
>>>     fine when I run:
>>>
>>>     rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>>> 72000,
>>>
>>>     values(rDemTest) = 1
>>>
>>>     When limiting the memory to about 8GB with
>>>     ulimit::memory_limit(8000),
>>>     the max size which can be allocated seems to be around 10000 x
>>>     10000px.
>>>     In this case all tests run fine. Unfortunately it seems to be
>>>     related to
>>>     the size of the grid (48000 x 72000) and therefore the problem
>>>     can't be
>>>     reproduced on machines with 8GB RAM. For some processing steps I
>>> need
>>>     grids of that size in the memory, which is why I have 256 GB
>>>     installed.
>>>
>>>     Normally, I use the raster package and not rgdal::readGDAL(). This
>>>     was
>>>     just a desperate attempt to find the source of the problem.
>>>
>>>     This is what I use in my code:
>>>
>>>     rDem = raster(sFileNameTiff)
>>>     mDem = raster::as.matrix(rDem)
>>>
>>>     But maybe this is the same...
>>>
>>>     Any further suggestions are much appreciated!
>>>
>>>     Thanks again!
>>>
>>>     Best,
>>>
>>>     Thorsten
>>>
>>>
>>>
>>>
>>>     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
>>>    > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>>>    >
>>>    >> Dear all,
>>>    >>
>>>    >> my problem is that I want to read a big geotiff raster dataset
>>>     into R
>>>    >> and convert it to a matrix, which does not work.
>>>    >> The file is big but there is sufficient memory. I need all the
>>>     data
>>>    >> in the memory at the same time.
>>>    >>
>>>    >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu
>>> 20.04
>>>    >> LTS with the latest version of the packages (see session info
>>>     below)
>>>    >> and 256GB RAM installed.
>>>    >>
>>>    >> When loading the raster dataset using rgdal (via readGDAL or
>>>    >> raster::readAll) I get the follwoing error in R 4.0.0:
>>>    >>
>>>    >> ```
>>>    >> Error in rgdal::getRasterData(con, offset = offs, region.dim =
>>>     reg,
>>>    >> band = object@data@band) :
>>>    >>   long vectors not supported yet: memory.c:3782
>>>    >> ```
>>>    >
>>>    > On a 16GB Fedora platform:
>>>    >
>>>    >> library(raster) # 3.1-5
>>>    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>>>     72000,
>>>    > 0,
>>>    > + 48000))) # all fine
>>>    >> rDemTest
>>>    > class      : RasterLayer
>>>    > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
>>>    > resolution : 1, 1  (x, y)
>>>    > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
>>>    > crs        : NA
>>>    >
>>>    >> values(rDemTest) = 1
>>>    > Error: cannot allocate vector of size 25.7 Gb
>>>    >
>>>    > So you are deceiving yourself into thinking that all is fine at
>>>     this
>>>    > point. Please try to instantiate an example that can be
>>>     reproduced on
>>>    > a machine with 8GB RAM.
>>>    >
>>>    > Further note that rgdal::readGDAL() is not how you handle very
>>>     large
>>>    > objects in files, and never has been. raster can handle blocks
>>>     of data
>>>    > from bands in file; stars and gdalcubes can use proxy=TRUE for the
>>>    > same purpose. Why did you choose rgdal::readGDAL() when this is
>>> not
>>>    > its purpose?
>>>    >
>>>    > You did not say how much RAM is on your platform.
>>>    >
>>>    > Roger
>>>    >
>>>    >>
>>>    >> In R 3.6.3 is is "... memory.c:3717"
>>>    >>
>>>    >> However, I can load the same file with the tiff package and a
>>>     file of
>>>    >> the same size in the native raster package format (*.grd) with
>>> the
>>>    >> raster package but again not with the rgdal package.
>>>    >>
>>>    >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
>>>    >> Rouault assumes the problem is related to rgdal and not gdal
>>>    >> (https://github.com/OSGeo/gdal/issues/2442).
>>>    >>
>>>    >> Below you find reproducible code, which generates a raster file,
>>>    >> saves the two formats (.tiff and .grd) and tries to read them
>>> with
>>>    >> the different packages.
>>>    >>
>>>    >> Is this a known limitation? Any help is greatly appreciated!
>>>    >>
>>>    >> Thanks a lot in advance!
>>>    >>
>>>    >> Best wishes and stay healthy,
>>>    >> Thorsten
>>>    >>
>>>    >>
>>>    >>
>>>    >> ### Steps to reproduce the problem.
>>>    >>
>>>    >> R code:
>>>    >>
>>>    >> ```
>>>    >> library(rgdal) # 1.4-8
>>>    >> library(raster) # 3.1-5
>>>    >> library(tiff) # 0.1-5
>>>    >>
>>>    >> ## generate and manipulate a big raster dataset
>>>    >> # - generate
>>>    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>>>    >> 72000, 0, 48000))) # all fine
>>>    >>
>>>    >> # - manipulate
>>>    >> values(rDemTest) = 1 # all fine
>>>    >>
>>>    >> # - convert
>>>    >> mDemTest = raster::as.matrix(rDemTest) # all fine
>>>    >> str(mDemTest)
>>>    >>
>>>    >> ## save a big dataset
>>>    >>
>>>    >> # - as raster/gdal
>>>    >> sFileNameTiff = "BigData.tif"
>>>    >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
>>>    >> NAflag = -9999) # all fine
>>>    >>
>>>    >> # - as raster native
>>>    >> sFileNameNative = "BigData.grd"
>>>    >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite =
>>> TRUE,
>>>    >> NAflag = -9999) # all fine
>>>    >>
>>>    >>
>>>    >> ## load the big raster datasets with different packages and
>>> options
>>>    >> # - load the tiff data with the gdal package via the raster
>>> package
>>>    >> rDem = raster(sFileNameTiff) # all fine
>>>    >> extent(rDem) # all fine
>>>    >> mDem = raster::as.matrix(rDem) # error
>>>    >> rDem = readAll(rDem) # error
>>>    >>
>>>    >> # - load the native raster data with the raster package
>>>    >> rDem = raster(sFileNameNative) # all fine
>>>    >> extent(rDem) # all fine
>>>    >> mDem = raster::as.matrix(rDem) # all fine
>>>    >> str(mDem)
>>>    >>
>>>    >> # - load the tiff data with the tiff package
>>>    >> mDem = readTIFF(sFileNameTiff) # all fine
>>>    >> str(mDem)
>>>    >>
>>>    >> # - load the tiff data with the gdal package
>>>    >> sfDem = readGDAL(sFileNameTiff) # error
>>>    >>
>>>    >> # - load the native raster data with the gdal package
>>>    >> sfDem = readGDAL(sFileNameNative) # error
>>>    >>
>>>    >> ```
>>>    >>
>>>    >>
>>>    >> ### Startup messages when rgdal is attached (requested by Roger
>>>     Bivand)
>>>    >>>  library(rgdal)
>>>    >> rgdal: version: 1.4-8, (SVN revision 845)
>>>    >>  Geospatial Data Abstraction Library extensions to R
>>>     successfully loaded
>>>    >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>>>    >>  Path to GDAL shared files:
>>>    >>  GDAL binary built with GEOS: TRUE
>>>    >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
>>>     [PJ_VERSION:
>>>    >> 631]
>>>    >>  Path to PROJ.4 shared files: (autodetected)
>>>    >>  Linking to sp version: 1.4-1
>>>    >>
>>>    >>
>>>    >> ### Session info
>>>    >>>  sessionInfo()
>>>    >> R version 4.0.0 (2020-04-24)
>>>    >> Platform: x86_64-pc-linux-gnu (64-bit)
>>>    >> Running under: Ubuntu 20.04 LTS
>>>    >>
>>>    >> Matrix products: default
>>>    >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>>>    >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>>>    >>
>>>    >> locale:
>>>    >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
>>>    >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
>>>    >> LC_MESSAGES=de_DE.UTF-8
>>>    >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
>>>    >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
>>>    >> LC_IDENTIFICATION=C
>>>    >>
>>>    >> attached base packages:
>>>    >> [1] stats     graphics  grDevices utils datasets methods base
>>>    >>
>>>    >> other attached packages:
>>>    >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5
>>>    >> sp_1.4-1
>>>    >>
>>>    >> loaded via a namespace (and not attached):
>>>    >>  [1] compiler_4.0.0    tools_4.0.0 Rcpp_1.0.4.6
>>>    >> R.methodsS3_1.8.0 codetools_0.2-16
>>>    >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
>>>    >> R.utils_2.9.2     R.oo_1.23.0
>>>    >> [11] lattice_0.20-41
>>>    >>
>>>    >>
>>>    >> ### gdalInfo
>>>    >>>  gdalinfo(sFileNameTiff)
>>>    >>  [1] "Driver: GTiff/GeoTIFF"
>>>    >>  [2] "Files: BigData.tif"
>>>    >>  [3] "Size is 72000, 48000"
>>>    >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
>>>    >>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
>>>    >>  [6] "Image Structure Metadata:"
>>>    >>  [7] "  COMPRESSION=LZW"
>>>    >>  [8] "  INTERLEAVE=BAND"
>>>    >>  [9] "Corner Coordinates:"
>>>    >> [10] "Upper Left  (       0.000,   48000.000) "
>>>    >> [11] "Lower Left  (   0.0000000,   0.0000000) "
>>>    >> [12] "Upper Right (   72000.000,   48000.000) "
>>>    >> [13] "Lower Right (   72000.000,       0.000) "
>>>    >> [14] "Center      (   36000.000,   24000.000) "
>>>    >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
>>>    >> [16] "  Min=1.000 Max=1.000 "
>>>    >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
>>>    >> [18] "  NoData Value=-9999"
>>>    >> [19] "  Metadata:"
>>>    >> [20] "    STATISTICS_MAXIMUM=1"
>>>    >> [21] "    STATISTICS_MEAN=nan"
>>>    >> [22] "    STATISTICS_MINIMUM=1"
>>>    >> [23] "    STATISTICS_STDDEV=nan"
>>>    >>
>>>    >> _______________________________________________
>>>    >> R-sig-Geo mailing list
>>>    >> [hidden email] <mailto:[hidden email]>
>>>    >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>    >>
>>>    >>
>>>    >
>>>
>>>     _______________________________________________
>>>     R-sig-Geo mailing list
>>>     [hidden email] <mailto:[hidden email]>
>>>     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> [hidden email]
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Thorsten Behrens
In reply to this post by Michael Sumner-2
Mike,

I'll look into terra as well. Overlooked it... Thanks!

My current code uses the raster package to load the datasets, hence I
would be happy if I do not have to change my code  :-)

I am trying to implement some multi-scale mapping in R/Rcpp (*Behrens,
T.*, Schmidt, K., MacMillan, R.A., Viscarra Rossel, R. (2018).
Multi-scale Digital Soil Mapping with deep learning. Scientific Reports
8,15244.). Reading and writing it piecewise does not really work for
this (especially for the coarse scales) and would complicate things a lot.

Cheers,

Thorsten


Am 28.04.2020 um 12:12 schrieb Michael Sumner:

> I reckon that's unlikely, it's also unlikely to scale or be fast
> enough ever  given what rgdal is - raster will happily crop sections
> of it out but I guess your workflow is to write the geotiff out in one
> go? You want rgdal because you want raster?  Have you tried terra? (I
> forgot to mention that one, and I haven't tried on that size GeoTIFF
> with it. It has its own internal connection to GDAL).
>
> Possibly you can instantiate a GeoTIFF with rgdal of the right size,
> then write to it piecewise - but if the convenience of raster is what
> you're after that's obviously no good. I think I'm missing some
> workflow constraint or motivation you have, but I'm interested (it's
> an awkward space).
>
> Cheers, Mike
>
> On Tue, Apr 28, 2020 at 7:14 PM Thorsten Behrens
> <[hidden email] <mailto:[hidden email]>>
> wrote:
>
>     Michael,
>
>     Thanks for the hint, it seems to work! Real-world tests will
>     follow in
>     the next few days...
>
>     So it definitely seems to be a problem of rgdal. It would be great
>     if it
>     could still be solved.
>
>     Best,
>
>     Thorsten
>
>
>
>     Am 27.04.2020 um 15:58 schrieb Michael Sumner:
>     > Try stars it worked for me on a test
>     >
>     > On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
>     > <[hidden email]
>     <mailto:[hidden email]>
>     <mailto:[hidden email]
>     <mailto:[hidden email]>>>
>     > wrote:
>     >
>     >     Roger,
>     >
>     >     thanks a lot for your reply!
>     >
>     >     I have 256GB RAM installed (mentioned it somewhere). And there,
>     >     all is
>     >     fine when I run:
>     >
>     >     rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>     extent(c(0, 72000,
>     >
>     >     values(rDemTest) = 1
>     >
>     >     When limiting the memory to about 8GB with
>     >     ulimit::memory_limit(8000),
>     >     the max size which can be allocated seems to be around 10000 x
>     >     10000px.
>     >     In this case all tests run fine. Unfortunately it seems to be
>     >     related to
>     >     the size of the grid (48000 x 72000) and therefore the problem
>     >     can't be
>     >     reproduced on machines with 8GB RAM. For some processing
>     steps I need
>     >     grids of that size in the memory, which is why I have 256 GB
>     >     installed.
>     >
>     >     Normally, I use the raster package and not
>     rgdal::readGDAL(). This
>     >     was
>     >     just a desperate attempt to find the source of the problem.
>     >
>     >     This is what I use in my code:
>     >
>     >     rDem = raster(sFileNameTiff)
>     >     mDem = raster::as.matrix(rDem)
>     >
>     >     But maybe this is the same...
>     >
>     >     Any further suggestions are much appreciated!
>     >
>     >     Thanks again!
>     >
>     >     Best,
>     >
>     >     Thorsten
>     >
>     >
>     >
>     >
>     >     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
>     >     > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>     >     >
>     >     >> Dear all,
>     >     >>
>     >     >> my problem is that I want to read a big geotiff raster
>     dataset
>     >     into R
>     >     >> and convert it to a matrix, which does not work.
>     >     >> The file is big but there is sufficient memory. I need
>     all the
>     >     data
>     >     >> in the memory at the same time.
>     >     >>
>     >     >> The error occurs under R 3.6.3 as well as 4.0.0 using
>     Ubuntu 20.04
>     >     >> LTS with the latest version of the packages (see session info
>     >     below)
>     >     >> and 256GB RAM installed.
>     >     >>
>     >     >> When loading the raster dataset using rgdal (via readGDAL or
>     >     >> raster::readAll) I get the follwoing error in R 4.0.0:
>     >     >>
>     >     >> ```
>     >     >> Error in rgdal::getRasterData(con, offset = offs,
>     region.dim =
>     >     reg,
>     >     >> band = object@data@band) :
>     >     >>   long vectors not supported yet: memory.c:3782
>     >     >> ```
>     >     >
>     >     > On a 16GB Fedora platform:
>     >     >
>     >     >> library(raster) # 3.1-5
>     >     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>     extent(c(0,
>     >     72000,
>     >     > 0,
>     >     > + 48000))) # all fine
>     >     >> rDemTest
>     >     > class      : RasterLayer
>     >     > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
>     >     > resolution : 1, 1  (x, y)
>     >     > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
>     >     > crs        : NA
>     >     >
>     >     >> values(rDemTest) = 1
>     >     > Error: cannot allocate vector of size 25.7 Gb
>     >     >
>     >     > So you are deceiving yourself into thinking that all is
>     fine at
>     >     this
>     >     > point. Please try to instantiate an example that can be
>     >     reproduced on
>     >     > a machine with 8GB RAM.
>     >     >
>     >     > Further note that rgdal::readGDAL() is not how you handle very
>     >     large
>     >     > objects in files, and never has been. raster can handle blocks
>     >     of data
>     >     > from bands in file; stars and gdalcubes can use proxy=TRUE
>     for the
>     >     > same purpose. Why did you choose rgdal::readGDAL() when
>     this is not
>     >     > its purpose?
>     >     >
>     >     > You did not say how much RAM is on your platform.
>     >     >
>     >     > Roger
>     >     >
>     >     >>
>     >     >> In R 3.6.3 is is "... memory.c:3717"
>     >     >>
>     >     >> However, I can load the same file with the tiff package and a
>     >     file of
>     >     >> the same size in the native raster package format (*.grd)
>     with the
>     >     >> raster package but again not with the rgdal package.
>     >     >>
>     >     >> gdalinfo (gdalUtils) does not complain (see below).
>     Hence, Even
>     >     >> Rouault assumes the problem is related to rgdal and not gdal
>     >     >> (https://github.com/OSGeo/gdal/issues/2442).
>     >     >>
>     >     >> Below you find reproducible code, which generates a
>     raster file,
>     >     >> saves the two formats (.tiff and .grd) and tries to read
>     them with
>     >     >> the different packages.
>     >     >>
>     >     >> Is this a known limitation? Any help is greatly appreciated!
>     >     >>
>     >     >> Thanks a lot in advance!
>     >     >>
>     >     >> Best wishes and stay healthy,
>     >     >> Thorsten
>     >     >>
>     >     >>
>     >     >>
>     >     >> ### Steps to reproduce the problem.
>     >     >>
>     >     >> R code:
>     >     >>
>     >     >> ```
>     >     >> library(rgdal) # 1.4-8
>     >     >> library(raster) # 3.1-5
>     >     >> library(tiff) # 0.1-5
>     >     >>
>     >     >> ## generate and manipulate a big raster dataset
>     >     >> # - generate
>     >     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>     extent(c(0,
>     >     >> 72000, 0, 48000))) # all fine
>     >     >>
>     >     >> # - manipulate
>     >     >> values(rDemTest) = 1 # all fine
>     >     >>
>     >     >> # - convert
>     >     >> mDemTest = raster::as.matrix(rDemTest) # all fine
>     >     >> str(mDemTest)
>     >     >>
>     >     >> ## save a big dataset
>     >     >>
>     >     >> # - as raster/gdal
>     >     >> sFileNameTiff = "BigData.tif"
>     >     >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite =
>     TRUE,
>     >     >> NAflag = -9999) # all fine
>     >     >>
>     >     >> # - as raster native
>     >     >> sFileNameNative = "BigData.grd"
>     >     >> writeRaster(rDemTest, sFileNameNative, "raster",
>     overwrite = TRUE,
>     >     >> NAflag = -9999) # all fine
>     >     >>
>     >     >>
>     >     >> ## load the big raster datasets with different packages
>     and options
>     >     >> # - load the tiff data with the gdal package via the
>     raster package
>     >     >> rDem = raster(sFileNameTiff) # all fine
>     >     >> extent(rDem) # all fine
>     >     >> mDem = raster::as.matrix(rDem) # error
>     >     >> rDem = readAll(rDem) # error
>     >     >>
>     >     >> # - load the native raster data with the raster package
>     >     >> rDem = raster(sFileNameNative) # all fine
>     >     >> extent(rDem) # all fine
>     >     >> mDem = raster::as.matrix(rDem) # all fine
>     >     >> str(mDem)
>     >     >>
>     >     >> # - load the tiff data with the tiff package
>     >     >> mDem = readTIFF(sFileNameTiff) # all fine
>     >     >> str(mDem)
>     >     >>
>     >     >> # - load the tiff data with the gdal package
>     >     >> sfDem = readGDAL(sFileNameTiff) # error
>     >     >>
>     >     >> # - load the native raster data with the gdal package
>     >     >> sfDem = readGDAL(sFileNameNative) # error
>     >     >>
>     >     >> ```
>     >     >>
>     >     >>
>     >     >> ### Startup messages when rgdal is attached (requested by
>     Roger
>     >     Bivand)
>     >     >>>  library(rgdal)
>     >     >> rgdal: version: 1.4-8, (SVN revision 845)
>     >     >>  Geospatial Data Abstraction Library extensions to R
>     >     successfully loaded
>     >     >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>     >     >>  Path to GDAL shared files:
>     >     >>  GDAL binary built with GEOS: TRUE
>     >     >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
>     >     [PJ_VERSION:
>     >     >> 631]
>     >     >>  Path to PROJ.4 shared files: (autodetected)
>     >     >>  Linking to sp version: 1.4-1
>     >     >>
>     >     >>
>     >     >> ### Session info
>     >     >>>  sessionInfo()
>     >     >> R version 4.0.0 (2020-04-24)
>     >     >> Platform: x86_64-pc-linux-gnu (64-bit)
>     >     >> Running under: Ubuntu 20.04 LTS
>     >     >>
>     >     >> Matrix products: default
>     >     >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>     >     >> LAPACK:
>     /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>     >     >>
>     >     >> locale:
>     >     >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
>     LC_TIME=de_DE.UTF-8
>     >     >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
>     >     >> LC_MESSAGES=de_DE.UTF-8
>     >     >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
>     >     >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
>     >     >> LC_IDENTIFICATION=C
>     >     >>
>     >     >> attached base packages:
>     >     >> [1] stats     graphics  grDevices utils datasets methods base
>     >     >>
>     >     >> other attached packages:
>     >     >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8 tiff_0.1-5 raster_3.1-5
>     >     >> sp_1.4-1
>     >     >>
>     >     >> loaded via a namespace (and not attached):
>     >     >>  [1] compiler_4.0.0    tools_4.0.0 Rcpp_1.0.4.6
>     >     >> R.methodsS3_1.8.0 codetools_0.2-16
>     >     >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
>     >     >> R.utils_2.9.2     R.oo_1.23.0
>     >     >> [11] lattice_0.20-41
>     >     >>
>     >     >>
>     >     >> ### gdalInfo
>     >     >>>  gdalinfo(sFileNameTiff)
>     >     >>  [1] "Driver: GTiff/GeoTIFF"
>     >     >>  [2] "Files: BigData.tif"
>     >     >>  [3] "Size is 72000, 48000"
>     >     >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
>     >     >>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
>     >     >>  [6] "Image Structure Metadata:"
>     >     >>  [7] "  COMPRESSION=LZW"
>     >     >>  [8] "  INTERLEAVE=BAND"
>     >     >>  [9] "Corner Coordinates:"
>     >     >> [10] "Upper Left  (       0.000, 48000.000) "
>     >     >> [11] "Lower Left  (   0.0000000, 0.0000000) "
>     >     >> [12] "Upper Right (   72000.000, 48000.000) "
>     >     >> [13] "Lower Right (   72000.000, 0.000) "
>     >     >> [14] "Center      (   36000.000, 24000.000) "
>     >     >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
>     >     >> [16] "  Min=1.000 Max=1.000 "
>     >     >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
>     >     >> [18] "  NoData Value=-9999"
>     >     >> [19] "  Metadata:"
>     >     >> [20] "    STATISTICS_MAXIMUM=1"
>     >     >> [21] "    STATISTICS_MEAN=nan"
>     >     >> [22] "    STATISTICS_MINIMUM=1"
>     >     >> [23] "    STATISTICS_STDDEV=nan"
>     >     >>
>     >     >> _______________________________________________
>     >     >> R-sig-Geo mailing list
>     >     >> [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     >     >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>     >     >>
>     >     >>
>     >     >
>     >
>     >     _______________________________________________
>     >     R-sig-Geo mailing list
>     > [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>     >
>
>             [[alternative HTML version deleted]]
>
>     _______________________________________________
>     R-sig-Geo mailing list
>     [hidden email] <mailto:[hidden email]>
>     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
>
> --
> Michael Sumner
> Software and Database Engineer
> Australian Antarctic Division
> Hobart, Australia
> e-mail: [hidden email] <mailto:[hidden email]>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Mathias Moser
In reply to this post by Thorsten Behrens
Dear all,

sorry if this is an obvious observation (or even completely wrong), but
I got interested in what's happening here: I think you are hitting the
long vector limit of R [1] and from a look at rgdal's code this happens
because it uses 32-bit types INTSXP/REALSXP in RGDAL_GetRasterData()
which are limited to 2^31-1 n. Making getRasterData() work for your
data set would probably require patching rgdal for long vector support.

(read_gdal_data() of stars _seems_ to use R_xlen_t instead to
accomodate for 52bits [2]).

Best wishes,

Mathias


[1]
https://cloud.r-project.org/doc/manuals/r-patched/R-ints.html#Long-vectors
[2]
https://github.com/r-spatial/sf/blob/ea1bd716769ab8140d3451e3d902cfc79bc895d5/src/stars.cpp#L172


On Tue, 2020-04-28 at 13:39 +0200, Thorsten Behrens wrote:

> Roger and Mike,
>
> I really appreciate your help on this!
>
> I had a look at getRasterData(). It results in the same error. Hence,
> I
> made further tests where I compared grids with the following numbers
> of
> cols and rows:
>
> nPx = floor(sqrt(2^31 -1)) # 46340
>
> and
>
> nPx = ceiling(sqrt(2^31 -1)) # 46341
>
> The result is clear. Raster data with 46340 * 46340 px can be loaded
> with getRasterData() and with raster(), raster::as.matrix(), whereas
> datasets with 46341 * 46341 px cannot and result in the error:
>
> Error in getRasterData(gdCeil, band = 1) :
> long vectors not supported yet: memory.c:3782
>
> read_stars() works for both. You find the corresponding code below.
>
> Are there any other option I can try?
>
> Thorsten
>
>
> Reproducible code:
>
> ## generate raster datasets
> # 46340 * 46340 grid dataset
> sFileNameFloor = "Floor.tif"
>
> nPx = floor(sqrt(2^31 -1)) # 46340
> nPx
>
> rFloor = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
> nPx)))
> values(rFloor) = 1
>
> writeRaster(rFloor, sFileNameFloor, "GTiff", overwrite = TRUE, NAflag
> =
> -9999)
>
>
> # 46341 * 46341 grid dataset
> sFileNameCeil = "Ceil.tif"
>
> nPx = ceiling(sqrt(2^31 -1))
> nPx
>
> rCeil = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
> nPx)))
> values(rCeil) = 1
>
> writeRaster(rCeil, sFileNameCeil, "GTiff", overwrite = TRUE, NAflag
> =
> -9999)
>
>
> ## load raster datasets with different methods
>
> # load Ceil
> gdCeil = GDAL.open(sFileNameCeil)
> dim(gdCeil)
>
> vnCeil = getRasterData(gdCeil, band = 1) # error
>
> GDAL.close(gdCeil)
> str(vnCeil)
>
> stCeil = read_stars(sFileNameCeil) # all fine
> str(stCeil[[1]])
>
> rCeil = raster(sFileNameCeil)
> mCeil = raster::as.matrix(rCeil) # error
> str(mCeil)
>
>
> # load Floor
> gdFloor = GDAL.open(sFileNameFloor)
> dim(gdFloor)
>
> vnData = getRasterData(gdFloor, band = 1) # all fine
>
> GDAL.close(gdFloor)
> str(vnData)
>
> stFloor= read_stars(sFileNameFloor) # all fine
> str(stFloor[[1]])
>
> rFloor = raster(sFileNameFloor)
> mFloor = raster::as.matrix(rFloor) # all fine
> str(mFloor)
>
>
>
>
> Am 28.04.2020 um 12:10 schrieb Roger Bivand:
> > On Tue, 28 Apr 2020, Thorsten Behrens wrote:
> >
> > > Michael,
> > >
> > > Thanks for the hint, it seems to work! Real-world tests will
> > > follow in
> > > the next few days...
> > >
> > > So it definitely seems to be a problem of rgdal. It would be
> > > great if it
> > > could still be solved.
> >
> > Not rgdal, but your use of it. Try looking at a sequence of
> >
> > file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
> > obj <- GDAL.open(file)
> > (dims <- dim(obj))
> > band <- 1
> > data_vector <- getRasterData(obj, band)
> > GDAL.close(obj)
> > str(data_vector)
> >
> > This does not create any more complicated objects, just a matrix.
> > In
> > some cases, the rows in the matrix are ordered S -> N, so it may
> > appear the wrong way up.
> >
> > rgdal::getRasterData() is lightweight, and has many arguments
> > which
> > may be helpful. rgdal::readGDAL() is heavyweight, creating a
> > SpatialGridDataFrame object. This involves much copying of data,
> > but
> > the output object can be used for example in mapping or analysis
> > without further conversion. My guess is that
> > rgdal::getRasterData()
> > gives you your matrix directly. Look at the R code to see how
> > as.is=
> > etc. work (files may include scale and offset values - recently a
> > user
> > was confused that scale and offset were "magically" applied to
> > convert
> > Uint16 values to degrees Kelvin on reading). For example, if as.is
> > ==
> > TRUE or scale == 1 and offset == 0, no copying of the input matrix
> > occurs because it is not converted. If you could check this route,
> > others following this thread could also benefit; if I'm wrong,
> > that
> > would also be good to know.
> >
> > Roger
> >
> >
> > > Best,
> > >
> > > Thorsten
> > >
> > >
> > >
> > > Am 27.04.2020 um 15:58 schrieb Michael Sumner:
> > > > Try stars it worked for me on a test
> > > >
> > > > On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
> > > > <
> > > > [hidden email]
> > > >  <mailto:
> > > > [hidden email]
> > > > >>
> > > > wrote:
> > > >
> > > >     Roger,
> > > >
> > > >     thanks a lot for your reply!
> > > >
> > > >     I have 256GB RAM installed (mentioned it somewhere). And
> > > > there,
> > > >     all is
> > > >     fine when I run:
> > > >
> > > >     rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > extent(c(0,
> > > > 72000,
> > > >
> > > >     values(rDemTest) = 1
> > > >
> > > >     When limiting the memory to about 8GB with
> > > >     ulimit::memory_limit(8000),
> > > >     the max size which can be allocated seems to be around
> > > > 10000 x
> > > >     10000px.
> > > >     In this case all tests run fine. Unfortunately it seems to
> > > > be
> > > >     related to
> > > >     the size of the grid (48000 x 72000) and therefore the
> > > > problem
> > > >     can't be
> > > >     reproduced on machines with 8GB RAM. For some processing
> > > > steps I
> > > > need
> > > >     grids of that size in the memory, which is why I have 256
> > > > GB
> > > >     installed.
> > > >
> > > >     Normally, I use the raster package and not
> > > > rgdal::readGDAL(). This
> > > >     was
> > > >     just a desperate attempt to find the source of the problem.
> > > >
> > > >     This is what I use in my code:
> > > >
> > > >     rDem = raster(sFileNameTiff)
> > > >     mDem = raster::as.matrix(rDem)
> > > >
> > > >     But maybe this is the same...
> > > >
> > > >     Any further suggestions are much appreciated!
> > > >
> > > >     Thanks again!
> > > >
> > > >     Best,
> > > >
> > > >     Thorsten
> > > >
> > > >
> > > >
> > > >
> > > >     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
> > > >    > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
> > > >    >
> > > >    >> Dear all,
> > > >    >>
> > > >    >> my problem is that I want to read a big geotiff raster
> > > > dataset
> > > >     into R
> > > >    >> and convert it to a matrix, which does not work.
> > > >    >> The file is big but there is sufficient memory. I need
> > > > all the
> > > >     data
> > > >    >> in the memory at the same time.
> > > >    >>
> > > >    >> The error occurs under R 3.6.3 as well as 4.0.0 using
> > > > Ubuntu
> > > > 20.04
> > > >    >> LTS with the latest version of the packages (see session
> > > > info
> > > >     below)
> > > >    >> and 256GB RAM installed.
> > > >    >>
> > > >    >> When loading the raster dataset using rgdal (via readGDAL
> > > > or
> > > >    >> raster::readAll) I get the follwoing error in R 4.0.0:
> > > >    >>
> > > >    >> ```
> > > >    >> Error in rgdal::getRasterData(con, offset = offs,
> > > > region.dim =
> > > >     reg,
> > > >    >> band = object@data@band) :
> > > >    >>   long vectors not supported yet: memory.c:3782
> > > >    >> ```
> > > >    >
> > > >    > On a 16GB Fedora platform:
> > > >    >
> > > >    >> library(raster) # 3.1-5
> > > >    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > extent(c(0,
> > > >     72000,
> > > >    > 0,
> > > >    > + 48000))) # all fine
> > > >    >> rDemTest
> > > >    > class      : RasterLayer
> > > >    > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
> > > >    > resolution : 1, 1  (x, y)
> > > >    > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
> > > >    > crs        : NA
> > > >    >
> > > >    >> values(rDemTest) = 1
> > > >    > Error: cannot allocate vector of size 25.7 Gb
> > > >    >
> > > >    > So you are deceiving yourself into thinking that all is
> > > > fine at
> > > >     this
> > > >    > point. Please try to instantiate an example that can be
> > > >     reproduced on
> > > >    > a machine with 8GB RAM.
> > > >    >
> > > >    > Further note that rgdal::readGDAL() is not how you handle
> > > > very
> > > >     large
> > > >    > objects in files, and never has been. raster can handle
> > > > blocks
> > > >     of data
> > > >    > from bands in file; stars and gdalcubes can use proxy=TRUE
> > > > for the
> > > >    > same purpose. Why did you choose rgdal::readGDAL() when
> > > > this is
> > > > not
> > > >    > its purpose?
> > > >    >
> > > >    > You did not say how much RAM is on your platform.
> > > >    >
> > > >    > Roger
> > > >    >
> > > >    >>
> > > >    >> In R 3.6.3 is is "... memory.c:3717"
> > > >    >>
> > > >    >> However, I can load the same file with the tiff package
> > > > and a
> > > >     file of
> > > >    >> the same size in the native raster package format (*.grd)
> > > > with
> > > > the
> > > >    >> raster package but again not with the rgdal package.
> > > >    >>
> > > >    >> gdalinfo (gdalUtils) does not complain (see below).
> > > > Hence, Even
> > > >    >> Rouault assumes the problem is related to rgdal and not
> > > > gdal
> > > >    >> (
> > > > https://github.com/OSGeo/gdal/issues/2442
> > > > ).
> > > >    >>
> > > >    >> Below you find reproducible code, which generates a
> > > > raster file,
> > > >    >> saves the two formats (.tiff and .grd) and tries to read
> > > > them
> > > > with
> > > >    >> the different packages.
> > > >    >>
> > > >    >> Is this a known limitation? Any help is greatly
> > > > appreciated!
> > > >    >>
> > > >    >> Thanks a lot in advance!
> > > >    >>
> > > >    >> Best wishes and stay healthy,
> > > >    >> Thorsten
> > > >    >>
> > > >    >>
> > > >    >>
> > > >    >> ### Steps to reproduce the problem.
> > > >    >>
> > > >    >> R code:
> > > >    >>
> > > >    >> ```
> > > >    >> library(rgdal) # 1.4-8
> > > >    >> library(raster) # 3.1-5
> > > >    >> library(tiff) # 0.1-5
> > > >    >>
> > > >    >> ## generate and manipulate a big raster dataset
> > > >    >> # - generate
> > > >    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > extent(c(0,
> > > >    >> 72000, 0, 48000))) # all fine
> > > >    >>
> > > >    >> # - manipulate
> > > >    >> values(rDemTest) = 1 # all fine
> > > >    >>
> > > >    >> # - convert
> > > >    >> mDemTest = raster::as.matrix(rDemTest) # all fine
> > > >    >> str(mDemTest)
> > > >    >>
> > > >    >> ## save a big dataset
> > > >    >>
> > > >    >> # - as raster/gdal
> > > >    >> sFileNameTiff = "BigData.tif"
> > > >    >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite =
> > > > TRUE,
> > > >    >> NAflag = -9999) # all fine
> > > >    >>
> > > >    >> # - as raster native
> > > >    >> sFileNameNative = "BigData.grd"
> > > >    >> writeRaster(rDemTest, sFileNameNative, "raster",
> > > > overwrite =
> > > > TRUE,
> > > >    >> NAflag = -9999) # all fine
> > > >    >>
> > > >    >>
> > > >    >> ## load the big raster datasets with different packages
> > > > and
> > > > options
> > > >    >> # - load the tiff data with the gdal package via the
> > > > raster
> > > > package
> > > >    >> rDem = raster(sFileNameTiff) # all fine
> > > >    >> extent(rDem) # all fine
> > > >    >> mDem = raster::as.matrix(rDem) # error
> > > >    >> rDem = readAll(rDem) # error
> > > >    >>
> > > >    >> # - load the native raster data with the raster package
> > > >    >> rDem = raster(sFileNameNative) # all fine
> > > >    >> extent(rDem) # all fine
> > > >    >> mDem = raster::as.matrix(rDem) # all fine
> > > >    >> str(mDem)
> > > >    >>
> > > >    >> # - load the tiff data with the tiff package
> > > >    >> mDem = readTIFF(sFileNameTiff) # all fine
> > > >    >> str(mDem)
> > > >    >>
> > > >    >> # - load the tiff data with the gdal package
> > > >    >> sfDem = readGDAL(sFileNameTiff) # error
> > > >    >>
> > > >    >> # - load the native raster data with the gdal package
> > > >    >> sfDem = readGDAL(sFileNameNative) # error
> > > >    >>
> > > >    >> ```
> > > >    >>
> > > >    >>
> > > >    >> ### Startup messages when rgdal is attached (requested by
> > > > Roger
> > > >     Bivand)
> > > >    >>>  library(rgdal)
> > > >    >> rgdal: version: 1.4-8, (SVN revision 845)
> > > >    >>  Geospatial Data Abstraction Library extensions to R
> > > >     successfully loaded
> > > >    >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
> > > >    >>  Path to GDAL shared files:
> > > >    >>  GDAL binary built with GEOS: TRUE
> > > >    >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
> > > >     [PJ_VERSION:
> > > >    >> 631]
> > > >    >>  Path to PROJ.4 shared files: (autodetected)
> > > >    >>  Linking to sp version: 1.4-1
> > > >    >>
> > > >    >>
> > > >    >> ### Session info
> > > >    >>>  sessionInfo()
> > > >    >> R version 4.0.0 (2020-04-24)
> > > >    >> Platform: x86_64-pc-linux-gnu (64-bit)
> > > >    >> Running under: Ubuntu 20.04 LTS
> > > >    >>
> > > >    >> Matrix products: default
> > > >    >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-
> > > > pthread/libblas.so.3
> > > >    >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-
> > > > pthread/liblapack.so.3
> > > >    >>
> > > >    >> locale:
> > > >    >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
> > > > LC_TIME=de_DE.UTF-8
> > > >    >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
> > > >    >> LC_MESSAGES=de_DE.UTF-8
> > > >    >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
> > > >    >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
> > > >    >> LC_IDENTIFICATION=C
> > > >    >>
> > > >    >> attached base packages:
> > > >    >> [1] stats     graphics  grDevices utils datasets methods
> > > > base
> > > >    >>
> > > >    >> other attached packages:
> > > >    >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5
> > > > raster_3.1-5
> > > >    >> sp_1.4-1
> > > >    >>
> > > >    >> loaded via a namespace (and not attached):
> > > >    >>  [1] compiler_4.0.0    tools_4.0.0 Rcpp_1.0.4.6
> > > >    >> R.methodsS3_1.8.0 codetools_0.2-16
> > > >    >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
> > > >    >> R.utils_2.9.2     R.oo_1.23.0
> > > >    >> [11] lattice_0.20-41
> > > >    >>
> > > >    >>
> > > >    >> ### gdalInfo
> > > >    >>>  gdalinfo(sFileNameTiff)
> > > >    >>  [1] "Driver: GTiff/GeoTIFF"
> > > >    >>  [2] "Files: BigData.tif"
> > > >    >>  [3] "Size is 72000, 48000"
> > > >    >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
> > > >    >>  [5] "Pixel Size = (1.000000000000000,-
> > > > 1.000000000000000)"
> > > >    >>  [6] "Image Structure Metadata:"
> > > >    >>  [7] "  COMPRESSION=LZW"
> > > >    >>  [8] "  INTERLEAVE=BAND"
> > > >    >>  [9] "Corner Coordinates:"
> > > >    >> [10] "Upper Left  (       0.000,   48000.000) "
> > > >    >> [11] "Lower Left  (   0.0000000,   0.0000000) "
> > > >    >> [12] "Upper Right (   72000.000,   48000.000) "
> > > >    >> [13] "Lower Right (   72000.000,       0.000) "
> > > >    >> [14] "Center      (   36000.000,   24000.000) "
> > > >    >> [15] "Band 1 Block=72000x1 Type=Float32,
> > > > ColorInterp=Gray"
> > > >    >> [16] "  Min=1.000 Max=1.000 "
> > > >    >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan,
> > > > StdDev=nan"
> > > >    >> [18] "  NoData Value=-9999"
> > > >    >> [19] "  Metadata:"
> > > >    >> [20] "    STATISTICS_MAXIMUM=1"
> > > >    >> [21] "    STATISTICS_MEAN=nan"
> > > >    >> [22] "    STATISTICS_MINIMUM=1"
> > > >    >> [23] "    STATISTICS_STDDEV=nan"
> > > >    >>
> > > >    >> _______________________________________________
> > > >    >> R-sig-Geo mailing list
> > > >    >>
> > > > [hidden email]
> > > >  <mailto:
> > > > [hidden email]
> > > > >
> > > >    >>
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > >
> > > >    >>
> > > >    >>
> > > >    >
> > > >
> > > >     _______________________________________________
> > > >     R-sig-Geo mailing list
> > > >    
> > > > [hidden email]
> > > >  <mailto:
> > > > [hidden email]
> > > > >
> > > >    
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > >
> > > >
> > >
> > >     [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > R-sig-Geo mailing list
> > > [hidden email]
> > >
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > >
> > >
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
>
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Thorsten Behrens
Hi Mathias and all,

thanks! This confirms my test with the grid sizes related to 2^31-1

My current solution is to use

mLargeRaster = read_stars("Raster.tif")[[1]]

instead of

rLargeRaster = raster("Raster.tif")
mLargeRaster = raster::as.matrix(rLargeRaster)

Cheers,
Thorsten



Am 28.04.2020 um 16:29 schrieb Mathias Moser:

> Dear all,
>
> sorry if this is an obvious observation (or even completely wrong), but
> I got interested in what's happening here: I think you are hitting the
> long vector limit of R [1] and from a look at rgdal's code this happens
> because it uses 32-bit types INTSXP/REALSXP in RGDAL_GetRasterData()
> which are limited to 2^31-1 n. Making getRasterData() work for your
> data set would probably require patching rgdal for long vector support.
>
> (read_gdal_data() of stars _seems_ to use R_xlen_t instead to
> accomodate for 52bits [2]).
>
> Best wishes,
>
> Mathias
>
>
> [1]
> https://cloud.r-project.org/doc/manuals/r-patched/R-ints.html#Long-vectors
> [2]
> https://github.com/r-spatial/sf/blob/ea1bd716769ab8140d3451e3d902cfc79bc895d5/src/stars.cpp#L172
>
>
> On Tue, 2020-04-28 at 13:39 +0200, Thorsten Behrens wrote:
>> Roger and Mike,
>>
>> I really appreciate your help on this!
>>
>> I had a look at getRasterData(). It results in the same error. Hence,
>> I
>> made further tests where I compared grids with the following numbers
>> of
>> cols and rows:
>>
>> nPx = floor(sqrt(2^31 -1)) # 46340
>>
>> and
>>
>> nPx = ceiling(sqrt(2^31 -1)) # 46341
>>
>> The result is clear. Raster data with 46340 * 46340 px can be loaded
>> with getRasterData() and with raster(), raster::as.matrix(), whereas
>> datasets with 46341 * 46341 px cannot and result in the error:
>>
>> Error in getRasterData(gdCeil, band = 1) :
>> long vectors not supported yet: memory.c:3782
>>
>> read_stars() works for both. You find the corresponding code below.
>>
>> Are there any other option I can try?
>>
>> Thorsten
>>
>>
>> Reproducible code:
>>
>> ## generate raster datasets
>> # 46340 * 46340 grid dataset
>> sFileNameFloor = "Floor.tif"
>>
>> nPx = floor(sqrt(2^31 -1)) # 46340
>> nPx
>>
>> rFloor = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
>> nPx)))
>> values(rFloor) = 1
>>
>> writeRaster(rFloor, sFileNameFloor, "GTiff", overwrite = TRUE, NAflag
>> =
>> -9999)
>>
>>
>> # 46341 * 46341 grid dataset
>> sFileNameCeil = "Ceil.tif"
>>
>> nPx = ceiling(sqrt(2^31 -1))
>> nPx
>>
>> rCeil = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
>> nPx)))
>> values(rCeil) = 1
>>
>> writeRaster(rCeil, sFileNameCeil, "GTiff", overwrite = TRUE, NAflag
>> =
>> -9999)
>>
>>
>> ## load raster datasets with different methods
>>
>> # load Ceil
>> gdCeil = GDAL.open(sFileNameCeil)
>> dim(gdCeil)
>>
>> vnCeil = getRasterData(gdCeil, band = 1) # error
>>
>> GDAL.close(gdCeil)
>> str(vnCeil)
>>
>> stCeil = read_stars(sFileNameCeil) # all fine
>> str(stCeil[[1]])
>>
>> rCeil = raster(sFileNameCeil)
>> mCeil = raster::as.matrix(rCeil) # error
>> str(mCeil)
>>
>>
>> # load Floor
>> gdFloor = GDAL.open(sFileNameFloor)
>> dim(gdFloor)
>>
>> vnData = getRasterData(gdFloor, band = 1) # all fine
>>
>> GDAL.close(gdFloor)
>> str(vnData)
>>
>> stFloor= read_stars(sFileNameFloor) # all fine
>> str(stFloor[[1]])
>>
>> rFloor = raster(sFileNameFloor)
>> mFloor = raster::as.matrix(rFloor) # all fine
>> str(mFloor)
>>
>>
>>
>>
>> Am 28.04.2020 um 12:10 schrieb Roger Bivand:
>>> On Tue, 28 Apr 2020, Thorsten Behrens wrote:
>>>
>>>> Michael,
>>>>
>>>> Thanks for the hint, it seems to work! Real-world tests will
>>>> follow in
>>>> the next few days...
>>>>
>>>> So it definitely seems to be a problem of rgdal. It would be
>>>> great if it
>>>> could still be solved.
>>> Not rgdal, but your use of it. Try looking at a sequence of
>>>
>>> file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
>>> obj <- GDAL.open(file)
>>> (dims <- dim(obj))
>>> band <- 1
>>> data_vector <- getRasterData(obj, band)
>>> GDAL.close(obj)
>>> str(data_vector)
>>>
>>> This does not create any more complicated objects, just a matrix.
>>> In
>>> some cases, the rows in the matrix are ordered S -> N, so it may
>>> appear the wrong way up.
>>>
>>> rgdal::getRasterData() is lightweight, and has many arguments
>>> which
>>> may be helpful. rgdal::readGDAL() is heavyweight, creating a
>>> SpatialGridDataFrame object. This involves much copying of data,
>>> but
>>> the output object can be used for example in mapping or analysis
>>> without further conversion. My guess is that
>>> rgdal::getRasterData()
>>> gives you your matrix directly. Look at the R code to see how
>>> as.is=
>>> etc. work (files may include scale and offset values - recently a
>>> user
>>> was confused that scale and offset were "magically" applied to
>>> convert
>>> Uint16 values to degrees Kelvin on reading). For example, if as.is
>>> ==
>>> TRUE or scale == 1 and offset == 0, no copying of the input matrix
>>> occurs because it is not converted. If you could check this route,
>>> others following this thread could also benefit; if I'm wrong,
>>> that
>>> would also be good to know.
>>>
>>> Roger
>>>
>>>
>>>> Best,
>>>>
>>>> Thorsten
>>>>
>>>>
>>>>
>>>> Am 27.04.2020 um 15:58 schrieb Michael Sumner:
>>>>> Try stars it worked for me on a test
>>>>>
>>>>> On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
>>>>> <
>>>>> [hidden email]
>>>>>   <mailto:
>>>>> [hidden email]
>>>>> wrote:
>>>>>
>>>>>      Roger,
>>>>>
>>>>>      thanks a lot for your reply!
>>>>>
>>>>>      I have 256GB RAM installed (mentioned it somewhere). And
>>>>> there,
>>>>>      all is
>>>>>      fine when I run:
>>>>>
>>>>>      rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>>>>> extent(c(0,
>>>>> 72000,
>>>>>
>>>>>      values(rDemTest) = 1
>>>>>
>>>>>      When limiting the memory to about 8GB with
>>>>>      ulimit::memory_limit(8000),
>>>>>      the max size which can be allocated seems to be around
>>>>> 10000 x
>>>>>      10000px.
>>>>>      In this case all tests run fine. Unfortunately it seems to
>>>>> be
>>>>>      related to
>>>>>      the size of the grid (48000 x 72000) and therefore the
>>>>> problem
>>>>>      can't be
>>>>>      reproduced on machines with 8GB RAM. For some processing
>>>>> steps I
>>>>> need
>>>>>      grids of that size in the memory, which is why I have 256
>>>>> GB
>>>>>      installed.
>>>>>
>>>>>      Normally, I use the raster package and not
>>>>> rgdal::readGDAL(). This
>>>>>      was
>>>>>      just a desperate attempt to find the source of the problem.
>>>>>
>>>>>      This is what I use in my code:
>>>>>
>>>>>      rDem = raster(sFileNameTiff)
>>>>>      mDem = raster::as.matrix(rDem)
>>>>>
>>>>>      But maybe this is the same...
>>>>>
>>>>>      Any further suggestions are much appreciated!
>>>>>
>>>>>      Thanks again!
>>>>>
>>>>>      Best,
>>>>>
>>>>>      Thorsten
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>      Am 27.04.2020 um 14:50 schrieb Roger Bivand:
>>>>>     > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>>>>>     >
>>>>>     >> Dear all,
>>>>>     >>
>>>>>     >> my problem is that I want to read a big geotiff raster
>>>>> dataset
>>>>>      into R
>>>>>     >> and convert it to a matrix, which does not work.
>>>>>     >> The file is big but there is sufficient memory. I need
>>>>> all the
>>>>>      data
>>>>>     >> in the memory at the same time.
>>>>>     >>
>>>>>     >> The error occurs under R 3.6.3 as well as 4.0.0 using
>>>>> Ubuntu
>>>>> 20.04
>>>>>     >> LTS with the latest version of the packages (see session
>>>>> info
>>>>>      below)
>>>>>     >> and 256GB RAM installed.
>>>>>     >>
>>>>>     >> When loading the raster dataset using rgdal (via readGDAL
>>>>> or
>>>>>     >> raster::readAll) I get the follwoing error in R 4.0.0:
>>>>>     >>
>>>>>     >> ```
>>>>>     >> Error in rgdal::getRasterData(con, offset = offs,
>>>>> region.dim =
>>>>>      reg,
>>>>>     >> band = object@data@band) :
>>>>>     >>   long vectors not supported yet: memory.c:3782
>>>>>     >> ```
>>>>>     >
>>>>>     > On a 16GB Fedora platform:
>>>>>     >
>>>>>     >> library(raster) # 3.1-5
>>>>>     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>>>>> extent(c(0,
>>>>>      72000,
>>>>>     > 0,
>>>>>     > + 48000))) # all fine
>>>>>     >> rDemTest
>>>>>     > class      : RasterLayer
>>>>>     > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
>>>>>     > resolution : 1, 1  (x, y)
>>>>>     > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
>>>>>     > crs        : NA
>>>>>     >
>>>>>     >> values(rDemTest) = 1
>>>>>     > Error: cannot allocate vector of size 25.7 Gb
>>>>>     >
>>>>>     > So you are deceiving yourself into thinking that all is
>>>>> fine at
>>>>>      this
>>>>>     > point. Please try to instantiate an example that can be
>>>>>      reproduced on
>>>>>     > a machine with 8GB RAM.
>>>>>     >
>>>>>     > Further note that rgdal::readGDAL() is not how you handle
>>>>> very
>>>>>      large
>>>>>     > objects in files, and never has been. raster can handle
>>>>> blocks
>>>>>      of data
>>>>>     > from bands in file; stars and gdalcubes can use proxy=TRUE
>>>>> for the
>>>>>     > same purpose. Why did you choose rgdal::readGDAL() when
>>>>> this is
>>>>> not
>>>>>     > its purpose?
>>>>>     >
>>>>>     > You did not say how much RAM is on your platform.
>>>>>     >
>>>>>     > Roger
>>>>>     >
>>>>>     >>
>>>>>     >> In R 3.6.3 is is "... memory.c:3717"
>>>>>     >>
>>>>>     >> However, I can load the same file with the tiff package
>>>>> and a
>>>>>      file of
>>>>>     >> the same size in the native raster package format (*.grd)
>>>>> with
>>>>> the
>>>>>     >> raster package but again not with the rgdal package.
>>>>>     >>
>>>>>     >> gdalinfo (gdalUtils) does not complain (see below).
>>>>> Hence, Even
>>>>>     >> Rouault assumes the problem is related to rgdal and not
>>>>> gdal
>>>>>     >> (
>>>>> https://github.com/OSGeo/gdal/issues/2442
>>>>> ).
>>>>>     >>
>>>>>     >> Below you find reproducible code, which generates a
>>>>> raster file,
>>>>>     >> saves the two formats (.tiff and .grd) and tries to read
>>>>> them
>>>>> with
>>>>>     >> the different packages.
>>>>>     >>
>>>>>     >> Is this a known limitation? Any help is greatly
>>>>> appreciated!
>>>>>     >>
>>>>>     >> Thanks a lot in advance!
>>>>>     >>
>>>>>     >> Best wishes and stay healthy,
>>>>>     >> Thorsten
>>>>>     >>
>>>>>     >>
>>>>>     >>
>>>>>     >> ### Steps to reproduce the problem.
>>>>>     >>
>>>>>     >> R code:
>>>>>     >>
>>>>>     >> ```
>>>>>     >> library(rgdal) # 1.4-8
>>>>>     >> library(raster) # 3.1-5
>>>>>     >> library(tiff) # 0.1-5
>>>>>     >>
>>>>>     >> ## generate and manipulate a big raster dataset
>>>>>     >> # - generate
>>>>>     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>>>>> extent(c(0,
>>>>>     >> 72000, 0, 48000))) # all fine
>>>>>     >>
>>>>>     >> # - manipulate
>>>>>     >> values(rDemTest) = 1 # all fine
>>>>>     >>
>>>>>     >> # - convert
>>>>>     >> mDemTest = raster::as.matrix(rDemTest) # all fine
>>>>>     >> str(mDemTest)
>>>>>     >>
>>>>>     >> ## save a big dataset
>>>>>     >>
>>>>>     >> # - as raster/gdal
>>>>>     >> sFileNameTiff = "BigData.tif"
>>>>>     >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite =
>>>>> TRUE,
>>>>>     >> NAflag = -9999) # all fine
>>>>>     >>
>>>>>     >> # - as raster native
>>>>>     >> sFileNameNative = "BigData.grd"
>>>>>     >> writeRaster(rDemTest, sFileNameNative, "raster",
>>>>> overwrite =
>>>>> TRUE,
>>>>>     >> NAflag = -9999) # all fine
>>>>>     >>
>>>>>     >>
>>>>>     >> ## load the big raster datasets with different packages
>>>>> and
>>>>> options
>>>>>     >> # - load the tiff data with the gdal package via the
>>>>> raster
>>>>> package
>>>>>     >> rDem = raster(sFileNameTiff) # all fine
>>>>>     >> extent(rDem) # all fine
>>>>>     >> mDem = raster::as.matrix(rDem) # error
>>>>>     >> rDem = readAll(rDem) # error
>>>>>     >>
>>>>>     >> # - load the native raster data with the raster package
>>>>>     >> rDem = raster(sFileNameNative) # all fine
>>>>>     >> extent(rDem) # all fine
>>>>>     >> mDem = raster::as.matrix(rDem) # all fine
>>>>>     >> str(mDem)
>>>>>     >>
>>>>>     >> # - load the tiff data with the tiff package
>>>>>     >> mDem = readTIFF(sFileNameTiff) # all fine
>>>>>     >> str(mDem)
>>>>>     >>
>>>>>     >> # - load the tiff data with the gdal package
>>>>>     >> sfDem = readGDAL(sFileNameTiff) # error
>>>>>     >>
>>>>>     >> # - load the native raster data with the gdal package
>>>>>     >> sfDem = readGDAL(sFileNameNative) # error
>>>>>     >>
>>>>>     >> ```
>>>>>     >>
>>>>>     >>
>>>>>     >> ### Startup messages when rgdal is attached (requested by
>>>>> Roger
>>>>>      Bivand)
>>>>>     >>>  library(rgdal)
>>>>>     >> rgdal: version: 1.4-8, (SVN revision 845)
>>>>>     >>  Geospatial Data Abstraction Library extensions to R
>>>>>      successfully loaded
>>>>>     >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>>>>>     >>  Path to GDAL shared files:
>>>>>     >>  GDAL binary built with GEOS: TRUE
>>>>>     >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
>>>>>      [PJ_VERSION:
>>>>>     >> 631]
>>>>>     >>  Path to PROJ.4 shared files: (autodetected)
>>>>>     >>  Linking to sp version: 1.4-1
>>>>>     >>
>>>>>     >>
>>>>>     >> ### Session info
>>>>>     >>>  sessionInfo()
>>>>>     >> R version 4.0.0 (2020-04-24)
>>>>>     >> Platform: x86_64-pc-linux-gnu (64-bit)
>>>>>     >> Running under: Ubuntu 20.04 LTS
>>>>>     >>
>>>>>     >> Matrix products: default
>>>>>     >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-
>>>>> pthread/libblas.so.3
>>>>>     >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-
>>>>> pthread/liblapack.so.3
>>>>>     >>
>>>>>     >> locale:
>>>>>     >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
>>>>> LC_TIME=de_DE.UTF-8
>>>>>     >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
>>>>>     >> LC_MESSAGES=de_DE.UTF-8
>>>>>     >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
>>>>>     >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
>>>>>     >> LC_IDENTIFICATION=C
>>>>>     >>
>>>>>     >> attached base packages:
>>>>>     >> [1] stats     graphics  grDevices utils datasets methods
>>>>> base
>>>>>     >>
>>>>>     >> other attached packages:
>>>>>     >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5
>>>>> raster_3.1-5
>>>>>     >> sp_1.4-1
>>>>>     >>
>>>>>     >> loaded via a namespace (and not attached):
>>>>>     >>  [1] compiler_4.0.0    tools_4.0.0 Rcpp_1.0.4.6
>>>>>     >> R.methodsS3_1.8.0 codetools_0.2-16
>>>>>     >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
>>>>>     >> R.utils_2.9.2     R.oo_1.23.0
>>>>>     >> [11] lattice_0.20-41
>>>>>     >>
>>>>>     >>
>>>>>     >> ### gdalInfo
>>>>>     >>>  gdalinfo(sFileNameTiff)
>>>>>     >>  [1] "Driver: GTiff/GeoTIFF"
>>>>>     >>  [2] "Files: BigData.tif"
>>>>>     >>  [3] "Size is 72000, 48000"
>>>>>     >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
>>>>>     >>  [5] "Pixel Size = (1.000000000000000,-
>>>>> 1.000000000000000)"
>>>>>     >>  [6] "Image Structure Metadata:"
>>>>>     >>  [7] "  COMPRESSION=LZW"
>>>>>     >>  [8] "  INTERLEAVE=BAND"
>>>>>     >>  [9] "Corner Coordinates:"
>>>>>     >> [10] "Upper Left  (       0.000,   48000.000) "
>>>>>     >> [11] "Lower Left  (   0.0000000,   0.0000000) "
>>>>>     >> [12] "Upper Right (   72000.000,   48000.000) "
>>>>>     >> [13] "Lower Right (   72000.000,       0.000) "
>>>>>     >> [14] "Center      (   36000.000,   24000.000) "
>>>>>     >> [15] "Band 1 Block=72000x1 Type=Float32,
>>>>> ColorInterp=Gray"
>>>>>     >> [16] "  Min=1.000 Max=1.000 "
>>>>>     >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan,
>>>>> StdDev=nan"
>>>>>     >> [18] "  NoData Value=-9999"
>>>>>     >> [19] "  Metadata:"
>>>>>     >> [20] "    STATISTICS_MAXIMUM=1"
>>>>>     >> [21] "    STATISTICS_MEAN=nan"
>>>>>     >> [22] "    STATISTICS_MINIMUM=1"
>>>>>     >> [23] "    STATISTICS_STDDEV=nan"
>>>>>     >>
>>>>>     >> _______________________________________________
>>>>>     >> R-sig-Geo mailing list
>>>>>     >>
>>>>> [hidden email]
>>>>>   <mailto:
>>>>> [hidden email]
>>>>>     >>
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>
>>>>>     >>
>>>>>     >>
>>>>>     >
>>>>>
>>>>>      _______________________________________________
>>>>>      R-sig-Geo mailing list
>>>>>      
>>>>> [hidden email]
>>>>>   <mailto:
>>>>> [hidden email]
>>>>>      
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>
>>>>>
>>>>      [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> [hidden email]
>>>>
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> [hidden email]
>>
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Roger Bivand
Administrator
In reply to this post by Mathias Moser
On Tue, 28 Apr 2020, Mathias Moser wrote:

> Dear all,
>
> sorry if this is an obvious observation (or even completely wrong), but
> I got interested in what's happening here: I think you are hitting the
> long vector limit of R [1] and from a look at rgdal's code this happens
> because it uses 32-bit types INTSXP/REALSXP in RGDAL_GetRasterData()
> which are limited to 2^31-1 n. Making getRasterData() work for your
> data set would probably require patching rgdal for long vector support.

This looks probable, I'd got to the same point, as now I can max out swap
on assigning values. Changes to lines 1574 and 1664 in
src/gdal-bindings.cpp committed to SVN, revision 961, will be installable
with

https://r-forge.r-project.org/R/?group_id=884 and if status green,
install.packages("rgdal", repos="http://R-Forge.R-project.org")

in about 2 hours (R-Forge builds tarballs, slowly), or source tree now by

svn checkout svn://scm.r-forge.r-project.org/svnroot/rgdal/

I cannot check this because I have no access to a platform with enough
RAM, so I need help here. I haven't been able to confirm that
LENGTH(sRStorage) returns an R_xlen_t, or double, in lines 1672, 1685 or
1693. Is there a way of generating a large file using a GDAL app, perhaps,
then I could just try reading a larger file if LENGTH plays up?

Roger

>
> (read_gdal_data() of stars _seems_ to use R_xlen_t instead to
> accomodate for 52bits [2]).
>
> Best wishes,
>
> Mathias
>
>
> [1]
> https://cloud.r-project.org/doc/manuals/r-patched/R-ints.html#Long-vectors
> [2]
> https://github.com/r-spatial/sf/blob/ea1bd716769ab8140d3451e3d902cfc79bc895d5/src/stars.cpp#L172
>
>
> On Tue, 2020-04-28 at 13:39 +0200, Thorsten Behrens wrote:
>> Roger and Mike,
>>
>> I really appreciate your help on this!
>>
>> I had a look at getRasterData(). It results in the same error. Hence,
>> I
>> made further tests where I compared grids with the following numbers
>> of
>> cols and rows:
>>
>> nPx = floor(sqrt(2^31 -1)) # 46340
>>
>> and
>>
>> nPx = ceiling(sqrt(2^31 -1)) # 46341
>>
>> The result is clear. Raster data with 46340 * 46340 px can be loaded
>> with getRasterData() and with raster(), raster::as.matrix(), whereas
>> datasets with 46341 * 46341 px cannot and result in the error:
>>
>> Error in getRasterData(gdCeil, band = 1) :
>> long vectors not supported yet: memory.c:3782
>>
>> read_stars() works for both. You find the corresponding code below.
>>
>> Are there any other option I can try?
>>
>> Thorsten
>>
>>
>> Reproducible code:
>>
>> ## generate raster datasets
>> # 46340 * 46340 grid dataset
>> sFileNameFloor = "Floor.tif"
>>
>> nPx = floor(sqrt(2^31 -1)) # 46340
>> nPx
>>
>> rFloor = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
>> nPx)))
>> values(rFloor) = 1
>>
>> writeRaster(rFloor, sFileNameFloor, "GTiff", overwrite = TRUE, NAflag
>> =
>> -9999)
>>
>>
>> # 46341 * 46341 grid dataset
>> sFileNameCeil = "Ceil.tif"
>>
>> nPx = ceiling(sqrt(2^31 -1))
>> nPx
>>
>> rCeil = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
>> nPx)))
>> values(rCeil) = 1
>>
>> writeRaster(rCeil, sFileNameCeil, "GTiff", overwrite = TRUE, NAflag
>> =
>> -9999)
>>
>>
>> ## load raster datasets with different methods
>>
>> # load Ceil
>> gdCeil = GDAL.open(sFileNameCeil)
>> dim(gdCeil)
>>
>> vnCeil = getRasterData(gdCeil, band = 1) # error
>>
>> GDAL.close(gdCeil)
>> str(vnCeil)
>>
>> stCeil = read_stars(sFileNameCeil) # all fine
>> str(stCeil[[1]])
>>
>> rCeil = raster(sFileNameCeil)
>> mCeil = raster::as.matrix(rCeil) # error
>> str(mCeil)
>>
>>
>> # load Floor
>> gdFloor = GDAL.open(sFileNameFloor)
>> dim(gdFloor)
>>
>> vnData = getRasterData(gdFloor, band = 1) # all fine
>>
>> GDAL.close(gdFloor)
>> str(vnData)
>>
>> stFloor= read_stars(sFileNameFloor) # all fine
>> str(stFloor[[1]])
>>
>> rFloor = raster(sFileNameFloor)
>> mFloor = raster::as.matrix(rFloor) # all fine
>> str(mFloor)
>>
>>
>>
>>
>> Am 28.04.2020 um 12:10 schrieb Roger Bivand:
>>> On Tue, 28 Apr 2020, Thorsten Behrens wrote:
>>>
>>>> Michael,
>>>>
>>>> Thanks for the hint, it seems to work! Real-world tests will
>>>> follow in
>>>> the next few days...
>>>>
>>>> So it definitely seems to be a problem of rgdal. It would be
>>>> great if it
>>>> could still be solved.
>>>
>>> Not rgdal, but your use of it. Try looking at a sequence of
>>>
>>> file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
>>> obj <- GDAL.open(file)
>>> (dims <- dim(obj))
>>> band <- 1
>>> data_vector <- getRasterData(obj, band)
>>> GDAL.close(obj)
>>> str(data_vector)
>>>
>>> This does not create any more complicated objects, just a matrix.
>>> In
>>> some cases, the rows in the matrix are ordered S -> N, so it may
>>> appear the wrong way up.
>>>
>>> rgdal::getRasterData() is lightweight, and has many arguments
>>> which
>>> may be helpful. rgdal::readGDAL() is heavyweight, creating a
>>> SpatialGridDataFrame object. This involves much copying of data,
>>> but
>>> the output object can be used for example in mapping or analysis
>>> without further conversion. My guess is that
>>> rgdal::getRasterData()
>>> gives you your matrix directly. Look at the R code to see how
>>> as.is=
>>> etc. work (files may include scale and offset values - recently a
>>> user
>>> was confused that scale and offset were "magically" applied to
>>> convert
>>> Uint16 values to degrees Kelvin on reading). For example, if as.is
>>> ==
>>> TRUE or scale == 1 and offset == 0, no copying of the input matrix
>>> occurs because it is not converted. If you could check this route,
>>> others following this thread could also benefit; if I'm wrong,
>>> that
>>> would also be good to know.
>>>
>>> Roger
>>>
>>>
>>>> Best,
>>>>
>>>> Thorsten
>>>>
>>>>
>>>>
>>>> Am 27.04.2020 um 15:58 schrieb Michael Sumner:
>>>>> Try stars it worked for me on a test
>>>>>
>>>>> On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
>>>>> <
>>>>> [hidden email]
>>>>>  <mailto:
>>>>> [hidden email]
>>>>>>>
>>>>> wrote:
>>>>>
>>>>>     Roger,
>>>>>
>>>>>     thanks a lot for your reply!
>>>>>
>>>>>     I have 256GB RAM installed (mentioned it somewhere). And
>>>>> there,
>>>>>     all is
>>>>>     fine when I run:
>>>>>
>>>>>     rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>>>>> extent(c(0,
>>>>> 72000,
>>>>>
>>>>>     values(rDemTest) = 1
>>>>>
>>>>>     When limiting the memory to about 8GB with
>>>>>     ulimit::memory_limit(8000),
>>>>>     the max size which can be allocated seems to be around
>>>>> 10000 x
>>>>>     10000px.
>>>>>     In this case all tests run fine. Unfortunately it seems to
>>>>> be
>>>>>     related to
>>>>>     the size of the grid (48000 x 72000) and therefore the
>>>>> problem
>>>>>     can't be
>>>>>     reproduced on machines with 8GB RAM. For some processing
>>>>> steps I
>>>>> need
>>>>>     grids of that size in the memory, which is why I have 256
>>>>> GB
>>>>>     installed.
>>>>>
>>>>>     Normally, I use the raster package and not
>>>>> rgdal::readGDAL(). This
>>>>>     was
>>>>>     just a desperate attempt to find the source of the problem.
>>>>>
>>>>>     This is what I use in my code:
>>>>>
>>>>>     rDem = raster(sFileNameTiff)
>>>>>     mDem = raster::as.matrix(rDem)
>>>>>
>>>>>     But maybe this is the same...
>>>>>
>>>>>     Any further suggestions are much appreciated!
>>>>>
>>>>>     Thanks again!
>>>>>
>>>>>     Best,
>>>>>
>>>>>     Thorsten
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
>>>>>   > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>>>>>   >
>>>>>   >> Dear all,
>>>>>   >>
>>>>>   >> my problem is that I want to read a big geotiff raster
>>>>> dataset
>>>>>     into R
>>>>>   >> and convert it to a matrix, which does not work.
>>>>>   >> The file is big but there is sufficient memory. I need
>>>>> all the
>>>>>     data
>>>>>   >> in the memory at the same time.
>>>>>   >>
>>>>>   >> The error occurs under R 3.6.3 as well as 4.0.0 using
>>>>> Ubuntu
>>>>> 20.04
>>>>>   >> LTS with the latest version of the packages (see session
>>>>> info
>>>>>     below)
>>>>>   >> and 256GB RAM installed.
>>>>>   >>
>>>>>   >> When loading the raster dataset using rgdal (via readGDAL
>>>>> or
>>>>>   >> raster::readAll) I get the follwoing error in R 4.0.0:
>>>>>   >>
>>>>>   >> ```
>>>>>   >> Error in rgdal::getRasterData(con, offset = offs,
>>>>> region.dim =
>>>>>     reg,
>>>>>   >> band = object@data@band) :
>>>>>   >>   long vectors not supported yet: memory.c:3782
>>>>>   >> ```
>>>>>   >
>>>>>   > On a 16GB Fedora platform:
>>>>>   >
>>>>>   >> library(raster) # 3.1-5
>>>>>   >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>>>>> extent(c(0,
>>>>>     72000,
>>>>>   > 0,
>>>>>   > + 48000))) # all fine
>>>>>   >> rDemTest
>>>>>   > class      : RasterLayer
>>>>>   > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
>>>>>   > resolution : 1, 1  (x, y)
>>>>>   > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
>>>>>   > crs        : NA
>>>>>   >
>>>>>   >> values(rDemTest) = 1
>>>>>   > Error: cannot allocate vector of size 25.7 Gb
>>>>>   >
>>>>>   > So you are deceiving yourself into thinking that all is
>>>>> fine at
>>>>>     this
>>>>>   > point. Please try to instantiate an example that can be
>>>>>     reproduced on
>>>>>   > a machine with 8GB RAM.
>>>>>   >
>>>>>   > Further note that rgdal::readGDAL() is not how you handle
>>>>> very
>>>>>     large
>>>>>   > objects in files, and never has been. raster can handle
>>>>> blocks
>>>>>     of data
>>>>>   > from bands in file; stars and gdalcubes can use proxy=TRUE
>>>>> for the
>>>>>   > same purpose. Why did you choose rgdal::readGDAL() when
>>>>> this is
>>>>> not
>>>>>   > its purpose?
>>>>>   >
>>>>>   > You did not say how much RAM is on your platform.
>>>>>   >
>>>>>   > Roger
>>>>>   >
>>>>>   >>
>>>>>   >> In R 3.6.3 is is "... memory.c:3717"
>>>>>   >>
>>>>>   >> However, I can load the same file with the tiff package
>>>>> and a
>>>>>     file of
>>>>>   >> the same size in the native raster package format (*.grd)
>>>>> with
>>>>> the
>>>>>   >> raster package but again not with the rgdal package.
>>>>>   >>
>>>>>   >> gdalinfo (gdalUtils) does not complain (see below).
>>>>> Hence, Even
>>>>>   >> Rouault assumes the problem is related to rgdal and not
>>>>> gdal
>>>>>   >> (
>>>>> https://github.com/OSGeo/gdal/issues/2442
>>>>> ).
>>>>>   >>
>>>>>   >> Below you find reproducible code, which generates a
>>>>> raster file,
>>>>>   >> saves the two formats (.tiff and .grd) and tries to read
>>>>> them
>>>>> with
>>>>>   >> the different packages.
>>>>>   >>
>>>>>   >> Is this a known limitation? Any help is greatly
>>>>> appreciated!
>>>>>   >>
>>>>>   >> Thanks a lot in advance!
>>>>>   >>
>>>>>   >> Best wishes and stay healthy,
>>>>>   >> Thorsten
>>>>>   >>
>>>>>   >>
>>>>>   >>
>>>>>   >> ### Steps to reproduce the problem.
>>>>>   >>
>>>>>   >> R code:
>>>>>   >>
>>>>>   >> ```
>>>>>   >> library(rgdal) # 1.4-8
>>>>>   >> library(raster) # 3.1-5
>>>>>   >> library(tiff) # 0.1-5
>>>>>   >>
>>>>>   >> ## generate and manipulate a big raster dataset
>>>>>   >> # - generate
>>>>>   >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>>>>> extent(c(0,
>>>>>   >> 72000, 0, 48000))) # all fine
>>>>>   >>
>>>>>   >> # - manipulate
>>>>>   >> values(rDemTest) = 1 # all fine
>>>>>   >>
>>>>>   >> # - convert
>>>>>   >> mDemTest = raster::as.matrix(rDemTest) # all fine
>>>>>   >> str(mDemTest)
>>>>>   >>
>>>>>   >> ## save a big dataset
>>>>>   >>
>>>>>   >> # - as raster/gdal
>>>>>   >> sFileNameTiff = "BigData.tif"
>>>>>   >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite =
>>>>> TRUE,
>>>>>   >> NAflag = -9999) # all fine
>>>>>   >>
>>>>>   >> # - as raster native
>>>>>   >> sFileNameNative = "BigData.grd"
>>>>>   >> writeRaster(rDemTest, sFileNameNative, "raster",
>>>>> overwrite =
>>>>> TRUE,
>>>>>   >> NAflag = -9999) # all fine
>>>>>   >>
>>>>>   >>
>>>>>   >> ## load the big raster datasets with different packages
>>>>> and
>>>>> options
>>>>>   >> # - load the tiff data with the gdal package via the
>>>>> raster
>>>>> package
>>>>>   >> rDem = raster(sFileNameTiff) # all fine
>>>>>   >> extent(rDem) # all fine
>>>>>   >> mDem = raster::as.matrix(rDem) # error
>>>>>   >> rDem = readAll(rDem) # error
>>>>>   >>
>>>>>   >> # - load the native raster data with the raster package
>>>>>   >> rDem = raster(sFileNameNative) # all fine
>>>>>   >> extent(rDem) # all fine
>>>>>   >> mDem = raster::as.matrix(rDem) # all fine
>>>>>   >> str(mDem)
>>>>>   >>
>>>>>   >> # - load the tiff data with the tiff package
>>>>>   >> mDem = readTIFF(sFileNameTiff) # all fine
>>>>>   >> str(mDem)
>>>>>   >>
>>>>>   >> # - load the tiff data with the gdal package
>>>>>   >> sfDem = readGDAL(sFileNameTiff) # error
>>>>>   >>
>>>>>   >> # - load the native raster data with the gdal package
>>>>>   >> sfDem = readGDAL(sFileNameNative) # error
>>>>>   >>
>>>>>   >> ```
>>>>>   >>
>>>>>   >>
>>>>>   >> ### Startup messages when rgdal is attached (requested by
>>>>> Roger
>>>>>     Bivand)
>>>>>   >>>  library(rgdal)
>>>>>   >> rgdal: version: 1.4-8, (SVN revision 845)
>>>>>   >>  Geospatial Data Abstraction Library extensions to R
>>>>>     successfully loaded
>>>>>   >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>>>>>   >>  Path to GDAL shared files:
>>>>>   >>  GDAL binary built with GEOS: TRUE
>>>>>   >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
>>>>>     [PJ_VERSION:
>>>>>   >> 631]
>>>>>   >>  Path to PROJ.4 shared files: (autodetected)
>>>>>   >>  Linking to sp version: 1.4-1
>>>>>   >>
>>>>>   >>
>>>>>   >> ### Session info
>>>>>   >>>  sessionInfo()
>>>>>   >> R version 4.0.0 (2020-04-24)
>>>>>   >> Platform: x86_64-pc-linux-gnu (64-bit)
>>>>>   >> Running under: Ubuntu 20.04 LTS
>>>>>   >>
>>>>>   >> Matrix products: default
>>>>>   >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-
>>>>> pthread/libblas.so.3
>>>>>   >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-
>>>>> pthread/liblapack.so.3
>>>>>   >>
>>>>>   >> locale:
>>>>>   >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
>>>>> LC_TIME=de_DE.UTF-8
>>>>>   >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
>>>>>   >> LC_MESSAGES=de_DE.UTF-8
>>>>>   >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
>>>>>   >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
>>>>>   >> LC_IDENTIFICATION=C
>>>>>   >>
>>>>>   >> attached base packages:
>>>>>   >> [1] stats     graphics  grDevices utils datasets methods
>>>>> base
>>>>>   >>
>>>>>   >> other attached packages:
>>>>>   >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5
>>>>> raster_3.1-5
>>>>>   >> sp_1.4-1
>>>>>   >>
>>>>>   >> loaded via a namespace (and not attached):
>>>>>   >>  [1] compiler_4.0.0    tools_4.0.0 Rcpp_1.0.4.6
>>>>>   >> R.methodsS3_1.8.0 codetools_0.2-16
>>>>>   >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
>>>>>   >> R.utils_2.9.2     R.oo_1.23.0
>>>>>   >> [11] lattice_0.20-41
>>>>>   >>
>>>>>   >>
>>>>>   >> ### gdalInfo
>>>>>   >>>  gdalinfo(sFileNameTiff)
>>>>>   >>  [1] "Driver: GTiff/GeoTIFF"
>>>>>   >>  [2] "Files: BigData.tif"
>>>>>   >>  [3] "Size is 72000, 48000"
>>>>>   >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
>>>>>   >>  [5] "Pixel Size = (1.000000000000000,-
>>>>> 1.000000000000000)"
>>>>>   >>  [6] "Image Structure Metadata:"
>>>>>   >>  [7] "  COMPRESSION=LZW"
>>>>>   >>  [8] "  INTERLEAVE=BAND"
>>>>>   >>  [9] "Corner Coordinates:"
>>>>>   >> [10] "Upper Left  (       0.000,   48000.000) "
>>>>>   >> [11] "Lower Left  (   0.0000000,   0.0000000) "
>>>>>   >> [12] "Upper Right (   72000.000,   48000.000) "
>>>>>   >> [13] "Lower Right (   72000.000,       0.000) "
>>>>>   >> [14] "Center      (   36000.000,   24000.000) "
>>>>>   >> [15] "Band 1 Block=72000x1 Type=Float32,
>>>>> ColorInterp=Gray"
>>>>>   >> [16] "  Min=1.000 Max=1.000 "
>>>>>   >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan,
>>>>> StdDev=nan"
>>>>>   >> [18] "  NoData Value=-9999"
>>>>>   >> [19] "  Metadata:"
>>>>>   >> [20] "    STATISTICS_MAXIMUM=1"
>>>>>   >> [21] "    STATISTICS_MEAN=nan"
>>>>>   >> [22] "    STATISTICS_MINIMUM=1"
>>>>>   >> [23] "    STATISTICS_STDDEV=nan"
>>>>>   >>
>>>>>   >> _______________________________________________
>>>>>   >> R-sig-Geo mailing list
>>>>>   >>
>>>>> [hidden email]
>>>>>  <mailto:
>>>>> [hidden email]
>>>>>>
>>>>>   >>
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>
>>>>>   >>
>>>>>   >>
>>>>>   >
>>>>>
>>>>>     _______________________________________________
>>>>>     R-sig-Geo mailing list
>>>>>
>>>>> [hidden email]
>>>>>  <mailto:
>>>>> [hidden email]
>>>>>>
>>>>>
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>
>>>>>
>>>>
>>>>     [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> [hidden email]
>>>>
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> [hidden email]
>>
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: [hidden email]
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Mathias Moser
> I cannot check this because I have no access to a platform with
> enough
> RAM, so I need help here. I haven't been able to confirm that
> LENGTH(sRStorage) returns an R_xlen_t, or double, in lines 1672, 1685
> or
> 1693. Is there a way of generating a large file using a GDAL app,
> perhaps,
> then I could just try reading a larger file if LENGTH plays up?

LENGTH() can not handle the R_xlen_t and again throws a long vector
error. I have replaced the three occurrences with XLENGTH() and tested
using Thorsten's Ceil.tif file, looks good so far:

> > library(rgdal)
> Loading required package: sp
> rgdal: version: 1.5-8, (SVN revision (unknown))
>  Geospatial Data Abstraction Library extensions to R successfully
> loaded
>  Loaded GDAL runtime: GDAL 3.0.1, released 2019/06/28
>  Path to GDAL shared files: /usr/local/share/gdal
>  GDAL binary built with GEOS: TRUE
>  Loaded PROJ runtime: Rel. 6.2.0, September 1st, 2019, [PJ_VERSION:
> 620]
>  Path to PROJ shared files: /usr/local/share/proj
>  Linking to sp version: 1.4-1
> > mCeil = raster::as.matrix(raster::raster("Ceil.tif"))
> > str(mCeil)
>  num [1:46341, 1:46341] 1 1 1 1 1 1 1 1 1 1 ...

HTH, best wishes,

Mathias

>
> > (read_gdal_data() of stars _seems_ to use R_xlen_t instead to
> > accomodate for 52bits [2]).
> >
> > Best wishes,
> >
> > Mathias
> >
> >
> > [1]
> > https://cloud.r-project.org/doc/manuals/r-patched/R-ints.html#Long-vectors
> >
> > [2]
> > https://github.com/r-spatial/sf/blob/ea1bd716769ab8140d3451e3d902cfc79bc895d5/src/stars.cpp#L172
> >
> >
> >
> > On Tue, 2020-04-28 at 13:39 +0200, Thorsten Behrens wrote:
> > > Roger and Mike,
> > >
> > > I really appreciate your help on this!
> > >
> > > I had a look at getRasterData(). It results in the same error.
> > > Hence,
> > > I
> > > made further tests where I compared grids with the following
> > > numbers
> > > of
> > > cols and rows:
> > >
> > > nPx = floor(sqrt(2^31 -1)) # 46340
> > >
> > > and
> > >
> > > nPx = ceiling(sqrt(2^31 -1)) # 46341
> > >
> > > The result is clear. Raster data with 46340 * 46340 px can be
> > > loaded
> > > with getRasterData() and with raster(), raster::as.matrix(),
> > > whereas
> > > datasets with 46341 * 46341 px cannot and result in the error:
> > >
> > > Error in getRasterData(gdCeil, band = 1) :
> > > long vectors not supported yet: memory.c:3782
> > >
> > > read_stars() works for both. You find the corresponding code
> > > below.
> > >
> > > Are there any other option I can try?
> > >
> > > Thorsten
> > >
> > >
> > > Reproducible code:
> > >
> > > ## generate raster datasets
> > > # 46340 * 46340 grid dataset
> > > sFileNameFloor = "Floor.tif"
> > >
> > > nPx = floor(sqrt(2^31 -1)) # 46340
> > > nPx
> > >
> > > rFloor = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
> > > nPx)))
> > > values(rFloor) = 1
> > >
> > > writeRaster(rFloor, sFileNameFloor, "GTiff", overwrite = TRUE,
> > > NAflag
> > > =
> > > -9999)
> > >
> > >
> > > # 46341 * 46341 grid dataset
> > > sFileNameCeil = "Ceil.tif"
> > >
> > > nPx = ceiling(sqrt(2^31 -1))
> > > nPx
> > >
> > > rCeil = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
> > > nPx)))
> > > values(rCeil) = 1
> > >
> > > writeRaster(rCeil, sFileNameCeil, "GTiff", overwrite = TRUE,
> > > NAflag
> > > =
> > > -9999)
> > >
> > >
> > > ## load raster datasets with different methods
> > >
> > > # load Ceil
> > > gdCeil = GDAL.open(sFileNameCeil)
> > > dim(gdCeil)
> > >
> > > vnCeil = getRasterData(gdCeil, band = 1) # error
> > >
> > > GDAL.close(gdCeil)
> > > str(vnCeil)
> > >
> > > stCeil = read_stars(sFileNameCeil) # all fine
> > > str(stCeil[[1]])
> > >
> > > rCeil = raster(sFileNameCeil)
> > > mCeil = raster::as.matrix(rCeil) # error
> > > str(mCeil)
> > >
> > >
> > > # load Floor
> > > gdFloor = GDAL.open(sFileNameFloor)
> > > dim(gdFloor)
> > >
> > > vnData = getRasterData(gdFloor, band = 1) # all fine
> > >
> > > GDAL.close(gdFloor)
> > > str(vnData)
> > >
> > > stFloor= read_stars(sFileNameFloor) # all fine
> > > str(stFloor[[1]])
> > >
> > > rFloor = raster(sFileNameFloor)
> > > mFloor = raster::as.matrix(rFloor) # all fine
> > > str(mFloor)
> > >
> > >
> > >
> > >
> > > Am 28.04.2020 um 12:10 schrieb Roger Bivand:
> > > > On Tue, 28 Apr 2020, Thorsten Behrens wrote:
> > > >
> > > > > Michael,
> > > > >
> > > > > Thanks for the hint, it seems to work! Real-world tests will
> > > > > follow in
> > > > > the next few days...
> > > > >
> > > > > So it definitely seems to be a problem of rgdal. It would be
> > > > > great if it
> > > > > could still be solved.
> > > >
> > > > Not rgdal, but your use of it. Try looking at a sequence of
> > > >
> > > > file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
> > > > obj <- GDAL.open(file)
> > > > (dims <- dim(obj))
> > > > band <- 1
> > > > data_vector <- getRasterData(obj, band)
> > > > GDAL.close(obj)
> > > > str(data_vector)
> > > >
> > > > This does not create any more complicated objects, just a
> > > > matrix.
> > > > In
> > > > some cases, the rows in the matrix are ordered S -> N, so it
> > > > may
> > > > appear the wrong way up.
> > > >
> > > > rgdal::getRasterData() is lightweight, and has many arguments
> > > > which
> > > > may be helpful. rgdal::readGDAL() is heavyweight, creating a
> > > > SpatialGridDataFrame object. This involves much copying of
> > > > data,
> > > > but
> > > > the output object can be used for example in mapping or
> > > > analysis
> > > > without further conversion. My guess is that
> > > > rgdal::getRasterData()
> > > > gives you your matrix directly. Look at the R code to see how
> > > > as.is=
> > > > etc. work (files may include scale and offset values - recently
> > > > a
> > > > user
> > > > was confused that scale and offset were "magically" applied to
> > > > convert
> > > > Uint16 values to degrees Kelvin on reading). For example, if
> > > > as.is
> > > > ==
> > > > TRUE or scale == 1 and offset == 0, no copying of the input
> > > > matrix
> > > > occurs because it is not converted. If you could check this
> > > > route,
> > > > others following this thread could also benefit; if I'm wrong,
> > > > that
> > > > would also be good to know.
> > > >
> > > > Roger
> > > >
> > > >
> > > > > Best,
> > > > >
> > > > > Thorsten
> > > > >
> > > > >
> > > > >
> > > > > Am 27.04.2020 um 15:58 schrieb Michael Sumner:
> > > > > > Try stars it worked for me on a test
> > > > > >
> > > > > > On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
> > > > > > <
> > > > > > [hidden email]
> > > > > >
> > > > > >  <mailto:
> > > > > > [hidden email]
> > > > > >
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > >     Roger,
> > > > > >
> > > > > >     thanks a lot for your reply!
> > > > > >
> > > > > >     I have 256GB RAM installed (mentioned it somewhere).
> > > > > > And
> > > > > > there,
> > > > > >     all is
> > > > > >     fine when I run:
> > > > > >
> > > > > >     rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > > > extent(c(0,
> > > > > > 72000,
> > > > > >
> > > > > >     values(rDemTest) = 1
> > > > > >
> > > > > >     When limiting the memory to about 8GB with
> > > > > >     ulimit::memory_limit(8000),
> > > > > >     the max size which can be allocated seems to be around
> > > > > > 10000 x
> > > > > >     10000px.
> > > > > >     In this case all tests run fine. Unfortunately it seems
> > > > > > to
> > > > > > be
> > > > > >     related to
> > > > > >     the size of the grid (48000 x 72000) and therefore the
> > > > > > problem
> > > > > >     can't be
> > > > > >     reproduced on machines with 8GB RAM. For some
> > > > > > processing
> > > > > > steps I
> > > > > > need
> > > > > >     grids of that size in the memory, which is why I have
> > > > > > 256
> > > > > > GB
> > > > > >     installed.
> > > > > >
> > > > > >     Normally, I use the raster package and not
> > > > > > rgdal::readGDAL(). This
> > > > > >     was
> > > > > >     just a desperate attempt to find the source of the
> > > > > > problem.
> > > > > >
> > > > > >     This is what I use in my code:
> > > > > >
> > > > > >     rDem = raster(sFileNameTiff)
> > > > > >     mDem = raster::as.matrix(rDem)
> > > > > >
> > > > > >     But maybe this is the same...
> > > > > >
> > > > > >     Any further suggestions are much appreciated!
> > > > > >
> > > > > >     Thanks again!
> > > > > >
> > > > > >     Best,
> > > > > >
> > > > > >     Thorsten
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
> > > > > >   > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
> > > > > >   >
> > > > > >   >> Dear all,
> > > > > >   >>
> > > > > >   >> my problem is that I want to read a big geotiff raster
> > > > > > dataset
> > > > > >     into R
> > > > > >   >> and convert it to a matrix, which does not work.
> > > > > >   >> The file is big but there is sufficient memory. I need
> > > > > > all the
> > > > > >     data
> > > > > >   >> in the memory at the same time.
> > > > > >   >>
> > > > > >   >> The error occurs under R 3.6.3 as well as 4.0.0 using
> > > > > > Ubuntu
> > > > > > 20.04
> > > > > >   >> LTS with the latest version of the packages (see
> > > > > > session
> > > > > > info
> > > > > >     below)
> > > > > >   >> and 256GB RAM installed.
> > > > > >   >>
> > > > > >   >> When loading the raster dataset using rgdal (via
> > > > > > readGDAL
> > > > > > or
> > > > > >   >> raster::readAll) I get the follwoing error in R 4.0.0:
> > > > > >   >>
> > > > > >   >> ```
> > > > > >   >> Error in rgdal::getRasterData(con, offset = offs,
> > > > > > region.dim =
> > > > > >     reg,
> > > > > >   >> band = object@data@band) :
> > > > > >   >>   long vectors not supported yet: memory.c:3782
> > > > > >   >> ```
> > > > > >   >
> > > > > >   > On a 16GB Fedora platform:
> > > > > >   >
> > > > > >   >> library(raster) # 3.1-5
> > > > > >   >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > > > extent(c(0,
> > > > > >     72000,
> > > > > >   > 0,
> > > > > >   > + 48000))) # all fine
> > > > > >   >> rDemTest
> > > > > >   > class      : RasterLayer
> > > > > >   > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol,
> > > > > > ncell)
> > > > > >   > resolution : 1, 1  (x, y)
> > > > > >   > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin,
> > > > > > ymax)
> > > > > >   > crs        : NA
> > > > > >   >
> > > > > >   >> values(rDemTest) = 1
> > > > > >   > Error: cannot allocate vector of size 25.7 Gb
> > > > > >   >
> > > > > >   > So you are deceiving yourself into thinking that all is
> > > > > > fine at
> > > > > >     this
> > > > > >   > point. Please try to instantiate an example that can be
> > > > > >     reproduced on
> > > > > >   > a machine with 8GB RAM.
> > > > > >   >
> > > > > >   > Further note that rgdal::readGDAL() is not how you
> > > > > > handle
> > > > > > very
> > > > > >     large
> > > > > >   > objects in files, and never has been. raster can handle
> > > > > > blocks
> > > > > >     of data
> > > > > >   > from bands in file; stars and gdalcubes can use
> > > > > > proxy=TRUE
> > > > > > for the
> > > > > >   > same purpose. Why did you choose rgdal::readGDAL() when
> > > > > > this is
> > > > > > not
> > > > > >   > its purpose?
> > > > > >   >
> > > > > >   > You did not say how much RAM is on your platform.
> > > > > >   >
> > > > > >   > Roger
> > > > > >   >
> > > > > >   >>
> > > > > >   >> In R 3.6.3 is is "... memory.c:3717"
> > > > > >   >>
> > > > > >   >> However, I can load the same file with the tiff
> > > > > > package
> > > > > > and a
> > > > > >     file of
> > > > > >   >> the same size in the native raster package format
> > > > > > (*.grd)
> > > > > > with
> > > > > > the
> > > > > >   >> raster package but again not with the rgdal package.
> > > > > >   >>
> > > > > >   >> gdalinfo (gdalUtils) does not complain (see below).
> > > > > > Hence, Even
> > > > > >   >> Rouault assumes the problem is related to rgdal and
> > > > > > not
> > > > > > gdal
> > > > > >   >> (
> > > > > > https://github.com/OSGeo/gdal/issues/2442
> > > > > >
> > > > > > ).
> > > > > >   >>
> > > > > >   >> Below you find reproducible code, which generates a
> > > > > > raster file,
> > > > > >   >> saves the two formats (.tiff and .grd) and tries to
> > > > > > read
> > > > > > them
> > > > > > with
> > > > > >   >> the different packages.
> > > > > >   >>
> > > > > >   >> Is this a known limitation? Any help is greatly
> > > > > > appreciated!
> > > > > >   >>
> > > > > >   >> Thanks a lot in advance!
> > > > > >   >>
> > > > > >   >> Best wishes and stay healthy,
> > > > > >   >> Thorsten
> > > > > >   >>
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ### Steps to reproduce the problem.
> > > > > >   >>
> > > > > >   >> R code:
> > > > > >   >>
> > > > > >   >> ```
> > > > > >   >> library(rgdal) # 1.4-8
> > > > > >   >> library(raster) # 3.1-5
> > > > > >   >> library(tiff) # 0.1-5
> > > > > >   >>
> > > > > >   >> ## generate and manipulate a big raster dataset
> > > > > >   >> # - generate
> > > > > >   >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > > > extent(c(0,
> > > > > >   >> 72000, 0, 48000))) # all fine
> > > > > >   >>
> > > > > >   >> # - manipulate
> > > > > >   >> values(rDemTest) = 1 # all fine
> > > > > >   >>
> > > > > >   >> # - convert
> > > > > >   >> mDemTest = raster::as.matrix(rDemTest) # all fine
> > > > > >   >> str(mDemTest)
> > > > > >   >>
> > > > > >   >> ## save a big dataset
> > > > > >   >>
> > > > > >   >> # - as raster/gdal
> > > > > >   >> sFileNameTiff = "BigData.tif"
> > > > > >   >> writeRaster(rDemTest, sFileNameTiff, "GTiff",
> > > > > > overwrite =
> > > > > > TRUE,
> > > > > >   >> NAflag = -9999) # all fine
> > > > > >   >>
> > > > > >   >> # - as raster native
> > > > > >   >> sFileNameNative = "BigData.grd"
> > > > > >   >> writeRaster(rDemTest, sFileNameNative, "raster",
> > > > > > overwrite =
> > > > > > TRUE,
> > > > > >   >> NAflag = -9999) # all fine
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ## load the big raster datasets with different
> > > > > > packages
> > > > > > and
> > > > > > options
> > > > > >   >> # - load the tiff data with the gdal package via the
> > > > > > raster
> > > > > > package
> > > > > >   >> rDem = raster(sFileNameTiff) # all fine
> > > > > >   >> extent(rDem) # all fine
> > > > > >   >> mDem = raster::as.matrix(rDem) # error
> > > > > >   >> rDem = readAll(rDem) # error
> > > > > >   >>
> > > > > >   >> # - load the native raster data with the raster
> > > > > > package
> > > > > >   >> rDem = raster(sFileNameNative) # all fine
> > > > > >   >> extent(rDem) # all fine
> > > > > >   >> mDem = raster::as.matrix(rDem) # all fine
> > > > > >   >> str(mDem)
> > > > > >   >>
> > > > > >   >> # - load the tiff data with the tiff package
> > > > > >   >> mDem = readTIFF(sFileNameTiff) # all fine
> > > > > >   >> str(mDem)
> > > > > >   >>
> > > > > >   >> # - load the tiff data with the gdal package
> > > > > >   >> sfDem = readGDAL(sFileNameTiff) # error
> > > > > >   >>
> > > > > >   >> # - load the native raster data with the gdal package
> > > > > >   >> sfDem = readGDAL(sFileNameNative) # error
> > > > > >   >>
> > > > > >   >> ```
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ### Startup messages when rgdal is attached (requested
> > > > > > by
> > > > > > Roger
> > > > > >     Bivand)
> > > > > >   >>>  library(rgdal)
> > > > > >   >> rgdal: version: 1.4-8, (SVN revision 845)
> > > > > >   >>  Geospatial Data Abstraction Library extensions to R
> > > > > >     successfully loaded
> > > > > >   >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
> > > > > >   >>  Path to GDAL shared files:
> > > > > >   >>  GDAL binary built with GEOS: TRUE
> > > > > >   >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th,
> > > > > > 2020,
> > > > > >     [PJ_VERSION:
> > > > > >   >> 631]
> > > > > >   >>  Path to PROJ.4 shared files: (autodetected)
> > > > > >   >>  Linking to sp version: 1.4-1
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ### Session info
> > > > > >   >>>  sessionInfo()
> > > > > >   >> R version 4.0.0 (2020-04-24)
> > > > > >   >> Platform: x86_64-pc-linux-gnu (64-bit)
> > > > > >   >> Running under: Ubuntu 20.04 LTS
> > > > > >   >>
> > > > > >   >> Matrix products: default
> > > > > >   >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-
> > > > > > pthread/libblas.so.3
> > > > > >   >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-
> > > > > > pthread/liblapack.so.3
> > > > > >   >>
> > > > > >   >> locale:
> > > > > >   >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
> > > > > > LC_TIME=de_DE.UTF-8
> > > > > >   >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
> > > > > >   >> LC_MESSAGES=de_DE.UTF-8
> > > > > >   >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
> > > > > >   >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
> > > > > >   >> LC_IDENTIFICATION=C
> > > > > >   >>
> > > > > >   >> attached base packages:
> > > > > >   >> [1] stats     graphics  grDevices utils datasets
> > > > > > methods
> > > > > > base
> > > > > >   >>
> > > > > >   >> other attached packages:
> > > > > >   >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5
> > > > > > raster_3.1-5
> > > > > >   >> sp_1.4-1
> > > > > >   >>
> > > > > >   >> loaded via a namespace (and not attached):
> > > > > >   >>  [1] compiler_4.0.0    tools_4.0.0 Rcpp_1.0.4.6
> > > > > >   >> R.methodsS3_1.8.0 codetools_0.2-16
> > > > > >   >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
> > > > > >   >> R.utils_2.9.2     R.oo_1.23.0
> > > > > >   >> [11] lattice_0.20-41
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ### gdalInfo
> > > > > >   >>>  gdalinfo(sFileNameTiff)
> > > > > >   >>  [1] "Driver: GTiff/GeoTIFF"
> > > > > >   >>  [2] "Files: BigData.tif"
> > > > > >   >>  [3] "Size is 72000, 48000"
> > > > > >   >>  [4] "Origin =
> > > > > > (0.000000000000000,48000.000000000000000)"
> > > > > >   >>  [5] "Pixel Size = (1.000000000000000,-
> > > > > > 1.000000000000000)"
> > > > > >   >>  [6] "Image Structure Metadata:"
> > > > > >   >>  [7] "  COMPRESSION=LZW"
> > > > > >   >>  [8] "  INTERLEAVE=BAND"
> > > > > >   >>  [9] "Corner Coordinates:"
> > > > > >   >> [10] "Upper Left  (       0.000,   48000.000) "
> > > > > >   >> [11] "Lower Left  (   0.0000000,   0.0000000) "
> > > > > >   >> [12] "Upper Right (   72000.000,   48000.000) "
> > > > > >   >> [13] "Lower Right (   72000.000,       0.000) "
> > > > > >   >> [14] "Center      (   36000.000,   24000.000) "
> > > > > >   >> [15] "Band 1 Block=72000x1 Type=Float32,
> > > > > > ColorInterp=Gray"
> > > > > >   >> [16] "  Min=1.000 Max=1.000 "
> > > > > >   >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan,
> > > > > > StdDev=nan"
> > > > > >   >> [18] "  NoData Value=-9999"
> > > > > >   >> [19] "  Metadata:"
> > > > > >   >> [20] "    STATISTICS_MAXIMUM=1"
> > > > > >   >> [21] "    STATISTICS_MEAN=nan"
> > > > > >   >> [22] "    STATISTICS_MINIMUM=1"
> > > > > >   >> [23] "    STATISTICS_STDDEV=nan"
> > > > > >   >>
> > > > > >   >> _______________________________________________
> > > > > >   >> R-sig-Geo mailing list
> > > > > >   >>
> > > > > > [hidden email]
> > > > > >
> > > > > >  <mailto:
> > > > > > [hidden email]
> > > > > >
> > > > > >
> > > > > >   >>
> > > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > > >
> > > > > >
> > > > > >   >>
> > > > > >   >>
> > > > > >   >
> > > > > >
> > > > > >     _______________________________________________
> > > > > >     R-sig-Geo mailing list
> > > > > >
> > > > > > [hidden email]
> > > > > >
> > > > > >  <mailto:
> > > > > > [hidden email]
> > > > > >
> > > > > >
> > > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >     [[alternative HTML version deleted]]
> > > > >
> > > > > _______________________________________________
> > > > > R-sig-Geo mailing list
> > > > > [hidden email]
> > > > >
> > > > >
> > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > >
> > > > >
> > > > >
> > >
> > > _______________________________________________
> > > R-sig-Geo mailing list
> > > [hidden email]
> > >
> > >
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > >
> > >
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > [hidden email]
> >
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
> >

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Roger Bivand
Administrator
Thanks, as I thought. Committed to R-forge as SVN rev. 962. Really we'd need a test rig for reading, manipulation in sp/rgdal and raster, and writing, to be sure that all int assumptions that need conversion to R_xlen_t.

Roger

--
Roger Bivand
Norwegian School of Economics
Helleveien 30, 5045 Bergen, Norway
[hidden email]


________________________________________
Fra: Mathias Moser <[hidden email]>
Sendt: tirsdag 28. april 2020 22.14
Til: Roger Bivand
Kopi: RsigGeo
Emne: Re: [R-sig-Geo] rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

> I cannot check this because I have no access to a platform with
> enough
> RAM, so I need help here. I haven't been able to confirm that
> LENGTH(sRStorage) returns an R_xlen_t, or double, in lines 1672, 1685
> or
> 1693. Is there a way of generating a large file using a GDAL app,
> perhaps,
> then I could just try reading a larger file if LENGTH plays up?

LENGTH() can not handle the R_xlen_t and again throws a long vector
error. I have replaced the three occurrences with XLENGTH() and tested
using Thorsten's Ceil.tif file, looks good so far:

> > library(rgdal)
> Loading required package: sp
> rgdal: version: 1.5-8, (SVN revision (unknown))
>  Geospatial Data Abstraction Library extensions to R successfully
> loaded
>  Loaded GDAL runtime: GDAL 3.0.1, released 2019/06/28
>  Path to GDAL shared files: /usr/local/share/gdal
>  GDAL binary built with GEOS: TRUE
>  Loaded PROJ runtime: Rel. 6.2.0, September 1st, 2019, [PJ_VERSION:
> 620]
>  Path to PROJ shared files: /usr/local/share/proj
>  Linking to sp version: 1.4-1
> > mCeil = raster::as.matrix(raster::raster("Ceil.tif"))
> > str(mCeil)
>  num [1:46341, 1:46341] 1 1 1 1 1 1 1 1 1 1 ...

HTH, best wishes,

Mathias

>
> > (read_gdal_data() of stars _seems_ to use R_xlen_t instead to
> > accomodate for 52bits [2]).
> >
> > Best wishes,
> >
> > Mathias
> >
> >
> > [1]
> > https://cloud.r-project.org/doc/manuals/r-patched/R-ints.html#Long-vectors
> >
> > [2]
> > https://github.com/r-spatial/sf/blob/ea1bd716769ab8140d3451e3d902cfc79bc895d5/src/stars.cpp#L172
> >
> >
> >
> > On Tue, 2020-04-28 at 13:39 +0200, Thorsten Behrens wrote:
> > > Roger and Mike,
> > >
> > > I really appreciate your help on this!
> > >
> > > I had a look at getRasterData(). It results in the same error.
> > > Hence,
> > > I
> > > made further tests where I compared grids with the following
> > > numbers
> > > of
> > > cols and rows:
> > >
> > > nPx = floor(sqrt(2^31 -1)) # 46340
> > >
> > > and
> > >
> > > nPx = ceiling(sqrt(2^31 -1)) # 46341
> > >
> > > The result is clear. Raster data with 46340 * 46340 px can be
> > > loaded
> > > with getRasterData() and with raster(), raster::as.matrix(),
> > > whereas
> > > datasets with 46341 * 46341 px cannot and result in the error:
> > >
> > > Error in getRasterData(gdCeil, band = 1) :
> > > long vectors not supported yet: memory.c:3782
> > >
> > > read_stars() works for both. You find the corresponding code
> > > below.
> > >
> > > Are there any other option I can try?
> > >
> > > Thorsten
> > >
> > >
> > > Reproducible code:
> > >
> > > ## generate raster datasets
> > > # 46340 * 46340 grid dataset
> > > sFileNameFloor = "Floor.tif"
> > >
> > > nPx = floor(sqrt(2^31 -1)) # 46340
> > > nPx
> > >
> > > rFloor = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
> > > nPx)))
> > > values(rFloor) = 1
> > >
> > > writeRaster(rFloor, sFileNameFloor, "GTiff", overwrite = TRUE,
> > > NAflag
> > > =
> > > -9999)
> > >
> > >
> > > # 46341 * 46341 grid dataset
> > > sFileNameCeil = "Ceil.tif"
> > >
> > > nPx = ceiling(sqrt(2^31 -1))
> > > nPx
> > >
> > > rCeil = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
> > > nPx)))
> > > values(rCeil) = 1
> > >
> > > writeRaster(rCeil, sFileNameCeil, "GTiff", overwrite = TRUE,
> > > NAflag
> > > =
> > > -9999)
> > >
> > >
> > > ## load raster datasets with different methods
> > >
> > > # load Ceil
> > > gdCeil = GDAL.open(sFileNameCeil)
> > > dim(gdCeil)
> > >
> > > vnCeil = getRasterData(gdCeil, band = 1) # error
> > >
> > > GDAL.close(gdCeil)
> > > str(vnCeil)
> > >
> > > stCeil = read_stars(sFileNameCeil) # all fine
> > > str(stCeil[[1]])
> > >
> > > rCeil = raster(sFileNameCeil)
> > > mCeil = raster::as.matrix(rCeil) # error
> > > str(mCeil)
> > >
> > >
> > > # load Floor
> > > gdFloor = GDAL.open(sFileNameFloor)
> > > dim(gdFloor)
> > >
> > > vnData = getRasterData(gdFloor, band = 1) # all fine
> > >
> > > GDAL.close(gdFloor)
> > > str(vnData)
> > >
> > > stFloor= read_stars(sFileNameFloor) # all fine
> > > str(stFloor[[1]])
> > >
> > > rFloor = raster(sFileNameFloor)
> > > mFloor = raster::as.matrix(rFloor) # all fine
> > > str(mFloor)
> > >
> > >
> > >
> > >
> > > Am 28.04.2020 um 12:10 schrieb Roger Bivand:
> > > > On Tue, 28 Apr 2020, Thorsten Behrens wrote:
> > > >
> > > > > Michael,
> > > > >
> > > > > Thanks for the hint, it seems to work! Real-world tests will
> > > > > follow in
> > > > > the next few days...
> > > > >
> > > > > So it definitely seems to be a problem of rgdal. It would be
> > > > > great if it
> > > > > could still be solved.
> > > >
> > > > Not rgdal, but your use of it. Try looking at a sequence of
> > > >
> > > > file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
> > > > obj <- GDAL.open(file)
> > > > (dims <- dim(obj))
> > > > band <- 1
> > > > data_vector <- getRasterData(obj, band)
> > > > GDAL.close(obj)
> > > > str(data_vector)
> > > >
> > > > This does not create any more complicated objects, just a
> > > > matrix.
> > > > In
> > > > some cases, the rows in the matrix are ordered S -> N, so it
> > > > may
> > > > appear the wrong way up.
> > > >
> > > > rgdal::getRasterData() is lightweight, and has many arguments
> > > > which
> > > > may be helpful. rgdal::readGDAL() is heavyweight, creating a
> > > > SpatialGridDataFrame object. This involves much copying of
> > > > data,
> > > > but
> > > > the output object can be used for example in mapping or
> > > > analysis
> > > > without further conversion. My guess is that
> > > > rgdal::getRasterData()
> > > > gives you your matrix directly. Look at the R code to see how
> > > > as.is=
> > > > etc. work (files may include scale and offset values - recently
> > > > a
> > > > user
> > > > was confused that scale and offset were "magically" applied to
> > > > convert
> > > > Uint16 values to degrees Kelvin on reading). For example, if
> > > > as.is
> > > > ==
> > > > TRUE or scale == 1 and offset == 0, no copying of the input
> > > > matrix
> > > > occurs because it is not converted. If you could check this
> > > > route,
> > > > others following this thread could also benefit; if I'm wrong,
> > > > that
> > > > would also be good to know.
> > > >
> > > > Roger
> > > >
> > > >
> > > > > Best,
> > > > >
> > > > > Thorsten
> > > > >
> > > > >
> > > > >
> > > > > Am 27.04.2020 um 15:58 schrieb Michael Sumner:
> > > > > > Try stars it worked for me on a test
> > > > > >
> > > > > > On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
> > > > > > <
> > > > > > [hidden email]
> > > > > >
> > > > > >  <mailto:
> > > > > > [hidden email]
> > > > > >
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > >     Roger,
> > > > > >
> > > > > >     thanks a lot for your reply!
> > > > > >
> > > > > >     I have 256GB RAM installed (mentioned it somewhere).
> > > > > > And
> > > > > > there,
> > > > > >     all is
> > > > > >     fine when I run:
> > > > > >
> > > > > >     rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > > > extent(c(0,
> > > > > > 72000,
> > > > > >
> > > > > >     values(rDemTest) = 1
> > > > > >
> > > > > >     When limiting the memory to about 8GB with
> > > > > >     ulimit::memory_limit(8000),
> > > > > >     the max size which can be allocated seems to be around
> > > > > > 10000 x
> > > > > >     10000px.
> > > > > >     In this case all tests run fine. Unfortunately it seems
> > > > > > to
> > > > > > be
> > > > > >     related to
> > > > > >     the size of the grid (48000 x 72000) and therefore the
> > > > > > problem
> > > > > >     can't be
> > > > > >     reproduced on machines with 8GB RAM. For some
> > > > > > processing
> > > > > > steps I
> > > > > > need
> > > > > >     grids of that size in the memory, which is why I have
> > > > > > 256
> > > > > > GB
> > > > > >     installed.
> > > > > >
> > > > > >     Normally, I use the raster package and not
> > > > > > rgdal::readGDAL(). This
> > > > > >     was
> > > > > >     just a desperate attempt to find the source of the
> > > > > > problem.
> > > > > >
> > > > > >     This is what I use in my code:
> > > > > >
> > > > > >     rDem = raster(sFileNameTiff)
> > > > > >     mDem = raster::as.matrix(rDem)
> > > > > >
> > > > > >     But maybe this is the same...
> > > > > >
> > > > > >     Any further suggestions are much appreciated!
> > > > > >
> > > > > >     Thanks again!
> > > > > >
> > > > > >     Best,
> > > > > >
> > > > > >     Thorsten
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
> > > > > >   > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
> > > > > >   >
> > > > > >   >> Dear all,
> > > > > >   >>
> > > > > >   >> my problem is that I want to read a big geotiff raster
> > > > > > dataset
> > > > > >     into R
> > > > > >   >> and convert it to a matrix, which does not work.
> > > > > >   >> The file is big but there is sufficient memory. I need
> > > > > > all the
> > > > > >     data
> > > > > >   >> in the memory at the same time.
> > > > > >   >>
> > > > > >   >> The error occurs under R 3.6.3 as well as 4.0.0 using
> > > > > > Ubuntu
> > > > > > 20.04
> > > > > >   >> LTS with the latest version of the packages (see
> > > > > > session
> > > > > > info
> > > > > >     below)
> > > > > >   >> and 256GB RAM installed.
> > > > > >   >>
> > > > > >   >> When loading the raster dataset using rgdal (via
> > > > > > readGDAL
> > > > > > or
> > > > > >   >> raster::readAll) I get the follwoing error in R 4.0.0:
> > > > > >   >>
> > > > > >   >> ```
> > > > > >   >> Error in rgdal::getRasterData(con, offset = offs,
> > > > > > region.dim =
> > > > > >     reg,
> > > > > >   >> band = object@data@band) :
> > > > > >   >>   long vectors not supported yet: memory.c:3782
> > > > > >   >> ```
> > > > > >   >
> > > > > >   > On a 16GB Fedora platform:
> > > > > >   >
> > > > > >   >> library(raster) # 3.1-5
> > > > > >   >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > > > extent(c(0,
> > > > > >     72000,
> > > > > >   > 0,
> > > > > >   > + 48000))) # all fine
> > > > > >   >> rDemTest
> > > > > >   > class      : RasterLayer
> > > > > >   > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol,
> > > > > > ncell)
> > > > > >   > resolution : 1, 1  (x, y)
> > > > > >   > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin,
> > > > > > ymax)
> > > > > >   > crs        : NA
> > > > > >   >
> > > > > >   >> values(rDemTest) = 1
> > > > > >   > Error: cannot allocate vector of size 25.7 Gb
> > > > > >   >
> > > > > >   > So you are deceiving yourself into thinking that all is
> > > > > > fine at
> > > > > >     this
> > > > > >   > point. Please try to instantiate an example that can be
> > > > > >     reproduced on
> > > > > >   > a machine with 8GB RAM.
> > > > > >   >
> > > > > >   > Further note that rgdal::readGDAL() is not how you
> > > > > > handle
> > > > > > very
> > > > > >     large
> > > > > >   > objects in files, and never has been. raster can handle
> > > > > > blocks
> > > > > >     of data
> > > > > >   > from bands in file; stars and gdalcubes can use
> > > > > > proxy=TRUE
> > > > > > for the
> > > > > >   > same purpose. Why did you choose rgdal::readGDAL() when
> > > > > > this is
> > > > > > not
> > > > > >   > its purpose?
> > > > > >   >
> > > > > >   > You did not say how much RAM is on your platform.
> > > > > >   >
> > > > > >   > Roger
> > > > > >   >
> > > > > >   >>
> > > > > >   >> In R 3.6.3 is is "... memory.c:3717"
> > > > > >   >>
> > > > > >   >> However, I can load the same file with the tiff
> > > > > > package
> > > > > > and a
> > > > > >     file of
> > > > > >   >> the same size in the native raster package format
> > > > > > (*.grd)
> > > > > > with
> > > > > > the
> > > > > >   >> raster package but again not with the rgdal package.
> > > > > >   >>
> > > > > >   >> gdalinfo (gdalUtils) does not complain (see below).
> > > > > > Hence, Even
> > > > > >   >> Rouault assumes the problem is related to rgdal and
> > > > > > not
> > > > > > gdal
> > > > > >   >> (
> > > > > > https://github.com/OSGeo/gdal/issues/2442
> > > > > >
> > > > > > ).
> > > > > >   >>
> > > > > >   >> Below you find reproducible code, which generates a
> > > > > > raster file,
> > > > > >   >> saves the two formats (.tiff and .grd) and tries to
> > > > > > read
> > > > > > them
> > > > > > with
> > > > > >   >> the different packages.
> > > > > >   >>
> > > > > >   >> Is this a known limitation? Any help is greatly
> > > > > > appreciated!
> > > > > >   >>
> > > > > >   >> Thanks a lot in advance!
> > > > > >   >>
> > > > > >   >> Best wishes and stay healthy,
> > > > > >   >> Thorsten
> > > > > >   >>
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ### Steps to reproduce the problem.
> > > > > >   >>
> > > > > >   >> R code:
> > > > > >   >>
> > > > > >   >> ```
> > > > > >   >> library(rgdal) # 1.4-8
> > > > > >   >> library(raster) # 3.1-5
> > > > > >   >> library(tiff) # 0.1-5
> > > > > >   >>
> > > > > >   >> ## generate and manipulate a big raster dataset
> > > > > >   >> # - generate
> > > > > >   >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > > > extent(c(0,
> > > > > >   >> 72000, 0, 48000))) # all fine
> > > > > >   >>
> > > > > >   >> # - manipulate
> > > > > >   >> values(rDemTest) = 1 # all fine
> > > > > >   >>
> > > > > >   >> # - convert
> > > > > >   >> mDemTest = raster::as.matrix(rDemTest) # all fine
> > > > > >   >> str(mDemTest)
> > > > > >   >>
> > > > > >   >> ## save a big dataset
> > > > > >   >>
> > > > > >   >> # - as raster/gdal
> > > > > >   >> sFileNameTiff = "BigData.tif"
> > > > > >   >> writeRaster(rDemTest, sFileNameTiff, "GTiff",
> > > > > > overwrite =
> > > > > > TRUE,
> > > > > >   >> NAflag = -9999) # all fine
> > > > > >   >>
> > > > > >   >> # - as raster native
> > > > > >   >> sFileNameNative = "BigData.grd"
> > > > > >   >> writeRaster(rDemTest, sFileNameNative, "raster",
> > > > > > overwrite =
> > > > > > TRUE,
> > > > > >   >> NAflag = -9999) # all fine
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ## load the big raster datasets with different
> > > > > > packages
> > > > > > and
> > > > > > options
> > > > > >   >> # - load the tiff data with the gdal package via the
> > > > > > raster
> > > > > > package
> > > > > >   >> rDem = raster(sFileNameTiff) # all fine
> > > > > >   >> extent(rDem) # all fine
> > > > > >   >> mDem = raster::as.matrix(rDem) # error
> > > > > >   >> rDem = readAll(rDem) # error
> > > > > >   >>
> > > > > >   >> # - load the native raster data with the raster
> > > > > > package
> > > > > >   >> rDem = raster(sFileNameNative) # all fine
> > > > > >   >> extent(rDem) # all fine
> > > > > >   >> mDem = raster::as.matrix(rDem) # all fine
> > > > > >   >> str(mDem)
> > > > > >   >>
> > > > > >   >> # - load the tiff data with the tiff package
> > > > > >   >> mDem = readTIFF(sFileNameTiff) # all fine
> > > > > >   >> str(mDem)
> > > > > >   >>
> > > > > >   >> # - load the tiff data with the gdal package
> > > > > >   >> sfDem = readGDAL(sFileNameTiff) # error
> > > > > >   >>
> > > > > >   >> # - load the native raster data with the gdal package
> > > > > >   >> sfDem = readGDAL(sFileNameNative) # error
> > > > > >   >>
> > > > > >   >> ```
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ### Startup messages when rgdal is attached (requested
> > > > > > by
> > > > > > Roger
> > > > > >     Bivand)
> > > > > >   >>>  library(rgdal)
> > > > > >   >> rgdal: version: 1.4-8, (SVN revision 845)
> > > > > >   >>  Geospatial Data Abstraction Library extensions to R
> > > > > >     successfully loaded
> > > > > >   >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
> > > > > >   >>  Path to GDAL shared files:
> > > > > >   >>  GDAL binary built with GEOS: TRUE
> > > > > >   >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th,
> > > > > > 2020,
> > > > > >     [PJ_VERSION:
> > > > > >   >> 631]
> > > > > >   >>  Path to PROJ.4 shared files: (autodetected)
> > > > > >   >>  Linking to sp version: 1.4-1
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ### Session info
> > > > > >   >>>  sessionInfo()
> > > > > >   >> R version 4.0.0 (2020-04-24)
> > > > > >   >> Platform: x86_64-pc-linux-gnu (64-bit)
> > > > > >   >> Running under: Ubuntu 20.04 LTS
> > > > > >   >>
> > > > > >   >> Matrix products: default
> > > > > >   >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-
> > > > > > pthread/libblas.so.3
> > > > > >   >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-
> > > > > > pthread/liblapack.so.3
> > > > > >   >>
> > > > > >   >> locale:
> > > > > >   >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
> > > > > > LC_TIME=de_DE.UTF-8
> > > > > >   >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
> > > > > >   >> LC_MESSAGES=de_DE.UTF-8
> > > > > >   >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
> > > > > >   >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
> > > > > >   >> LC_IDENTIFICATION=C
> > > > > >   >>
> > > > > >   >> attached base packages:
> > > > > >   >> [1] stats     graphics  grDevices utils datasets
> > > > > > methods
> > > > > > base
> > > > > >   >>
> > > > > >   >> other attached packages:
> > > > > >   >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5
> > > > > > raster_3.1-5
> > > > > >   >> sp_1.4-1
> > > > > >   >>
> > > > > >   >> loaded via a namespace (and not attached):
> > > > > >   >>  [1] compiler_4.0.0    tools_4.0.0 Rcpp_1.0.4.6
> > > > > >   >> R.methodsS3_1.8.0 codetools_0.2-16
> > > > > >   >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
> > > > > >   >> R.utils_2.9.2     R.oo_1.23.0
> > > > > >   >> [11] lattice_0.20-41
> > > > > >   >>
> > > > > >   >>
> > > > > >   >> ### gdalInfo
> > > > > >   >>>  gdalinfo(sFileNameTiff)
> > > > > >   >>  [1] "Driver: GTiff/GeoTIFF"
> > > > > >   >>  [2] "Files: BigData.tif"
> > > > > >   >>  [3] "Size is 72000, 48000"
> > > > > >   >>  [4] "Origin =
> > > > > > (0.000000000000000,48000.000000000000000)"
> > > > > >   >>  [5] "Pixel Size = (1.000000000000000,-
> > > > > > 1.000000000000000)"
> > > > > >   >>  [6] "Image Structure Metadata:"
> > > > > >   >>  [7] "  COMPRESSION=LZW"
> > > > > >   >>  [8] "  INTERLEAVE=BAND"
> > > > > >   >>  [9] "Corner Coordinates:"
> > > > > >   >> [10] "Upper Left  (       0.000,   48000.000) "
> > > > > >   >> [11] "Lower Left  (   0.0000000,   0.0000000) "
> > > > > >   >> [12] "Upper Right (   72000.000,   48000.000) "
> > > > > >   >> [13] "Lower Right (   72000.000,       0.000) "
> > > > > >   >> [14] "Center      (   36000.000,   24000.000) "
> > > > > >   >> [15] "Band 1 Block=72000x1 Type=Float32,
> > > > > > ColorInterp=Gray"
> > > > > >   >> [16] "  Min=1.000 Max=1.000 "
> > > > > >   >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan,
> > > > > > StdDev=nan"
> > > > > >   >> [18] "  NoData Value=-9999"
> > > > > >   >> [19] "  Metadata:"
> > > > > >   >> [20] "    STATISTICS_MAXIMUM=1"
> > > > > >   >> [21] "    STATISTICS_MEAN=nan"
> > > > > >   >> [22] "    STATISTICS_MINIMUM=1"
> > > > > >   >> [23] "    STATISTICS_STDDEV=nan"
> > > > > >   >>
> > > > > >   >> _______________________________________________
> > > > > >   >> R-sig-Geo mailing list
> > > > > >   >>
> > > > > > [hidden email]
> > > > > >
> > > > > >  <mailto:
> > > > > > [hidden email]
> > > > > >
> > > > > >
> > > > > >   >>
> > > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > > >
> > > > > >
> > > > > >   >>
> > > > > >   >>
> > > > > >   >
> > > > > >
> > > > > >     _______________________________________________
> > > > > >     R-sig-Geo mailing list
> > > > > >
> > > > > > [hidden email]
> > > > > >
> > > > > >  <mailto:
> > > > > > [hidden email]
> > > > > >
> > > > > >
> > > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >     [[alternative HTML version deleted]]
> > > > >
> > > > > _______________________________________________
> > > > > R-sig-Geo mailing list
> > > > > [hidden email]
> > > > >
> > > > >
> > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > >
> > > > >
> > > > >
> > >
> > > _______________________________________________
> > > R-sig-Geo mailing list
> > > [hidden email]
> > >
> > >
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > >
> > >
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > [hidden email]
> >
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
> >

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics
Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Thorsten Behrens
Roger and Mathias,

I tried the latest version (rev 962) and it works! Perfect!

Thanks a lot!

Cheers,
Thorsten


29.04.2020 11:27:18 Roger Bivand <[hidden email]>:

> Thanks, as I thought. Committed to R-forge as SVN rev. 962. Really we'd need a test rig for reading, manipulation in sp/rgdal and raster, and writing, to be sure that all int assumptions that need conversion to R_xlen_t.
>
> Roger
>
> --
> Roger Bivand
> Norwegian School of Economics
> Helleveien 30, 5045 Bergen, Norway
> [hidden email]
>
>
> ________________________________________
> Fra: Mathias Moser <[hidden email]>
> Sendt: tirsdag 28. april 2020 22.14
> Til: Roger Bivand
> Kopi: RsigGeo
> Emne: Re: [R-sig-Geo] rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)
>
>
> > I cannot check this because I have no access to a platform with
> > enough
> > RAM, so I need help here. I haven't been able to confirm that
> > LENGTH(sRStorage) returns an R_xlen_t, or double, in lines 1672, 1685
> > or
> > 1693. Is there a way of generating a large file using a GDAL app,
> > perhaps,
> > then I could just try reading a larger file if LENGTH plays up?
> >
>
> LENGTH() can not handle the R_xlen_t and again throws a long vector
> error. I have replaced the three occurrences with XLENGTH() and tested
> using Thorsten's Ceil.tif file, looks good so far:
>
>
> >
> > > library(rgdal)
> > >
> > Loading required package: sp
> > rgdal: version: 1.5-8, (SVN revision (unknown))
> > Geospatial Data Abstraction Library extensions to R successfully
> > loaded
> > Loaded GDAL runtime: GDAL 3.0.1, released 2019/06/28
> > Path to GDAL shared files: /usr/local/share/gdal
> > GDAL binary built with GEOS: TRUE
> > Loaded PROJ runtime: Rel. 6.2.0, September 1st, 2019, [PJ_VERSION:
> > 620]
> > Path to PROJ shared files: /usr/local/share/proj
> > Linking to sp version: 1.4-1
> >
> > > mCeil = raster::as.matrix(raster::raster("Ceil.tif"))
> > > str(mCeil)
> > >
> > num [1:46341, 1:46341] 1 1 1 1 1 1 1 1 1 1 ...
> >
>
> HTH, best wishes,
>
> Mathias
>
>
> >
> >
> > > (read_gdal_data() of stars _seems_ to use R_xlen_t instead to
> > > accomodate for 52bits [2]).
> > >
> > > Best wishes,
> > >
> > > Mathias
> > >
> > >
> > > [1]
> > > https://cloud.r-project.org/doc/manuals/r-patched/R-ints.html#Long-vectors
> > >
> > > [2]
> > > https://github.com/r-spatial/sf/blob/ea1bd716769ab8140d3451e3d902cfc79bc895d5/src/stars.cpp#L172
> > >
> > >
> > >
> > > On Tue, 2020-04-28 at 13:39 +0200, Thorsten Behrens wrote:
> > >
> > > > Roger and Mike,
> > > >
> > > > I really appreciate your help on this!
> > > >
> > > > I had a look at getRasterData(). It results in the same error.
> > > > Hence,
> > > > I
> > > > made further tests where I compared grids with the following
> > > > numbers
> > > > of
> > > > cols and rows:
> > > >
> > > > nPx = floor(sqrt(2^31 -1)) # 46340
> > > >
> > > > and
> > > >
> > > > nPx = ceiling(sqrt(2^31 -1)) # 46341
> > > >
> > > > The result is clear. Raster data with 46340 * 46340 px can be
> > > > loaded
> > > > with getRasterData() and with raster(), raster::as.matrix(),
> > > > whereas
> > > > datasets with 46341 * 46341 px cannot and result in the error:
> > > >
> > > > Error in getRasterData(gdCeil, band = 1) :
> > > > long vectors not supported yet: memory.c:3782
> > > >
> > > > read_stars() works for both. You find the corresponding code
> > > > below.
> > > >
> > > > Are there any other option I can try?
> > > >
> > > > Thorsten
> > > >
> > > >
> > > > Reproducible code:
> > > >
> > > > ## generate raster datasets
> > > > # 46340 * 46340 grid dataset
> > > > sFileNameFloor = "Floor.tif"
> > > >
> > > > nPx = floor(sqrt(2^31 -1)) # 46340
> > > > nPx
> > > >
> > > > rFloor = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
> > > > nPx)))
> > > > values(rFloor) = 1
> > > >
> > > > writeRaster(rFloor, sFileNameFloor, "GTiff", overwrite = TRUE,
> > > > NAflag
> > > > =
> > > > -9999)
> > > >
> > > >
> > > > # 46341 * 46341 grid dataset
> > > > sFileNameCeil = "Ceil.tif"
> > > >
> > > > nPx = ceiling(sqrt(2^31 -1))
> > > > nPx
> > > >
> > > > rCeil = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0,
> > > > nPx)))
> > > > values(rCeil) = 1
> > > >
> > > > writeRaster(rCeil, sFileNameCeil, "GTiff", overwrite = TRUE,
> > > > NAflag
> > > > =
> > > > -9999)
> > > >
> > > >
> > > > ## load raster datasets with different methods
> > > >
> > > > # load Ceil
> > > > gdCeil = GDAL.open(sFileNameCeil)
> > > > dim(gdCeil)
> > > >
> > > > vnCeil = getRasterData(gdCeil, band = 1) # error
> > > >
> > > > GDAL.close(gdCeil)
> > > > str(vnCeil)
> > > >
> > > > stCeil = read_stars(sFileNameCeil) # all fine
> > > > str(stCeil[[1]])
> > > >
> > > > rCeil = raster(sFileNameCeil)
> > > > mCeil = raster::as.matrix(rCeil) # error
> > > > str(mCeil)
> > > >
> > > >
> > > > # load Floor
> > > > gdFloor = GDAL.open(sFileNameFloor)
> > > > dim(gdFloor)
> > > >
> > > > vnData = getRasterData(gdFloor, band = 1) # all fine
> > > >
> > > > GDAL.close(gdFloor)
> > > > str(vnData)
> > > >
> > > > stFloor= read_stars(sFileNameFloor) # all fine
> > > > str(stFloor[[1]])
> > > >
> > > > rFloor = raster(sFileNameFloor)
> > > > mFloor = raster::as.matrix(rFloor) # all fine
> > > > str(mFloor)
> > > >
> > > >
> > > >
> > > >
> > > > Am 28.04.2020 um 12:10 schrieb Roger Bivand:
> > > >
> > > > > On Tue, 28 Apr 2020, Thorsten Behrens wrote:
> > > > >
> > > > >
> > > > > > Michael,
> > > > > >
> > > > > > Thanks for the hint, it seems to work! Real-world tests will
> > > > > > follow in
> > > > > > the next few days...
> > > > > >
> > > > > > So it definitely seems to be a problem of rgdal. It would be
> > > > > > great if it
> > > > > > could still be solved.
> > > > > >
> > > > >
> > > > > Not rgdal, but your use of it. Try looking at a sequence of
> > > > >
> > > > > file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
> > > > > obj <- GDAL.open(file)
> > > > > (dims <- dim(obj))
> > > > > band <- 1
> > > > > data_vector <- getRasterData(obj, band)
> > > > > GDAL.close(obj)
> > > > > str(data_vector)
> > > > >
> > > > > This does not create any more complicated objects, just a
> > > > > matrix.
> > > > > In
> > > > > some cases, the rows in the matrix are ordered S -> N, so it
> > > > > may
> > > > > appear the wrong way up.
> > > > >
> > > > > rgdal::getRasterData() is lightweight, and has many arguments
> > > > > which
> > > > > may be helpful. rgdal::readGDAL() is heavyweight, creating a
> > > > > SpatialGridDataFrame object. This involves much copying of
> > > > > data,
> > > > > but
> > > > > the output object can be used for example in mapping or
> > > > > analysis
> > > > > without further conversion. My guess is that
> > > > > rgdal::getRasterData()
> > > > > gives you your matrix directly. Look at the R code to see how
> > > > > as.is=
> > > > > etc. work (files may include scale and offset values - recently
> > > > > a
> > > > > user
> > > > > was confused that scale and offset were "magically" applied to
> > > > > convert
> > > > > Uint16 values to degrees Kelvin on reading). For example, if
> > > > > as.is
> > > > > ==
> > > > > TRUE or scale == 1 and offset == 0, no copying of the input
> > > > > matrix
> > > > > occurs because it is not converted. If you could check this
> > > > > route,
> > > > > others following this thread could also benefit; if I'm wrong,
> > > > > that
> > > > > would also be good to know.
> > > > >
> > > > > Roger
> > > > >
> > > > >
> > > > >
> > > > > > Best,
> > > > > >
> > > > > > Thorsten
> > > > > >
> > > > > >
> > > > > >
> > > > > > Am 27.04.2020 um 15:58 schrieb Michael Sumner:
> > > > > >
> > > > > > > Try stars it worked for me on a test
> > > > > > >
> > > > > > > On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
> > > > > > > <
> > > > > > > [hidden email]
> > > > > > >
> > > > > > > <mailto:
> > > > > > > [hidden email]
> > > > > > >
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > Roger,
> > > > > > >
> > > > > > > thanks a lot for your reply!
> > > > > > >
> > > > > > > I have 256GB RAM installed (mentioned it somewhere).
> > > > > > > And
> > > > > > > there,
> > > > > > > all is
> > > > > > > fine when I run:
> > > > > > >
> > > > > > > rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > > > > extent(c(0,
> > > > > > > 72000,
> > > > > > >
> > > > > > > values(rDemTest) = 1
> > > > > > >
> > > > > > > When limiting the memory to about 8GB with
> > > > > > > ulimit::memory_limit(8000),
> > > > > > > the max size which can be allocated seems to be around
> > > > > > > 10000 x
> > > > > > > 10000px.
> > > > > > > In this case all tests run fine. Unfortunately it seems
> > > > > > > to
> > > > > > > be
> > > > > > > related to
> > > > > > > the size of the grid (48000 x 72000) and therefore the
> > > > > > > problem
> > > > > > > can't be
> > > > > > > reproduced on machines with 8GB RAM. For some
> > > > > > > processing
> > > > > > > steps I
> > > > > > > need
> > > > > > > grids of that size in the memory, which is why I have
> > > > > > > 256
> > > > > > > GB
> > > > > > > installed.
> > > > > > >
> > > > > > > Normally, I use the raster package and not
> > > > > > > rgdal::readGDAL(). This
> > > > > > > was
> > > > > > > just a desperate attempt to find the source of the
> > > > > > > problem.
> > > > > > >
> > > > > > > This is what I use in my code:
> > > > > > >
> > > > > > > rDem = raster(sFileNameTiff)
> > > > > > > mDem = raster::as.matrix(rDem)
> > > > > > >
> > > > > > > But maybe this is the same...
> > > > > > >
> > > > > > > Any further suggestions are much appreciated!
> > > > > > >
> > > > > > > Thanks again!
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Thorsten
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Am 27.04.2020 um 14:50 schrieb Roger Bivand:
> > > > > > > > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
> > > > > > > >
> > > > > > > >> Dear all,
> > > > > > > >>
> > > > > > > >> my problem is that I want to read a big geotiff raster
> > > > > > > dataset
> > > > > > > into R
> > > > > > > >> and convert it to a matrix, which does not work.
> > > > > > > >> The file is big but there is sufficient memory. I need
> > > > > > > all the
> > > > > > > data
> > > > > > > >> in the memory at the same time.
> > > > > > > >>
> > > > > > > >> The error occurs under R 3.6.3 as well as 4.0.0 using
> > > > > > > Ubuntu
> > > > > > > 20.04
> > > > > > > >> LTS with the latest version of the packages (see
> > > > > > > session
> > > > > > > info
> > > > > > > below)
> > > > > > > >> and 256GB RAM installed.
> > > > > > > >>
> > > > > > > >> When loading the raster dataset using rgdal (via
> > > > > > > readGDAL
> > > > > > > or
> > > > > > > >> raster::readAll) I get the follwoing error in R 4.0.0:
> > > > > > > >>
> > > > > > > >> ```
> > > > > > > >> Error in rgdal::getRasterData(con, offset = offs,
> > > > > > > region.dim =
> > > > > > > reg,
> > > > > > > >> band = object@data@band) :
> > > > > > > >> long vectors not supported yet: memory.c:3782
> > > > > > > >> ```
> > > > > > > >
> > > > > > > > On a 16GB Fedora platform:
> > > > > > > >
> > > > > > > >> library(raster) # 3.1-5
> > > > > > > >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > > > > extent(c(0,
> > > > > > > 72000,
> > > > > > > > 0,
> > > > > > > > + 48000))) # all fine
> > > > > > > >> rDemTest
> > > > > > > > class : RasterLayer
> > > > > > > > dimensions : 48000, 72000, 3.456e+09 (nrow, ncol,
> > > > > > > ncell)
> > > > > > > > resolution : 1, 1 (x, y)
> > > > > > > > extent : 0, 72000, 0, 48000 (xmin, xmax, ymin,
> > > > > > > ymax)
> > > > > > > > crs : NA
> > > > > > > >
> > > > > > > >> values(rDemTest) = 1
> > > > > > > > Error: cannot allocate vector of size 25.7 Gb
> > > > > > > >
> > > > > > > > So you are deceiving yourself into thinking that all is
> > > > > > > fine at
> > > > > > > this
> > > > > > > > point. Please try to instantiate an example that can be
> > > > > > > reproduced on
> > > > > > > > a machine with 8GB RAM.
> > > > > > > >
> > > > > > > > Further note that rgdal::readGDAL() is not how you
> > > > > > > handle
> > > > > > > very
> > > > > > > large
> > > > > > > > objects in files, and never has been. raster can handle
> > > > > > > blocks
> > > > > > > of data
> > > > > > > > from bands in file; stars and gdalcubes can use
> > > > > > > proxy=TRUE
> > > > > > > for the
> > > > > > > > same purpose. Why did you choose rgdal::readGDAL() when
> > > > > > > this is
> > > > > > > not
> > > > > > > > its purpose?
> > > > > > > >
> > > > > > > > You did not say how much RAM is on your platform.
> > > > > > > >
> > > > > > > > Roger
> > > > > > > >
> > > > > > > >>
> > > > > > > >> In R 3.6.3 is is "... memory.c:3717"
> > > > > > > >>
> > > > > > > >> However, I can load the same file with the tiff
> > > > > > > package
> > > > > > > and a
> > > > > > > file of
> > > > > > > >> the same size in the native raster package format
> > > > > > > (*.grd)
> > > > > > > with
> > > > > > > the
> > > > > > > >> raster package but again not with the rgdal package.
> > > > > > > >>
> > > > > > > >> gdalinfo (gdalUtils) does not complain (see below).
> > > > > > > Hence, Even
> > > > > > > >> Rouault assumes the problem is related to rgdal and
> > > > > > > not
> > > > > > > gdal
> > > > > > > >> (
> > > > > > > https://github.com/OSGeo/gdal/issues/2442
> > > > > > >
> > > > > > > ).
> > > > > > > >>
> > > > > > > >> Below you find reproducible code, which generates a
> > > > > > > raster file,
> > > > > > > >> saves the two formats (.tiff and .grd) and tries to
> > > > > > > read
> > > > > > > them
> > > > > > > with
> > > > > > > >> the different packages.
> > > > > > > >>
> > > > > > > >> Is this a known limitation? Any help is greatly
> > > > > > > appreciated!
> > > > > > > >>
> > > > > > > >> Thanks a lot in advance!
> > > > > > > >>
> > > > > > > >> Best wishes and stay healthy,
> > > > > > > >> Thorsten
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> ### Steps to reproduce the problem.
> > > > > > > >>
> > > > > > > >> R code:
> > > > > > > >>
> > > > > > > >> ```
> > > > > > > >> library(rgdal) # 1.4-8
> > > > > > > >> library(raster) # 3.1-5
> > > > > > > >> library(tiff) # 0.1-5
> > > > > > > >>
> > > > > > > >> ## generate and manipulate a big raster dataset
> > > > > > > >> # - generate
> > > > > > > >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
> > > > > > > extent(c(0,
> > > > > > > >> 72000, 0, 48000))) # all fine
> > > > > > > >>
> > > > > > > >> # - manipulate
> > > > > > > >> values(rDemTest) = 1 # all fine
> > > > > > > >>
> > > > > > > >> # - convert
> > > > > > > >> mDemTest = raster::as.matrix(rDemTest) # all fine
> > > > > > > >> str(mDemTest)
> > > > > > > >>
> > > > > > > >> ## save a big dataset
> > > > > > > >>
> > > > > > > >> # - as raster/gdal
> > > > > > > >> sFileNameTiff = "BigData.tif"
> > > > > > > >> writeRaster(rDemTest, sFileNameTiff, "GTiff",
> > > > > > > overwrite =
> > > > > > > TRUE,
> > > > > > > >> NAflag = -9999) # all fine
> > > > > > > >>
> > > > > > > >> # - as raster native
> > > > > > > >> sFileNameNative = "BigData.grd"
> > > > > > > >> writeRaster(rDemTest, sFileNameNative, "raster",
> > > > > > > overwrite =
> > > > > > > TRUE,
> > > > > > > >> NAflag = -9999) # all fine
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> ## load the big raster datasets with different
> > > > > > > packages
> > > > > > > and
> > > > > > > options
> > > > > > > >> # - load the tiff data with the gdal package via the
> > > > > > > raster
> > > > > > > package
> > > > > > > >> rDem = raster(sFileNameTiff) # all fine
> > > > > > > >> extent(rDem) # all fine
> > > > > > > >> mDem = raster::as.matrix(rDem) # error
> > > > > > > >> rDem = readAll(rDem) # error
> > > > > > > >>
> > > > > > > >> # - load the native raster data with the raster
> > > > > > > package
> > > > > > > >> rDem = raster(sFileNameNative) # all fine
> > > > > > > >> extent(rDem) # all fine
> > > > > > > >> mDem = raster::as.matrix(rDem) # all fine
> > > > > > > >> str(mDem)
> > > > > > > >>
> > > > > > > >> # - load the tiff data with the tiff package
> > > > > > > >> mDem = readTIFF(sFileNameTiff) # all fine
> > > > > > > >> str(mDem)
> > > > > > > >>
> > > > > > > >> # - load the tiff data with the gdal package
> > > > > > > >> sfDem = readGDAL(sFileNameTiff) # error
> > > > > > > >>
> > > > > > > >> # - load the native raster data with the gdal package
> > > > > > > >> sfDem = readGDAL(sFileNameNative) # error
> > > > > > > >>
> > > > > > > >> ```
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> ### Startup messages when rgdal is attached (requested
> > > > > > > by
> > > > > > > Roger
> > > > > > > Bivand)
> > > > > > > >>> library(rgdal)
> > > > > > > >> rgdal: version: 1.4-8, (SVN revision 845)
> > > > > > > >> Geospatial Data Abstraction Library extensions to R
> > > > > > > successfully loaded
> > > > > > > >> Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
> > > > > > > >> Path to GDAL shared files:
> > > > > > > >> GDAL binary built with GEOS: TRUE
> > > > > > > >> Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th,
> > > > > > > 2020,
> > > > > > > [PJ_VERSION:
> > > > > > > >> 631]
> > > > > > > >> Path to PROJ.4 shared files: (autodetected)
> > > > > > > >> Linking to sp version: 1.4-1
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> ### Session info
> > > > > > > >>> sessionInfo()
> > > > > > > >> R version 4.0.0 (2020-04-24)
> > > > > > > >> Platform: x86_64-pc-linux-gnu (64-bit)
> > > > > > > >> Running under: Ubuntu 20.04 LTS
> > > > > > > >>
> > > > > > > >> Matrix products: default
> > > > > > > >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-
> > > > > > > pthread/libblas.so.3
> > > > > > > >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-
> > > > > > > pthread/liblapack.so.3
> > > > > > > >>
> > > > > > > >> locale:
> > > > > > > >> [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C
> > > > > > > LC_TIME=de_DE.UTF-8
> > > > > > > >> [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
> > > > > > > >> LC_MESSAGES=de_DE.UTF-8
> > > > > > > >> [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C
> > > > > > > >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
> > > > > > > >> LC_IDENTIFICATION=C
> > > > > > > >>
> > > > > > > >> attached base packages:
> > > > > > > >> [1] stats graphics grDevices utils datasets
> > > > > > > methods
> > > > > > > base
> > > > > > > >>
> > > > > > > >> other attached packages:
> > > > > > > >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8 tiff_0.1-5
> > > > > > > raster_3.1-5
> > > > > > > >> sp_1.4-1
> > > > > > > >>
> > > > > > > >> loaded via a namespace (and not attached):
> > > > > > > >> [1] compiler_4.0.0 tools_4.0.0 Rcpp_1.0.4.6
> > > > > > > >> R.methodsS3_1.8.0 codetools_0.2-16
> > > > > > > >> [6] grid_4.0.0 iterators_1.0.12 foreach_1.5.0
> > > > > > > >> R.utils_2.9.2 R.oo_1.23.0
> > > > > > > >> [11] lattice_0.20-41
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> ### gdalInfo
> > > > > > > >>> gdalinfo(sFileNameTiff)
> > > > > > > >> [1] "Driver: GTiff/GeoTIFF"
> > > > > > > >> [2] "Files: BigData.tif"
> > > > > > > >> [3] "Size is 72000, 48000"
> > > > > > > >> [4] "Origin =
> > > > > > > (0.000000000000000,48000.000000000000000)"
> > > > > > > >> [5] "Pixel Size = (1.000000000000000,-
> > > > > > > 1.000000000000000)"
> > > > > > > >> [6] "Image Structure Metadata:"
> > > > > > > >> [7] " COMPRESSION=LZW"
> > > > > > > >> [8] " INTERLEAVE=BAND"
> > > > > > > >> [9] "Corner Coordinates:"
> > > > > > > >> [10] "Upper Left ( 0.000, 48000.000) "
> > > > > > > >> [11] "Lower Left ( 0.0000000, 0.0000000) "
> > > > > > > >> [12] "Upper Right ( 72000.000, 48000.000) "
> > > > > > > >> [13] "Lower Right ( 72000.000, 0.000) "
> > > > > > > >> [14] "Center ( 36000.000, 24000.000) "
> > > > > > > >> [15] "Band 1 Block=72000x1 Type=Float32,
> > > > > > > ColorInterp=Gray"
> > > > > > > >> [16] " Min=1.000 Max=1.000 "
> > > > > > > >> [17] " Minimum=1.000, Maximum=1.000, Mean=nan,
> > > > > > > StdDev=nan"
> > > > > > > >> [18] " NoData Value=-9999"
> > > > > > > >> [19] " Metadata:"
> > > > > > > >> [20] " STATISTICS_MAXIMUM=1"
> > > > > > > >> [21] " STATISTICS_MEAN=nan"
> > > > > > > >> [22] " STATISTICS_MINIMUM=1"
> > > > > > > >> [23] " STATISTICS_STDDEV=nan"
> > > > > > > >>
> > > > > > > >> _______________________________________________
> > > > > > > >> R-sig-Geo mailing list
> > > > > > > >>
> > > > > > > [hidden email]
> > > > > > >
> > > > > > > <mailto:
> > > > > > > [hidden email]
> > > > > > >
> > > > > > >
> > > > > > > >>
> > > > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > > > >
> > > > > > >
> > > > > > > >>
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > R-sig-Geo mailing list
> > > > > > >
> > > > > > > [hidden email]
> > > > > > >
> > > > > > > <mailto:
> > > > > > > [hidden email]
> > > > > > >
> > > > > > >
> > > > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > [[alternative HTML version deleted]]
> > > > > >
> > > > > > _______________________________________________
> > > > > > R-sig-Geo mailing list
> > > > > > [hidden email]
> > > > > >
> > > > > >
> > > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > > _______________________________________________
> > > > R-sig-Geo mailing list
> > > > [hidden email]
> > > >
> > > >
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > >
> > > >
> > > >
> > >
> > > _______________________________________________
> > > R-sig-Geo mailing list
> > > [hidden email]
> > >
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > >
> > >
> > >
> >
>
> _______________________________________________
> R-sig-Geo mailing list
> [hidden email]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>

_______________________________________________
R-sig-Geo mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo