Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#5444 closed defect (fixed)

VRT opens datasets for GetMinimum() call

Reported by: warmerdam Owned by: warmerdam
Priority: low Milestone: 1.11.1
Component: GDAL_Raster Version: 1.9.0
Severity: normal Keywords: vrt
Cc: Even Rouault

Description

The current implementation of VRTSourcedRasterBand::GetMinimum?() will call GetMinimum?() on each of the sourced datasets which forces them to be opened when that would not otherwise have been required. This is undesirable behavior (IMHO) as it makes something like a gdalinfo on a VRT catalog unbearably expensive. A significant part of the goal of VRTs with many sources is to only touch those needed.

Change History (7)

comment:1 Changed 7 years ago by Even Rouault

Frank,

It must happen with certain sources, right ? With a VRT of TIFF files, I've verified that VRTSourcedRasterBand::GetMinimum?() will not cause the TIFF files to be opened, since the GDALRasterBand::GetMinimum?() implementation just returns 0 (for Byte datatype e.g)

AFAIR, implementing GetMinimum?() on VRT was made to speed-up VRT opening in QGIS.

comment:2 Changed 7 years ago by warmerdam

Cc: Even Rouault added

This is a debug traceback from gdalinfo and a VRT. I don't see what should be avoiding opening the file here. I *think* I'm using GDAL trunk.

GDAL: GDALOpen(out.vrt, this=0x6320a0) succeeds as VRT.
Driver: VRT/Virtual Raster
GDAL: GDALDefaultOverviews::OverviewScan()
Files: out.vrt
       /home/warmerdam/LC82150642014011LGN00_B1.TIF
Size is 7631, 7781
Coordinate System is:
PROJCS["WGS 84 / UTM zone 24N",
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563,
                AUTHORITY["EPSG","7030"]],
            AUTHORITY["EPSG","6326"]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433],
        AUTHORITY["EPSG","4326"]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",-39],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AUTHORITY["EPSG","32624"]]
Origin = (668085.000000000000000,-523785.000000000000000)
Pixel Size = (30.000000000000000,-30.000000000000000)
Metadata:
  AREA_OR_POINT=Point
OGRCT: PROJ >= 4.8.0 features enabled
OGRCT: Source: +proj=utm +zone=24 +datum=WGS84 +units=m +no_defs 
OGRCT: Target: +proj=longlat +datum=WGS84 +no_defs 
Corner Coordinates:
Upper Left  (  668085.000, -523785.000) ( 37d29' 4.18"W,  4d44'13.48"S)
Lower Left  (  668085.000, -757215.000) ( 37d28'43.88"W,  6d50'52.90"S)
Upper Right (  897015.000, -523785.000) ( 35d25'20.38"W,  4d43'46.14"S)
Lower Right (  897015.000, -757215.000) ( 35d24'32.56"W,  6d50'13.29"S)
Center      (  782550.000, -640500.000) ( 36d26'55.31"W,  5d47'19.91"S)
Band 1 Block=128x128 Type=UInt16, ColorInterp=Gray

Breakpoint 1, GDALOpen (pszFilename=0x635338 "/home/warmerdam/LC82150642014011LGN00_B1.TIF", eAccess=GA_ReadOnly) at gdaldataset.cpp:2258
(gdb) where
#0  GDALOpen (pszFilename=0x635338 "/home/warmerdam/LC82150642014011LGN00_B1.TIF", eAccess=GA_ReadOnly) at gdaldataset.cpp:2258
#1  0x00007ffff727cef2 in GDALDatasetPool::_RefDataset (this=0x634880, pszFileName=0x635338 "/home/warmerdam/LC82150642014011LGN00_B1.TIF", eAccess=GA_ReadOnly) at gdalproxypool.cpp:298
#2  0x00007ffff727d2a9 in GDALDatasetPool::RefDataset (pszFileName=0x635338 "/home/warmerdam/LC82150642014011LGN00_B1.TIF", eAccess=GA_ReadOnly) at gdalproxypool.cpp:387
#3  0x00007ffff727d8fb in GDALProxyPoolDataset::RefUnderlyingDataset (this=0x6351f0) at gdalproxypool.cpp:587
#4  0x00007ffff727e533 in GDALProxyPoolRasterBand::RefUnderlyingRasterBand (this=0x635370) at gdalproxypool.cpp:917
#5  0x00007ffff727b5f7 in GDALProxyRasterBand::GetMinimum (this=0x635370, pbSuccess=0x7fffffff3b90) at gdalproxydataset.cpp:208
#6  0x00007ffff721b763 in VRTSimpleSource::GetMinimum (this=0x634720, nXSize=7631, nYSize=7781, pbSuccess=0x7fffffff3b90) at vrtsources.cpp:935
#7  0x00007ffff7216459 in VRTSourcedRasterBand::GetMinimum (this=0x6350c0, pbSuccess=0x7fffffff3c88) at vrtsourcedrasterband.cpp:271
#8  0x00007ffff7281989 in GDALGetRasterMinimum (hBand=0x6350c0, pbSuccess=0x7fffffff3c88) at gdalrasterband.cpp:1796
#9  0x00000000004034e1 in main (argc=<optimized out>, argv=0x630750) at gdalinfo.c:598
(gdb) 
<VRTDataset rasterXSize="7631" rasterYSize="7781">
  <SRS>PROJCS["WGS 84 / UTM zone 24N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-39],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AUTHORITY["EPSG","32624"]]</SRS>
  <GeoTransform>  6.6808500000000000e+05,  3.0000000000000000e+01,  0.0000000000000000e+00, -5.2378500000000000e+05,  0.0000000000000000e+00, -3.0000000000000000e+01</GeoTransform>
  <Metadata>
    <MDI key="AREA_OR_POINT">Point</MDI>
  </Metadata>
  <VRTRasterBand dataType="UInt16" band="1">
    <Metadata />
    <ColorInterp>Gray</ColorInterp>
    <SimpleSource>
      <SourceFilename relativeToVRT="1">LC82150642014011LGN00_B1.TIF</SourceFilename>
      <SourceBand>1</SourceBand>
      <SourceProperties RasterXSize="7631" RasterYSize="7781" DataType="UInt16" BlockXSize="256" BlockYSize="256" />
      <SrcRect xOff="0" yOff="0" xSize="7631" ySize="7781" />
      <DstRect xOff="0" yOff="0" xSize="7631" ySize="7781" />
    </SimpleSource>
  </VRTRasterBand>
</VRTDataset>

This is hopefully a minimum demonstration of the problem, but the case it matters to me is where the opening the datasource is actually quite expensive.

comment:3 Changed 7 years ago by Even Rouault

Hum, actually I've profiled against a huge VRT, and it indeed opens the first tile in it, but as the default GetMinimum?(pbSuccess) returns *pbSuccess = FALSE it stops iterating over other files. But yes if the opening of that file is slow, then GetMinimum?() could be slow.

comment:4 Changed 7 years ago by warmerdam

Ah, I see. OK, lets leave this as low priority. It isn't a very significant issue.

comment:5 Changed 7 years ago by Even Rouault

The following patch would default to the base implementation of GetMinimum?()/GetMaximum?() while preserving the possibility of doing the slow way if VRT_MIN_MAX_FROM_SOURCES=YES is defined

Index: frmts/vrt/vrtsourcedrasterband.cpp
===================================================================
--- frmts/vrt/vrtsourcedrasterband.cpp	(révision 27237)
+++ frmts/vrt/vrtsourcedrasterband.cpp	(copie de travail)
@@ -243,6 +243,9 @@
 
 double VRTSourcedRasterBand::GetMinimum( int *pbSuccess )
 {
+    if( !CSLTestBoolean(CPLGetConfigOption("VRT_MIN_MAX_FROM_SOURCES", "FALSE")) )
+        return GDALRasterBand::GetMinimum(pbSuccess);
+
     const char *pszValue = NULL;
 
     if( (pszValue = GetMetadataItem("STATISTICS_MINIMUM")) != NULL )
@@ -294,6 +297,9 @@
 
 double VRTSourcedRasterBand::GetMaximum(int *pbSuccess )
 {
+    if( !CSLTestBoolean(CPLGetConfigOption("VRT_MIN_MAX_FROM_SOURCES", "FALSE")) )
+        return GDALRasterBand::GetMaximum(pbSuccess);
+
     const char *pszValue = NULL;
 
     if( (pszValue = GetMetadataItem("STATISTICS_MAXIMUM")) != NULL )
Index: ../autotest/gcore/vrt_read.py
===================================================================
--- ../autotest/gcore/vrt_read.py	(révision 27226)
+++ ../autotest/gcore/vrt_read.py	(copie de travail)
@@ -273,6 +273,8 @@
     # Now compute source statistics
     mem_ds.GetRasterBand(1).ComputeStatistics(False)
 
+    gdal.SetConfigOption('VRT_MIN_MAX_FROM_SOURCES', 'YES')
+
     if vrt_ds.GetRasterBand(1).GetMinimum() != 74:
         gdaltest.post_reason('got bad minimum value')
         print(vrt_ds.GetRasterBand(1).GetMinimum())
@@ -282,6 +284,8 @@
         print(vrt_ds.GetRasterBand(1).GetMaximum())
         return 'fail'
 
+    gdal.SetConfigOption('VRT_MIN_MAX_FROM_SOURCES', None)
+
     mem_ds = None
     vrt_ds = None

comment:6 Changed 7 years ago by Even Rouault

Milestone: 1.11.1
Resolution: fixed
Status: newclosed
Version: svn-trunk1.9.0

trunk r27540, branches/1.11 r27541 "VRT: implement heuristics to determine if GetMinimum?()/GetMaximum?() should use the implementation of their sources of not. Can be overriden by setting VRT_MIN_MAX_FROM_SOURCES = YES/NO (#5444)"

comment:7 Changed 7 years ago by Even Rouault

Actually the fix in 1.11 is in r27542

Note: See TracTickets for help on using tickets.