Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#3386 closed enhancement (fixed)

Patch: performance improvement for HDF4_EOS:EOS_GRID

Reported by: vanbogaert Owned by: warmerdam
Priority: normal Milestone: 1.8.0
Component: GDAL_Raster Version: 1.7.0
Severity: normal Keywords: HDF4 HDFEOS MODIS
Cc:

Description

Improve the performance in reading MODIS and MERIS HDFEOS GRID product

Method:

for HDF4_EOS:EOS_GRID 2D only
1. Reading the GDchunkinfo
2. Use of GDreadtile
3. if not chunked apply a better BlockSize ( BlockYSize = RasterYSize only if in a reasonable range )

Tested:

on Linux OpenSuse? 11.1 and gdal-1.6.3, gdal-1.7.0 and gdal-1.8dev

Patch:

diff -crB gdal_ref20100204/frmts/hdf4/hdf4imagedataset.cpp gdal/frmts/hdf4/hdf4imagedataset.cpp
*** gdal_ref20100204/frmts/hdf4/hdf4imagedataset.cpp	2010-02-04 14:51:37.783219412 +0100
--- gdal/frmts/hdf4/hdf4imagedataset.cpp	2010-02-04 15:38:19.047224965 +0100
***************
*** 135,140 ****
--- 135,151 ----
      virtual int         GetGCPCount();
      virtual const char  *GetGCPProjection();
      virtual const GDAL_GCP *GetGCPs();
+ /* -------------------------------------------------------------------- */
+ /*      Decode the get tiledinfo for map                                */
+ /*      and set nBlockXSize,nBlockYSize for accessing HDF4_EOS:EOS_GRID */
+ /*      in a more efficient way                                         */
+ /*      UCL/Geomatics (BELGUIM)                                         */
+ /*      author: eric.vanbogaert@uclouvain.be                            */
+ /* -------------------------------------------------------------------- */
+   private:
+ 	int nBlockSize[2];
+ 	bool bReadTile;
+ /*----------------------------------------------------------------------*/
  };
  
  /************************************************************************/
***************
*** 203,209 ****
--- 214,237 ----
          nBlockYSize = MAX(1,MIN(nBlockYSize,poDS->GetRasterYSize()));
      }
      else
+     {
          nBlockYSize = 1;
+ /* -------------------------------------------------------------------- */
+ /*      Decode the get tiledinfo for map                                */
+ /*      and set nBlockXSize,nBlockYSize for accessing HDF4_EOS:EOS_GRID */
+ /*      in a more efficient way                                         */
+ /*      UCL/Geomatics (BELGUIM)                                         */
+ /*      author: eric.vanbogaert@uclouvain.be                            */
+ /* -------------------------------------------------------------------- */
+ 	if (
+ 		( poDS->nBlockSize[0] > 0 ) &&
+ 		( poDS->nBlockSize[1] > 0 )
+ 	)
+ 	{
+ 		nBlockXSize = poDS->nBlockSize[0];
+ 		nBlockYSize = poDS->nBlockSize[1];
+ 	}
+     }
  }
  
  /************************************************************************/
***************
*** 390,396 ****
                      aiEdges[poGDS->iXDim] = nBlockXSize;
                      break;
                  }
!                 if( GDreadfield( hGD, poGDS->pszFieldName,
                                   aiStart, NULL, aiEdges, pImage ) < 0 )
                  {
                      CPLError( CE_Failure, CPLE_AppDefined, 
--- 418,442 ----
                      aiEdges[poGDS->iXDim] = nBlockXSize;
                      break;
                  }
! /* -------------------------------------------------------------------- */
! /*      Decode the get tiledinfo for map                                */
! /*      and set nBlockXSize,nBlockYSize for accessing HDF4_EOS:EOS_GRID */
! /*      in a more efficient way                                         */
! /*      UCL/Geomatics (BELGUIM)                                         */
! /*      author: eric.vanbogaert@uclouvain.be                            */
! /* -------------------------------------------------------------------- */
! 		if ( poGDS->bReadTile )
! 		{
! 			//intn GDreadtile(int32 gridID, char *fieldname, int32 tilecoords[], VOIDP buffer)
! 			//CPLDebug( "HDF4_EOS:EOS_GRID:","GDreadtile for grid %s", poGDS->pszFieldName );
! 			int32 tilecoords[] = { nBlockYOff , nBlockXOff };
! 			if( GDreadtile( hGD, poGDS->pszFieldName , tilecoords , pImage ) != 0 )
! 			{
! 				CPLError( CE_Failure, CPLE_AppDefined, "GDreadtile() failed for block." );
! 				eErr = CE_Failure;
!                 	}
! 		}
! 		else if( GDreadfield( hGD, poGDS->pszFieldName,
                                   aiStart, NULL, aiEdges, pImage ) < 0 )
                  {
                      CPLError( CE_Failure, CPLE_AppDefined, 
***************
*** 2299,2304 ****
--- 2345,2361 ----
      HDF4ImageDataset    *poDS;
  
      poDS = new HDF4ImageDataset( );
+ /* -------------------------------------------------------------------- */
+ /*      Decode the get tiledinfo for map                                */
+ /*      and set nBlockXSize,nBlockYSize for accessing HDF4_EOS:EOS_GRID */
+ /*      in a more efficient way                                         */
+ /*      UCL/Geomatics (BELGUIM)                                         */
+ /*      author: eric.vanbogaert@uclouvain.be                            */
+ /* -------------------------------------------------------------------- */
+     poDS->nBlockSize[0] = -1;
+     poDS->nBlockSize[1] = -1;
+     poDS->bReadTile = false;
+ /* -------------------------------------------------------------------- */
  
      poDS->fp = poOpenInfo->fp;
      poOpenInfo->fp = NULL;
***************
*** 2554,2559 ****
--- 2613,2675 ----
                            poDS->pszFieldName, szDimList);
  #endif
                  poDS->GetImageDimensions( szDimList );
+ /* -------------------------------------------------------------------- */
+ /*      Decode the get tiledinfo for map                                */
+ /*      and set nBlockXSize,nBlockYSize for accessing HDF4_EOS:EOS_GRID */
+ /*	in a more efficient way                                         */
+ /*	UCL/Geomatics (BELGUIM)                                         */
+ /*      author: eric.vanbogaert@uclouvain.be                            */
+ /* -------------------------------------------------------------------- */
+ 		int32 tilecode, tilerank;
+ 		int32 *tiledims = NULL;
+ 		if( GDtileinfo( hGD , poDS->pszFieldName , &tilecode, &tilerank, NULL ) == 0 )
+ 		{
+ 			if ( tilecode == HDFE_TILE )
+ 			{
+ 				tiledims = (int32 *)calloc( tilerank , sizeof( int32 ) );
+ 				GDtileinfo( hGD , poDS->pszFieldName , &tilecode, &tilerank, tiledims );
+ 				if ( ( tilerank == 2 ) && ( poDS->iRank == tilerank  ) )
+ 				{
+ 					poDS->nBlockSize[0] = tiledims[1];
+ 					poDS->nBlockSize[1] = tiledims[0];
+ 					poDS->bReadTile = true;
+ 				}
+ #ifdef DEBUG
+ 				else
+ 				{
+ 					CPLDebug( "HDF4_EOS:EOS_GRID:","tilerank in grid %s: %d not supported", poDS->pszFieldName , (int)tilerank );
+ 				}
+ 				CPLDebug( "HDF4_EOS:EOS_GRID:","tilerank in grid %s: %d", poDS->pszFieldName , (int)tilerank );
+ 				CPLDebug( "HDF4_EOS:EOS_GRID:","tiledimens in grid %s: %d,%d", poDS->pszFieldName , (int)tiledims[0] , (int)tiledims[1] );
+ #endif
+ 			}
+ 			else
+ 			{
+ 				if ( 
+ 					( ( poDS->aiDimSizes[1] * poDS->aiDimSizes[0] ) < ( 1024*1024*128 ) ) && // 128 Mega is actually a good limit
+ 					( poDS->iRank == 2 )
+ 				)
+ 				{
+ 					poDS->nBlockSize[0] = poDS->aiDimSizes[1];
+ 					poDS->nBlockSize[1] = poDS->aiDimSizes[0];
+ 				}
+ #ifdef DEBUG
+ 				CPLDebug( "HDF4_EOS:EOS_GRID:","tilecode == HDFE_NOTILE in grid %s BlockSize: (%d,%d) %d",
+ 					poDS->pszFieldName ,
+ 					poDS->nBlockSize[0] ,
+ 					poDS->nBlockSize[1] ,
+ 					(int)poDS->iRank
+ 					);
+ #endif
+ 			}
+ 		}
+ #ifdef DEBUG
+ 		else
+ 		{
+ 			CPLDebug( "HDF4_EOS:EOS_GRID:","ERROR GDtileinfo %s", poDS->pszFieldName );
+ 		}
+ #endif
+ /* -------------------------------------------------------------------- */
  
  /* -------------------------------------------------------------------- */
  /*      Fetch projection information                                    */

Apply:

#! /bin/sh

export SRC="gdal-1.7.0.tar.gz"
export PKG="gdal-1.7.0"

export PREFIX="/usr/local"

export LD_LIBRARY_PATH=${PREFIX}/lib:$LD_LIBRARY_PATH
export PATH=${PREFIX}/bin:$PATH
export PYTHONPATH=${PREFIX}/lib/python

rm -rf ${PKG}
tar -zxvf ${SRC} &&
cd ${PKG} &&
patch -p1 -i ../gdal_frmts_hdf4.patch &&
./configure \
 --prefix=${PREFIX} \
 --with-local=${PREFIX} \
 --with-threads \
 --with-gnu-ld \
 --with-pic \
 --without-perl \
 --with-python --with-pymoddir=${PREFIX}/lib/python \
 --with-hdf4=${PREFIX} \
 --with-hdf5=${PREFIX} \
 --with-jpeg=internal \
 --with-jasper=${PREFIX} \
 --with-libtiff=internal \
 --with-geotiff=internal \
 --without-netcdf \
 --with-geos=${PREFIX}/bin/geos-config \
 --without-expat \
 2>&1 | tee ../${PKG}.config &&
make -j 4

Attachments (1)

gdal_frmts_hdf4.patch (7.2 KB) - added by vanbogaert 10 years ago.
Patch: hdf4imagedataset.cpp

Download all attachments as: .zip

Change History (8)

Changed 10 years ago by vanbogaert

Attachment: gdal_frmts_hdf4.patch added

Patch: hdf4imagedataset.cpp

comment:1 Changed 10 years ago by vanbogaert

Owner: changed from warmerdam to vanbogaert
Status: newassigned

comment:2 Changed 10 years ago by vanbogaert

Owner: changed from vanbogaert to warmerdam
Status: assignednew

comment:3 Changed 10 years ago by warmerdam

Milestone: 1.7.1

As a non-critical enhancement with no one assigned to incorporate it, I don't see any reason for this ticket to have a milestone.

comment:4 Changed 10 years ago by Even Rouault

Eric,

I've tried your patch, but I haven't found any HDF4_EOS:EOS_GRID dataset where the tile size returned by GDtileinfo is different from the dataset size. For example, with ftp://ladsweb.nascom.nasa.gov/allData/5/MOD09Q1G_EVI/2006/233/MOD09Q1G_EVI.A2006233.h07v03.005.2008338190308.hdf, I get

HDF4_EOS:EOS_GRID:: tilerank in grid MODIS_EVI: 2
HDF4_EOS:EOS_GRID:: tiledimens in grid MODIS_EVI: 4800,4800

which is the dataset dimensions.

Could you point to a public dataset that has block size smaller than dataset dimension ?

comment:5 in reply to:  4 Changed 10 years ago by vanbogaert

Replying to rouault: Dear Rouault

A modis daily 500m reflectance TERRA or AQUA have a tile size of 2400x32 for a rastersize of 2400x2400

ftp://e4ftl01u.ecs.nasa.gov/MOLT/MOD09GA.005/2010.02.24/MOD09GA.A2010055.h17v04.005.2010058075104.hdf

gdalinfo HDF4_EOS:EOS_GRID:"MOD09GA.A2010055.h17v04.005.2010058075104.hdf":MODIS_Grid_500m_2D:sur_refl_b07_1

Band 1 Block=2400x32 Type=Int16, ColorInterp=Gray
NoData Value=-28672

comment:6 Changed 10 years ago by Even Rouault

Milestone: 1.8.0
Resolution: fixed
Status: newclosed

r18989 /trunk/gdal/frmts/hdf4/hdf4imagedataset.cpp: HDF4_EOS_GRID : detect tile dimensions and use them as block size (adapted from original patch by Eric Vanbogaert, UCL/Geomatics (BELGUIM)); raise HDF4_BLOCK_PIXELS default value to 1,000,000 (#3386)

r18990 /trunk/autotest/gcore/hdf4_read.py: Test #3386


Eric,

you'll find in r18989 a variation of your patch. The differences are :

  • Rename the nBlockSize[] array into nBlockPreferredXSize/nBlockPreferredYSize
  • Use that only if nBlockPreferredXSize == nRasterXSize as the IReadBlock() can only handle such case
  • Don't use GDreadtile() for the most bottom block if nRasterYSize is not a multiple of nBlockPreferredYSize. This is for example the case for HDF4_EOS:EOS_GRID:"MOD09GA.A2010055.h17v04.005.2010058075104.hdf":MODIS_Grid_1km_2D:num_observations_1km whose dimension is 1200x1200 and preferred block size is 1200x32. But 1200 / 32 = 37.5
  • I've dropped the part where you set nBlockPreferredXSize/nBlockPreferredYSize to be the dataset dimensions (with the limit of 128 MB) when GDtileinfo() fails. I've raised the default value of the HDF4_BLOCK_PIXELS environment variable (that can be used for HDF4_EOS datasets since yesterday) to 1 MB, so it will read more lines at once. You can raise the value of the environment variable if necessary
  • I've dropped all the mentions to UCL/Geomatics and yourself. If you wish, I could add a new copyright line in the header ?

Please re-open if after some testing, you find I've made something wrong.

comment:7 Changed 10 years ago by Even Rouault

r19069 /trunk/gdal/frmts/hdf4/hdf4imagedataset.cpp: HDF4_EOS_GRID : fix comparison operator for r18989; avoid using preferred block height when it is only 1 as it leads to very poor read performance with some datasets (#3386)

r19070 /trunk/autotest/gcore/hdf4_read.py: Test reading a HDF4_EOS:EOS_GRID where preferred block height reported would be 1 but that will lead to very poor performance - fix done in r19069 (#3386)

Note: See TracTickets for help on using tickets.