Ticket #2266 (assigned defect)

Opened 6 months ago

Last modified 5 months ago

Large performance overhead going through RasterIO as compared to ReadBlock

Reported by: grovduck Assigned to: warmerdam (accepted)
Priority: normal Milestone: 1.6.0
Component: default Version: unspecified
Severity: normal Keywords: RasterIO, ReadBlock, performance
Cc: rouault, chowell

Description

GDALRasterBand.RasterIO() and GDALRasterBand.ReadBlock() should have similar performance when RasterIO is used on block boundaries. In a simple test (reading a large single-band GTiff (8000x13000) 200 times), ReadBlock appears to be much faster than RasterIO (up to 4x).

The original post to the gdal-dev listserve (including code to replicate the issue) is at http://lists.osgeo.org/pipermail/gdal-dev/2008-March/016353.html. Frank Warmerdam confirmed the unacceptable performance overhead going through RasterIO.

Change History

03/07/08 13:13:47 changed by grovduck

  • keywords set to RasterIO, ReadBlock, performance.

03/16/08 19:12:47 changed by rouault

  • cc set to rouault.

I confirm the performance overhead with RasterIO.

On my machine, AMD Athlon(tm) 64 Processor 3200+ with 512MB RAM, Linux 2.6.22 32bit, the slowdown is about 1.5x - 2x. I've profiled a bit with sysprof and top and it appears that :

  • in the ReadBlock? case, we have about 55% CPU usage for userland process, 45% CPU usage for kernel
  • in the RasterIO case, we have about 70% CPU usage for userland process, 30% CPU usage for kernel

The increased CPU usage seems to come from the memcpy at line 115 of rasterio.cpp (copy from the cached block to the destination buffer) - about 15-20% - and a few percent in cache management - about 7%. Note : I measured the cost for the memcpy in rasterio.cpp by wrapping it in a GDALmemcpy function, so that it can appear in sysprof report.

For the RasterIO case, variations in the size of the cache have no noticeable performance influence when it's under 100 MB (the size of the test product). Above 100 MB, of course, RasterIO wins clearly on ReadBlock?. So, cache thrashing is clearly affecting this test.

Apart from potentially complex anti cache thrashing strategies, it's not obvious to me how to improve that. Well, there's already a test for cache thrashing in gdalrasterband.cpp. It could be used as a hint for disabling caching for that rasterband. But the figures above show that the cache management cost is not so high in comparison to the extra cost of the copy of the blocks required to satisfy the RasterIO operation.

03/26/08 12:10:38 changed by warmerdam

  • status changed from new to assigned.
  • cc changed from rouault to rouault, chowell.

Adding Chris who may do some work on this issue.