Opened 16 years ago

Closed 5 years ago

#2266 closed defect (wontfix)

Large performance overhead going through RasterIO as compared to ReadBlock

Reported by: grovduck Owned by: warmerdam
Priority: normal Milestone: closed_because_of_github_migration
Component: default Version: unspecified
Severity: normal Keywords: RasterIO, ReadBlock, performance
Cc: Even Rouault, chowell, antonio, rprinceley, Mateusz Łoskot

Description

GDALRasterBand.RasterIO() and GDALRasterBand.ReadBlock() should have similar performance when RasterIO is used on block boundaries. In a simple test (reading a large single-band GTiff (8000x13000) 200 times), ReadBlock appears to be much faster than RasterIO (up to 4x).

The original post to the gdal-dev listserve (including code to replicate the issue) is at http://lists.osgeo.org/pipermail/gdal-dev/2008-March/016353.html. Frank Warmerdam confirmed the unacceptable performance overhead going through RasterIO.

Change History (9)

comment:1 by grovduck, 16 years ago

Keywords: RasterIO ReadBlock performance added

comment:2 by Even Rouault, 16 years ago

Cc: Even Rouault added

I confirm the performance overhead with RasterIO.

On my machine, AMD Athlon(tm) 64 Processor 3200+ with 512MB RAM, Linux 2.6.22 32bit, the slowdown is about 1.5x - 2x. I've profiled a bit with sysprof and top and it appears that :

  • in the ReadBlock case, we have about 55% CPU usage for userland process, 45% CPU usage for kernel
  • in the RasterIO case, we have about 70% CPU usage for userland process, 30% CPU usage for kernel

The increased CPU usage seems to come from the memcpy at line 115 of rasterio.cpp (copy from the cached block to the destination buffer) - about 15-20% - and a few percent in cache management - about 7%. Note : I measured the cost for the memcpy in rasterio.cpp by wrapping it in a GDALmemcpy function, so that it can appear in sysprof report.

For the RasterIO case, variations in the size of the cache have no noticeable performance influence when it's under 100 MB (the size of the test product). Above 100 MB, of course, RasterIO wins clearly on ReadBlock. So, cache thrashing is clearly affecting this test.

Apart from potentially complex anti cache thrashing strategies, it's not obvious to me how to improve that. Well, there's already a test for cache thrashing in gdalrasterband.cpp. It could be used as a hint for disabling caching for that rasterband. But the figures above show that the cache management cost is not so high in comparison to the extra cost of the copy of the blocks required to satisfy the RasterIO operation.

comment:3 by warmerdam, 16 years ago

Cc: chowell added
Status: newassigned

Adding Chris who may do some work on this issue.

comment:4 by antonio, 15 years ago

Cc: antonio added

comment:5 by rprinceley, 15 years ago

Cc: rprinceley added

comment:6 by warmerdam, 12 years ago

Milestone: 1.6.4

Remove non-serious milestone.

comment:7 by Mateusz Łoskot, 11 years ago

Cc: Mateusz Łoskot added

comment:8 by Jukka Rahkonen, 9 years ago

We need data from new benchmark with GDAL 1.11 or 2.0-dev. It would be excellent to include some code or a whole runnable Python script to ensure that the results are comparable.

comment:9 by Even Rouault, 5 years ago

Milestone: closed_because_of_github_migration
Resolution: wontfix
Status: assignedclosed

This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.

Note: See TracTickets for help on using tickets.