Opened 13 years ago

Closed 13 years ago

#3693 closed defect (fixed)

MapServer Raster Overview Performance Concerns

Reported by: warmerdam Owned by: warmerdam
Priority: normal Milestone:
Component: GDAL Support Version: svn-trunk (development)
Severity: normal Keywords:
Cc: tbonfort

Description

Frank, thanks for taking the time to answer.

The problem is I am experiencing the slowdown without going through the warper or the resampler (i.e no reprojection going on, and no RESAMPLE processing option). The test results I'm attaching clearly show (I hope) that there is a problem when using gdal overviews:

test case: fullres image is a ~130000x130000 pixel wide biftiff, tiled, lzw, band interleaved.

setup 1: an external overview is added to the image: gdaladdo -r average --config INTERLEAVE_OVERVIEW BAND --config COMPRESS_OVERVIEW LZW --config BIGTIFF_OVERVIEW YES -ro $infile 2 4 8 16 32 64 128 256 512 the mapserver layer used is basic: no processing options, and projection identical to the map projection.

setup2: "manual" overviews are created by basically repeating the gdalwarp command, each time multiplying the -tr resolution option by 2:

for level in `seq 1 $levels`; do
   curres=`echo "scale=10; $curres*2" | bc`
   target="l$level-$file"
   echo "making file $target, resolution $curres"
   rm $target
   gdalwarp -of GTiff -co TILED=YES -co INTERLEAVE=BAND -co
COMPRESS=LZW -co BIGTIFF=YES -r bilinear -tr $curres $curres $prev
$target
   prev=$target
done

the mapfile is configured with 10 layers that switch on or off depending on the requested scale. the minscale/maxscale for each layer are set so that only downsampling can occur, not oversampling.

it is my assumption that both setups are basically doing the same stuff, (except for the resampling method, but that does not intervene in the following performance), am I correct?

attached is the result I'm getting, when requesting images at *exactly* (to the extent of rounding errrors) the resolutions of the overviews. I'll also point out that for the gdal .ovr overview case, I also tested by slightly modifying the final resolution, so as to rule out a rounding error that would cause gdal/mapserver to use an oversampled overview, to no avail.

for refererence, I also include the cases when forcing the resampler, by adding the RESAMPLE=NEAREST processing option. (with oversample_ratio set to 1 and 2).

best regards, thomas

Attachments (1)

overviews.png (7.2 KB ) - added by warmerdam 13 years ago.
Thomas' graph of performance results.

Download all attachments as: .zip

Change History (5)

comment:1 by warmerdam, 13 years ago

Status: newassigned

Digging into now based on a detailed test setup provided by Thomas.

by warmerdam, 13 years ago

Attachment: overviews.png added

Thomas' graph of performance results.

comment:2 by warmerdam, 13 years ago

I am seeing somewhat different results from Thomas, as I understand his graph. I get:

With "MapServer Overviews":

Requests per second:    51.99 [#/sec] (mean)
Requests per second:    62.56 [#/sec] (mean)
Requests per second:    65.17 [#/sec] (mean)
Requests per second:    64.85 [#/sec] (mean)
Requests per second:    62.74 [#/sec] (mean)
Requests per second:    63.66 [#/sec] (mean)
Requests per second:    62.70 [#/sec] (mean)
Requests per second:    69.66 [#/sec] (mean)
Requests per second:    65.56 [#/sec] (mean)
Requests per second:    70.55 [#/sec] (mean)


With GDAL Overviews:

Requests per second:    25.75 [#/sec] (mean)
Requests per second:    26.85 [#/sec] (mean)
Requests per second:    24.95 [#/sec] (mean)
Requests per second:    26.25 [#/sec] (mean)
Requests per second:    25.55 [#/sec] (mean)
Requests per second:    25.92 [#/sec] (mean)
Requests per second:    26.69 [#/sec] (mean)
Requests per second:    26.82 [#/sec] (mean)
Requests per second:    26.55 [#/sec] (mean)
Requests per second:    81.19 [#/sec] (mean)

The first url in the urls.txt is the 38m resolution request (full res) through to the last being the most reduced resolution (20km x 20km pixels). So in this testing GDAL overviews also perform consistently twice as slow or worse than mapserver overviews with the exception of the last level which is fast for both.

My preliminary guess is that GDAL is having the read tile pointer information for each overview level during the open operation when it enumerates the available levels and this is quite expensive. The demonstration files has 884925 tiles in the base layer so reading the TileOffsets and TileByteCount values for that layer along will require processing roughly 15MB. This substantially dwarfs the amount of actual image data that needs to be accessed.

I will dig more deeply and try to determine why GDAL overviews performs well for only the final layer resolution.

comment:3 by warmerdam, 13 years ago

I prepared a mapscript script to do something similar to the cgi based test and then broke in gdb quite a number of times to sample what is taking time. This confirms that a surprising amount of the time is being spent reading TIFF tags (tile offsets, and tile sizes).

I am going to think a bit more about what can be done to optimize this. Perhaps I can adjust libtiff to defer loading these tags until needed. This could be a big win for files with very large tile/strip maps - particularly where the imagery isn't access when scanning directories.

comment:4 by warmerdam, 13 years ago

Resolution: fixed
Status: assignedclosed

I have incorporated support for deferring loading of tile offset/size values from directories until they are actually needed. This is primarily useful for the initial scan for overviews. It is enabled with the --enable-defer-strile-load configure option of libtiff, and will be the default when using the internal libtiff with GDAL 1.9 and later.

See also:

http://fwarmerdam.blogspot.com/2011/02/mapserver-tiff-overview-performance.html

Note: See TracTickets for help on using tickets.