Opened 17 years ago

Closed 16 years ago

#1758 closed defect (fixed)

Creating overview on a TILED LZW GTiff fails

Reported by: Even Rouault Owned by: warmerdam
Priority: normal Milestone: 1.5.0
Component: GDAL_Raster Version: svn-trunk
Severity: normal Keywords: geotiff lzw tiled
Cc:

Description

I've made via "gdal_translate -co "TILED=YES" -co "COMPRESS=LZW SOURCE_FILE.XXX gtiff_tiled_lzw.tif" a 23040 x 3072 GeoTIFF with 4 bands (RGBA) of type byte.

When I do a "gdaladdo -r average gtiff_tiled_lzw.tif 2 4 8 16", I get lots of warning around 15% of progress :

0...10...Warning 1: gtiff_tiled_lzw.tif:LZWDecode: Strip -1 not terminated with EOI code
ERROR 1: gtiff_tiled_lzw.tif:LZWDecode: Not enough data at scanline 0 (short 30136 bytes)
ERROR 1: TIFFReadEncodedTile() failed.
ERROR 1: IReadBlock failed at X offset 0, Y offset 1
ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1

Tested on GDAL SVN with internal libtiff and libgeotiff.

If the LZW GTiff is not tiled, it works fine.

The source GTiff is not corrupted (I can translate it sucessfully).

I've reproduced the problem with a smaller image too (4608 x 7680). The warning appears around 40% progress.

So it's really a problem with overview building on tiled LZW Geotiff...

Attachments (2)

gdal_svn_bug1758.patch (1.7 KB ) - added by Even Rouault 17 years ago.
gdal_svn_bug1758_with_workaround_for_mb_lzw.patch (2.4 KB ) - added by Even Rouault 17 years ago.
Add a workaround for multi-band compressed overviews

Download all attachments as: .zip

Change History (18)

comment:1 by Even Rouault, 17 years ago

Keywords: geotiff lzw tiled added

comment:2 by Even Rouault, 17 years ago

I've tested it again and it works now! So I guess that the recent libtiff4 commit fixed the problem.

comment:3 by Even Rouault, 17 years ago

Actually, I was wrong yesterday and did not reproduce exactly in the same conditions. Libtiff4 doesn't solve the issue at all. The resampling method is important to reproduce the problem. With 'nearest', it works fine. But not with 'average'. And if you do the overview generations in two passes like :

gdaladdo -r average gtiff_lzw.tif 2 4
gdaladdo -r average gtiff_lzw.tif 8 16

It works.

I commented out the following optimization in overview.cpp :

    if( EQUALN(pszResampling,"AVER",4) && nOverviews > 1 )
        return GDALRegenerateCascadingOverviews( poSrcBand, 
                                                 nOverviews, papoOvrBands,
                                                 pszResampling, 
                                                 pfnProgress,
                                                 pProgressData );

And it hides the problem. So my feeling on the problem is that there's some corruption (in libtiff ?) when reading back the data from the 1/4 overview that has just been generated on the previous iteration. The warning actually occurs in the following call in GDALRegenerateOverviews when generating the 1/8 overview:

        /* read chunk */
        poSrcBand->RasterIO( GF_Read, 0, nChunkYOff, nWidth, nFullResYChunk, 
                             pafChunk, nWidth, nFullResYChunk, eType,
                             0, 0 );

Additional note : a few complementary tests also show that the problem can also happen with non-tiled LZW Gtiff. And image size doesn't need to be that big, because I also hit the bug with images of size 1000x1000.

You can reproduce the problem with 01zc013.on1 for example.

gdal_translate -co "COMPRESS=LZW"  001zc013.on1 gtiff_lzw.tif
gdaladdo -r average gtiff_lzw.tif 2 4 8

And you get :

0...10...20...30...40...50...60...70...Warning 1: LZWDecode:LZWDecode: Strip -1 not terminated with EOI code
ERROR 1: LZWDecode:Not enough data at scanline 0 (short 1249 bytes)
ERROR 1: TIFFReadEncodedTile() failed.

ERROR 1: IReadBlock failed at X offset 0, Y offset 1
ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1
81..90..Warning 1: LZWDecode:LZWDecode: Strip -1 not terminated with EOI code
ERROR 1: LZWDecode:Not enough data at scanline 0 (short 175 bytes)
ERROR 1: TIFFReadEncodedTile() failed.

ERROR 1: IReadBlock failed at X offset 0, Y offset 1
ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1
.100 - done.

comment:4 by warmerdam, 17 years ago

Component: defaultGDAL_Raster
Status: newassigned

This sounds like it may be related to #1738.

comment:5 by Even Rouault, 17 years ago

I read #1738 and it appears that it's related to band interleaving. But here 001zc013.on1 and the translated gtiff_lzw.tif have just 1 band (Band 1 Block=1536x5 Type=Byte, ColorInterp=Palette)

comment:6 by Even Rouault, 17 years ago

In fact, Frank, you are right. In a sense, this bug is conceptually the same as #1738...

Here's a proposal fix that works on the above cases.

When using average mode, the Nth band is generated from the (N-1)th band. To be sure that (N-1)th band is really written, there is already a "papoOvrBands[iOverview]->FlushCache();" in overview.cpp However, GTiffRasterBand doesn't override FlushCache(). In this case, there may be still bytes in the LZW encoded stream that are not flushed out... Hence the problem when reading the band afterwards.

My patch consists of implementing FlushCache() for GTiffRasterBand and in addition to calling the base FlushCache(), I call FlushCache() on the GTiffDataset. A bit radical, we can't take too much precaution when doing FlushCache... However it's not enough. We must also call TIFFFlush(hTIFF) in GTiffDataset::FlushCache.

by Even Rouault, 17 years ago

Attachment: gdal_svn_bug1758.patch added

comment:7 by Even Rouault, 17 years ago

Unfortunately, the patch is not sufficient on LZW interleaved multi-band TIFF.

comment:8 by Even Rouault, 17 years ago

I've attached another version of the patch that contains a workaround for the LZW interleaved multi-band case. When there is compression, the overview is generated with PLANARCONFIG_SEPARATE. It's clearly a workaround and a better fix should be found.

by Even Rouault, 17 years ago

Add a workaround for multi-band compressed overviews

comment:9 by warmerdam, 17 years ago

Milestone: 1.5.0

Even,

Can you attach a modest sized file that I can reproduce this problem with even with the first patch?

comment:10 by Even Rouault, 17 years ago

In fact I think almost any sufficiently big image (let's say at least 700x700) with "normal" data will hit the bug.

For example, you can reproduce the problem with 01zc013.on1 already included in GDAL test suite (http://dl.maptools.org/dl/gdal/data/nitf/cadrg/)

And then :

gdal_translate -co "COMPRESS=LZW"  001zc013.on1 gtiff_lzw.tif
gdaladdo -r average gtiff_lzw.tif 2 4 8

comment:11 by Even Rouault, 17 years ago

This bug had already been reported as #1205

in reply to:  10 comment:12 by Monty, 17 years ago

Hi rouault,

i have the same Problem with big TIFF/LZW Files (8 bit colors - 12000x12000 Pixel - Topographic map 1:10000)

D:\>D:\gdal\gdalwin32-1.4.2\bin\gdaladdo.exe -r average 3240nocol.tif 2 4 8 16 0...10...20...30...40...50...60...70..Warning 1: 3240nocol.tif:LZWDecode: Strip -1 not terminated with EOI code ERROR 1: 3240nocol.tif:LZWDecode: Not enough data at scanline 0 (short 11905 bytes) ERROR 1: TIFFReadEncodedTile() failed.

ERROR 1: IReadBlock failed at X offset 0, Y offset 1 ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1 .80...90.Warning 1: 3240nocol.tif:LZWDecode: Strip -1 not terminated with EOI code ERROR 1: 3240nocol.tif:LZWDecode: Not enough data at scanline 0 (short 11481 bytes) ERROR 1: TIFFReadEncodedTile() failed.

ERROR 1: IReadBlock failed at X offset 0, Y offset 1 ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1 ..100 - done.

If i save the file (with Irfanview)as TIFF/LZW format with 24bit color, i have many more errors: "LZWDecode: Corrupted LZW table at scanline 4096".

If i save the file in uncompressed TIFF format, i have no errors.

gdaladdo 1.4.2

Replying to rouault:

comment:13 by Even Rouault, 16 years ago

Also reported for deflate tiff in #1980.

comment:14 by warmerdam, 16 years ago

The problem was that a mixture of reading and writing resulted in libtiff writing out the contents of one of the tiles twice, doubling the size of the tile and writing over most of the following tile. When the following tile was read it was corrupt.

I fixed this in libtiff with the introduction of the TIFF_BUF2WRITE flag to keep track whether rawcc buffer contents are really for pending write or the result of a recent read. In libtiff ChangeLog I write:

  • tif_dir.c, tif_dirread.c, tif_dirwrite.c, tif_read.c, tif_write.c, tiffiop.h: Added TIFF_BUF4WRITE flag to indicate if contents of the rawcp/rawcc buffer are for writing and thus may require flushing. Necessary to distinguish whether they need to be written to disk when in mixed read/write mode and doing a mixture of writing followed by reading.

Currently this change is only in libtiff4 but it will need to be retrofit to 3.9 branch as well. The updated code has been imported into trunk as r12954.

comment:15 by kyngchaos, 16 years ago

Latest libtiff svn works for me now without errors (re: bug #1205).

comment:16 by warmerdam, 16 years ago

Resolution: fixed
Status: assignedclosed

The changes were ported back to libtiff 3.9 a week or so ago. Closing.

Note: See TracTickets for help on using tickets.