Opened 15 years ago

Closed 15 years ago

#2820 closed defect (fixed)

gdalwarp with compression corrupts output

Reported by: msieczka Owned by: warmerdam
Priority: normal Milestone: 1.6.1
Component: GDAL_Raster Version: svn-trunk
Severity: critical Keywords: gtiff
Cc:

Description

  1. gdalwarp -co "COMPRESS=LZW" some.tif some_lzw.tif
  1. gdal_translate some_lzw.tif some_lzw2.tif - error:
0ERROR 1: LZWDecode:Corrupted LZW table at scanline 0
ERROR 1: TIFFReadEncodedStrip() failed.

ERROR 1: IReadBlock failed at X offset 0, Y offset 0
ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 0

In case of DEFLATE:

0ERROR 1: ZIPDecode:Decoding error at scanline 0, incorrect header check
ERROR 1: ZIPDecode:Decoding error at scanline 0, too many length or distance symbols
ERROR 1: TIFFReadEncodedStrip() failed.

ERROR 1: IReadBlock failed at X offset 0, Y offset 0
ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 0

GDAL 1.6.0 +SVN r16186

Attachments (3)

map.tif (28.1 KB ) - added by msieczka 15 years ago.
sample tiff to reproduce the error with
ovr11.tif (7.8 KB ) - added by Even Rouault 15 years ago.
"gdaladdo ovr11.tif 2" generates a broken file on that file
fix_for_2820_in_1_6branch.patch (68.0 KB ) - added by Even Rouault 15 years ago.

Download all attachments as: .zip

Change History (16)

comment:1 by warmerdam, 15 years ago

Is this using the internal libtiff, or a system libtiff? Some aspects of update in place with TIFF and compression are very sensitive to the libtiff version. If it is the internal version, then please provide details on the input file (a gdalinfo report for instance).

in reply to:  1 comment:2 by msieczka, 15 years ago

Replying to warmerdam:

Is this using the internal libtiff, or a system libtiff?

Internal. My configure:

--prefix=/usr/local --with-libz=internal --with-png=internal --with-libtiff=internal --with-geotiff=internal --with-jpeg=internal --with-gif=internal --with-geos=yes --with-pg=/usr/bin/pg_config --with-sqlite=/usr/local/lib --with-xerces=yes --with-ecw=yes --with-python --without-ogpython --with-curl --with-hdf4 --with-threads=yes --mandir=/usr/local/share/man

please provide details on the input file (a gdalinfo report for instance).

Any raster map I tried. A small sample 'map.tif' attached. To reproduce the error:

$ gdalwarp -co "COMPRESS=LZW" map.tif  map2.tif

$ gdal_translate map2.tif map3.tifInput file size is 155, 161
0ERROR 1: LZWDecode:Corrupted LZW table at scanline 0
ERROR 1: TIFFReadEncodedStrip() failed.

ERROR 1: IReadBlock failed at X offset 0, Y offset 0
ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 0

by msieczka, 15 years ago

Attachment: map.tif added

sample tiff to reproduce the error with

comment:3 by Even Rouault, 15 years ago

Maciej, thanks for reporting that ! I can confirm that :

  • gdal_translate on map2.tif (or gdalinfo -checksum) works with official release 1.6.0
  • it fails with latest SVN branches/1.6
  • it fails with latest SVN trunk

(all the 3 versions being configured against internal libtiff)

So this is likely a regression introduced by a recent commit in the GeoTIFF driver. It seems to be related to the presence of the color table in the source file, because if I do the test again with the same file but without color table, it works.

comment:4 by warmerdam, 15 years ago

Keywords: gtiff added
Status: newassigned

This appears to be related to changes made for #2421 with regard to color table handling. I'll dig into this further.

comment:5 by warmerdam, 15 years ago

I have done quite a bit of rewriting with regard to how calls to TIFFFlush(), TIFFWriteDirectory() and TIFFRewriteDirectory() are controlled in geotiff.cpp - still some more testing to ensure it is correct, but now for this particular case the directory only gets written out once (at Crystalize()) which does not occur till image IO starts.

comment:6 by warmerdam, 15 years ago

I have spent most of today continuing to fiddle with this extremely fragile code, but now I'm reasonable confident I have a safer approach. The changes are currently only in trunk (r16222), but will likely also need to go into 1.6.1 branch - possibly in a reduced impact form. But I'd like to give them a bit of time to settle in trunk.

comment:7 by Even Rouault, 15 years ago

Just a reminder of a discussion started with FrankW on IRC : r16222 makes tiff_ovr_11 fails on 32bit Linux (with latest r16225 internal libtiff, or with libtiff 3.9 CVS, or with Ubuntu 8.04 libtiff 3.8.2). This can be reduced to the fact that gdaladdo ovr11.tif 2 fails (or generate a broken tiff) on the attached ovr11.tif.

"The error seems to be related to the fact that tmp/ovr11.tif has the 3 following conditions : 1) is uncompressed 2) has 3 bands 3) has nodata value. GTiffDataset::FlushDirectory() is called once when beginning to write the first block of data, but the bNeedsRewrite flag is set again to TRUE by SetNoDataValue() on the second and third band, so in the destructor of the overview dataset,TIFFRewriteDirectory() is called again. apparently that's the reason for the error because before the RewriteDirectory() that triggers the error, a tiffdump on the file shows 2 directories, and after the call, it shows an invalid third directory"

by Even Rouault, 15 years ago

Attachment: ovr11.tif added

"gdaladdo ovr11.tif 2" generates a broken file on that file

comment:8 by warmerdam, 15 years ago

I have corrected this problem in trunk (r16243).

I believe things are now working smoothly and pretty safely in trunk, but the changes are scary enough in aggregate that I'm hesitant to move them into 1.6 branch.

comment:9 by Even Rouault, 15 years ago

I had to commit in r16248 an additional fix for a new regression introduced by the removal of TIFFFlush() in GTiffDataset::IBuildOverviews().

The issue could be seen with the following python snippet :

    ds = gdal.GetDriverByName('GTiff').Create('tmp/ovr25.tif',100,100,1)
    ds.GetRasterBand(1).Fill(1)
    ds.GetRasterBand(1).FlushCache()
    ds.BuildOverviews('NEAR', overviewlist = [2])
    ds = None

The FlushDirectory() does essentially a TIFFFlush(), which was done in GDAL 1.6.0 and previous versions.

New tests added in r16249

comment:10 by Even Rouault, 15 years ago

In r16250 I've commited another fix that was necessary to make tiff_write_52 and tiff_write_53 still pass when we compile and link against libtiff 3.X (this used to work with GDAL 1.6.0, so it was a regression).

We set bNeedsRewrite in GTiffRasterBand::SetColorTable(), so that TIFFRewriteDirectory() get called in GTiffDataset::FlushDirectory() at dataset destruction. This is necessary because TIFFFlush() in libtiff 3.X doesn't rewrite the directory because TIFFFlushData() returns 0 if TIFF_BEENWRITING is not set. This has been corrected in libtiff 4.X. Side note : the comment before TIFFFlushData() in CVS head of 3.9 branch (and down to 3.6.0 actually) says that it returns 1, but it actually still returns 0...

Now, trunk doesn't show any regression w.r.t. GDAL 1.6.0 (for tiff_write.py, mask.py and tiff_ovr.py ), when linked against libtiff 3.6.0, 3.8.2-ubuntu, 3.9 CVS head or internal...

comment:11 by Even Rouault, 15 years ago

Frank,

I'm attaching fix_for_2820_in_1_6branch.patch that's a quite impressive backport of geotiff.cpp (before recent changes to overview and differed GetProjectioRef()) and libtiff from trunk. With that patch, there's no more regression in 1.6 branch for autotest and Maciej's testcase with internal libtiff or 3.8.2 external libtiff. Do you think that it can be applied ?

by Even Rouault, 15 years ago

comment:12 by warmerdam, 15 years ago

Even, please go ahead.

comment:13 by Even Rouault, 15 years ago

Resolution: fixed
Status: assignedclosed

Done in r16836

Note: See TracTickets for help on using tickets.