Opened 17 years ago
Closed 16 years ago
#1758 closed defect (fixed)
Creating overview on a TILED LZW GTiff fails
Reported by: | Even Rouault | Owned by: | warmerdam |
---|---|---|---|
Priority: | normal | Milestone: | 1.5.0 |
Component: | GDAL_Raster | Version: | svn-trunk |
Severity: | normal | Keywords: | geotiff lzw tiled |
Cc: |
Description
I've made via "gdal_translate -co "TILED=YES" -co "COMPRESS=LZW SOURCE_FILE.XXX gtiff_tiled_lzw.tif" a 23040 x 3072 GeoTIFF with 4 bands (RGBA) of type byte.
When I do a "gdaladdo -r average gtiff_tiled_lzw.tif 2 4 8 16", I get lots of warning around 15% of progress :
0...10...Warning 1: gtiff_tiled_lzw.tif:LZWDecode: Strip -1 not terminated with EOI code ERROR 1: gtiff_tiled_lzw.tif:LZWDecode: Not enough data at scanline 0 (short 30136 bytes) ERROR 1: TIFFReadEncodedTile() failed. ERROR 1: IReadBlock failed at X offset 0, Y offset 1 ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1
Tested on GDAL SVN with internal libtiff and libgeotiff.
If the LZW GTiff is not tiled, it works fine.
The source GTiff is not corrupted (I can translate it sucessfully).
I've reproduced the problem with a smaller image too (4608 x 7680). The warning appears around 40% progress.
So it's really a problem with overview building on tiled LZW Geotiff...
Attachments (2)
Change History (18)
comment:1 by , 17 years ago
Keywords: | geotiff lzw tiled added |
---|
comment:2 by , 17 years ago
comment:3 by , 17 years ago
Actually, I was wrong yesterday and did not reproduce exactly in the same conditions. Libtiff4 doesn't solve the issue at all. The resampling method is important to reproduce the problem. With 'nearest', it works fine. But not with 'average'. And if you do the overview generations in two passes like :
gdaladdo -r average gtiff_lzw.tif 2 4 gdaladdo -r average gtiff_lzw.tif 8 16
It works.
I commented out the following optimization in overview.cpp :
if( EQUALN(pszResampling,"AVER",4) && nOverviews > 1 ) return GDALRegenerateCascadingOverviews( poSrcBand, nOverviews, papoOvrBands, pszResampling, pfnProgress, pProgressData );
And it hides the problem. So my feeling on the problem is that there's some corruption (in libtiff ?) when reading back the data from the 1/4 overview that has just been generated on the previous iteration. The warning actually occurs in the following call in GDALRegenerateOverviews when generating the 1/8 overview:
/* read chunk */ poSrcBand->RasterIO( GF_Read, 0, nChunkYOff, nWidth, nFullResYChunk, pafChunk, nWidth, nFullResYChunk, eType, 0, 0 );
Additional note : a few complementary tests also show that the problem can also happen with non-tiled LZW Gtiff. And image size doesn't need to be that big, because I also hit the bug with images of size 1000x1000.
You can reproduce the problem with 01zc013.on1 for example.
gdal_translate -co "COMPRESS=LZW" 001zc013.on1 gtiff_lzw.tif gdaladdo -r average gtiff_lzw.tif 2 4 8
And you get :
0...10...20...30...40...50...60...70...Warning 1: LZWDecode:LZWDecode: Strip -1 not terminated with EOI code ERROR 1: LZWDecode:Not enough data at scanline 0 (short 1249 bytes) ERROR 1: TIFFReadEncodedTile() failed. ERROR 1: IReadBlock failed at X offset 0, Y offset 1 ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1 81..90..Warning 1: LZWDecode:LZWDecode: Strip -1 not terminated with EOI code ERROR 1: LZWDecode:Not enough data at scanline 0 (short 175 bytes) ERROR 1: TIFFReadEncodedTile() failed. ERROR 1: IReadBlock failed at X offset 0, Y offset 1 ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1 .100 - done.
comment:4 by , 17 years ago
Component: | default → GDAL_Raster |
---|---|
Status: | new → assigned |
This sounds like it may be related to #1738.
comment:5 by , 17 years ago
I read #1738 and it appears that it's related to band interleaving. But here 001zc013.on1 and the translated gtiff_lzw.tif have just 1 band (Band 1 Block=1536x5 Type=Byte, ColorInterp=Palette)
comment:6 by , 17 years ago
In fact, Frank, you are right. In a sense, this bug is conceptually the same as #1738...
Here's a proposal fix that works on the above cases.
When using average mode, the Nth band is generated from the (N-1)th band. To be sure that (N-1)th band is really written, there is already a "papoOvrBands[iOverview]->FlushCache();" in overview.cpp However, GTiffRasterBand doesn't override FlushCache(). In this case, there may be still bytes in the LZW encoded stream that are not flushed out... Hence the problem when reading the band afterwards.
My patch consists of implementing FlushCache() for GTiffRasterBand and in addition to calling the base FlushCache(), I call FlushCache() on the GTiffDataset. A bit radical, we can't take too much precaution when doing FlushCache... However it's not enough. We must also call TIFFFlush(hTIFF) in GTiffDataset::FlushCache.
by , 17 years ago
Attachment: | gdal_svn_bug1758.patch added |
---|
comment:7 by , 17 years ago
Unfortunately, the patch is not sufficient on LZW interleaved multi-band TIFF.
comment:8 by , 17 years ago
I've attached another version of the patch that contains a workaround for the LZW interleaved multi-band case. When there is compression, the overview is generated with PLANARCONFIG_SEPARATE. It's clearly a workaround and a better fix should be found.
by , 17 years ago
Attachment: | gdal_svn_bug1758_with_workaround_for_mb_lzw.patch added |
---|
Add a workaround for multi-band compressed overviews
comment:9 by , 17 years ago
Milestone: | → 1.5.0 |
---|
Even,
Can you attach a modest sized file that I can reproduce this problem with even with the first patch?
follow-up: 12 comment:10 by , 17 years ago
In fact I think almost any sufficiently big image (let's say at least 700x700) with "normal" data will hit the bug.
For example, you can reproduce the problem with 01zc013.on1 already included in GDAL test suite (http://dl.maptools.org/dl/gdal/data/nitf/cadrg/)
And then :
gdal_translate -co "COMPRESS=LZW" 001zc013.on1 gtiff_lzw.tif gdaladdo -r average gtiff_lzw.tif 2 4 8
comment:12 by , 17 years ago
Hi rouault,
i have the same Problem with big TIFF/LZW Files (8 bit colors - 12000x12000 Pixel - Topographic map 1:10000)
D:\>D:\gdal\gdalwin32-1.4.2\bin\gdaladdo.exe -r average 3240nocol.tif 2 4 8 16 0...10...20...30...40...50...60...70..Warning 1: 3240nocol.tif:LZWDecode: Strip -1 not terminated with EOI code ERROR 1: 3240nocol.tif:LZWDecode: Not enough data at scanline 0 (short 11905 bytes) ERROR 1: TIFFReadEncodedTile() failed.
ERROR 1: IReadBlock failed at X offset 0, Y offset 1 ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1 .80...90.Warning 1: 3240nocol.tif:LZWDecode: Strip -1 not terminated with EOI code ERROR 1: 3240nocol.tif:LZWDecode: Not enough data at scanline 0 (short 11481 bytes) ERROR 1: TIFFReadEncodedTile() failed.
ERROR 1: IReadBlock failed at X offset 0, Y offset 1 ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 1 ..100 - done.
If i save the file (with Irfanview)as TIFF/LZW format with 24bit color, i have many more errors: "LZWDecode: Corrupted LZW table at scanline 4096".
If i save the file in uncompressed TIFF format, i have no errors.
gdaladdo 1.4.2
Replying to rouault:
comment:14 by , 16 years ago
The problem was that a mixture of reading and writing resulted in libtiff writing out the contents of one of the tiles twice, doubling the size of the tile and writing over most of the following tile. When the following tile was read it was corrupt.
I fixed this in libtiff with the introduction of the TIFF_BUF2WRITE flag to keep track whether rawcc buffer contents are really for pending write or the result of a recent read. In libtiff ChangeLog I write:
- tif_dir.c, tif_dirread.c, tif_dirwrite.c, tif_read.c, tif_write.c, tiffiop.h: Added TIFF_BUF4WRITE flag to indicate if contents of the rawcp/rawcc buffer are for writing and thus may require flushing. Necessary to distinguish whether they need to be written to disk when in mixed read/write mode and doing a mixture of writing followed by reading.
Currently this change is only in libtiff4 but it will need to be retrofit to 3.9 branch as well. The updated code has been imported into trunk as r12954.
comment:16 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
The changes were ported back to libtiff 3.9 a week or so ago. Closing.
I've tested it again and it works now! So I guess that the recent libtiff4 commit fixed the problem.