Opened 4 years ago

Closed 4 years ago

#5102 closed defect (fixed)

Gdalwarp artifacts in output GTIFF

Reported by: sshak Owned by: warmerdam
Priority: normal Milestone: 1.10.1
Component: GDAL_Raster Version: 1.9.2
Severity: major Keywords: jpeg nodata mask msb
Cc:

Description

I'm using Gdalwarp to combine reproject multiple tiles into a single image. The input file set contains a large quantity of JPEG files (JFIF format) with a corresponding JGW file. They start as EPSG:25832 and end as EPSG:4326 into an uncompressed GTIFF. I select a target region, locate all input files that contain data for that boundary, and then run gdalwarp: gdalwarp -s_srs EPSG:25832 -t_srs EPSG:4326 -te <bounds> --config GDAL_CACHEMAX 4096 -wm 4096 -srcnodata 0 -dstnodata 0 <input files> <outputfile> Files that get created where the output file is larger than the input data get vertical artifacts. These artifacts seem to coincide where an "edge" input tile occurs, with a line at the inside edge, and at the outer edge. The overview.jpg shows the right edge of the tiles (black is nodata filled from not having tiles to fill those coordinates) having a white pattern in two vertical lines. The detail.jpg shows the inner corruption that follows a similar pattern.

Attachments (3)

overview.JPG (216.3 KB) - added by sshak 4 years ago.
Overview of artifacts.
detail.JPG (120.0 KB) - added by sshak 4 years ago.
Detail of the artifacts.
detail2.JPG (94.1 KB) - added by sshak 4 years ago.
Detail of the nodata boundary

Download all attachments as: .zip

Change History (15)

Changed 4 years ago by sshak

Attachment: overview.JPG added

Overview of artifacts.

Changed 4 years ago by sshak

Attachment: detail.JPG added

Detail of the artifacts.

Changed 4 years ago by sshak

Attachment: detail2.JPG added

Detail of the nodata boundary

comment:1 Changed 4 years ago by Even Rouault

This *might* be caused by JPEG compression if the borders of your source images are at 0, since at the border between nodata and imagery, JPEG compression has artifacts. Perhaps you could use the nearblack on the source JPEGs to produce TIFFs that have no longer the artifacts. And then use gdalwarp on those TIFF. If that still doesn't solve the issue, could you provide sample images that can be used to reproduce this image, as well as the gdalwarp command line you've used ?

comment:2 in reply to:  1 Changed 4 years ago by sshak

Replying to rouault:

This *might* be caused by JPEG compression if the borders of your source images are at 0, since at the border between nodata and imagery, JPEG compression has artifacts. Perhaps you could use the nearblack on the source JPEGs to produce TIFFs that have no longer the artifacts. And then use gdalwarp on those TIFF. If that still doesn't solve the issue, could you provide sample images that can be used to reproduce this image, as well as the gdalwarp command line you've used ?

The command used I listed in the original post:

gdalwarp -s_srs EPSG:25832 -t_srs EPSG:4326 -te <bounds> --config GDAL_CACHEMAX 4096 -wm 4096 -srcnodata 0 -dstnodata 0 <input files> <outputfile>

Would your explanation explain why I don't get horizontal lines at the upper edge of the imagery, just vertical lines? I also do not get this error at the internal boundaries (where I'm not reading nodata) except on the inner edge of a tile. Detail2 showed the "outer edge" where the eastmost boundary of the jpeg tile exists, and detail showed the "inner edge" where the westmost boundary of the same jpeg tile met another jpeg tile without nodata between them.

As far as sample data, I will have to see about creating some false source as the actual images have copyright issues that beyond screen captures I cannot provide. I will see if I can make some artificial images as test.

comment:3 Changed 4 years ago by Even Rouault

The best would be that you try the nearblack utility first. As far as if my hypothesis is the good one, well this is just a guess... If you can provide the sample data, the *exact* command line (i.e. with <bounds> expanded to the right value) would be great.

comment:4 in reply to:  3 Changed 4 years ago by sshak

I managed to get permission to use the exact dataset. I'm attaching a .zip file. I use a script to generate a lot of the commands, so I've decoded it to the single command for this test dataset as well.

exact command line: gdalwarp -s_srs EPSG:25832 -t_srs EPSG:4326 -te 8.375 55.000 8.500 55.125 -wm 2048 --config GDAL_CACHEMAX 2048 -srcnodata 0 -dstnodata 0 --optfile optfile test.tif

comment:5 Changed 4 years ago by sshak

The Zip file was 15M which is over the upload limit for the tracker. Due to the firewall here, I can't easily create a link online for download. Either I can work to create a oneshot link via dropbox for the file, or if you have a direct email I can send the file to, that would work.

comment:6 Changed 4 years ago by Even Rouault

dropbox would be fine

comment:8 in reply to:  7 Changed 4 years ago by sshak

Did the download link work for you?

comment:9 Changed 4 years ago by Even Rouault

yes the link works. Could you perhaps reduce the number of tiles that can reproduce the problem ? I'm afraid I will not have the motivation to analyze a process that takes so much time/tiles to run...

comment:10 Changed 4 years ago by Even Rouault

Nevermind, I manage to reproduce with :

gdalwarp -s_srs EPSG:25832 -t_srs EPSG:4326 -srcnodata 0 -dstnodata 0 464/dop_464_6100.jpg 465/dop_465_6100.jpg out.tif -overwrite

comment:11 Changed 4 years ago by sshak

Glad the download worked and reproduced the issue on your setup, and thanks for assisting on this bit of a confusing issue.

comment:12 Changed 4 years ago by Even Rouault

Component: defaultGDAL_Raster
Keywords: jpeg nodata mask msb added
Milestone: 1.10.1
Resolution: fixed
Status: newclosed

trunk(r26063) and branches/1.10 (r26064): "JPEG: add autodetection of bitmasks that are msb ordered. The JPEG_MASK_BIT_ORDER config option can also be set to MSB if the heuristics fails. (#5102)"

OK, the issue was that some of your JPEG images have a nodata bitmask with an unusual convention. Basically a bitmask is an array of the size of the image where, for each pixel, 0 indicates that the pixel is invalid (transparent) or 1 to indicate that it is valid (opaque). A byte (8 bits) contains the value of 8 pixels. The usual convention is to consider that the least significant bit of a byte is the pixel at the most left. In your images, the convention was different: it was the most significant bit. Hence the weird effects at the left of the image, at the boundary between opaque/transparent areas and at the right of the image. The above commit auto-detect thats unusual ordering in your particular case. You can also set the JPEG_MASK_BIT_ORDER config option to MSB.

If you cannot upgrade yet, the workaround is to gdal_translate each of your JPEG files into a TIFF for example, and to delete the *.msk files produced (those files contain the -invalid- nodata bitmask). Afterwards, you can run gdalwarp on those TIFF files instead of the original JPEG files.

Note: See TracTickets for help on using tickets.