Opened 20 years ago

Closed 20 years ago

Last modified 20 years ago

#888 closed defect (fixed)

Display of WMS raster layer causes gdal 1.2.1.0 to crash

Reported by: hliu@… Owned by: warmerdam
Priority: high Milestone:
Component: GDAL Support Version: 4.2
Severity: major Keywords:
Cc:

Description

Mapserver 4.2.3 and gdal 1.2.1.0 on Linux Fedora Core release 1.
When attempting to display a WMS raster layer gdal 1.2.1.0 throws a 
Segmentation fault.  

The WMS vector display is fine, and the sample raster works as a regular (non-
WMS) Mapserver layer.

We have placed a sample (data and mapfile) that produces the error here:
http://www.firstbasesolutions.com/tmp_download/simcoe_10k_6m.zip

The bug can be reproduced using an http WMS GetMap call, or by calling shp2img.

The following is a traceback of shp2img from gdb:

> (gdb) run -m test_wms.map
> Starting program: /root/build/mapserver-4.2.2/shp2img -m
> test_wms.map
> (no debugging symbols found)...(no debugging symbols found)...(no
> debugging
> symbols found)...(no debugging symbols found)...
> (no debugging symbols found)...[Thread debugging using libthread_db
> enabled]
> [New Thread -1085163296 (LWP 22595)]
> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread -1085163296 (LWP 22595)]
> 0x009bd6ad in TIFFFillStrip (tif=0x827dc98, strip=1) at tif_read.c:252
> 252             bytecount = td->td_stripbytecount[strip];
> (gdb) where
> #0  0x009bd6ad in TIFFFillStrip (tif=0x827dc98, strip=1) at
> tif_read.c:252
> #1  0x0059f971 in TIFFReadEncodedStrip () from
> /usr/local/lib/libgdal.so.1
> #2  0x0053dadb in GTiffDataset::LoadBlockBuf(int) () from
> /usr/local/lib/libgdal.so.1
> #3  0x0053c16f in GTiffRasterBand::IReadBlock(int, int, void*) () from
> /usr/local/lib/libgdal.so.1
> #4  0x005b1043 in GDALRasterBand::GetBlockRef(int, int, int) () from
> /usr/local/lib/libgdal.so.1
> #5  0x005b3795 in GDALRasterBand::IRasterIO(GDALRWFlag, int, int, int,
> int,
> void*, int, int, GDALDataType, int, int) ()
>    from /usr/local/lib/libgdal.so.1
> #6  0x005b0aad in GDALRasterBand::RasterIO(GDALRWFlag, int, int, int,
> int,
> void*, int, int, GDALDataType, int, int) ()
>    from /usr/local/lib/libgdal.so.1
> #7  0x005b0b52 in GDALRasterIO () from /usr/local/lib/libgdal.so.1
> #8  0x080ac921 in LoadGDALImage ()
> #9  0x080aadf5 in msDrawRasterLayerGDAL ()
> #10 0x0809e6b6 in msDrawRasterLayerLow ()
> #11 0x080676b2 in msDrawLayer ()
> #12 0x08066cd9 in msDrawMap ()
> #13 0x0804e342 in main ()
> ************************************************************************

Mapserver config:
>MapServer Version  MapServer version 4.2.3 OUTPUT=PNG OUTPUT=JPEG OUTPUT=WBMP 
>OUTPUT=PDF OUTPUT=SWF SUPPORTS=PROJ SUPPORTS=FREETYPE SUPPORTS=WMS_SERVER 
>SUPPORTS=WMS_CLIENT SUPPORTS=WFS_SERVER SUPPORTS=WFS_CLIENT INPUT=POSTGIS 
>INPUT=OGR INPUT=GDAL INPUT=SHAPEFILE  

Gdal version:
>gdalinfo --version
>GDAL 1.2.1.0, released 2004/06/23

O.S version:
>[root@pc82 support]# cat /etc/redhat-release
>Fedora Core release 1 (Yarrow)
>[root@pc82 support]# uname -a
>Linux pc82 2.4.22-1.2199.nptl #1 Wed Aug 4 12:21:48 EDT 2004 i686 i686 i386 
>GNU/Linux

thanks

Change History (4)

comment:1 by fwarmerdam, 20 years ago

Status: newassigned
I have tried this with my Linux build of GDAL (current) and MapServer 4.3 
(current) with no problems (via shp2img). 

I think this is some sort of subtle build issue, and it would be easiest
to diagnose "in place".  Can you provide an "ssh'able" account on the 
system in question?  If not, I should likely just pop over to your
office. 

Thursday morning could work for me. 



comment:2 by hliu@…, 20 years ago

thanks Frank.
I am in a workshop tommorow, but you can contact Jeremy Hanna 
905 477 3600 to set a time on Thursday morning.

comment:3 by fwarmerdam, 20 years ago

Resolution: fixed
Status: assignedclosed
It was determined (onsite) that the problem was a copy of libtiff code in
pdflib that had a different TIFF structure layout.  Removing pdflib from
MapServer resolved the problem. 

comment:4 by fwarmerdam, 20 years ago

Sent the following email to the pdflib maintainers via pdflib@yahoogroups.com:

--

Dear PDFlib maintainers, 

In PDFlib-Lite-6.0.0p1/doc/pdflib/readme-source-unix.txt, the section
on auxilary libraries indicates that PDFlib includes modified copies of 
several libraries, including libtiff.   It also indicates that "Due to the
prefixed function names an application can link against both PDFlib (including
all auxilary libraries) and standard versions of these libs without any name
conflicts". 

This does not appear to be entirely true for libtiff at least.  A dump with "nm"
of libpdf.a (libs/pdflib/.libs/libpdf.a in a pretty standard build on Linux - 
Red Hat - Fedora Core 2) has a number of TIFF functions *not* prefixed with
"pdf_".  Amoung others is the TIFFFillStrip function. 

I spent several hours today assisting a user of UMN MapServer 
(http://mapserver.gis.umn.edu) who were experiencing crashes when they read
TIFF files. The problem turned out to be that they had configured in pdflib
support along with libtiff.  Deep in the code the libtiff functions ended up
calling the pdflib TIFFFillStrip() instead of the the one from libtiff itself.
Due to differences in code versions, structures were not properly aligned and
a crash resulted.   The user had already spent a number of man-days trying
to work out what was going wrong. 

My point is if you are going to include a copy of libtiff (and similar issues
may apply to other auxilary libraries), it is necessary to prefix all functions
and exportable tables with your pdf_ prefix.  Please check the resulting 
object code with nm to ensure you have everything.  In this case TIFFFillStrip()
is declared only within tif_read.c and is not part of the normal public interface.
But with your libtiff linked into the application the real libtiff's 
TIFFReadEncodedStrip() function (also in tif_read.c) actually ended up calling
the TIFFFillStrip() even though another TIFFFillStrip() was in the very same
object file as TIFFReadEncodedStrip()!

I don't know if this is an issue on all unix platforms, or just Linux.  I 
don't *think* it is an issue on Win32 where functions need to be explicitly
exported from DLLs. 

By the way, in the readme-source-unix.txt, it says more fully:

"""
PDFlib includes modified copies of the libjpeg libtiff, zlib, and libpng
libraries as part of the source code package. These libraries have been
modified for use with PDFlib in several ways:

- all function names are prefixed with a PDFlib-private tag
- code which is not required for PDFlib has been removed
- a number of portability changes have been applied
- bugs have been fixed
"""

If you are building in a copy of libtiff because the normal one has bugs,
then I would appreciate your intefacing with the libtiff team (including
me) to correct them.  If it is because you want to hook memory or IO
functions then there are other ways of doing this without having to hold
an internal copy.  Basically, I would like you to offer a means to build
pdflib with an external libtiff, even if you carry around an internal copy
for some platforms or for "pdflib only" applications.  It would make it
a great deal safer to use your library in larger applications using many
other components. 

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent

Note: See TracTickets for help on using tickets.