Opened 17 years ago

Closed 17 years ago

#1899 closed defect (fixed)

Plugin ABI troubles

Reported by: Markus Neteler Owned by: Even Rouault
Priority: normal Milestone: 1.5.0
Component: ConfigBuild Version: svn-trunk
Severity: normal Keywords: plugin abi
Cc: warmerdam

Description (last modified by warmerdam)

Frank,

I have tried today with today's SVN HEAD on a RHEL4 64bit box to read SRTM:

gdb gdalinfo
GNU gdb Red Hat Linux (6.3.0.0-1.143.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib64/tls/libthread_db.so.1".

(gdb) r N59E013.hgt
Starting program: /usr/local/bin/gdalinfo N59E013.hgt
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 182909824480 (LWP 31739)]
Driver: SRTMHGT/SRTMHGT File Format
Files: N59E013.hgt
Size is 1201, 1201
Coordinate System is:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        TOWGS84[0,0,0,0,0,0,0],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9108"]],
    AXIS["Lat",NORTH],
    AXIS["Long",EAST],
    AUTHORITY["EPSG","4326"]]
Origin = (12.999583333333334,60.000416666666666)
Pixel Size = (0.000833333333333,-0.000833333333333)
Corner Coordinates:
Upper Left  (  12.9995833,  60.0004167) ( 12d59'58.50"E, 60d 0'1.50"N)
Lower Left  (  12.9995833,  58.9995833) ( 12d59'58.50"E, 58d59'58.50"N)
Upper Right (  14.0004167,  60.0004167) ( 14d 0'1.50"E, 60d 0'1.50"N)
Lower Right (  14.0004167,  58.9995833) ( 14d 0'1.50"E, 58d59'58.50"N)
Center      (  13.5000000,  59.5000000) ( 13d30'0.00"E, 59d30'0.00"N)
Band 1 Block=1201x1 Type=Int16, ColorInterp=Undefined
  NoData Value=-32768
  Unit Type: m
*** glibc detected *** double free or corruption (!prev): 0x000000000050dc10 ***

Program received signal SIGABRT, Aborted.
[Switching to Thread 182909824480 (LWP 31739)]
0x0000003e1452e21d in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000003e1452e21d in raise () from /lib64/tls/libc.so.6
#1  0x0000003e1452fa1e in abort () from /lib64/tls/libc.so.6
#2  0x0000003e14563451 in __libc_message () from /lib64/tls/libc.so.6
#3  0x0000003e1456906e in _int_free () from /lib64/tls/libc.so.6
#4  0x0000003e145693b6 in free () from /lib64/tls/libc.so.6
#5  0x0000003e15cae29e in operator delete () from /usr/lib64/libstdc++.so.6
#6  0x0000002a957f1ae9 in GDALDriverManager::~GDALDriverManager$delete () from /usr/local/lib/libgdal.so.1
#7  0x000000000040302d in main ()

If you want me to debug more, let me know. Same thing when using gdalwarp on the SRTM file.

Markus

Attachments (2)

gdal-grass-1.4.3-check-gdal-version.patch (2.2 KB ) - added by Even Rouault 17 years ago.
Patch to check that gdal-grass plugin is run against the GDAL version it was compiled with
gdal_svn_trunk_ABI_check_1899.patch (15.5 KB ) - added by Even Rouault 17 years ago.
Addition of GDALCheckVersion to GDAL/OGR API

Download all attachments as: .zip

Change History (15)

comment:1 by warmerdam, 17 years ago

Cc: warmerdam added
Description: modified (diff)
Keywords: srtmhgt added
Milestone: 1.5.0
Owner: changed from warmerdam to Even Rouault

Even,

Could you look into this?

comment:2 by Even Rouault, 17 years ago

I have downloaded the same dataset (from http://netgis.geo.uw.edu.pl/srtm/Europe/) and tried gdalinfo on it, on my Ubuntu in 32bit and 64bit environments. I could not reproduce your crash. I ran it under valgrind and did not notice any error neither.

From your stack trace, I can see that the crash occurs in GDALDriverManager::~GDALDriverManager. Does the crash only occurs on SRTMHGT datasets ?

What could help is that you rebuild completely your GDAL tree in debug mode (CFG=debug make clean all) and try again, so that we get source line numbers. Running under valgrind could be helpfull too (although, in my experience the source of "glibc detected * double free or corruption (!prev)" is kind of hard to detect)

comment:3 by Markus Neteler, 17 years ago

(gdb) bt #0 0x0000003e1452e21d in raise () from /lib64/tls/libc.so.6 #1 0x0000003e1452fa1e in abort () from /lib64/tls/libc.so.6 #2 0x0000003e14563451 in libc_message () from /lib64/tls/libc.so.6 #3 0x0000003e1456906e in _int_free () from /lib64/tls/libc.so.6 #4 0x0000003e145693b6 in free () from /lib64/tls/libc.so.6 #5 0x0000003e15cae29e in operator delete () from /usr/lib64/libstdc++.so.6 #6 0x0000002a95895944 in ~GDALDriver (this=0x50dc70) at gdaldriver.cpp:61 #7 0x0000002a95897a28 in ~GDALDriverManager (this=0x506fa0) at gdaldrivermanager.cpp:143 #8 0x0000002a9589853a in GDALDestroyDriverManager () at gdaldrivermanager.cpp:664 #9 0x000000000040363e in main (argc=2, argv=0x52f630) at gdalinfo.c:544

(gdb) bt full #0 0x0000003e1452e21d in raise () from /lib64/tls/libc.so.6 No symbol table info available. #1 0x0000003e1452fa1e in abort () from /lib64/tls/libc.so.6 No symbol table info available. #2 0x0000003e14563451 in libc_message () from /lib64/tls/libc.so.6 No symbol table info available. #3 0x0000003e1456906e in _int_free () from /lib64/tls/libc.so.6 No symbol table info available. #4 0x0000003e145693b6 in free () from /lib64/tls/libc.so.6 No symbol table info available. #5 0x0000003e15cae29e in operator delete () from /usr/lib64/libstdc++.so.6 No symbol table info available. #6 0x0000002a95895944 in ~GDALDriver (this=0x50dc70) at gdaldriver.cpp:61 No locals. #7 0x0000002a95897a28 in ~GDALDriverManager (this=0x506fa0) at gdaldrivermanager.cpp:143

poDriver = (class GDALDriver *) 0x50dc70

#8 0x0000002a9589853a in GDALDestroyDriverManager () at gdaldrivermanager.cpp:664 No locals. #9 0x000000000040363e in main (argc=2, argv=0x52f630) at gdalinfo.c:544

hDataset = 0x530690 hBand = 0x5309f0 i = 1 iBand = 1 adfGeoTransform = {12.999583333333334, 0.00083333333333333339, 0, 60.000416666666666, 0, -0.00083333333333333339} hDriver = 0x52aac0 papszMetadata = (char ) 0x0 bComputeMinMax = 0 bSample = 0 bShowGCPs = 1 bShowMetadata = 1 bStats = 0 bApproxStats = 1 iMDD = 0 pszFilename = 0x52f670 "@?R" papszExtraMDDomains = (char ) 0x0 papszFileList = (char ) 0x5317c0 pszProjection = 0x2a95aea528 "GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT["... hTransform = 0x0

But, bingo, also with GeoTIFF/Gauss-Boaga (Italy): gdalinfo /hardmnt/eden0/ssi/eden_data/radiation/linke_data/GB1/lin10.tif Driver: GTiff/GeoTIFF Files: /hardmnt/eden0/ssi/eden_data/radiation/linke_data/GB1/lin10.tif Size is 23, 27 Coordinate System is: PROJCS["Monte_Mario_Transverse_Mercator",

GEOGCS["GCS_Monte_Mario",

DATUM["unknown",

SPHEROID["unnamed",6378388,297.0000000000014]],

PRIMEM["Greenwich",0], UNIT["degree",0.0174532925199433]],

PROJECTIONTransverse_Mercator, PARAMETER["latitude_of_origin",0], PARAMETER["central_meridian",9], PARAMETER["scale_factor",0.9996], PARAMETER["false_easting",1500000], PARAMETER["false_northing",0], UNIT["metre",1,

AUTHORITY["EPSG","9001"]]]

Origin = (1594738.499216475989670,5229821.309640650637448) Pixel Size = (7684.606098286027191,-7684.606098286027191) Metadata:

AREA_OR_POINT=Area

Corner Coordinates: Upper Left ( 1594738.499, 5229821.310) ( 10d15'3.95"E, 47d12'50.91"N) Lower Left ( 1594738.499, 5022336.945) ( 10d12'33.50"E, 45d20'50.26"N) Upper Right ( 1771484.439, 5229821.310) ( 12d34'55.22"E, 47d 9'53.79"N) Lower Right ( 1771484.439, 5022336.945) ( 12d27'45.41"E, 45d18'4.24"N) Center ( 1683111.469, 5126079.127) ( 11d22'33.54"E, 46d15'45.78"N) Band 1 Block=23x27 Type=Float32, ColorInterp=Gray * glibc detected * double free or corruption (!prev): 0x000000000050dc70 * Aborted

It used to work on that machine.

For valgrind I would need to know how to do it (also not installed there and I am not root user).

uname -a Linux eden 2.6.9-42.0.3.ELsmp #1 SMP Mon Sep 25 17:24:31 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux

I configured like this: ./configure \

--with-netcdf=no \ --with-mrsid=no \ --with-hdf4 \ --without-hdf5 \ --with-pymoddir=/usr/local/lib/python2.3 \ --with-grass=no \ --with-ogdi=no \ --with-sqlite \ --with-mysql \ --with-xerces=no \ --with-geotiff=internal --with-libtiff=internal

Markus

comment:4 by Markus Neteler, 17 years ago

ouch, forgot wiki formatting -sorry...

comment:5 by Even Rouault, 17 years ago

Hum, the stacktrace even with line numbers doesn't give me much clue unfortunately. Maybe something getting wrong when destroying a driver... ? IMHO, the fact that it crashes when reading a dataset recognized by another driver seems to show that it's not related to SRTMHGT.

As far as valgrind is concerned, you don't need to be root I think. Download http://valgrind.org/downloads/valgrind-3.2.3.tar.bz2, unpack and the usual build chain should work (I haven't tested it):

./configure --prefix=somewhere_where_you_have_the_right_to_write
make
make install

Then launch gdalinfo with valgrind, by doing something like the following line from the directory where you have built gdal :

LD_LIBRARY_PATH=.libs somewhere_where_you_have_the_right_to_write/bin/valgrind apps/.libs/gdalinfo some_dataset

Another thing I would try would be to edit frmts/gdalallregister.cpp and comment most GDALRegister_XXX calls, except GDALRegister_GTiff for example. Build and try again on your geotiff.

You mention that it used to work on your machine. Do you remember when ?, with which version ?

comment:6 by Markus Neteler, 17 years ago

Resolution: invalid
Status: newclosed

Thanks for the excellent instructions! This helped to identify me as [censored]:

eden:gdal[19892.37] valgrind apps/.libs/gdalinfo /tmp/N59E013.hgt
==6527== Memcheck, a memory error detector.
==6527== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==6527== Using LibVEX rev 1732, a library for dynamic binary translation.
==6527== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==6527== Using valgrind-3.2.3, a dynamic binary instrumentation framework.
==6527== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==6527== For more details, rerun with: -v
==6527==
==6527== Invalid write of size 8
==6527==    at 0x4D477AF: GDALDriver::GDALDriver() (gdaldriver.cpp:47)
==6527==    by 0x5B7AC40: GDALRegister_GRASS (in /usr/local/lib/gdalplugins/gdal_GRASS.so)
==6527==    by 0x4D4A4D3: GDALDriverManager::AutoLoadDrivers() (gdaldrivermanager.cpp:632)
==6527==    by 0x4BAAAFE: GDALAllRegister (gdalallregister.cpp:75)
==6527==    by 0x402203: main (gdalinfo.c:77)
...

This indicates a plugin issue. Now it works, after I have recompiled the GRASS-GDAL plugin which I forgot yesterday after the update. Shame on me, sorry for wasting your time.

As small try to save my me: it there are way to trap if the plugin compilation (can be any plugin) is outdated? From the memory corruption it is not that obvious. Kudos to you and valgrind :)

Best regards, Markus

comment:7 by Even Rouault, 17 years ago

Resolution: invalid
Status: closedreopened

I'm pleased you managed to solve your crash.

You raise a good point however on ABI compatibility. I think that GDAL officially supports C API/ABI compatibility. However, I don't think there's such claim for the C++ part. When digging around on the Internet, I found this email from Frank Warmerdam whose conclusion is that the plugins should checks that GDAL version is exactly the one they were compiled for. (In theory, 2 different GDAL versions could have the same ABI, so this test is maybe a bit too extreme. On the other hand, if you consider the development version, the ABI may be constantly changing, so that's not enough... All in all, it's a simple idea that should satisfy most people. Incrementing an ABI version number is something very error-prone because it's easy to forget to increment it.)

I'm attaching a patch towards gdal-grass-1.4.3 that includes such a test.

My Ubuntu distro uses GDAL 1.3.2 as default. I've compiled and install gdal-grass-1.4.3 against it. When using GDAL 1.5-dev, I get now the following message :

ERROR 1: This version of the GDAL/GRASS plugin was compiled against GDAL 1320 but it is run with GDAL 1500.
As this will (probably) not work, the GDAL/GRASS plugin is disabled.

by Even Rouault, 17 years ago

Patch to check that gdal-grass plugin is run against the GDAL version it was compiled with

comment:8 by Even Rouault, 17 years ago

Component: GDAL_RasterConfigBuild
Keywords: plugin abi added; srtmhgt removed
Summary: SRTM HGT driver: glibc detected *** double free or corruption (!prev)Plugin ABI troubles

comment:9 by Markus Neteler, 17 years ago

The proposed error message sounds good from a user's point of view. I guess that I had already > 3 times this problem (only this time it took unfortunately more to remember the plugin presence).

comment:10 by warmerdam, 17 years ago

Even,

I think that we should only be testing the major and minor version number, not the whole GDAL_VERSION_NUM. GDAL revisions (like 1.4.0/1.4.1/1.4.2) are intended to be ABI compatible so plugins should work across them.

I would suggest we implement checking as library function. Something like:

if( !GDALCheckVersion( GDAL_VERSION_MAJOR, GDAL_VERSION_MINOR ) )

return;

I'd also note that CPLError has "printf" style support, so you don't have to do:

       char szErrorMsg[256]; 
       sprintf(szErrorMsg, "This version of the GDAL/GRASS plugin " 
 	                   "was compiled against GDAL %d but it is run with GDAL %s.\n" 
 	                   "As this will (probably) not work, the GDAL/GRASS plugin is disabled.\n", 
 	                   GDAL_VERSION_NUM, GDALVersionInfo("VERSION_NUM")); 
       CPLError(CE_Failure, CPLE_AppDefined, szErrorMsg); 

Instead just do:

       CPLError( CE_Failure, CPLE_AppDefined,
                 "This version of the GDAL/GRASS plugin " 
 	         "was compiled against GDAL %d but it is run with GDAL %s.\n" 
 	         "As this will (probably) not work, the GDAL/GRASS plugin is disabled.\n", 
 	         GDAL_VERSION_NUM, GDALVersionInfo("VERSION_NUM")); 

If you would like to handle this issue, go ahead and do so, or if you wish we can reassign it to Mateusz. Once the library function is ready it ought to be called from the registration functions of all drivers likely to be built as plugins (for instance optional drivers).

Also, it would be good to separately prototype the function in ogr_core.h so we don't have to include gdal.h in OGR drivers.

comment:11 by Even Rouault, 17 years ago

I'm attaching a patch for review that implements your proposal.

by Even Rouault, 17 years ago

Addition of GDALCheckVersion to GDAL/OGR API

comment:12 by warmerdam, 17 years ago

It looks good. Go ahead and apply it.

comment:13 by Even Rouault, 17 years ago

Resolution: fixed
Status: reopenedclosed

Applied in trunk in r12396

Note: See TracTickets for help on using tickets.