Opened 10 years ago

Closed 10 years ago

#4476 closed defect (invalid)

segfault on static deinitialization

Reported by: strk Owned by: warmerdam
Priority: normal Milestone:
Component: default Version: unspecified
Severity: normal Keywords:
Cc: Mateusz Łoskot

Description

Current GDAL trunk (1.9.0) seems to be doing something dangerous on library deinitialization. See http://hub.qgis.org/issues/4912 for some backtraces.

Change History (6)

comment:1 Changed 10 years ago by Even Rouault

Cc: Mateusz Łoskot added

This mechanism was introduced in http://trac.osgeo.org/gdal/ticket/3824 . CC'ing Mateusz in case he has a clue on what is going on...

Sandro, I've read the qgis ticket, and I don't understand why this is relate to topology. Does he do special things with GDAL (does he use it at all ??) ? For example, if you open qgis and load a geotiff and exit, do you still have that issue ?

Would there be components in qgis that load GDAL another time with another mechanism that the standard dynamic library loading ?

If you have a GDAL build with -DDEBUG defined, could you set CPL_DEBUG=ON as environment variable, run qgis, try the crashing scenario and look if you see "In GDALDestroy - unloading GDAL shared library." appear once or twice (or run qgis under gdb and attach a breakpoint in GDALDestroy). But theoretically, I would expect that GDALDestroy() can be called several times. It must (should) clean things in a way where calling it several times doesn't harm.

comment:2 Changed 10 years ago by strk

I've no idea how it relates to topology. There's really nothing special in there.

GDB finds GDALDestroy called once:

Breakpoint 1, GDALDestroy () at gdaldllmain.cpp:66
66          CPLDebug("GDAL", "In GDALDestroy - unloading GDAL shared library.");
(gdb) cont
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff2b2576c in malloc_consolidate (av=0x7ffff2e2ce40) at malloc.c:5144
5144    malloc.c: No such file or directory.
        in malloc.c
(gdb) bt
#0  0x00007ffff2b2576c in malloc_consolidate (av=0x7ffff2e2ce40) at malloc.c:5144
#1  0x00007ffff2b28460 in _int_free (av=0x7ffff2e2ce40, p=0x1096ac0) at malloc.c:5017
#2  0x00007ffff2b2be83 in *__GI___libc_free (mem=<value optimized out>) at malloc.c:3738
#3  0x00007ffff3fe5d79 in CPLCleanupTLSList (papTLSList=0x112b030) at cpl_multiproc.cpp:184
#4  0x00007ffff3fa5e3a in ~GDALDriverManager (this=0x1cb2360, __in_chrg=<value optimized out>) at gdaldrivermanager.cpp:234
#5  0x00007ffff3fa54fe in GDALDestroy () at gdaldllmain.cpp:67
#6  0x00007ffff3ce887f in __do_global_dtors_aux () from /usr/local/lib/libgdal.so
#7  0x0000000000000000 in ?? ()

comment:3 in reply to:  1 Changed 10 years ago by Mateusz Łoskot

Replying to rouault:

This mechanism was introduced in http://trac.osgeo.org/gdal/ticket/3824 . CC'ing Mateusz in case he has a clue on what is going on...

I don't think the change I applied is related, see gdaldllmain.cpp

comment:4 Changed 10 years ago by Even Rouault

Mateusz, yes, it was initially disabled and later enabled by me. I CC'ed you just in case you had an idea.

Note: my casual using of GDAL trunk with qgis under Linux has never exhibited that problem, but I didn't try specifically Sandro's scenario.

Sandro, looking at your last comments in the qgis ticket, it is not very clear that the root of the problem comes from GDAL. Could be some memory corruption that happens before. The ideal would be to have some simple code to reproduce this without QGIS being in the equation.

Ah, and there's no Valgrind trace at shutdown that shows problem in those GDAL symbols ? In the qgis ticket, you've noted a few Valgrind warnings in qgis stuff (IHMO the second one in QBasicAtomicInt::deref() is a real one, whereas the first one in PyObject_Free() can be ignored as it reminds me of similar warnings in other circumstances with Python and that didn't come from application bugs).

comment:5 Changed 10 years ago by strk

Indeed valgrind isn't blaming gdal at all, but you can see it's all inside tear-down phase so the automatic lib cleanup could be involved.

comment:6 Changed 10 years ago by Even Rouault

Milestone: 1.9.1
Resolution: invalid
Status: newclosed

Looking at the current status of http://hub.qgis.org/issues/4912, looks like the issue with on QGIS end. Closing

Note: See TracTickets for help on using tickets.