Opened 6 years ago

Closed 6 years ago

#7113 closed defect (invalid)

CPLSetTLSWithFreeFuncEx causes signal 6 (SIGABRT)

Reported by: mj10777 Owned by: warmerdam
Priority: normal Milestone:
Component: OGR_SRS Version: svn-trunk
Severity: critical Keywords:
Cc:

Description

For unknown reasons, QGis sometimes calls OGRSpatialReference::importFromEPSGAInternal, where CSVFilename( "gcs.csv" ) does not contain a valid path.

  • '/not_existing_dir/not_existing_path'

When this happens, a signal 6 (SIGABRT) is called and the application crashes.

Before this happens importFromEPSGAInternal is called 4 times with a correct path.

There are also multiple messages during this stage*

  • CPLGetTLSList(): pthread_setspecific() failed!

Here a list of added printf message that I added in Qgis and gdal:

"QgsCoordinateReferenceSystem::loadFromDatabase db[/home/mj10777/000_links/apps_local/qgis_300.spatialite/share/qgis/resources/srs.db]"
"QgsCoordinateReferenceSystem::loadFromDatabase sql[select srs_id,description,projection_acronym,ellipsoid_acronym,parameters,srid,auth_name||':'||auth_id,is_geo from tbl_srs where lower(auth_name||':'||auth_id)='epsg:3068' order by deprecated]"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[EPSG:3068]"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[EPSG:3068] before OSRDestroySpatialReference"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[EPSG:3068] after OSRDestroySpatialReference"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[EPSG:3068] before OSRNewSpatialReference( nullptr )"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[epsg:3068] after OSRNewSpatialReference( nullptr )"
CSVScanFileByName CSVFilename[/home/mj10777/000_links/apps_local/libQt592/share/gdal/gcs.csv]
"QgsCoordinateReferenceSystem::loadFromDatabase isValid[1]"
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
"QgsCoordinateReferenceSystem::loadFromDatabase db[/home/mj10777/000_links/apps_local/qgis_300.spatialite/share/qgis/resources/srs.db]"
"QgsCoordinateReferenceSystem::loadFromDatabase sql[select srs_id,description,projection_acronym,ellipsoid_acronym,parameters,srid,auth_name||':'||auth_id,is_geo from tbl_srs where srs_id='3239' order by deprecated]"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[EPSG:4030]"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[EPSG:4030] before OSRDestroySpatialReference"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[EPSG:4030] after OSRDestroySpatialReference"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[EPSG:4030] before OSRNewSpatialReference( nullptr )"
"QgsCoordinateReferenceSystem::loadFromDatabase mAuthId[epsg:4030] after OSRNewSpatialReference( nullptr )"
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CSVScanFileByName CSVFilename[/not_existing_dir/not_existing_path]
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
QGIS died on signal 11

And the output of valgrind:

==28104==
==28104== Invalid write of size 8
==28104==    at 0x9D5CE07: CPLSetTLSWithFreeFuncEx (cpl_multiproc.cpp:2241)
==28104==    by 0x9D478B4: CPLErrorV (cpl_error.cpp:269)
==28104==    by 0x9D479AE: CPLError (cpl_error.cpp:239)
==28104==    by 0x93399EC: OGRSpatialReference::importFromEPSGAInternal(int, char const*) (ogr_fromepsg.cpp:2238)
==28104==    by 0x933A4CB: OGRSpatialReference::importFromEPSG(int) (ogr_fromepsg.cpp:2137)
==28104==    by 0x93A258B: OGRSpatialReference::SetFromUserInput(char const*) (ogrspatialreference.cpp:2047)
==28104==    by 0x646DCD6: QgsCoordinateReferenceSystem::loadFromDatabase(QString const&, QString const&, QString const&) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/lib/libqgis_core.so.2.99.0)
==28104==    by 0x646CCF5: QgsCoordinateReferenceSystem::createFromSrsId(long) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/lib/libqgis_core.so.2.99.0)
==28104==    by 0x646FEB3: QgsCoordinateReferenceSystem::createFromProj4(QString const&) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/lib/libqgis_core.so.2.99.0)
==28104==    by 0x646B0C6: QgsCoordinateReferenceSystem::fromProj4(QString const&) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/lib/libqgis_core.so.2.99.0)
==28104==    by 0x64C6195: QgsEllipsoidUtils::ellipsoidParameters(QString const&) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/lib/libqgis_core.so.2.99.0)
==28104==    by 0x64B9733: QgsDistanceArea::setEllipsoid(QString const&) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/lib/libqgis_core.so.2.99.0)
==28104==  Address 0x28 is not stack'd, malloc'd or (recently) free'd
==28104==
==28104==
==28104== Process terminating with default action of signal 6 (SIGABRT)
==28104==    at 0x87E7E37: raise (raise.c:56)
==28104==    by 0x87E9527: abort (abort.c:89)
==28104==    by 0x40BA87: qgisCrash(int) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/bin/qgis)
==28104==    by 0x87E7EAF: ??? (in /lib/x86_64-linux-gnu/libc-2.19.so)
==28104==    by 0x9D5CE06: CPLSetTLSWithFreeFuncEx (cpl_multiproc.cpp:2241)
==28104==    by 0x9D478B4: CPLErrorV (cpl_error.cpp:269)
==28104==    by 0x9D479AE: CPLError (cpl_error.cpp:239)
==28104==    by 0x93399EC: OGRSpatialReference::importFromEPSGAInternal(int, char const*) (ogr_fromepsg.cpp:2238)
==28104==    by 0x933A4CB: OGRSpatialReference::importFromEPSG(int) (ogr_fromepsg.cpp:2137)
==28104==    by 0x93A258B: OGRSpatialReference::SetFromUserInput(char const*) (ogrspatialreference.cpp:2047)
==28104==    by 0x646DCD6: QgsCoordinateReferenceSystem::loadFromDatabase(QString const&, QString const&, QString const&) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/lib/libqgis_core.so.2.99.0)
==28104==    by 0x646CCF5: QgsCoordinateReferenceSystem::createFromSrsId(long) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/lib/libqgis_core.so.2.99.0)
==28104==
==28104== HEAP SUMMARY:
==28104==     in use at exit: 30,544,831 bytes in 165,103 blocks
==28104==   total heap usage: 1,416,965 allocs, 1,251,862 frees, 376,766,452 bytes allocated

Speculation:

The string containing '/not_existing_dir/not_existing_path' is maybe being freed, after which it is later being used as a default value.


What could be the cause of 'CPLGetTLSList(): pthread_setspecific() failed!', which until a month ago never turned up?

This turned up 2 times between the calling of OSRSetFromUserInput and if( CSVScanFileByName( CSVFilename( "gcs.csv" ) inside OGRSpatialReference::importFromEPSGAInternal and may be the cause of why the value of 'GDAL_DATA' has been lost.

Change History (10)

comment:1 by Even Rouault, 6 years ago

I'm not sure if it is a GDAL bug, or rather a an integration issue of GDAL in QGIS. The error looks like if CPLFinalizeTLS() (likely through GDALDestroy()) would have been called before OGRSpatialReference::SetFromUserInput(). You could check if you see a "In GDALDestroy - unloading GDAL shared library" trace (with CPL_DEBUG=ON) appearing before the crash.

comment:2 by mj10777, 6 years ago

With CPL_DEBUG=ON, 'In GDALDestroy - unloading GDAL shared library' does not turn up at all when this error occurs.

The source is a fresh copy of master, with changes of a pull request applied. The error occurs during the start up process, with no project being loaded. Since writing this report, I have done a 'make clean' and then rebuilt.

With another copy of 'master', with the git source of the same date, this error does not occur and the above message come out when the application ends.

comment:3 by mj10777, 6 years ago

QgsCoordinateReferenceSystem::loadFromDatabase:

      OSRDestroySpatialReference( d->mCRS );
      d->mCRS = OSRNewSpatialReference( nullptr );
      // Error occurs during 'OSRSetFromUserInput'
      d->mIsValid = OSRSetFromUserInput( d->mCRS, d->mAuthId.toLower().toLatin1() ) == OGRERR_NONE;
      setMapUnits();

comment:4 by Even Rouault, 6 years ago

OK, so it looks like it was a transient error due to a unconsistant build state ?

The 'In GDALDestroy - unloading GDAL shared library' trace at QGIS closing is expected

comment:5 by mj10777, 6 years ago

No, I was hoping that with the 'make clean' it would go away, but that was not the case.

comment:6 by mj10777, 6 years ago

I am in the process of recompiling with Debug, which is not quite finished.

I replaced the core.so and started the application I saw that the debug reports an error in getRecord and then the first 'CPLGetTLSList(): pthread_setspecific() failed!' turns up.

It is now just completing, so I will try to find the exact point where this happens.

comment:7 by mj10777, 6 years ago

I have been able to isolate where the 'CPLGetTLSList(): pthread_setspecific() failed!' messages start:

QgsCoordinateReferenceSystem::setProj4String

QGis code:

QgsLocaleNumC l;

  OSRDestroySpatialReference( d->mCRS );
  d->mCRS = OSRNewSpatialReference( nullptr );
  QgsDebugMsgLevel( QString("-I-> QgsCoordinateReferenceSystem::setProj4String -1- [%1] ").arg(proj4String), 4 );
  d->mIsValid = OSRImportFromProj4( d->mCRS, proj4String.trimmed().toLatin1().constData() ) == OGRERR_NONE;
  QgsDebugMsgLevel( QString("-I-> QgsCoordinateReferenceSystem::setProj4String -2- [%1] [%2]").arg(d->mIsValid).arg(proj4String), 4 );
 

and the output:

src/core/qgscoordinatereferencesystem.cpp: 1151: (setProj4String) [0ms] -I-> QgsCoordinateReferenceSystem::setProj4String -1- [+proj=longlat +ellps=WGS84 +no_defs] 
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
CPLGetTLSList(): pthread_setspecific() failed!
src/core/qgscoordinatereferencesystem.cpp: 1153: (setProj4String) [0ms] -I-> QgsCoordinateReferenceSystem::setProj4String -2- [1] [+proj=longlat +ellps=WGS84 +no_defs]

The only difference I can see is that *toLatin1()* is being used instead of *.toLocal8Bit().constData()*

I also noticed that a very mixed usage of *sqlite3_prepare*, OGR and pj commands using

  • sql.toLatin1()
  • sql.toUtf8()
  • sql.toUtf8().constData()

instead of *sql.sql.toLocal8Bit().constData()*. I wonder if something is getting confused.

comment:8 by mj10777, 6 years ago

After adding the Debug Messages to the working master version, I have noticed that instead of the 'failed' messages the following turns up:

src/core/qgscoordinatereferencesystem.cpp: 1149: (setProj4String) [0ms] -I-> QgsCoordinateReferenceSystem::setProj4String -1- [+proj=longlat +ellps=WGS84 +no_defs]
OGRCT: PROJ >= 4.8.0 features enabled
OGRCT: Using locale-safe proj version
src/core/qgscoordinatereferencesystem.cpp: 1151: (setProj4String) [1ms] -I-> QgsCoordinateReferenceSystem::setProj4String -2- [1] [+proj=longlat +ellps=WGS84 +no_defs]

comment:9 by mj10777, 6 years ago

OK, so it looks like it was a transient error due to a unconsistant build state ?

It would seem so, since it suddenly started to work properly.

For the master version of QGis it always worked, so I assumed something got confused.

If anything, the cause of this would be interesting to avoid:

==28104==    by 0x40BA87: qgisCrash(int) (in /media/mj10777/tb_4/apps_local/qgis_300.spatialite/bin/qgis)
==28104==    by 0x87E7EAF: ??? (in /lib/x86_64-linux-gnu/libc-2.19.so)
==28104==    by 0x9D5CE06: CPLSetTLSWithFreeFuncEx (cpl_multiproc.cpp:2241)
==28104==    by 0x9D478B4: CPLErrorV (cpl_error.cpp:269)
==28104==    by 0x9D479AE: CPLError (cpl_error.cpp:239)

but otherwise this could be closed.

comment:10 by Even Rouault, 6 years ago

Resolution: invalid
Status: newclosed
Note: See TracTickets for help on using tickets.