Opened 11 years ago
Closed 11 years ago
#5024 closed defect (fixed)
Conflict between ECW SDK deinitialization and GDAL deinitialization
Reported by: | Even Rouault | Owned by: | Even Rouault |
---|---|---|---|
Priority: | normal | Milestone: | 1.9.3 |
Component: | GDAL_Raster | Version: | unspecified |
Severity: | normal | Keywords: | |
Cc: | Mateusz Łoskot |
Description
The issue has been demonstrated when running the gdalecwjp2 test from imageio-ext 1.1 which end up with a segmentation fault.
Those tests open a GDAL dataset at some point (see below trace) that isn't explicitely closed in the Java code.
java.lang.Exception: Stack trace at java.lang.Thread.dumpStack(Thread.java:1266) at it.geosolutions.imageio.gdalframework.GDALUtilities.acquireDataSet(GDALUtilities.java:295) at it.geosolutions.imageio.gdalframework.GDALCommonIIOImageMetadata.<init>(GDALCommonIIOImageMetadata.java:106) at it.geosolutions.imageio.gdalframework.GDALCommonIIOImageMetadata.<init>(GDALCommonIIOImageMetadata.java:85) at it.geosolutions.imageio.gdalframework.GDALImageReader.createDatasetMetadata(GDALImageReader.java:230) at it.geosolutions.imageio.gdalframework.GDALImageReader.setInput(GDALImageReader.java:738) at com.sun.media.jai.imageioimpl.ImageReadCRIF.getImageReader(ImageReadCRIF.java:250) at com.sun.media.jai.imageioimpl.ImageReadCRIF.create(ImageReadCRIF.java:277) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at javax.media.jai.FactoryCache.invoke(FactoryCache.java:122) at javax.media.jai.OperationRegistry.invokeFactory(OperationRegistry.java:1674) at javax.media.jai.ThreadSafeOperationRegistry.invokeFactory(ThreadSafeOperationRegistry.java:473) at javax.media.jai.registry.RIFRegistry.create(RIFRegistry.java:332) at javax.media.jai.RenderedOp.createInstance(RenderedOp.java:819) at javax.media.jai.RenderedOp.createRendering(RenderedOp.java:867) at javax.media.jai.RenderedOp.getMinX(RenderedOp.java:2161) at it.geosolutions.imageio.plugins.jp2ecw.JP2KReadTest.testJaiOperations(JP2KReadTest.java:164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.internal.runners.TestMethodRunner.executeMethodBody(TestMethodRunner.java:99) at org.junit.internal.runners.TestMethodRunner.runUnprotected(TestMethodRunner.java:81) at org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34) at org.junit.internal.runners.TestMethodRunner.runMethod(TestMethodRunner.java:75) at org.junit.internal.runners.TestMethodRunner.run(TestMethodRunner.java:45) at org.junit.internal.runners.TestClassMethodsRunner.invokeTestMethod(TestClassMethodsRunner.java:71) at org.junit.internal.runners.TestClassMethodsRunner.run(TestClassMethodsRunner.java:35) at org.junit.internal.runners.TestClassRunner$1.runUnprotected(TestClassRunner.java:42) at org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34) at org.junit.internal.runners.TestClassRunner.run(TestClassRunner.java:52) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:127) at org.apache.maven.surefire.Surefire.run(Surefire.java:177) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:345) at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1009)
Hence this dataset is destroyed by GDALDestroy() when the process terminates. The issue is that the ECW SDK 3.3 has also a static ressource CNCSJP2File::CNCSJP2FileVector CNCSJP2File::sm_Files (in NCJP2File.cpp). The issue arise when this variable is destroyed (the CNCSJP2FileVector destructor closes all the remaining CNCSJP2File objects), before GDALDestroy() is called. When GDALDestroy() is called, it destroys the remaining ECW dataset, but the underlying ECW SDK object is now invalid...
==24843== Invalid read of size 8 ==24843== at 0x170404AE: CNCSJP2FileView::GetStream() (NCSJP2FileView.cpp:1855) ==24843== by 0xF5E55B4: ECWDataset::~ECWDataset() (ecwdataset.cpp:553) ==24843== by 0xF9E52C9: GDALDriverManager::~GDALDriverManager() (gdaldrivermanager.cpp:196) ==24843== by 0xF9E613B: GDALDestroyDriverManager (gdaldrivermanager.cpp:811) ==24843== by 0xF9E48AE: GDALDestroy() (gdaldllmain.cpp:67) ==24843== by 0xF51D7AE: ??? (in /home/even/gdal/svn/trunk/gdal/libgdal.so) ==24843== by 0x100320A0: ??? (in /home/even/gdal/svn/trunk/gdal/libgdal.so) ==24843== by 0x56A7311: exit (exit.c:78) ==24843== by 0x62C84B6: vm_direct_exit(int) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== by 0x65CB74B: VM_Operation::evaluate() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== by 0x65CA469: VMThread::evaluate_operation(VM_Operation*) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== by 0x65CAA45: VMThread::loop() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== Address 0x1f60b4f0 is 3,504 bytes inside a block of size 3,584 free'd ==24843== at 0x4C283A4: operator delete(void*) (vg_replace_malloc.c:480) ==24843== by 0x17033703: CNCSJP2File::~CNCSJP2File() (NCSJP2File.cpp:164) ==24843== by 0x1703221B: CNCSJP2File::CNCSJP2FileVector::CloseAll() (NCSJP2File.cpp:1239) ==24843== by 0x17039D2A: CNCSJP2File::CNCSJP2FileVector::~CNCSJP2FileVector() (NCSJP2File.h:98) ==24843== by 0x56A7311: exit (exit.c:78) ==24843== by 0x62C84B6: vm_direct_exit(int) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== by 0x65CB74B: VM_Operation::evaluate() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== by 0x65CA469: VMThread::evaluate_operation(VM_Operation*) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== by 0x65CAA45: VMThread::loop() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== by 0x65CAD41: VMThread::run() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== by 0x6485401: java_start(Thread*) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so) ==24843== by 0x504E9C9: start_thread (pthread_create.c:300)
The only workaround I found, without patching the ECW SDK, is to just disable the destruction of the ECW SDK resources in the ECWDataset destructor when the GDAL_CLOSE_JP2ECW_RESOURCE is set to NO. This config option is set to NO by GDALDestroy(). As a provision, I also introduced a GDAL_DESTROY config option that can be set to NO to avoid any code from being run in GDALDestroy() if similar issues were to be discovered in other places.
Fix/workaround committed in r25719 (trunk) and r25720 (branches/1.9)