Opened 11 years ago

Closed 11 years ago

#5024 closed defect (fixed)

Conflict between ECW SDK deinitialization and GDAL deinitialization

Reported by: Even Rouault Owned by: Even Rouault
Priority: normal Milestone: 1.9.3
Component: GDAL_Raster Version: unspecified
Severity: normal Keywords:
Cc: Mateusz Łoskot

Description

The issue has been demonstrated when running the gdalecwjp2 test from imageio-ext 1.1 which end up with a segmentation fault.

Those tests open a GDAL dataset at some point (see below trace) that isn't explicitely closed in the Java code.

java.lang.Exception: Stack trace
	at java.lang.Thread.dumpStack(Thread.java:1266)
	at it.geosolutions.imageio.gdalframework.GDALUtilities.acquireDataSet(GDALUtilities.java:295)
	at it.geosolutions.imageio.gdalframework.GDALCommonIIOImageMetadata.<init>(GDALCommonIIOImageMetadata.java:106)
	at it.geosolutions.imageio.gdalframework.GDALCommonIIOImageMetadata.<init>(GDALCommonIIOImageMetadata.java:85)
	at it.geosolutions.imageio.gdalframework.GDALImageReader.createDatasetMetadata(GDALImageReader.java:230)
	at it.geosolutions.imageio.gdalframework.GDALImageReader.setInput(GDALImageReader.java:738)
	at com.sun.media.jai.imageioimpl.ImageReadCRIF.getImageReader(ImageReadCRIF.java:250)
	at com.sun.media.jai.imageioimpl.ImageReadCRIF.create(ImageReadCRIF.java:277)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:616)
	at javax.media.jai.FactoryCache.invoke(FactoryCache.java:122)
	at javax.media.jai.OperationRegistry.invokeFactory(OperationRegistry.java:1674)
	at javax.media.jai.ThreadSafeOperationRegistry.invokeFactory(ThreadSafeOperationRegistry.java:473)
	at javax.media.jai.registry.RIFRegistry.create(RIFRegistry.java:332)
	at javax.media.jai.RenderedOp.createInstance(RenderedOp.java:819)
	at javax.media.jai.RenderedOp.createRendering(RenderedOp.java:867)
	at javax.media.jai.RenderedOp.getMinX(RenderedOp.java:2161)
	at it.geosolutions.imageio.plugins.jp2ecw.JP2KReadTest.testJaiOperations(JP2KReadTest.java:164)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:616)
	at org.junit.internal.runners.TestMethodRunner.executeMethodBody(TestMethodRunner.java:99)
	at org.junit.internal.runners.TestMethodRunner.runUnprotected(TestMethodRunner.java:81)
	at org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34)
	at org.junit.internal.runners.TestMethodRunner.runMethod(TestMethodRunner.java:75)
	at org.junit.internal.runners.TestMethodRunner.run(TestMethodRunner.java:45)
	at org.junit.internal.runners.TestClassMethodsRunner.invokeTestMethod(TestClassMethodsRunner.java:71)
	at org.junit.internal.runners.TestClassMethodsRunner.run(TestClassMethodsRunner.java:35)
	at org.junit.internal.runners.TestClassRunner$1.runUnprotected(TestClassRunner.java:42)
	at org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34)
	at org.junit.internal.runners.TestClassRunner.run(TestClassRunner.java:52)
	at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
	at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
	at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:127)
	at org.apache.maven.surefire.Surefire.run(Surefire.java:177)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:616)
	at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:345)
	at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1009)

Hence this dataset is destroyed by GDALDestroy() when the process terminates. The issue is that the ECW SDK 3.3 has also a static ressource CNCSJP2File::CNCSJP2FileVector CNCSJP2File::sm_Files (in NCJP2File.cpp). The issue arise when this variable is destroyed (the CNCSJP2FileVector destructor closes all the remaining CNCSJP2File objects), before GDALDestroy() is called. When GDALDestroy() is called, it destroys the remaining ECW dataset, but the underlying ECW SDK object is now invalid...

==24843== Invalid read of size 8
==24843==    at 0x170404AE: CNCSJP2FileView::GetStream() (NCSJP2FileView.cpp:1855)
==24843==    by 0xF5E55B4: ECWDataset::~ECWDataset() (ecwdataset.cpp:553)
==24843==    by 0xF9E52C9: GDALDriverManager::~GDALDriverManager() (gdaldrivermanager.cpp:196)
==24843==    by 0xF9E613B: GDALDestroyDriverManager (gdaldrivermanager.cpp:811)
==24843==    by 0xF9E48AE: GDALDestroy() (gdaldllmain.cpp:67)
==24843==    by 0xF51D7AE: ??? (in /home/even/gdal/svn/trunk/gdal/libgdal.so)
==24843==    by 0x100320A0: ??? (in /home/even/gdal/svn/trunk/gdal/libgdal.so)
==24843==    by 0x56A7311: exit (exit.c:78)
==24843==    by 0x62C84B6: vm_direct_exit(int) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==    by 0x65CB74B: VM_Operation::evaluate() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==    by 0x65CA469: VMThread::evaluate_operation(VM_Operation*) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==    by 0x65CAA45: VMThread::loop() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==  Address 0x1f60b4f0 is 3,504 bytes inside a block of size 3,584 free'd
==24843==    at 0x4C283A4: operator delete(void*) (vg_replace_malloc.c:480)
==24843==    by 0x17033703: CNCSJP2File::~CNCSJP2File() (NCSJP2File.cpp:164)
==24843==    by 0x1703221B: CNCSJP2File::CNCSJP2FileVector::CloseAll() (NCSJP2File.cpp:1239)
==24843==    by 0x17039D2A: CNCSJP2File::CNCSJP2FileVector::~CNCSJP2FileVector() (NCSJP2File.h:98)
==24843==    by 0x56A7311: exit (exit.c:78)
==24843==    by 0x62C84B6: vm_direct_exit(int) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==    by 0x65CB74B: VM_Operation::evaluate() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==    by 0x65CA469: VMThread::evaluate_operation(VM_Operation*) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==    by 0x65CAA45: VMThread::loop() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==    by 0x65CAD41: VMThread::run() (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==    by 0x6485401: java_start(Thread*) (in /usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server/libjvm.so)
==24843==    by 0x504E9C9: start_thread (pthread_create.c:300)

The only workaround I found, without patching the ECW SDK, is to just disable the destruction of the ECW SDK resources in the ECWDataset destructor when the GDAL_CLOSE_JP2ECW_RESOURCE is set to NO. This config option is set to NO by GDALDestroy(). As a provision, I also introduced a GDAL_DESTROY config option that can be set to NO to avoid any code from being run in GDALDestroy() if similar issues were to be discovered in other places.

Change History (1)

comment:1 by Even Rouault, 11 years ago

Resolution: fixed
Status: newclosed

Fix/workaround committed in r25719 (trunk) and r25720 (branches/1.9)

Note: See TracTickets for help on using tickets.