Ticket #4231 (assigned defect)

Opened 15 months ago

Last modified 15 months ago

crash with many threads in java

Reported by: nbrachet Owned by: unicoletti
Priority: normal Milestone: 6.0.2 release
Component: MapScript Version: unspecified
Severity: normal Keywords: mapscript java thread
Cc: sdlime, tamas

Description

How to reproduce:

Modify MapThread.java to remove the call to System.gc(). For this problem to reproduce one thread has to delete the mapObj object, while simultaneously another one deletes a layerObj object (from the same mapObj object), which happens when garbage collection runs.

Run with many threads for a while... because this is a threading issue it can work for a while without problem. In our testing we would crash within a half hour on a 8-core machine running 20 threads with JRE 1.6.

Analysis:

Reference counting isn't thread-safe.

layerObj objects obtained from a call to mapObj.get(int) are really shared between the layerObj and the mapObj objects. When garbage collection runs it may actually delete the layerObj object simultaneously from 2 different threads: one deleting the layerObj, the other one deleting the mapObj. freeLayer() calls MS_REFCNT_DECR_IS_NOT_ZERO(), itself implemented in terms of MS_REFCNT_DECR() which is a simple macro around operator++... not thread-safe. In some (rare) case both threads think the ref counter went to zero because of their own call to MS_REFCNT_DECR and they end up both calling free() causing the crash.

Solution:

A simple solution is to use atomic int operations, __sync_fetch_and_add/__sync_sub_and_fetch for GCC, InterlockedIncrement/InterlockedDecrement on WIN32, etc... Attached is a patch against 6.0.2 that implements this change for GCC.

Attachments

diff Download (0.9 KB) - added by nbrachet 15 months ago.
patch for GCC

Change History

Changed 15 months ago by nbrachet

  • attachment diff Download added

patch for GCC

  Changed 15 months ago by sdlime

  • owner changed from sdlime to unicoletti
  • component changed from MapServer C Library to MapScript-Java

  Changed 15 months ago by sdlime

  • cc sdlime added

follow-up: ↓ 5   Changed 15 months ago by unicoletti

  • owner changed from unicoletti to sdlime
  • component changed from MapScript-Java to MapScript

Thanks for the patch, I'll look into it asap. I'm wondering what options are you passing to the vm? Perhaps UseParallelGC or something like that?

  Changed 15 months ago by unicoletti

  • cc tamas added
  • owner changed from sdlime to unicoletti
  • status changed from new to assigned

Changing the component as Mapscript because this change potentially impacts other mapscript implementations as well. cc'ing tamas maybe he can come up with a fix for c# too.

in reply to: ↑ 3 ; follow-up: ↓ 6   Changed 15 months ago by nbrachet

Replying to unicoletti:

Thanks for the patch, I'll look into it asap. I'm wondering what options are you passing to the vm? Perhaps UseParallelGC or something like that?

Not specifically but it first happened with the -server option (which turns UseParallelGC on right?). I also reproduced on a default (-client no UseParallelGC) JVM.

I am running 64-bit JVM. Not that it should matter.

in reply to: ↑ 5   Changed 15 months ago by unicoletti

Replying to nbrachet:

Replying to unicoletti:

Thanks for the patch, I'll look into it asap. I'm wondering what options are you passing to the vm? Perhaps UseParallelGC or something like that?

Not specifically but it first happened with the -server option (which turns UseParallelGC on right?). I also reproduced on a default (-client no UseParallelGC) JVM.

I think that starting from 1.6 -server does just that. Patch looks good to me.

I am running 64-bit JVM. Not that it should matter.

  Changed 15 months ago by rouault

You could borrow  http://trac.osgeo.org/gdal/browser/trunk/gdal/port/cpl_atomic_ops.cpp from GDAL that has implementation of atomic incrementation for various platforms/compilers

Note: See TracTickets for help on using tickets.