Opened 15 years ago

Closed 15 years ago

#160 closed defect (fixed)

CSW server sometimes gives ConcurrentModificationException

Reported by: simonp Owned by: geonetwork-devel@…
Priority: critical Milestone: v2.4.1
Component: General Version: v2.5.0
Keywords: Cc:

Description

Reported by James Q Wilson (james.q.wilson@…): I've been trying to do some load tests on geonetwork 2_4_1 csw getRecords service and I am getting java.util.ConcurrentModificationExceptions thrown by Jeeves. What I've done is loaded up 1000 test records onto the server, and written a simple python client to sequentially fire CSW GetRecords queries at the server, asking for a randomly generated bounding box. For a single client, the server can handle ~ 60 requests / second. If I fire up multiple instances of the client I occasionally get a java.util.ConcurrentModificationException error thrown and the client quits. I've spent some time looking through the code, and the error is thrown somewhere within CatalogDispatcher.java. Although I doubt a production server would be hit this hard, it does cause some concern. Are there other subtle concurrency bugs out there waiting to bite, that have an effect on data integrity, for example. The obvious (and painful) fix is to make the app single threaded, by synchronising the whole servlet (ie synchronize Jeeves doGet doPost). Of course, this introduces significant latency when the server gets loaded. Has anyone else seen errors of this type or does anybody have suggestions as to how they could be tracked down - I don't have much expertise in concurrency issues.

Change History (2)

comment:1 by simonp, 15 years ago

Or to put it more clearly: I've tested it and it doesn't happen when hsListeners in DbmsPool.java is synchronized. But more analysis is required.

Simon Pigot wrote:

Had a brief look at this in 2.4.1 using a couple of sessions that send requests via curl.

Saw the following exception:

2009-10-14 14:19:23,058 DEBUG [jeeves.service] - Raised exception while executing service <error id="error">

<message /> <class>ConcurrentModificationException</class> <stack>

<at class="java.util.HashMap$HashIterator" file="HashMap.java"

line="793" method="nextEntry" />

<at class="java.util.HashMap$KeyIterator" file="HashMap.java"

line="828" method="next" />

<at class="jeeves.resources.dbms.DbmsPool" file="DbmsPool.java"

line="173" method="close" />

<at class="jeeves.server.resources.ResourceManager"

file="ResourceManager.java" line="117" method="release" />

<at class="jeeves.server.resources.ResourceManager"

file="ResourceManager.java" line="83" method="close" />

<at class="jeeves.server.dispatchers.ServiceInfo"

file="ServiceInfo.java" line="245" method="execService" />

<at class="jeeves.server.dispatchers.ServiceInfo"

file="ServiceInfo.java" line="141" method="execServices" />

<at class="jeeves.server.dispatchers.ServiceManager"

file="ServiceManager.java" line="377" method="dispatch" />

<at class="jeeves.server.JeevesEngine" file="JeevesEngine.java"

line="621" method="dispatch" />

<at class="jeeves.server.sources.http.JeevesServlet"

file="JeevesServlet.java" line="174" method="execute" />

</stack> <request>

Is this what you are seeing James?

Seems it can be fixed by making hsListeners into a Synchronized set in DbmsPool.java - but this is a bit of shallow analysis because curiously it doesn't seem to happen in an earlier version of the BlueNetMEST (jetty 5 vs jetty 6 maybe?) which we've backported the CSW 2.0.2 implementation into.

Cheers, Simon

comment:2 by simonp, 15 years ago

Resolution: fixed
Status: newclosed

Further analysis: src/org/fao/geonet/services/main/CswDispatcher.java was opening Dbms resource explicitly to create new SettingManager rather than using the one provided in the context via gc.getSettingManager.

Switch to using setting manager from context fixes the problem - no need to synchronize hsListeners set in jeeves (although this behaviour is interesting!).

Fixed in svn commit 5373

This did not appear in the backported CSW 2.0.2 implementation because the service change that uses the settingmanager to check whether the CSW service was on had not been backported!

Note: See TracTickets for help on using tickets.