Opened 8 years ago
Closed 8 years ago
#2596 closed defect (fixed)
Constant crashes under high load/many concurrent requests
Reported by: | andymorf | Owned by: | |
---|---|---|---|
Priority: | high | Milestone: | 3.1 |
Component: | Server | Version: | 3.0.0 |
Severity: | blocker | Keywords: | crash, high load |
Cc: | Andreas, Morf | External ID: |
Description
Firing 100-200 concurrent QUERYMAPFEATURES (according Maptip) to mapagent leads to constant crashing of mgserver. Before crashing there are a lots of exceptions logged:
Error: Invalid argument(s): String argument is empty: className StackTrace: - MgRenderingServiceHandler.ProcessOperation line 83 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\RenderingServiceHandler.cpp - MgOpQueryFeatures.Execute line 125 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\OpQueryFeatures.cpp - MgServerRenderingService.QueryFeatures line 1093 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\ServerRenderingService.cpp - MgServerRenderingService.RenderForSelection line 1826 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\ServerRenderingService.cpp - MgServerFeatureService.SelectFeatures line 451 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\feature\ServerFeatureService.cpp - MgServerSelectFeatures.SelectFeatures line 331 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\feature\ServerSelectFeatures.cpp - MgServerSelectFeatures::ValidateParam line 826 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\feature\ServerSelectFeatures.cpp <2016-06-06T20:41:30> 8788 MgStress 127.0.0.1 Administrator Error: The specified class was not found. StackTrace: - MgRenderingServiceHandler.ProcessOperation line 83 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\RenderingServiceHandler.cpp - MgOpQueryFeatures.Execute line 125 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\OpQueryFeatures.cpp - MgServerRenderingService.QueryFeatures line 1093 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\ServerRenderingService.cpp - MgServerRenderingService.RenderForSelection line 1826 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\ServerRenderingService.cpp - MgServerDescribeSchema.GetClassDefinition line 1029 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\feature\ServerDescribeSchema.cpp <2016-06-06T20:41:30> 6236 MgStress 127.0.0.1 Administrator Error: An exception occurred in FDO component. Error occurred in Feature Source (Library://oradata/av.FeatureSource): c_KgOraSelectCommand.Execute : ERROR: FindClassDefinition() return NULL (Cause: , Root Cause: c_KgOraSelectCommand.Execute : ERROR: FindClassDefinition() return NULL ) StackTrace: - MgRenderingServiceHandler.ProcessOperation line 83 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\RenderingServiceHandler.cpp - MgOpQueryFeatures.Execute line 125 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\OpQueryFeatures.cpp - MgServerRenderingService.QueryFeatures line 1093 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\ServerRenderingService.cpp - MgServerRenderingService.RenderForSelection line 1826 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\rendering\ServerRenderingService.cpp - MgServerFeatureService.SelectFeatures line 451 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\feature\ServerFeatureService.cpp - MgServerSelectFeatures.SelectFeatures line 331 file c:\working\build_area\mapguide\3.1.0\x64\mgdev\server\src\services\feature\ServerSelectFeatures.cpp
- If sending the requests one after another, everyone is executed correct response.
- KingOra is used
Attachments (2)
Change History (7)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
My current findings thus far.
I was able to finally reproduce this problem and am strongly suspecting that it is some kind thread-safety issue around strings.
This error from the provider
c_KgOraSelectCommand.Execute : ERROR: FindClassDefinition() return NULL
is most likely because it was fed a corrupted string as a result of intense multi-threaded usage.
I found an old mapguide-internals thread which highlighted some pitfalls around STL string usage in a multi-threaded environment (http://osgeo-org.1560.x6.nabble.com/std-string-not-thread-safe-on-Linux-td4210940i20.html)
Which leads me to suspect it is some thread-safety issue around MgUtil
and its static sm_classNameQualifier
STL string member (https://trac.osgeo.org/mapguide/browser/trunk/MgDev/Common/Foundation/System/Util.cpp#L1232). I think corruption is being introduced by multiple threads concatting from that same static STL string.
Why do I think this? Because this is the file where VS generally starts breaking on when it starts access violating in my load tests. Other places where it breaks on access violation also happen to be places that involve an FDO class name that would've went through MgUtil one way or another.
comment:3 by , 8 years ago
The last few tests of mgserver (release) with attached debugger always crashed within geos::io::WKTReader.read() and I stumbled over http://stackoverflow.com/questions/4057319/is-setlocale-thread-safe-function which states the setLocale() function used in CLocalizer.cpp operates globally (per process) – and saving and restoring the locale via pointers seemed risky to me.
So as a first attempt I commented out the line “CLocalizer clocale;” in WKTReader.read() and WKTWriter.writeFormatted() and rebuilt geos.dll. Astonishingly, my load tests cannot tear down mgserver so far… Maybe, leaving locale-handling just commented out is not the definitive solution
comment:4 by , 8 years ago
Is there a preprocessor flag in GEOS that we can defined so that the code in question in GEOS can be #ifdef
'd out?
I don't want to modify Oem sources if we don't have to.
by , 8 years ago
Attachment: | 2596_v4.patch added |
---|
comment:5 by , 8 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Do you have any load testing code / data for reproducing this?