Opened 17 years ago

Closed 13 years ago

#480 closed defect (fixed)

Crash with MgInvalidStreamHeaderException with GETTILEIMAGE (load test)

Reported by: zspitzer Owned by:
Priority: high Milestone: 2.4
Component: Tile Service Version: 2.1.0
Severity: major Keywords: out of memory, #520
Cc: walt.welton-lair External ID:

Description

I have been doing some simple load testing with 2.0.0 and jmeter (http://jakarta.apache.org/jmeter/index.html)

The test involves simply calling tiles from the Sheboygan sample, (about 382k of PNG8 tiles.) The server name and port can be configured in the attached jmeter test plan, under the 'HTTP Request Defaults' section.

The load is 50 users making requests every 100ms+a random 50ms deviation, ramping in over 10 seconds

The server will crash with a 559 response code and MgInvalidStreamHeaderException, which isn't logged to any log files or when the service is run interactively with CPL_DEBEUG=on. The service will then need to be restarted. Occasionally i have seen it crash with a MgConnectionFailedException

On my laptop i can achieve a bit over 200 requests per second and throughput of about 950Kb/s

The crash is repeatable but the frequency varies, sometimes after 1,500 requests, once after 36,000. Slowing down the request frequency only made the test run longer, so far it has always crashed on me.

Attachments (3)

MG 2.0 Sheboygan Tile Test.jmx (597.0 KB ) - added by zspitzer 17 years ago.
Jmeter Test Plan for the Titled Sheboygan sample
Mg 2 Sheybogan Jmeter Load Test 40 Users.zip (25.1 KB ) - added by zspitzer 17 years ago.
Test Case with more tile requests, 40 users with 300ms jitter, configurable hostname and port
ByteSinkFix-#480-#520.patch (2.1 KB ) - added by uvlite 16 years ago.
fixes some bad behaviour in ByteSink under limited memory conditions

Download all attachments as: .zip

Change History (18)

by zspitzer, 17 years ago

Jmeter Test Plan for the Titled Sheboygan sample

comment:1 by tomfukushima, 17 years ago

What OS and what web tier is this using? Thanks, Tom

comment:2 by zspitzer, 17 years ago

I was using XP with the bundled Apache

comment:3 by zspitzer, 17 years ago

if you want to make it crash quicker, open one of the tiles under Server\Repositories\TileCache\Samples_Sheboygan_MapsTiled_Sheboygan and replace it with a screen shot of your desktop, ie a big image and the test will crash within 1000 requests.

I'm not sure if there is a design limit (ie the tile cache might be optimised for smaller images) architecturally which makes this approach invalid but i was able to generate 20 Mb per second traffic before this crashed

comment:4 by MaksimS, 17 years ago

Another GETTILEIMAGE quirk that may be related to Zac's observations:

There's a common problem with Autodesk MapGuide Enteprise 2008 and Autodesk Raster provider. Happens most often if there're several layers within base layer group, or a base layer displying heavy raster imagery. During base layer tiling operation it happens that some tiles simply get "skipped", and stay that way (cached as "empty" tile) even after MapGuide service gets restarted. Tiles get skipped (rendered empty) in a random fashion, no errors reported whatsoever.

Testing platform:

  • Windows Server 2003 / IIS6
  • Autodesk MapGuide Enterprise 2008

I hope Autodesk will get this fixed in 2009 version.

by zspitzer, 17 years ago

Test Case with more tile requests, 40 users with 300ms jitter, configurable hostname and port

comment:5 by zspitzer, 17 years ago

This maybe windows only, I just repeated this test against a Mapguide 2.0 debug build on CentOS 4.6 and it didn't crash after over 100,000 requests, the same test failed after about 39,000 requests on Win32.

The new test case can be configured via the "HTTP Request Defaults" to point to a specific server and port. The usage pattern is simply zooming in and out and around the sample sheybogan map as an anonymous user

comment:6 by zspitzer, 16 years ago

Resolution: fixed
Status: newclosed

this problem doesn't occur with 2.0.2, closing

comment:7 by amitmarty, 16 years ago

Priority: mediumhigh
Resolution: fixed
Status: closedreopened

comment:8 by amitmarty, 16 years ago

Problem is still happening in 2.0.1 and 2.0.2. Please see ticket #726

http://trac.osgeo.org/mapguide/ticket/726

just found this one yesterday.

Thank You

comment:9 by uvlite, 16 years ago

Cc: walt.welton-lair@… added
Keywords: out of memory #520 added
Version: 2.0.02.1.0

This error seems to be caused by a missing cleanup in ByteSink.cpp:246 in MgByteSink::ToFile(CREFSTRING filename) In this method a huge buffer (1MB) is allocated to export the byteSource into the given filename. This is the first place when a system with exhausted memory resources fails. The buffer is on the stack so thats not the problem, but when an exception is triggered we are left with an empty file. Thus a MgInvalidStreamHeaderException on the client side

This seems to be related to ticket #520.

by uvlite, 16 years ago

Attachment: ByteSinkFix-#480-#520.patch added

fixes some bad behaviour in ByteSink under limited memory conditions

comment:10 by waltweltonlair, 16 years ago

Cc: walt.welton-lair added; walt.welton-lair@… removed

Additional fix submitted in the trunk stream - see https://trac.osgeo.org/mapguide/changeset/3787.

comment:11 by waltweltonlair, 16 years ago

UV - if you have a chance lpease retest using the updated code. If you're satisifed with the behavior then close the ticket.

in reply to:  11 comment:12 by uvlite, 16 years ago

Hi Walt,

I tried to render the tiles our map on a 1GB windows server machine using a most recent build from the build server.

The server falls over but the exception from FDO does not get propagated to MapGuide correctly. I think this needs a bit of thought.

Generally, the behaviour of the server when memory runs out is bad because the tileserver returns SUCCESS for wrongly rendered tiles!!

[Approach1] do a wait retry for each out of memory exception .... this should already serialize memory requests sufficiently to make things stable. we could create a clever macro for all MEMALLOCS which does exactly this. Maybe the allocator plugin for std library containers can be used for that also.

[Approach2] at least propagate more meaningful errors from FDO to mapguide so this can be dealt with correctly (e.g. return error message on the tile but dont create the tile in cache) thats still better than wrong tiles.

# Log Type: Error Log # Log Parameters: CLIENT,CLIENTIP,USER,ERROR,STACKTRACE <2009-04-20T13:30:01> 2088

Success: Server started.

<2009-04-20T14:04:41> 5212 192.168.99.99 Anonymous

Error: Failed to stylize layer: aug 06-ROADS - HWY

Cannot establish connection.

StackTrace:

<2009-04-20T14:07:28> 5540 192.168.99.99 Anonymous

Error: Failed to stylize layer: aug 06-ROADS - FWY

An exception occurred in FDO component. An exception occurred in FDO component. An error occurred during SDF database access.

StackTrace:

<2009-04-20T14:07:29> 1768 192.168.99.99 Anonymous

Error: Failed to stylize layer: aug 06-ROADS - MAJ

An exception occurred in FDO component. An exception occurred in FDO component. An error occurred during SDF database access.

StackTrace:

comment:13 by crispinatime, 13 years ago

Milestone: 2.4

This is still present in MG 2.2 - extract from server logs which hung the service and w3wp.exe... hope this helps. Will v2.4 based on new MSVC build help with the error in msvcr90.dll?

Event Type: Warning Event Source: ASP.NET 2.0.50727.0 Event Category: Web Event Event ID: 1309 Date: 24/01/2012 Time: 10:56:42 User: N/A Computer: Description: Event code: 3005 Event message: An unhandled exception has occurred. Event time: 24/01/2012 10:56:42 Event time (UTC): 24/01/2012 10:56:42 Event ID: 8880e0790f384f39a9d991ef223b0060 Event sequence: 250 Event occurrence: 1 Event detail code: 0 Application information:

Application domain: /LM/W3SVC/1/Root/-1-129718669592750903 Trust level: Full Application Virtual Path: / Application Path: E:\Projects\\ Machine name: HOBRON02WEB

Process information:

Process ID: 6764 Process name: w3wp.exe Account name: NT AUTHORITY\NETWORK SERVICE

Exception information:

Exception type: Exception Exception message: Failed to connect, please check network connection and login information.

Extended error info:

The remote server returned an error: (559) MgInvalidStreamHeaderException.

Request information:

Request URL: http:// Request path: /_ User host address: User: Is authenticated: False Authentication Type: Thread account name: NT AUTHORITY\NETWORK SERVICE

Thread information:

Thread ID: 17 Thread account name: NT AUTHORITY\NETWORK SERVICE Is impersonating: False Stack trace: at OSGeo.MapGuide.MaestroAPI.HttpServerConnection.InitConnection(Uri hosturl, String username, String password, String locale, Boolean allowUntestedVersion)

at OSGeo.MapGuide.MaestroAPI.HttpServerConnection..ctor(NameValueCollection initParams)

Custom event details:

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

Event Type: Information Event Source: Application Error Event Category: (100) Event ID: 1004 Date: 24/01/2012 Time: 11:34:11 User: N/A Computer: Description: Reporting queued error: faulting application w3wp.exe, version 6.0.3790.3959, faulting module msvcr90.dll, version 9.0.30729.5570, fault address 0x0003bedb.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp. Data: 0000: 41 70 70 6c 69 63 61 74 Applicat 0008: 69 6f 6e 20 46 61 69 6c ion Fail 0010: 75 72 65 20 20 77 33 77 ure w3w 0018: 70 2e 65 78 65 20 36 2e p.exe 6. 0020: 30 2e 33 37 39 30 2e 33 0.3790.3 0028: 39 35 39 20 69 6e 20 6d 959 in m 0030: 73 76 63 72 39 30 2e 64 svcr90.d 0038: 6c 6c 20 39 2e 30 2e 33 ll 9.0.3 0040: 30 37 32 39 2e 35 35 37 0729.557 0048: 30 20 61 74 20 6f 66 66 0 at off 0050: 73 65 74 20 30 30 30 33 set 0003 0058: 62 65 64 62 bedb

comment:14 by jng, 13 years ago

Ran the jmeter test on the 2.4 beta (IIS config). Could not crash the MapGuide Server.

Trying same test for Apache config.

comment:15 by jng, 13 years ago

Resolution: fixed
Status: reopenedclosed

Same result for Apache config. Could not crash the MapGuide Server. No errors logged

Closing as fixed.

Note: See TracTickets for help on using tickets.