Opened 16 years ago
Closed 13 years ago
#722 closed defect (fixed)
Failed Mapguide Service gets completely stuck 'stopping'
Reported by: | zspitzer | Owned by: | |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | Server | Version: | 2.0.2 |
Severity: | critical | Keywords: | |
Cc: | External ID: |
Description
The mapguide server service on windows often gets completely stuck and cannot be stopped or restarted. The can be reproduced with the GDAL provider #462.
The only way to kill the service is to use something like Zerowave which does a kernel level 0 ring call to nuke the service.
The service is left in the 'stopping' state under services
For users unaware of the Zerowave approach ( which is considered risky but has proved fine) the only solution is to reboot the server
Change History (6)
comment:1 by , 16 years ago
comment:2 by , 16 years ago
The problem with just aborting an operation is that if it involves updating the repository it could lead to possible data loss or worst case data corruption.
Perhaps the logic that does the service stopping needs to only wait for repository operations to complete and abort any other operations so that the service stopping can happen much quicker.
Have you noticed this with other providers or only with GDAL?
comment:3 by , 16 years ago
At the moment I am totally avoiding using GDAL, it's not production ready and always seems to just crash unfortunately. It is however the easiest way to reproduce the problem.
I'm pretty much working with a read-only repository which is being used to generate tiles for a big map using about 9.6 Gb of SDF / SHP files. Thankfully 2.0.2 is pretty damn stable.
So there's two items here, like you said, the shutdown logic needs to abort any r/o rendering operations. For dynamic maps, ideally if the http connection is dropped those operations should also be aborted as the result isn't cached so there is no point.
The second one is that problems with rouge FDO providers don't appear to be handled robustly enough and cause this stuck state. GDAL reproduces it the problem quite well, I have also seen this with a problematic SHP file.
comment:4 by , 15 years ago
Has this been addressed? I seem to remember something about connections auto-closing after a certain timeout with the 2.1 changes?
comment:5 by , 15 years ago
The shutdown should be better now, but I'm sure it could still be improved. :)
Unfortunetly, rogue FDO providers can still be a problem for MapGuide because they are loaded into the same process space as the server.
I think this may be related to the way the mapguide service responds to a stop request, rather than simply aborting all operations, it seems wait to complete all requests.
This is particular problematic on a production which is getting a lot of traffic, even when just trying to stop the service when it's still servicing requests and isn't in the failed locked up state