Opened 12 years ago

Closed 12 years ago

#4093 closed defect (fixed)

mapserver deadlock on signal

Reported by: laurin Owned by: sdlime
Priority: high Milestone:
Component: MapServer FastCGI Version: 6.0
Severity: normal Keywords: fastcgi signal SIGUSR1 SIGTERM deadlock
Cc: warmerdam, tbonfort

Description

There is a bug in how mapserver handles USR1 and TERM signals when used in fastcgi mode. msCleanupOnSignal calls msCleanup, which calls gdFontCacheShutdown which tries to acquire a mutex that can already be locked, deadlocking the process.

A sample stack trace showing the symptoms:

Process 12533 attached - interrupt to quit
futex(0x7ffdcb1b7d80, FUTEX_WAIT_PRIVATE, 2, NULL

(gdb) bt
#0  0x00007fdf9ec99be4 in __lll_lock_wait () from /lib/libpthread.so.0
#1  0x00007fdf9ec950e9 in _L_lock_953 () from /lib/libpthread.so.0
#2  0x00007fdf9ec94f0b in pthread_mutex_lock () from /lib/libpthread.so.0
#3  0x00007fdfa19edf1a in gdFontCacheShutdown () from /usr/lib/libgd.so.2
#4  0x000000000047e890 in msCleanup ()
#5  0x000000000044b4d0 in msCleanupOnSignal ()
#6  <signal handler called>
#7  0x00007fdfa154b538 in TT_RunIns () from /usr/lib/libfreetype.so.6
#8  0x00007fdfa1546073 in ?? () from /usr/lib/libfreetype.so.6
#9  0x00007fdfa154edc1 in ?? () from /usr/lib/libfreetype.so.6
#10 0x00007fdfa154f47d in ?? () from /usr/lib/libfreetype.so.6
#11 0x00007fdfa153b4aa in FT_Load_Glyph () from /usr/lib/libfreetype.so.6
#12 0x00007fdfa19eccb3 in gdImageStringFTEx () from /usr/lib/libgd.so.2
#13 0x00007fdfa19ed89b in gdImageStringFT () from /usr/lib/libgd.so.2
#14 0x00000000004cd151 in msDrawTextLineGD ()
#15 0x000000000057de7e in msDrawTextLine ()
#16 0x00000000004b8286 in msDrawLabelCache ()
#17 0x00000000004ad595 in msDrawMap ()
#18 0x00000000005a6aea in msWMSGetMap ()
#19 0x00000000005aa52d in msWMSDispatch ()
#20 0x000000000050ce91 in msOWSDispatch ()
#21 0x000000000044b8b8 in main ()

Attachments (1)

02_signalhandling.dpatch (1.9 KB ) - added by laurin 12 years ago.
proposed patch

Download all attachments as: .zip

Change History (6)

by laurin, 12 years ago

Attachment: 02_signalhandling.dpatch added

proposed patch

comment:1 by laurin, 12 years ago

Priority: normalhigh
Version: 5.66.0

applies to latest version aswell.

comment:2 by tbonfort, 12 years ago

Cc: warmerdam added

The fastcgi doc ( http://www.fastcgi.com/docs/faq.html#Signals ) states that for usr1 and term the process can wait for normal termination of the currently running task, so the proposed implementation seems correct.

ccing Frank in case he has an objection to this, if not I will commit the fix

comment:3 by warmerdam, 12 years ago

I do not have a strong position on this but a normal map request can be quite long in some cases and we will have completely disabled use of these signals to cleanup an essentially hung process.

Perhaps it would be better in the cleanup function to first clear all locks?

comment:4 by laurin, 12 years ago

The lock in question is private to libgd and can't be 'cleared' externally.

comment:5 by tbonfort, 12 years ago

Resolution: fixed
Status: newclosed

committed in r12958

concerning Frank's reticence w.r.t hung processes, we could add a check in msCleanupOnSignal to call the previous msCleanup+exit sequence in case a second signal is received (the log message could become "Received exit signal, waiting for current task to terminate. Resend to force quit")

Note: See TracTickets for help on using tickets.