Opened 6 years ago

Closed 5 years ago

#4175 closed defect (fixed)

CPLSetErrorHandler causes segmentation fault

Reported by: rbanfield Owned by: warmerdam
Priority: normal Milestone: 1.10.0
Component: default Version: 1.8.1
Severity: normal Keywords:
Cc: akrherz

Description

Not every call I make to CPLSetErrorHandler causes a segfault, but in my code I have a place where this call to CPLSetErrorHandler always causes a segmentation fault. Postmortem, It looks like the error function is being called repeatedly until the stack overflows.

This looks to be a result of pthread_setspecific() failing. I dont know why it would fail.

This does not occur in gdal 1.7.3. I have not tried 1.8.0.

Here's the backtrace (first 145485 lines omitted).

#145486 0x00000000005f2df1 in CPLErrorV(CPLErr, int, const char *, typedef __va_list_tag __va_list_tag *) (eErrClass=CE_Fatal, err_no=1, 
    fmt=0xba4d90 "pthread_setspecific() failed!", args=0x7fffffffbd20) at cpl_error.cpp:145
#145487 0x00000000005f30f1 in CPLError (eErrClass=<value optimized out>, err_no=<value optimized out>, fmt=<value optimized out>) at cpl_error.cpp:135
#145488 0x00000000005f909a in CPLGetTLSList () at cpl_multiproc.cpp:1054
#145489 0x00000000005f911b in CPLGetTLS (nIndex=5) at cpl_multiproc.cpp:1070
#145490 0x00000000005f248e in CPLGetErrorContext () at cpl_error.cpp:78

#145491 0x00000000005f2df1 in CPLErrorV(CPLErr, int, const char *, typedef __va_list_tag __va_list_tag *) (eErrClass=CE_Fatal, err_no=1, 
    fmt=0xba4d90 "pthread_setspecific() failed!", args=0x7fffffffbed0) at cpl_error.cpp:145
#145492 0x00000000005f30f1 in CPLError (eErrClass=<value optimized out>, err_no=<value optimized out>, fmt=<value optimized out>) at cpl_error.cpp:135
#145493 0x00000000005f909a in CPLGetTLSList () at cpl_multiproc.cpp:1054
#145494 0x00000000005f911b in CPLGetTLS (nIndex=5) at cpl_multiproc.cpp:1070
#145495 0x00000000005f248e in CPLGetErrorContext () at cpl_error.cpp:78

#145496 0x00000000005f2d1b in CPLSetErrorHandler (pfnErrorHandlerNew=0x452670 <_ZL16GDALErrorHandler6CPLErriPKc>) at cpl_error.cpp:654

Change History (18)

comment:1 Changed 6 years ago by warmerdam

I'm going to have to investigate why pthread_getspecific might be failing in this context, but it is clear that calling CPLErrorV() to report it is a bad plan!

I'll try to dig into this tonight.

It would be helpful if you could provide a minimal program to demonstrate this, and perhaps give some context (operating system, etc).

comment:2 Changed 6 years ago by warmerdam

Resolution: worksforme
Status: newclosed

I can't see an obvious reason you would encounter the problem in question, with the exception perhaps of heap corruption. But I have made an effort to improve trunk so that we won't issue errors in response to failures to issue errors through the introduction of a lower level CPLEmergencyError() (r22825). I'm not sure that it will help you specifically though.

comment:3 Changed 6 years ago by warmerdam

Milestone: 1.9.0

comment:4 Changed 5 years ago by akrherz

Hi, I appear to be hitting this issue with GDAL 1.9.1 on linux 64bit, here's a snippet of my gdb bt

#352492 0x00002af64527eccb in CPLGetTLS (nIndex=5) at cpl_multiproc.cpp:1113
        papTLSList = <value optimized out>
#352493 0x00002af645274cee in CPLGetErrorContext () at cpl_error.cpp:80
        psCtx = <value optimized out>
#352494 0x00002af645275801 in CPLErrorV(CPLErr, int, const char *, typedef __va_list_tag __va_list_tag *) (eErrClass=CE_Fatal, err_no=1, 
    fmt=0x2af6455dad48 "pthread_setspecific() failed!", args=0x7fff5c99e640)
    at cpl_error.cpp:169
        psCtx = <value optimized out>
#352495 0x00002af645275b13 in CPLError (eErrClass=<value optimized out>, 
    err_no=<value optimized out>, fmt=<value optimized out>)
    at cpl_error.cpp:159
        args = {{gp_offset = 24, fp_offset = 48, 
            overflow_arg_area = 0x7fff5c99e720, 
            reg_save_area = 0x7fff5c99e660}}
#352496 0x00002af64527ec4c in CPLGetTLSList () at cpl_multiproc.cpp:1097
        papTLSList = 0x2af6fa464e00
#352497 0x00002af64527eccb in CPLGetTLS (nIndex=5) at cpl_multiproc.cpp:1113
        papTLSList = <value optimized out>
#352498 0x00002af645274cee in CPLGetErrorContext () at cpl_error.cpp:80
        psCtx = <value optimized out>

Reproducing this is easy for me, I just need to reload or graceful restart apache, which causes mapserver PHP and mapserver FCGI apache threads to sometimes run away. Thanks.

comment:5 Changed 5 years ago by akrherz

Cc: akrherz added

comment:6 Changed 5 years ago by warmerdam

Resolution: worksforme
Status: closedreopened

I've applied a small patch in trunk (r25710) that should force an emergency fatal error instead of stack recursion. Could you see if that helps?

Ultimately you are presumably running into an out of memory condition so this change will just make failure more orderly.

comment:7 Changed 5 years ago by akrherz

@warmerdam thanks. I tried this patch. I still get an apache thread that runs away and explodes memory. The process did eventually exit though. Your comment seems to indicate this is still expected?

comment:8 Changed 5 years ago by akrherz

Here's an updated gdb backtrace portion showing the looping with trunk checked out moments ago

#249593 0x00002b820cccfc53 in CPLError (eErrClass=<value optimized out>, 
    err_no=<value optimized out>, fmt=<value optimized out>)
    at cpl_error.cpp:162
        args = {{gp_offset = 24, fp_offset = 48, 
            overflow_arg_area = 0x7ffdec9cfb00, 
            reg_save_area = 0x7ffdec9cfa40}}
#249594 0x00002b820ccdad2c in CPLGetTLSList () at cpl_multiproc.cpp:1553
        papTLSList = 0x2b839db7a2f0
#249595 0x00002b820ccdadab in CPLGetTLS (nIndex=5) at cpl_multiproc.cpp:1569
        papTLSList = <value optimized out>
#249596 0x00002b820cccf3ee in CPLGetErrorContext () at cpl_error.cpp:80
        psCtx = <value optimized out>
#249597 0x00002b820cccf931 in CPLErrorV(CPLErr, int, const char *, typedef __va_list_tag __va_list_tag *) (eErrClass=CE_Fatal, err_no=1, 
    fmt=0x2b820d0371a6 "pthread_setspecific() failed!", args=0x7ffdec9cfbd0)
    at cpl_error.cpp:172
        psCtx = <value optimized out>
#249598 0x00002b820cccfc53 in CPLError (eErrClass=<value optimized out>, 
    err_no=<value optimized out>, fmt=<value optimized out>)
    at cpl_error.cpp:162
        args = {{gp_offset = 24, fp_offset = 48, 
            overflow_arg_area = 0x7ffdec9cfcb0, 
            reg_save_area = 0x7ffdec9cfbf0}}
#249599 0x00002b820ccdad2c in CPLGetTLSList () at cpl_multiproc.cpp:1553
        papTLSList = 0x2b839db7a0e0
#249600 0x00002b820ccdadab in CPLGetTLS (nIndex=5) at cpl_multiproc.cpp:1569
        papTLSList = <value optimized out>
#249601 0x00002b820cccf3ee in CPLGetErrorContext () at cpl_error.cpp:80
        psCtx = <value optimized out>
#249602 0x00002b820cccf931 in CPLErrorV(CPLErr, int, const char *, typedef __va_list_tag __va_list_tag *) (eErrClass=CE_Fatal, err_no=1, 
    fmt=0x2b820d0371a6 "pthread_setspecific() failed!", args=0x7ffdec9cfd80)
    at cpl_error.cpp:172
        psCtx = <value optimized out>
#249603 0x00002b820cccfc53 in CPLError (eErrClass=<value optimized out>, 
    err_no=<value optimized out>, fmt=<value optimized out>)
    at cpl_error.cpp:162
        args = {{gp_offset = 24, fp_offset = 48, 
            overflow_arg_area = 0x7ffdec9cfe60, 
            reg_save_area = 0x7ffdec9cfda0}}

comment:9 Changed 5 years ago by warmerdam

Try to avoid triggering normal errors from the TLS allocator (r25767).

comment:10 Changed 5 years ago by akrherz

Thanks, but still am seeing it :(

#160103 0x00002ba14eb72b13 in CPLError (eErrClass=<value optimized out>, 
    err_no=<value optimized out>, fmt=<value optimized out>)
    at cpl_error.cpp:159
        args = {{gp_offset = 24, fp_offset = 48, 
            overflow_arg_area = 0x7ffe90ab6e40, 
            reg_save_area = 0x7ffe90ab6d80}}
#160104 0x00002ba14eb7bc4c in CPLGetTLSList () at cpl_multiproc.cpp:1097
        papTLSList = 0x2ba2c9c71490
#160105 0x00002ba14eb7bccb in CPLGetTLS (nIndex=5) at cpl_multiproc.cpp:1113
        papTLSList = <value optimized out>
#160106 0x00002ba14eb71cee in CPLGetErrorContext () at cpl_error.cpp:80
        psCtx = <value optimized out>
#160107 0x00002ba14eb72801 in CPLErrorV(CPLErr, int, const char *, typedef __va_list_tag __va_list_tag *) (eErrClass=CE_Fatal, err_no=1, 
    fmt=0x2ba14eed7d48 "pthread_setspecific() failed!", args=0x7ffe90ab6f10)
    at cpl_error.cpp:169
        psCtx = <value optimized out>
#160108 0x00002ba14eb72b13 in CPLError (eErrClass=<value optimized out>, 
    err_no=<value optimized out>, fmt=<value optimized out>)
    at cpl_error.cpp:159
        args = {{gp_offset = 24, fp_offset = 48, 
            overflow_arg_area = 0x7ffe90ab6ff0, 
            reg_save_area = 0x7ffe90ab6f30}}
#160109 0x00002ba14eb7bc4c in CPLGetTLSList () at cpl_multiproc.cpp:1097
        papTLSList = 0x2ba2c9c71280
#160110 0x00002ba14eb7bccb in CPLGetTLS (nIndex=5) at cpl_multiproc.cpp:1113
        papTLSList = <value optimized out>
#160111 0x00002ba14eb71cee in CPLGetErrorContext () at cpl_error.cpp:80
        psCtx = <value optimized out>
#160112 0x00002ba14eb72801 in CPLErrorV(CPLErr, int, const char *, typedef __va_list_tag __va_list_tag *) (eErrClass=CE_Fatal, err_no=1, 
    fmt=0x2ba14eed7d48 "pthread_setspecific() failed!", args=0x7ffe90ab70c0)
    at cpl_error.cpp:169
        psCtx = <value optimized out>
#160113 0x00002ba14eb72b13 in CPLError (eErrClass=<value optimized out>, 
    err_no=<value optimized out>, fmt=<value optimized out>)
    at cpl_error.cpp:159
        args = {{gp_offset = 24, fp_offset = 48, 
            overflow_arg_area = 0x7ffe90ab71a0, 
            reg_save_area = 0x7ffe90ab70e0}}

comment:11 in reply to:  10 Changed 5 years ago by akrherz

Replying to akrherz:

Thanks, but still am seeing it :(

sorry, those line numbers are not accurate as the system had 1.9.1 debuginfo rpm installed, which is what gdb picked up.

comment:12 Changed 5 years ago by warmerdam

Hmm, somehow I failed to fix the pthread case. Applied now in trunk (r25768). Could you try that?

comment:13 in reply to:  12 Changed 5 years ago by akrherz

Replying to warmerdam:

Hmm, somehow I failed to fix the pthread case. Applied now in trunk (r25768). Could you try that?

Thanks, but unfortunately, I get the same error. The Loop is from

in CPLGetTLS() cpl_multiproc.cpp:1577 calls CPLGetTLSList() in CPLGetTLSList() cpl_multiproc.cpp:1561 calls CPLError() in CPLError() cpl_error.cpp:162 calls CPLErrorV() in CPLErrorV() cpl_error.cpp:172 calls CPLGetErrorContext() in CPLGetErrorContext() cpl_error.cpp:80 calls CPLGetTLS() in CPLGetTLS() cpl_multiproc.cpp:1577 calls CPLGetTLSList()

comment:14 Changed 5 years ago by warmerdam

Another crack at it in trunk (r25780).

comment:15 in reply to:  14 ; Changed 5 years ago by akrherz

Replying to warmerdam:

Another crack at it in trunk (r25780).

Thanks, but am getting this recursive loop now:

#536702 0x00002b4539931ac5 in CPLGetTLSList () at cpl_multiproc.cpp:1558
        papTLSList = 0x2b46357fd9c0
#536703 0x00002b4539931b5b in CPLGetTLS (nIndex=14) at cpl_multiproc.cpp:1574
        papTLSList = <value optimized out>
#536704 0x00002b4539923303 in CPLGetConfigOption (
    pszKey=0x2b4539c8d0e3 "CPL_MAX_ERROR_REPORTS",
    pszDefault=0x2b4539c324b8 "1000") at cpl_conv.cpp:1552
        pszResult = 0x0
        papszTLConfigOptions = <value optimized out>
#536705 0x00002b45399260a3 in CPLDefaultErrorHandler (eErrClass=CE_Fatal,
    nError=1,
    pszErrorMsg=0x2b4539c8e250 "CPLGetTLSList(): pthread_setspecific() failed!") at cpl_error.cpp:539
        bLogInit = 0
        fpLog = 0x2b4529a66860
        nCount = 0
        nMaxErrors = -1
#536706 0x00002b4539926129 in CPLEmergencyError (
    pszMessage=<value optimized out>) at cpl_error.cpp:305
        psCtx = <value optimized out>
        bInEmergencyError = 1
#536707 0x00002b4539931ac5 in CPLGetTLSList () at cpl_multiproc.cpp:1558
        papTLSList = 0x2b46357fd7b0
#536708 0x00002b4539931b5b in CPLGetTLS (nIndex=14) at cpl_multiproc.cpp:1574
        papTLSList = <value optimized out>

comment:16 in reply to:  15 Changed 5 years ago by akrherz

Replying to akrherz:

Replying to warmerdam:

Another crack at it in trunk (r25780).

Thanks, but am getting this recursive loop now:

Sorry, line numbers are fouled up again, but the loop is there in CPLEmergencyError calling CPLGetTLS

comment:17 Changed 5 years ago by akrherz

trunk no longer reproduces the infinite recursion issue, THANK YOU! whew

comment:18 Changed 5 years ago by Even Rouault

Milestone: 1.9.01.10.0
Resolution: fixed
Status: reopenedclosed
Note: See TracTickets for help on using tickets.