Opened 15 months ago

Closed 15 months ago

Last modified 15 months ago

#3644 closed defect (fixed)

interrupt_relate test hangs

Reported by: strk Owned by: pramsey
Priority: blocker Milestone: PostGIS 2.3.1
Component: postgis Version: trunk
Keywords: Cc:

Description

Running "make check" hangs at the second "interrupt_relate .." line, indefinitely. The first one gets through.

This is on a just-upgraded "Ubuntu 16.04".

ps(1) shows the postgres backend sleeping:

31269 ?        Ss     0:00 postgres: strk postgis_reg [local] SELECT

pg_stat_activity view on the matter:

datid            | 56073418
datname          | postgis_reg
pid              | 31269
usesysid         | 10
usename          | strk
application_name | psql
client_addr      | 
client_hostname  | 
client_port      | -1
backend_start    | 2016-09-27 09:12:52.207264+02
xact_start       | 2016-09-27 09:12:52.248314+02
query_start      | 2016-09-27 09:12:52.248314+02
state_change     | 2016-09-27 09:12:52.248315+02
waiting          | f
state            | active
query            | select ST_Contains(g,g) from _inputs WHERE id = 1;

now() is:

now | 2016-09-27 09:18:03.375686+02

So after 5 minutes the process is still there sleeping

Change History (9)

comment:1 Changed 15 months ago by strk

PostgreSQL 9.3.9 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit

postgis_reg=# select postgis_full_version();
NOTICE:  Function postgis_gdal_version() not found.  Is raster support enabled and rtpostgis.sql installed?
DEBUG:  Function postgis_topology_scripts_installed() not found. Is topology support enabled and topology.sql installed?
NOTICE:  Function postgis_raster_scripts_installed() not found. Is raster support enabled and rtpostgis.sql installed?
NOTICE:  Function postgis_raster_lib_version() not found. Is raster support enabled and rtpostgis.sql installed?
-[ RECORD 1 ]--------+----------------------------------------------------------------------------------------------------------------------------------------------
postgis_full_version | POSTGIS="2.4.0dev r15155" GEOS="3.6.0dev-CAPI-1.10.0 r4257" SFCGAL="1.2.2" PROJ="Rel. 4.9.2, 08 September 2015" LIBXML="2.9.3" LIBJSON="0.12"

comment:2 Changed 15 months ago by strk

The process seems to be stuck due to a deadlock on IO:

#0  __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x00007f51a6e1f64a in __GI___libc_malloc (bytes=139988672314144, bytes@entry=4096)
    at malloc.c:2912
#2  0x00007f51a6e09185 in __GI__IO_file_doallocate (fp=0x7f51a7160620 <_IO_2_1_stdout_>)
    at filedoalloc.c:127
#3  0x00007f51a6e174c4 in __GI__IO_doallocbuf (fp=fp@entry=0x7f51a7160620 <_IO_2_1_stdout_>)
    at genops.c:398
#4  0x00007f51a6e16828 in _IO_new_file_overflow (f=0x7f51a7160620 <_IO_2_1_stdout_>, ch=-1)
    at fileops.c:820
#5  0x00007f51a6e151bd in _IO_new_file_xsputn (f=0x7f51a7160620 <_IO_2_1_stdout_>, 
    data=0x7f51a2c45f76, n=19) at fileops.c:1331
#6  0x00007f51a6e0b678 in _IO_puts (str=0x7f51a2c45f76 "Interrupt requested") at ioputs.c:40
#7  0x00007f51a2bc625d in handleInterrupt ()
   from /usr/src/postgis/postgis/regress/00-regress-install/lib/postgis-2.4.so
#8  <signal handler called>

comment:3 Changed 15 months ago by strk

Same problem with PostgreSQL 9.3.14. This is happening with libc6

comment:4 Changed 15 months ago by strk

Removing this line from "handleInterrupt" fixes the problem for me:

printf("Interrupt requested\n"); fflush(stdout);

It turns out it was never a good idea to printf from the interrupt handler, exactly for the risk of being invoked again (and triggering a deadlock).

See http://stackoverflow.com/questions/8738951/printk-inside-an-interrupt-handler-is-it-really-that-bad

I'm going to push a fix

comment:5 Changed 15 months ago by strk

Resolution: fixed
Status: newclosed

In 15156:

Do not call printf from interrupt handler, fixing deadlocks

Closes #3644

comment:6 Changed 15 months ago by strk

In 15157:

Do not call printf from interrupt handler, fixing deadlocks

Closes #3644 for 2.3 branch

comment:7 Changed 15 months ago by strk

In 15158:

Do not call printf from interrupt handler, fixing deadlocks

Closes #3644 for 2.2 branch

comment:8 Changed 15 months ago by strk

In 15159:

Do not call printf from interrupt handler, fixing deadlocks

Closes #3644 for 2.1 branch

comment:9 Changed 15 months ago by strk

A good ref if we want to further improve the code by blocking signals:

http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html

Note: See TracTickets for help on using tickets.