Opened 3 years ago

Closed 3 years ago

#4985 closed defect (fixed)

32-bit crashers on cluster regress

Reported by: robe Owned by: komzpa
Priority: blocker Milestone: PostGIS 3.2.0
Component: postgis Version: master
Keywords: Cc:

Description

bessie32 and berrie have been crashing after commit

[45e49124/git]

From berrie (32-bit raspberry pi)

22:23:25 PATH is /home/jenkins/workspace/pg/label/berrie/rel/pg12w32/bin:/home/jenkins/workspace/pg/label/berrie/rel/pg12w32/lib:/usr/local/bin:/usr/bin:/bin:/usr/games
22:23:25 Checking for shp2pgsql ... found
22:23:25 Checking for pgsql2shp ... found
22:23:25 Checking for raster2pgsql ... found
22:23:25 TMPDIR is /tmp/pgis_reg
22:23:25 Database postgis_reg already exists, dropping.
22:23:25 Creating database 'postgis_reg' 
22:23:25 Preparing db 'postgis_reg' using: CREATE EXTENSION postgis SCHEMA public
22:23:28 Preparing db 'postgis_reg' using: CREATE EXTENSION postgis_topology
22:23:28 Preparing db 'postgis_reg' using: CREATE EXTENSION postgis_raster SCHEMA public
22:23:29 PostgreSQL 12.6 on aarch64-unknown-linux-gnu, compiled by gcc (Raspbian 8.3.0-6+rpi1) 8.3.0, 32-bit
22:23:29   Postgis 3.2.0dev - (5f4b44e) - 2021-08-31 02:17:48
22:23:29   scripts 3.2.0dev 5f4b44e
22:23:29   raster scripts 3.2.0dev 5f4b44e
22:23:29   GEOS: 3.7.1-CAPI-1.11.1 27a5e771
22:23:29   PROJ: Rel. 5.2.0, September 15th, 2018
22:23:29   GDAL: GDAL 2.4.0, released 2018/12/14
22:23:29 
22:23:29 Running tests

22:23:29   GDAL: GDAL 2.4.0, released 2018/12/14
22:23:29 
22:23:29 Running tests
22:23:29 
22:23:29  after-create-script /home/jenkins/workspace/PostGIS_Worker_Run/label/berrie/5f4b44edb30c8609d3f812d1f0bc6fd07d750f0f/regress/hooks/hook-after-create.sql .. ok 
22:23:29  ./regress/core/affine .. ok in 81 ms
22:23:29  ./regress/core/bestsrid .. ok in 47 ms
22:23:29  ./regress/core/binary .. ok in 119 ms
22:23:29  ./regress/core/boundary .. ok in 49 ms
22:23:29  ./regress/core/chaikin .. ok in 43 ms
22:23:29  ./regress/core/filterm .. ok in 40 ms
22:23:29  ./regress/core/cluster ..Died at /home/jenkins/workspace/PostGIS_Worker_Run/label/berrie/5f4b44edb30c8609d3f812d1f0bc6fd07d750f0f/regress/run_test.pl line 744.
22:23:30  failed (psql exited with an error: /tmp/pgis_reg/test_8_out)
22:23:30 -----------------------------------------------------------------------------
22:23:30 t1|GEOMETRYCOLLECTION(LINESTRING(0 0,1 1),LINESTRING(5 5,4 4),LINESTRING(0 0,-1 -1),POLYGON((0 0,4 0,4 4,0 4,0 0)))
22:23:30 t1|GEOMETRYCOLLECTION(LINESTRING(6 6,7 7))
22:23:30 t1|GEOMETRYCOLLECTION(POLYGON EMPTY)
22:23:30 t2|GEOMETRYCOLLECTION(LINESTRING(0 0,1 1),LINESTRING(5 5,4 4),LINESTRING(0 0,-1 -1),POLYGON((0 0,4 0,4 4,0 4,0 0)))
22:23:30 t2|GEOMETRYCOLLECTION(LINESTRING(6 6,7 7))
22:23:30 t2|GEOMETRYCOLLECTION(POLYGON EMPTY)
22:23:30 t3|GEOMETRYCOLLECTION(LINESTRING(0 0,1 1),LINESTRING(5 5,4 4),LINESTRING(0 0,-1 -1),POLYGON((0 0,4 0,4 4,0 4,0 0)))
22:23:30 t3|GEOMETRYCOLLECTION(LINESTRING(6 6,7 7))
22:23:30 t3|GEOMETRYCOLLECTION(POLYGON EMPTY)
22:23:30 t4|GEOMETRYCOLLECTION(LINESTRING(0 0,1 1),LINESTRING(5 5,4 4),LINESTRING(0 0,-1 -1),LINESTRING(6 6,7 7),POLYGON((0 0,4 0,4 4,0 4,0 0)))
22:23:30 t4|GEOMETRYCOLLECTION(POLYGON EMPTY)
22:23:30 t101|1|0
22:23:30 t101|2|0
22:23:30 t101|3|0
22:23:30 t101|4|1
22:23:30 t101|5|1
22:23:30 t101|6|1
22:23:30 t102|1|
22:23:30 t102|2|
22:23:30 t102|3|
22:23:30 t102|4|
22:23:30 t102|5|
22:23:30 t102|6|
22:23:30 t103|1|
22:23:30 t103|2|
22:23:30 t103|3|
22:23:30 t103|4|0
22:23:30 t103|5|0
22:23:30 t103|6|0
22:23:30 #3612a|
22:23:30 #3612a|
22:23:30 #3612b|
22:23:30 psql:cluster.sql:50: server closed the connection unexpectedly
22:23:30 	This probably means the server terminated abnormally
22:23:30 	before or while processing the request.
22:23:30 psql:cluster.sql:50: fatal: connection to server was lost
22:23:30 -----------------------------------------------------------------------------
22:23:30 make: *** [regress/runtest.mk:11: check-regress] Error 2
22:23:30 Build step 'Execute shell' marked build as failure
22:23:38 Finished: FAILURE

From bessie32 FreeBSD 12 32-bit

22:21:23 PostgreSQL 12.7 on i386-portbld-freebsd12.2, compiled by gcc10 (FreeBSD Ports Collection) 10.2.0, 32-bit
22:21:23   Postgis 3.2.0dev - (5f4b44e) - 2021-08-30 02:55:55
22:21:23   scripts 3.2.0dev 5f4b44e
22:21:23   raster scripts 3.2.0dev 5f4b44e
22:21:23   GEOS: 3.9.1-CAPI-1.14.2
22:21:23   PROJ: 7.2.1
22:21:23   SFCGAL: 1.3.8
22:21:23   GDAL: GDAL 3.2.1, released 2020/12/29
22:21:23 

22:21:25  ./regress/core/cluster ..Died at /usr/home/jenkins/workspace/PostGIS_Worker_Run/label/bessie32/5f4b44edb30c8609d3f812d1f0bc6fd07d750f0f/regress/run_test.pl line 744.
22:21:29  failed (psql exited with an error: /home/jenkins/tmp/pgis_reg_5f4b44edb30c8609d3f812d1f0bc6fd07d750f0f/test_8_out)
22:21:29 -----------------------------------------------------------------------------
22:21:29 t1|GEOMETRYCOLLECTION(LINESTRING(0 0,1 1),LINESTRING(5 5,4 4),LINESTRING(0 0,-1 -1),POLYGON((0 0,4 0,4 4,0 4,0 0)))
22:21:29 t1|GEOMETRYCOLLECTION(LINESTRING(6 6,7 7))
22:21:29 t1|GEOMETRYCOLLECTION(POLYGON EMPTY)
22:21:29 t2|GEOMETRYCOLLECTION(LINESTRING(0 0,1 1),LINESTRING(5 5,4 4),LINESTRING(0 0,-1 -1),POLYGON((0 0,4 0,4 4,0 4,0 0)))
22:21:29 t2|GEOMETRYCOLLECTION(LINESTRING(6 6,7 7))
22:21:29 t2|GEOMETRYCOLLECTION(POLYGON EMPTY)
22:21:29 t3|GEOMETRYCOLLECTION(LINESTRING(0 0,1 1),LINESTRING(5 5,4 4),LINESTRING(0 0,-1 -1),POLYGON((0 0,4 0,4 4,0 4,0 0)))
22:21:29 t3|GEOMETRYCOLLECTION(LINESTRING(6 6,7 7))
22:21:29 t3|GEOMETRYCOLLECTION(POLYGON EMPTY)
22:21:29 t4|GEOMETRYCOLLECTION(LINESTRING(0 0,1 1),LINESTRING(5 5,4 4),LINESTRING(0 0,-1 -1),LINESTRING(6 6,7 7),POLYGON((0 0,4 0,4 4,0 4,0 0)))
22:21:29 t4|GEOMETRYCOLLECTION(POLYGON EMPTY)
22:21:29 t101|1|0
22:21:29 t101|2|0
22:21:29 t101|3|0
22:21:29 t101|4|1
22:21:29 t101|5|1
22:21:29 t101|6|1
22:21:29 t102|1|
22:21:29 t102|2|
22:21:29 t102|3|
22:21:29 t102|4|
22:21:29 t102|5|
22:21:29 t102|6|
22:21:29 t103|1|
22:21:29 t103|2|
22:21:29 t103|3|
22:21:29 t103|4|0
22:21:29 t103|5|0
22:21:29 t103|6|0
22:21:29 #3612a|
22:21:29 #3612a|
22:21:29 #3612b|
22:21:29 psql:cluster.sql:50: server closed the connection unexpectedly
22:21:29 	This probably means the server terminated abnormally
22:21:29 	before or while processing the request.
22:21:29 psql:cluster.sql:50: fatal: connection to server was lost
22:21:29 -----------------------------------------------------------------------------
22:21:29 gmake[1]: *** [regress/runtest.mk:11: check-regress] Error 2
22:21:29 gmake[1]: Leaving directory '/usr/home/jenkins/workspace/PostGIS_Worker_Run/label/bessie32/5f4b44edb30c8609d3f812d1f0bc6fd07d750f0f'
22:21:29 *** Error code 2
22:21:29 
22:21:29 Stop.

Attachments (2)

btfull_berrie.txt (10.5 KB ) - added by robe 3 years ago.
btfull_berrie_7a7736ae.txt (10.1 KB ) - added by robe 3 years ago.

Download all attachments as: .zip

Change History (7)

comment:1 by komzpa, 3 years ago

Owner: changed from pramsey to komzpa

comment:2 by robe, 3 years ago

Here is bt from berrie. I also tried bessie32 but guess because I am using stuck postgres on her it gives me no line numbers into our code.

Reading symbols from /home/jenkins/workspace/pg/label/berrie/rel/pg12w32/bin/postgres...done.
Reading symbols from /usr/lib/arm-linux-gnueabihf/libarmmem-v8l.so...(no debugging symbols found)...done.
Reading symbols from /lib/arm-linux-gnueabihf/libpthread.so.0...Reading symbols from /usr/lib/debug/.build-id/79/58164ddcdf86b06e4a06700f80a4655a80c40e.debug...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Reading symbols from /lib/arm-linux-gnueabihf/librt.so.1...Reading symbols from /usr/lib/debug/.build-id/1d/5428c68f929c6b4c0b781a608fc95ccb886efb.debug...done.
done.
Reading symbols from /lib/arm-linux-gnueabihf/libdl.so.2...Reading symbols from /usr/lib/debug/.build-id/bc/c9ee6666973f3f41da5bdf8c8d7c299e4a09a2.debug...done.
done.
Reading symbols from /lib/arm-linux-gnueabihf/libm.so.6...Reading symbols from /usr/lib/debug/.build-id/24/22b5cf95895f5f4f69dae68ad9f99198e43121.debug...done.
done.
Reading symbols from /lib/arm-linux-gnueabihf/libc.so.6...Reading symbols from /usr/lib/debug/.build-id/ef/dd27c16f5283e5c53dcbd1bbc3ef136e312d1b.debug...done.
done.
Reading symbols from /lib/ld-linux-armhf.so.3...Reading symbols from /usr/lib/debug/.build-id/fb/85e699c11db06c7b24f74de2cdada3146442a8.debug...done.
done.
Reading symbols from /lib/arm-linux-gnueabihf/libnss_files.so.2...Reading symbols from /usr/lib/debug/.build-id/d4/cee2c3a91545cb062bc7bcb38c08abeb15d0ec.debug...done.
done.
0xf769c98c in epoll_wait (epfd=<optimized out>, events=0x20d0dc0, maxevents=maxevents@entry=1, timeout=-1, timeout@entry=2) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30      ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
(gdb) cont
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0xee3fcf1c in ST_ClusterKMeans (fcinfo=0xffca400c) at lwgeom_window.c:211
211                     max_radius = DatumGetFloat8(WinGetFuncArgCurrent(winobj, 2, &isnull));

comment:3 by robe, 3 years ago

bt full

gives

0xee3fcf1c in ST_ClusterKMeans (fcinfo=0xffca400c) at lwgeom_window.c:211
211                     max_radius = DatumGetFloat8(WinGetFuncArgCurrent(winobj, 2, &isnull));
(gdb) bt full
#0  0xee3fcf1c in ST_ClusterKMeans (fcinfo=0xffca400c) at lwgeom_window.c:211
        i = <optimized out>
        k = 35830033
        geoms = <optimized out>
        N = 3
        isnull = true
        isout = false
        max_radius = 0
        r = <optimized out>
        winobj = 0x222cdb0
        context = 0x21a8ed0
        curpos = <optimized out>
        rowcount = <optimized out>
#1  0x00261ff4 in eval_windowfunction (perfuncstate=0x222cbc8, result=0x222cb98, isnull=0x222cbb0, winstate=<optimized out>, winstate=<optimized out>) at nodeWindowAgg.c:1055
        fcinfodata = {fcinfo = {flinfo = 0x222cbd4, context = 0x222cdb0, resultinfo = 0x0, fncollation = 0, isnull = false, nargs = 3, args = 0xffca4020},
          fcinfo_data = "\324\313\"\002\260\315\"\002\000\000\000\000\000\000\000\000\000K\003\000\200\002\000\000\001\206p\000\000\000\000\000\001\000\000\000\351\375\000\000\001\232\030\002@V!\002X\263M\000\350\v$\002\260\236c\367\324\367p\367\240@\312\377\300K\"\002\305K\"\002\000\000\000\000\300k\"\000\300K\"\002\310K\"\002\305K\"\002\070w\"\000\330\350\"\002\324\367p\367\b\370p\367\377\017\000\000\b\370p\367\324\367p\367X\362p\367\000\000\000\000\b\200\000\000\224mc\367\000\200\000\000\000\000\000\000\020\000\000\000\310K\"\002\305K\"\002\230\203 \002\fDM\000\000\353\033\002\020\000\000\000\274\206p\000\001\000\000\000\000\000\000\000@\354\033\002\003\000\000\000\001\000\000\000"...}
        fcinfo = 0xffca4004
        oldContext = 0x2188da8
#2  0x00264ce0 in ExecWindowAgg (pstate=0x21893a8) at nodeWindowAgg.c:2197
        perfuncstate = <optimized out>
        winstate = 0x21893a8
        econtext = 0x222cbb0
        i = 0
        numfuncs = 1
        __func__ = "ExecWindowAgg"
#3  0x00253224 in ExecProcNode (node=0x21893a8) at ../../../src/include/executor/executor.h:242
No locals.
#4  ExecLimit (pstate=0x2189230) at nodeLimit.c:95
        node = 0x2189230
        direction = ForwardScanDirection
        slot = <optimized out>
        outerPlan = 0x21893a8
        __func__ = "ExecLimit"
#5  0x00243c90 in ExecProcNode (node=0x2189230) at ../../../src/include/executor/executor.h:242
No locals.
#6  fetch_input_tuple (aggstate=aggstate@entry=0x2188fb0) at nodeAgg.c:406
        slot = <optimized out>
#7  0x002455c0 in agg_retrieve_direct (aggstate=0x2188fb0) at nodeAgg.c:1748
        econtext = <optimized out>
        firstSlot = <optimized out>
        numGroupingSets = 1
        node = 0x220bee0
        tmpcontext = 0x21890d8
        peragg = 0x20f60d8
        outerslot = <optimized out>
        nextSetSize = <optimized out>
        pergroups = 0x220c358
        result = <optimized out>
        hasGroupingSets = true
        currentSet = <optimized out>
        numReset = 1
        i = <optimized out>
        node = <optimized out>
        econtext = <optimized out>
        tmpcontext = <optimized out>
        peragg = <optimized out>
        pergroups = <optimized out>
        outerslot = <optimized out>
        firstSlot = <optimized out>
        result = <optimized out>
        hasGroupingSets = <optimized out>
        numGroupingSets = <optimized out>
        currentSet = <optimized out>
        nextSetSize = <optimized out>
        numReset = <optimized out>
        i = <optimized out>
#8  ExecAgg (pstate=0x2188fb0) at nodeAgg.c:1563

by robe, 3 years ago

Attachment: btfull_berrie.txt added

comment:4 by robe, 3 years ago

Here is the bt full after [7a7736ae/git]

#0  0xee5c2f1c in ST_ClusterKMeans (fcinfo=0xffb570fc) at lwgeom_window.c:214
        i = <optimized out>
        k = 41982649
        geoms = <optimized out>
        N = 3
        isnull = true
        isout = false
        max_radius = 0
        r = <optimized out>
        argdatum = 0
        winobj = 0x280b0b8
        context = 0x2786f48
        curpos = <optimized out>
        rowcount = <optimized out>
#1  0x00261ff4 in eval_windowfunction (perfuncstate=0x280ad70, result=0x280ad40, isnull=0x280ad58, winstate=<optimized out>, winstate=<optimized out>) at nodeWindowAgg.c:1055
        fcinfodata = {fcinfo = {flinfo = 0x280ad7c, context = 0x280b0b8, resultinfo = 0x0, fncollation = 0, isnull = false, nargs = 3, args = 0xffb57110},
          fcinfo_data = "|\255\200\002\270\260\200\002\000\000\000\000\000\000\000\000\000+\003\000\200\002\000\000\001\206p\000\000\000\000\000\001\000\000\000\351\375\000\000\001\330v\002\330\026\177\002X\263M\000\210\354\201\002\260\376\177\367\324W\215\367\220q\265\377`,\200\002e,\200\002\000\000\000\000\300k\"\000`,\200\002h,\200\002e,\200\002\070w\"\000pɀ\002\324W\215\367\bX\215\367\377\017\000\000\bX\215\367\324W\215\367XR\215\367\000\000\000\000\b\200\000\000\224\315\177\367\000\200\000\000\000\000\000\000\020\000\000\000h,\200\002e,\200\002\020d~\002\fDM\000x\313y\002\020\000\000\000\274\206p\000\001\000\000\000\000\000\000\000\270\314y\002\003\000\000\000\001\000\000\000X"...}
        fcinfo = 0xffb570f4
        oldContext = 0x276caf0
#2  0x00264ce0 in ExecWindowAgg (pstate=0x276d0f0) at nodeWindowAgg.c:2197
        perfuncstate = <optimized out>
        winstate = 0x276d0f0
        econtext = 0x280ad58
        i = 0
        numfuncs = 1
        __func__ = "ExecWindowAgg"
#3  0x00253224 in ExecProcNode (node=0x276d0f0) at ../../../src/include/executor/executor.h:242

Full output attached

by robe, 3 years ago

Attachment: btfull_berrie_7a7736ae.txt added

comment:5 by kalenikaliaksandr <kalenik.aliaksandr@…>, 3 years ago

Resolution: fixed
Status: newclosed

In 81bcf3d/git:

fix kmeans crash on 32-bit. Closes #4985

Note: See TracTickets for help on using tickets.