Opened 5 months ago

Closed 3 months ago

#4125 closed defect (fixed)

postgis 2.5 crashes with PG11 on Debian stretch/amd64

Reported by: myon Owned by: robe
Priority: high Milestone: PostGIS 2.5.0
Component: documentation Version: trunk
Keywords: Cc:

Description

Postgis svn-trunk (git ef537b71c, also 2.5.0beta1) crashes with PG11 (11beta2) on Debian stretch/amd64. Other PG versions, and other Debian/Ubuntu? releases on amd64 are not affected.

How to reproduce:

  • Debian stretch/amd64
  • PostgreSQL 11~beta2-2.pgdg90+1 from apt.postgresql.org
  • POSTGIS="2.5.0beta2dev r16647" [EXTENSION] PGSQL="110" GEOS="3.5.1-CAPI-1.9.1 r4246" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.1.2, released 2016/10/24" LIBXML="2.9.4" LIBJSON="0.12.1" LIBPROTOBUF="1.2.1" RASTER

Running regress/tickets.sql crashes the server:

\pset pager off
\i regress/tickets.sql
2018-07-17 10:42:20.218 CEST [26234] LOG:  Serverprozess (PID 32436) wurde von Signal 11 beendet: Segmentation fault
2018-07-17 10:42:20.218 CEST [26234] DETAIL:  Der fehlgeschlagene Prozess führte aus: SELECT '#408.3', st_isvalid('0106000020BB0B000001000000010300000005000000D6000000000000C0F1A138410AD7A3103190524114AE4721F7A138410000000030905241713D0A57FAA1384185EB51982C9052417B14AE87FAA13841000000402A905241AE47E13AFBA1384114AE474128905241EC51B81EFDA138413D0AD7632690524152B81E85FFA13841D7A3707D259052415

Running this query alone does not crash it, but the crash happens reproducibly when running the full tickets.sql file.

Change History (13)

comment:1 Changed 5 months ago by TobWen

I can verify the same crash for stretch/amd64 with GEOS 3.7.0~beta1-1~exp1 backported from Debian experimental.

comment:2 Changed 5 months ago by robe

Priority: mediumblocker

comment:3 Changed 5 months ago by robe

comment:4 Changed 5 months ago by komzpa

I installed a fresh stretch/amd64, added pgdg repos, took trunk postgis, psql (11beta2 (Ubuntu 11~beta2-2.pgdg18.04+1)) PostgreSQL 11beta2 (Ubuntu 11~beta2-2.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0, 64-bit

Postgis 2.5.0beta2dev - r16657 - 2018-07-21 18:55:19 scripts 2.5.0beta2dev r16657 GEOS: 3.5.1-CAPI-1.9.1 r4246 PROJ: Rel. 4.9.3, 15 August 2016

  • and it didn't crash.

I just noticed that I took postgres from bionic pgdg repos instead. I'll recheck with stretch's.

comment:5 Changed 5 months ago by komzpa

stretch's postgres brings in clang-3.9 compared to bionic's. this leads to crash after recompile:

Thread 1 (Thread 0x7faee2354900 (LWP 29243)):
#0  0x00007faedc0282de in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#1  0x00007faedc0287d9 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#2  0x00007faedc029076 in _Unwind_Find_FDE () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#3  0x00007faedc025b13 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#4  0x00007faedc026d30 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#5  0x00007faedc0271de in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#6  0x00007faedc2bd2bc in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007faecf2b1dbc in geos::geom::LinearRing::validateConstruction() () from /usr/lib/x86_64-linux-gnu/libgeos-3.5.1.so
#8  0x00007faecf2b1f3b in geos::geom::LinearRing::LinearRing(geos::geom::CoordinateSequence*, geos::geom::GeometryFactory const*) () from /usr/lib/x86_64-linux-gnu/libgeos-3.5.1.so
#9  0x00007faecf2af6f5 in geos::geom::GeometryFactory::createLinearRing(geos::geom::CoordinateSequence*) const () from /usr/lib/x86_64-linux-gnu/libgeos-3.5.1.so
#10 0x00007faed0634a7a in GEOSGeom_createLinearRing_r () from /usr/lib/x86_64-linux-gnu/libgeos_c.so.1
#11 0x00007faed08d9e00 in ptarray_to_GEOSLinearRing (autofix=<optimized out>, pa=<optimized out>) at lwgeom_geos.c:304
#12 LWGEOM2GEOS (lwgeom=0x55967cde1de8, autofix=<optimized out>) at lwgeom_geos.c:426
#13 0x00007faed08d9d09 in LWGEOM2GEOS (lwgeom=lwgeom@entry=0x55967cde1d80, autofix=autofix@entry=0 '\000') at lwgeom_geos.c:467
#14 0x00007faed08722cd in isvalid (fcinfo=0x55967ce42330) at lwgeom_geos.c:1414
#15 0x000055967ac52b11 in ?? ()
#16 0x000055967ad0a59b in ?? ()
#17 0x000055967ad0de83 in ?? ()
#18 0x000055967ad0c9ea in ?? ()
#19 0x000055967aca50cf in expression_tree_mutator ()
#20 0x000055967ad0c7a2 in ?? ()
#21 0x000055967aca533b in expression_tree_mutator ()
#22 0x000055967ad0c7a2 in ?? ()
#23 0x000055967ad0dccf in eval_const_expressions ()
#24 0x000055967acf5347 in ?? ()
#25 0x000055967acfb98b in subquery_planner ()
#26 0x000055967acfcca5 in standard_planner ()
#27 0x000055967ada7150 in pg_plan_query ()
#28 0x000055967ada7226 in pg_plan_queries ()
#29 0x000055967ada773e in ?? ()
#30 0x000055967ada9473 in PostgresMain ()
#31 0x000055967aab8910 in ?? ()
#32 0x000055967ad34ff3 in PostmasterMain ()
#33 0x000055967aab9e04 in main ()

comment:6 Changed 5 months ago by komzpa

More symbols:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: 11/main: root postgis_reg [local] SELECT                            '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007faedc0282de in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1

Thread 1 (Thread 0x7faee2354900 (LWP 1276)):
#0  0x00007faedc0282de in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#1  0x00007faedc0287d9 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#2  0x00007faedc029076 in _Unwind_Find_FDE () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#3  0x00007faedc025b13 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#4  0x00007faedc026d30 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#5  0x00007faedc0271de in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#6  0x00007faedc2bd2bc in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007faecf2b1dbc in geos::geom::LinearRing::validateConstruction() () from /usr/lib/x86_64-linux-gnu/libgeos-3.5.1.so
#8  0x00007faecf2b1f3b in geos::geom::LinearRing::LinearRing(geos::geom::CoordinateSequence*, geos::geom::GeometryFactory const*) () from /usr/lib/x86_64-linux-gnu/libgeos-3.5.1.so
#9  0x00007faecf2af6f5 in geos::geom::GeometryFactory::createLinearRing(geos::geom::CoordinateSequence*) const () from /usr/lib/x86_64-linux-gnu/libgeos-3.5.1.so
#10 0x00007faed0634a7a in GEOSGeom_createLinearRing_r () from /usr/lib/x86_64-linux-gnu/libgeos_c.so.1
#11 0x00007faed08d9e00 in ptarray_to_GEOSLinearRing (autofix=<optimized out>, pa=<optimized out>) at lwgeom_geos.c:304
#12 LWGEOM2GEOS (lwgeom=0x55967cde1de8, autofix=<optimized out>) at lwgeom_geos.c:426
#13 0x00007faed08d9d09 in LWGEOM2GEOS (lwgeom=lwgeom@entry=0x55967cde1d80, autofix=autofix@entry=0 '\000') at lwgeom_geos.c:467
#14 0x00007faed08722cd in isvalid (fcinfo=0x55967ce42350) at lwgeom_geos.c:1414
#15 0x000055967ac52b11 in ExecInterpExpr (state=0x55967ce42270, econtext=0x55967ce42b70, isnull=<optimized out>) at ./build/../src/backend/executor/execExprInterp.c:678
#16 0x000055967ad0a59b in ExecEvalExprSwitchContext (isNull=0x7ffed2780e74, econtext=<optimized out>, state=0x55967ce42270) at ./build/../src/include/executor/executor.h:303
#17 evaluate_expr (expr=<optimized out>, result_type=result_type@entry=16, result_typmod=result_typmod@entry=-1, result_collation=result_collation@entry=0) at ./build/../src/backend/optimizer/util/clauses.c:4880
#18 0x000055967ad0de83 in evaluate_function (context=0x7ffed27811f0, func_tuple=0x7faee22b1408, funcvariadic=false, args=0x55967ce49078, input_collid=0, result_collid=0, result_typmod=-1, result_type=16, funcid=79732) at ./build/../src/backend/optimizer/util/clauses.c:4422
#19 simplify_function (funcid=79732, result_type=16, result_typmod=-1, result_collid=result_collid@entry=0, input_collid=input_collid@entry=0, args_p=args_p@entry=0x7ffed2781010, funcvariadic=false, process_args=true, allow_non_const=true, context=0x7ffed27811f0) at ./build/../src/backend/optimizer/util/clauses.c:4062
#20 0x000055967ad0c9ea in eval_const_expressions_mutator (node=0x55967cd09fd0, context=0x7ffed27811f0) at ./build/../src/backend/optimizer/util/clauses.c:2674
#21 0x000055967aca50cf in expression_tree_mutator (node=node@entry=0x55967ce47fd0, mutator=mutator@entry=0x55967ad0c740 <eval_const_expressions_mutator>, context=context@entry=0x7ffed27811f0) at ./build/../src/backend/nodes/nodeFuncs.c:3033
#22 0x000055967ad0c7a2 in eval_const_expressions_mutator (node=0x55967ce47fd0, context=0x7ffed27811f0) at ./build/../src/backend/optimizer/util/clauses.c:3669
#23 0x000055967aca533b in expression_tree_mutator (node=node@entry=0x55967cd091f0, mutator=mutator@entry=0x55967ad0c740 <eval_const_expressions_mutator>, context=context@entry=0x7ffed27811f0) at ./build/../src/backend/nodes/nodeFuncs.c:2914
#24 0x000055967ad0c7a2 in eval_const_expressions_mutator (node=0x55967cd091f0, context=context@entry=0x7ffed27811f0) at ./build/../src/backend/optimizer/util/clauses.c:3669
#25 0x000055967ad0dccf in eval_const_expressions (root=root@entry=0x55967cd09bc0, node=<optimized out>) at ./build/../src/backend/optimizer/util/clauses.c:2472
#26 0x000055967acf5347 in preprocess_expression (root=root@entry=0x55967cd09bc0, expr=<optimized out>, kind=kind@entry=1) at ./build/../src/backend/optimizer/plan/planner.c:1041
#27 0x000055967acfb98b in subquery_planner (glob=glob@entry=0x55967ce496c8, parse=parse@entry=0x55967cd09020, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) at ./build/../src/backend/optimizer/plan/planner.c:732
#28 0x000055967acfcca5 in standard_planner (parse=0x55967cd09020, cursorOptions=256, boundParams=<optimized out>) at ./build/../src/backend/optimizer/plan/planner.c:405
#29 0x000055967ada7150 in pg_plan_query (querytree=querytree@entry=0x55967cd09020, cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at ./build/../src/backend/tcop/postgres.c:809
#30 0x000055967ada7226 in pg_plan_queries (querytrees=<optimized out>, cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at ./build/../src/backend/tcop/postgres.c:875
#31 0x000055967ada773e in exec_simple_query (query_string=0x55967cf74678 "SELECT '#408.3', st_isvalid('0106000020BB0B000001000000010300000005000000D6", '0' <repeats 12 times>, "C0F1A138410AD7A3103190524114AE4721F7A138410000000030905241713D0A57FAA1384185EB51982C9052417B14AE87FAA138410000004"...) at ./build/../src/backend/tcop/postgres.c:1050
#32 0x000055967ada9473 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x55967cd50208, dbname=<optimized out>, username=<optimized out>) at ./build/../src/backend/tcop/postgres.c:4153
#33 0x000055967aab8910 in BackendRun (port=0x55967cd48770) at ./build/../src/backend/postmaster/postmaster.c:4361
#34 BackendStartup (port=0x55967cd48770) at ./build/../src/backend/postmaster/postmaster.c:4033
#35 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1706
#36 0x000055967ad34ff3 in PostmasterMain (argc=5, argv=0x55967cd02e40) at ./build/../src/backend/postmaster/postmaster.c:1379
#37 0x000055967aab9e04 in main (argc=5, argv=0x55967cd02e40) at ./build/../src/backend/main/main.c:228

Highly similar stacktrace I see in PG11 on travis, PR: https://github.com/postgis/postgis/pull/262

Another thing I notice is that clang is used there to emit llvm code:

/usr/bin/clang-3.9 -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -O2  -I../liblwgeom -g -O1 -I../libpgcommon  -I/usr/include    -I/usr/include/libxml2 -I/usr/include -DHAVE_SFCGAL    -fPIC -I/usr/include -DHAVE_SFCGAL -I. -I./ -I/usr/include/postgresql/11/server -I/usr/include/postgresql/internal -I/usr/include/x86_64-linux-gnu   -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include/mit-krb5 -flto=thin -emit-llvm -c -o postgis_module.bc postgis_module.c

comment:7 Changed 5 months ago by komzpa

Milestone: PostGIS 2.5.0PostGIS PostgreSQL

Per recommendation of Myon and RhodiumToad? on #postgresql@freenode I changed config variable jit=off. Test suite passed. Writing a letter to postgresql-hackers@.

comment:9 Changed 5 months ago by myon

I just switched postgresql-11 on stretch to clang-4.0 (instead of 3.9). It still crashes on the same query in tickets.sql.

comment:10 Changed 3 months ago by myon

In the meantime, the problem has been diagnosed as LLVM << 5 having problems handling C++ exceptions. We fixed it on apt.postgresql.org for stretch by upgrading to LLVM 6. (We'll soon go to 7 because 6.0.1 supports x86 only.)

I suggest documenting that PostGIS shouldn't be used with PG11+JIT before LLVM 6.

comment:11 Changed 3 months ago by robe

Component: postgisdocumentation
Milestone: PostGIS PostgreSQLPostGIS 2.5.0
Owner: changed from pramsey to robe
Priority: blockerhigh

Will close once we've added to documentation that LLVM >= 5 is needed

comment:12 Changed 3 months ago by robe

In 16805:

Document that LLV >= 6 is required for JIT compiles.
References #4125 for PostGIS 3.0.0

comment:13 Changed 3 months ago by robe

Resolution: fixed
Status: newclosed

In 16806:

Document that LLV >= 6 is required for JIT compiles.
Closes #4125 for PostGIS 2.5.0

Note: See TracTickets for help on using tickets.