Opened 4 years ago
Last modified 4 years ago
#4776 new defect
st_geometrytype causes 100% CPU loop in postgres13 on specific linestring
Reported by: | tvijlbrief | Owned by: | pramsey |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS PostgreSQL |
Component: | postgis | Version: | 3.0.x |
Keywords: | Cc: |
Description
I have a dataset which works fine with postgres12/postgis3.0.2 but a specific valid large linestring causes st_geometrytype() to hang:
=====demo13.sh fragment======================
create table demo(wkb_geometry geometry(Geometry,28992));
COPY demo (wkb_geometry) FROM stdin; 0102000020 data skipped (see attached for full script)
\.
select st_isvalid(wkb_geometry) from demo;
select ST_numpoints(wkb_geometry) from demo;
— Next statement will hang Postgres13 server in 100% CPU loop:
select ST_GeometryType(wkb_geometry) from demo;
=============================================
Output from script:
DROP TABLE CREATE TABLE COPY 1
st_isvalid
t
(1 row)
st_numpoints
1518
(1 row)
Script/server hangs….
Attachments (1)
Change History (16)
by , 4 years ago
comment:1 by , 4 years ago
To be clear, this script runs fine on postgres12 but fails with postgres13
comment:2 by , 4 years ago
I have tested this with postgresql-13-postgis-3 from deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main
but also with a Postgis compiled from source and the postgres:13.0-alpine docker image.
comment:3 by , 4 years ago
This might be a PostGIS 3.0 probably
I tried on my development version PostGIS 3.1
POSTGIS="3.1.0dev 3.1.0alpha2-111-gd781ecf29" [EXTENSION] PGSQL="130" GEOS="3.9.0-CAPI-1.14.0" PROJ="6.3.2" GDAL="GDAL 3.2.0, released 2020/10/26" LIBXML="2.9.9" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.5.0 (Internal)" RASTER PostgreSQL 13.0 on x86_64-w64-mingw32, compiled by gcc.exe (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0, 64-bit
DROP TABLE CREATE TABLE COPY 1 st_isvalid ------------ t (1 row) st_numpoints -------------- 1518 (1 row) st_geometrytype ----------------- ST_LineString (1 row)
But my PostgreSQL 13 with PostGIS 3.0 bad news - same issue
POSTGIS="3.0.2 3.0.2" [EXTENSION] PGSQL="130" GEOS="3.8.1-CAPI-1.13.3" SFCGAL="1.3.8" PROJ="6.3.2" GDAL="GDAL 3.2.0, released 2020/10/26 GDAL_DATA not found" LIBXML="2.9.9" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.4.3 (Internal)" TOPOLOGY RASTER PostgreSQL 13.0, compiled by Visual C++ build 1914, 64-bit
Gets this far
DROP TABLE CREATE TABLE COPY 1 st_isvalid ------------ t (1 row) st_numpoints -------------- 1518 (1 row) hangs
comment:4 by , 4 years ago
Milestone: | PostGIS 3.1.0 → PostGIS 3.0.3 |
---|---|
Priority: | medium → blocker |
comment:5 by , 4 years ago
Running on the latest Pg13 and the PostGIS 3.0 stable branch I cannot replicate this. Could have been fixed in Pg, or in PostGIS? Or platform dependent? (I'm MacOS)
comment:6 by , 4 years ago
comment:7 by , 4 years ago
I also tried it with current Postgis master (3.1) and it does fail again with a Debian postgres13 docker base image. Looking at the Postgis source I cannot think of a reason why this particular linestring fails with st_geometrytype() and works OK with eg st_numpoints().
The only difference is that st_numpoints() fetches the whole TOAST object and st_geometrytype() just requests a slice at the front of the TOAST object. This looks to me like a low level issue with Postgres13 and TOAST objects of a specific size and or compression behavior.
I have monitored https://www.postgresql.org/list/pgsql-bugs/2020-10/ to see if a generic Postgres13 issue popped up, but it did not, so I decided to start with reporting this as a Postgis issue. Perhaps we should report this issue as a PostgreSQL-13 issue referring to this ticket?
It's a strange issue, also because it does not effect MacOS with this specific instance, and most linestrings work just OK. I was very relieved to see that @robe could reproduce the error.
comment:8 by , 4 years ago
Could this be related to
https://github.com/postgres/postgres/commit/11a078cf87ffb611d19c7dec6df68b41084ad9c9
This commit is new in PostgreSQL REL_13
comment:10 by , 4 years ago
When I disable compression the hang is fixed:
create table demo(wkb_geometry geometry(Geometry,28992)); alter table demo alter column wkb_geometry set storage external;
comment:11 by , 4 years ago
It's important o understand was the hanging in 3.0.2 caused by some hard to reproduce bug in PG or something within PostGIS. I would be very greatfull if you could share a call stack of hanging postgres process.
comment:12 by , 4 years ago
Top 10 stack entries when hanging:
#0 pglz_decompress (source=source@entry=0x55584d6c39b0 "", slen=<optimized out>, dest=dest@entry=0x55584d6c3a3c "", rawsize=rawsize@entry=52, check_complete=check_complete@entry=false) at ./build/../src/common/pg_lzcompress.c:767 #1 0x000055584c5906cb in toast_decompress_datum_slice (slicelength=52, attr=0x55584d6c39a8) at ./build/../src/backend/access/common/detoast.c:483 #2 detoast_attr_slice (attr=<optimized out>, sliceoffset=sliceoffset@entry=0, slicelength=52) at ./build/../src/backend/access/common/detoast.c:274 #3 0x000055584c9e6f2a in pg_detoast_datum_slice (datum=<optimized out>, first=first@entry=0, count=<optimized out>) at ./build/../src/backend/utils/fmgr/fmgr.c:1754 #4 0x00007f0bc257a7c1 in geometry_geometrytype (fcinfo=0x55584d6bea60) at lwgeom_ogc.c:199 #5 0x000055584c740c9e in ExecInterpExpr (state=0x55584d6be578, econtext=0x55584d6be260, isnull=<optimized out>) at ./build/../src/backend/executor/execExprInterp.c:699 #6 0x000055584c74ec62 in ExecEvalExprSwitchContext (isNull=0x7ffec8369497, econtext=0x55584d6be260, state=0x55584d6be578) at ./build/../src/include/executor/executor.h:313 #7 ExecProject (projInfo=0x55584d6be570) at ./build/../src/include/executor/executor.h:347 #8 ExecScan (node=<optimized out>, accessMtd=0x55584c773130 <SeqNext>, recheckMtd=0x55584c7731c0 <SeqRecheck>) at ./build/../src/backend/executor/execScan.c:238 #9 0x000055584c744bcd in ExecProcNode (node=0x55584d6be150) at ./build/../src/include/executor/executor.h:245 #10 ExecutePlan (execute_once=<optimized out>, dest=0x55584d6b8640, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x55584d6be150, estate=0x55584d6bded8) at ./build/../src/backend/executor/execMain.c:1646
comment:13 by , 4 years ago
I filed a Postgresql report:
https://www.postgresql.org/message-id/16694-f107871e499ec114%40postgresql.org
comment:15 by , 4 years ago
Milestone: | PostGIS 3.0.3 → PostGIS PostgreSQL |
---|---|
Priority: | blocker → medium |
I'll keep it open until PostgreSQL 13.1 is released but flipped to a PostgreSQL problem. I guess that is why it worked on my development PostgreSQL 13 cause that was compiled from git where the fix was probably already in place and my PostGIS 3.0.2/PostgreSQL 13.0 are the released versions in the wild.
Script to reproduce the problem