#5139 closed defect (fixed)
PostGIS causes to_jsonb to no longer be parallel safe, ST_AsGeoJSON and ST_AsGML are also parallel unsafe
Reported by: | samuelspurling | Owned by: | pramsey |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 3.1.7 |
Component: | postgis | Version: | 3.2.x |
Keywords: | Cc: |
Description
Hi,
Below was using PostgreSQL 13 and PostGIS 3.2.1.
I believe that the automatic conversion of the geometry field to geojson when using to_json or to_jsonb is causing a query of ours to fail.
Below query has been trimmed down a lot to try and show the issue. In the below case 'a_large_table' needs to be big enough to cause the query planner to use parallel queries (or force it using other settings).
-- Does not work WITH object_list AS ( SELECT guid,'a_large_table' AS table_name, to_jsonb((a_large_table.geom)) AS json_data FROM a_large_table WHERE reference like 'A00171%' ) SELECT * INTO TEMPORARY TABLE object_list_temp FROM object_list;
This gives the output: ERROR: cannot start commands during a parallel operation
Where: SQL statement "SELECT auth_name ':' auth_srid FROM public.spatial_ref_sys WHERE srid='27700'"
-- If casting the geometry to text the query then works as expected WITH object_list AS ( SELECT guid,'a_large_table' AS table_name, to_jsonb((a_large_table.geom::text)) AS json_data FROM a_large_table WHERE reference like 'A00171%' ) SELECT * INTO TEMPORARY TABLE object_list_temp FROM object_list;
Above works correctly for comparison, only difference being the geometry casting to text inside the to_jsonb. It is also worth noting that the part inside the CTE does work correctly when run on its own in either case.
I believe the selection into the temporary table causes it to be parallel restricted (? I am not a DBA, please correct me if incorrect) which is why after surrounding the other query something inside it is not marked as parallel safe and fails.
Any help appreciated, but I believe this a bug.
Thanks
Change History (14)
comment:1 by , 2 years ago
comment:2 by , 2 years ago
I was able to trigger the same error -
ERROR: cannot start commands during a parallel operation CONTEXT: SQL statement "SELECT auth_name||':'||auth_srid FROM public.spatial_ref_sys WHERE srid='27700'"
on both {{{ PostgreSQL 13.4, compiled by Visual C++ build 1914, 64-bit POSTGIS="3.2.0 3.2.0" [EXTENSION] PGSQL="130" GEOS="3.10.1-CAPI-1.16.0" PROJ="7.2.1" LIBXML="2.9.9" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.5.0 (Internal)" }}}
and:
PostgreSQL 14.1, compiled by Visual C++ build 1914, 64-bit POSTGIS="3.2.1 3.2.1" [EXTENSION] PGSQL="140" GEOS="3.10.2-CAPI-1.16.0" PROJ="7.2.1" GDAL="GDAL 3.4.2, released 2022/03/08" LIBXML="2.9.9" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.5.0 (Internal)" RASTER
Using the below script
-- create a fake table -- 28 secs 21 msec. DROP TABLE IF EXISTS a_large_table; CREATE TABLE a_large_table AS SELECT x AS id, ST_Point(40000 + random()*10000, -100000 + random()*10000, 27700) AS geom FROM generate_series(1,10000000) AS x;
— this triggers the error
set max_parallel_workers_per_gather=4; DROP TABLE IF EXISTS object_list_temp; WITH object_list AS ( SELECT id,'a_large_table' AS table_name, to_jsonb((a_large_table.geom)) AS json_data FROM a_large_table WHERE id::text LIKE '10000' ) SELECT * INTO TEMPORARY TABLE object_list_temp FROM object_list;
The explain looks like this:
Gather (cost=1000.00..133115.60 rows=50000 width=68) Workers Planned: 4 -> Parallel Seq Scan on a_large_table (cost=0.00..127115.60 rows=12500 width=68) Filter: ((id)::text ~~ '10000'::text)
It also fails for me with 2 parallel workers with same error. Using id > 9900000 also triggers the error.
@pramsey are you able to replicate with that above?
I thought maybe we had a short-circuit for 4326 so tried with the same in the example. I confirmed I can trigger with 4326 also.
However, if the dataset has no srid, then the error doesn't happen.
e.g.
-- create a fake table -- 28 secs 21 msec. DROP TABLE IF EXISTS a_large_table; CREATE TABLE a_large_table AS SELECT x AS id, ST_Point(40000 + random()*10000, -100000 + random()*10000) AS geom FROM generate_series(1,10000000) AS x; DROP TABLE IF EXISTS object_list_temp; -- no error set max_parallel_workers_per_gather=4; WITH object_list AS ( SELECT id,'a_large_table' AS table_name, to_jsonb(a_large_table.geom) AS json_data FROM a_large_table WHERE id::text LIKE '10000' ) SELECT * INTO TEMPORARY TABLE object_list_temp FROM object_list;
The explain of that looks the same as the failing ones
So the issue is the srid call that is causing the problem. Using a_large_table.geom::text simply avoids using ST_AsGeoJSON and the output is not a geojson object and no srid query needs to be done to pull the spatial ref sys metadata.
comment:3 by , 2 years ago
As I recall I think perhaps in 3.1 we didn't do a call to spatial_ref_sys to look up the auth and srid. I think we just hard-coded EPSG:<srid> of the geometry. So that change to look it up, cause "the auth could be different", probably triggered this issue. I'm not sure there is a fix short of going back to the way we used to do it. We could probably at least hard-code the 4326 case.
comment:4 by , 2 years ago
AS expected this is also an issue with ST_AsGeoJSON
WITH object_list AS ( SELECT id,'a_large_table' AS table_name, ST_AsGeoJSON(a_large_table.geom) AS json_data FROM a_large_table WHERE id::text LIKE '10000' ) SELECT * INTO TEMPORARY TABLE object_list_temp FROM object_list;
triggers the same:
ERROR: cannot start commands during a parallel operation CONTEXT: SQL statement "SELECT auth_name||':'||auth_srid FROM public.spatial_ref_sys WHERE srid='27700'"
comment:5 by , 2 years ago
Summary: | PostGIS causes to_jsonb to no longer be parallel safe → PostGIS causes to_jsonb to no longer be parallel safe, ST_AsGeoJSON also parallel unsafe |
---|
comment:6 by , 2 years ago
I wanted to confirm that all output functions that try to output spatial_ref_sys metadata fail.
I confirmed that ST_AsGML also fails.
What is interesting is ST_AsKML doesn't. I assume it's because it's using ST_Transform to convert to WGS84 long lat, but why ST_Transform doesn't fail when it has to also query spatial_ref_sys at some point I haven't quite figured out. It looks like for the most part the projcache is used most of the time instead of directly calling into spatial_ref_sys, so that might be why.
comment:7 by , 2 years ago
Summary: | PostGIS causes to_jsonb to no longer be parallel safe, ST_AsGeoJSON also parallel unsafe → PostGIS causes to_jsonb to no longer be parallel safe, ST_AsGeoJSON and ST_AsGML are also parallel unsafe |
---|
comment:8 by , 2 years ago
Milestone: | PostGIS 3.2.2 → PostGIS 3.1.7 |
---|
I'm pushing this to PostGIS 3.1. Even though the issue here manifests because of the auto-cast in to_jsonb, to now do proper casting of postgis geometry (instead of just returning the canonical hex), the underlying issue is the ST_AsGeoJSON, ST_AsGML and their call of the spatial_ref_sys table. Perhaps if we switched those to using cached values like ST_Transform, that might fix the issue or at least make it less common.
comment:10 by , 2 years ago
I'm going to backport the above to 3.2 and 3.1 after I have a test in place.
Does this occur on PgSQL 14? I'm trying to replicate and failing.