#4129 closed defect (fixed)
Cannot update SecondarySnapshot during a parallel operation
Reported by: | davidp | Owned by: | dustymugs |
---|---|---|---|
Priority: | high | Milestone: | PostGIS 2.4.9 |
Component: | raster | Version: | 2.4.x |
Keywords: | SecondarySnapshot parallel | Cc: |
Description
Hi, since we upgraded our Postgres/Postgis stack to version 10/2.4 this error starts to appear in our log:
ERROR: cannot update SecondarySnapshot during a parallel operation
Most often in this context:
SQL statement: "SELECT proj4text FROM public.spatial_ref_sys WHERE srid = 5514 LIMIT 1"
or
SQL statement "SELECT TRUE" PL/pgSQL function _st_intersects(geometry,raster,integer)
There are many diferent sources of this error (PHP, Python, Mapnik), so i thing, there is marginal chance to be caused by our code (SQL statements). When i wrap problematic statements into PL/pgSQL function (which are parallel restricted by default), then this error disappear. So my suspicion falls on PostGIS functions and how they behave in PostgreSQL parallel mode.
For now i thing there is issue in these PostGIS functions:
- ST_Transform(geometry, int) which executes "select proj4text from…"
- ST_Intersects(geometry, raster)
Our system info is:
- PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
- POSTGIS="2.4.4 r16526" PGSQL="100" GEOS="3.5.1-CAPI-1.9.1 r4246" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.1.2, released 2016/10/24" LIBXML="2.9.4" LIBJSON="0.12.1" LIBPROTOBUF="1.2.1" RASTER
And PostreSQL cluster resides behind PGBouncer in session pooling mode.
Best Regards, David
Attachments (2)
Change History (15)
follow-up: 2 comment:1 by , 6 years ago
Component: | postgis → raster |
---|---|
Owner: | changed from | to
comment:2 by , 6 years ago
Replying to Algunenano:
This means that, by default rules, it's parallel unsafe, which makes anyone using it also unsafe. Now, I don't know if the parallelism declaration is missing or it really is unsafe, but that's where I would start looking into.
I didn't thing so… PARALLEL UNSAFE apply only downstream (by my observations). So, function marked as parallel safe is executed in parallel mode, even if internally calls parallel unsafe function. Contrary, parallel unsafe function never executes in parallel mode, no matter how internally called functions are labeled.
Ind in our DB, functions are declared as follows:
CREATE OR REPLACE FUNCTION public._st_intersects(geom geometry, rast raster, nband integer DEFAULT NULL::integer) RETURNS boolean LANGUAGE plpgsql IMMUTABLE PARALLEL SAFE COST 1000 ...
and
CREATE OR REPLACE FUNCTION public.st_intersects(geom geometry, rast raster, nband integer DEFAULT NULL::integer) RETURNS boolean LANGUAGE sql IMMUTABLE PARALLEL SAFE COST 1000 AS $function$ SELECT $1 OPERATOR(public.&&) $2::geometry AND public._st_intersects($1, $2, $3); $function$
And one correction to my original post… function default is PARALLEL UNSAFE, not RESTRICTED. Sorry, my fault.
comment:3 by , 6 years ago
So, function marked as parallel safe is executed in parallel mode, even if internally calls parallel unsafe function.
Yes, Postgresql trust the programmer when he declares a function parallel but, by definition, any function that calls another one that is parallel unsafe should be marked as unsafe too. If you don't, then your unsafe function might get called in parallel which leads to errors like this one or worse.
What I meant is, that if ST_ConverHull
is unsafe (should be reviewed and tagged appropriately) then st_intersects(geom, raster, int)
should be marked as unsafe too, but I haven't investigated any further.
comment:4 by , 6 years ago
Milestone: | PostGIS 2.4.5 → PostGIS 2.4.6 |
---|
comment:6 by , 6 years ago
So, that particular error message arises from GetLatestSnapshot which is not called at all in the PostGIS code base and pretty infrequently in the PgSQL code base. The only places that I can see that might happen during run time is some replication code paths? Are you running with replication on? I'm looking particularly at the st_transform() code line in PostGIS, and it all seems like it should be fine WRT parallelism. Any hints on replicating this?
comment:7 by , 6 years ago
Yes, we are using synchronous streaming replication. Pretty simle scenario with one Master and one Hot Stanby.
There is SQL query that causes errors:
with adrm as( select *, ST_Distance( ST_TRANSFORM( ST_SETSRID( ST_MAKEPOINT( $1, $2 ), $4 ), 5514 ), definicni_bod) distance from ruian.rn_adresni_misto where ST_DWithin( ST_TRANSFORM( ST_SETSRID( ST_MAKEPOINT( $1, $2 ), $4 ), 5514 ), definicni_bod, $5 ) order by distance asc limit $3 ) select json_agg( json_build_object( 'address', address, 'coords', coords, 'distance', distance )) as result from(select json_build_object( 'adrm_kod', adrm.kod, 'kraj', ruian.rn_vusc.nazev, 'kraj_kod', ruian.rn_okres.vusc_kod, 'okres', ruian.rn_okres.nazev, 'okres_kod', ruian.rn_obec.okres_kod, 'obec', ruian.rn_obec.nazev, 'obec_kod', ruian.rn_obec.kod, 'cast_obce', ruian.rn_cast_obce.nazev, 'cast_obce_kod', ruian.rn_cast_obce.kod, 'ulice', ruian.rn_ulice.nazev, 'ulice_kod', ruian.rn_ulice.kod, 'stavebni_objekt', ruian.rn_stavebni_objekt.kod, 'co', cislo_orientacni_hodnota, 'co_pism', cislo_orientacni_pismeno, 'cd', cislo_domovni, 'psc', adrp_psc ) address, st_asgeojson( ST_Transform( adrm.definicni_bod, $4 ))::json -> 'coordinates' coords, round( adrm.distance::numeric, 2 ) distance from adrm left join ruian.rn_ulice on adrm.ulice_kod = ruian.rn_ulice.kod left join ruian.rn_stavebni_objekt on ruian.rn_stavebni_objekt.kod = adrm.stavobj_kod left join ruian.rn_cast_obce on ruian.rn_cast_obce.kod = ruian.rn_stavebni_objekt.cobce_kod left join ruian.rn_obec on ruian.rn_ulice.obec_kod = ruian.rn_obec.kod or ruian.rn_cast_obce.obec_kod = ruian.rn_obec.kod left join ruian.rn_okres on ruian.rn_obec.okres_kod = ruian.rn_okres.kod left join ruian.rn_vusc on ruian.rn_okres.vusc_kod = ruian.rn_vusc.kod) a;
where: $1 = X coordinate, $2 = Y coordinate, $3 = row limit, $4 = coordinates SRID, $5 = search radius
Error appears randomly when this query is executed in batch mode - eg 20k of executions… When i transformed this query into function (parallel unsafe), errors disappears.
Hope that helps…
by , 6 years ago
Attachment: | pgbadger-error-report-1.png added |
---|
by , 6 years ago
Attachment: | pgbadger-error-report-2.png added |
---|
comment:8 by , 5 years ago
Milestone: | PostGIS 2.4.6 → PostGIS 2.4.7 |
---|
comment:9 by , 5 years ago
Milestone: | PostGIS 2.4.7 → PostGIS 2.4.9 |
---|
comment:10 by , 5 years ago
Is this still an issue. I thought it might have had to do with our calls to spatial_ref_sys in C code not being schema qualified, I think we fixed it in newer versions of PostGIS , but I forget if we went as far back as 2.4
comment:11 by , 4 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
I suspect this is a similar issue to #4661 and is fixed by
This looks like there is some Postgis function marked as PARALLEL SAFE when it isn't.
ST_Transform
might execute a query like the one you've posted above, but it's read only and doesn't do anything strange so it should be ok.On the other hand
ST_Intersects(geometry, raster)
is defined as:And if we look at ST_ConverHull(raster) it is missing the parallel declaration:
This means that, by default rules, it's parallel unsafe, which makes anyone using it also unsafe. Now, I don't know if the parallelism declaration is missing or it really is unsafe, but that's where I would start looking into.