Opened 2 years ago

Closed 2 years ago

#5186 closed defect (wontfix)

PostGIS performance regressions on identical queries

Reported by: dracos Owned by: pramsey
Priority: medium Milestone: PostGIS 3.0.7
Component: postgis Version: 3.1.x
Keywords: Cc:

Description

Hi, we have recently upgraded a server from PostgreSQL 9.6 and PostGIS 2.3.1 (debian stretch) to PostgreSQL 13 and PostGIS 3.1.1 (debian bullseye). Everything went well (thanks :) ), but two queries we had both started performing worse.

The first case involved a CTE that (since PostgreSQL 12, see https://paquier.xyz/postgresql-2/postgres-12-with-materialize/ ) was being 'inlined' - adding MATERIALIZED restored the old fast behaviour. The second did not involve a CTE, but rewriting the query so that it did use one (and again, with MATERIALIZED) got the performance back to what it was previously.

Below I provide query plans for that second non-CTE query on old and new versions. The reason I'm opening this here rather than PostgreSQL might well be wrong, apologies if so, but I wasn't sure what was involved in deciding what the query planner did, and I wondered if there was something PostGIS could do in this situation - in the last two query plans, the difference appears to be (again, sorry if misunderstood) st_transform isn't called once up front but is called every iteration. The cost is far higher for the non-materialized version.

PostgreSQL 9.6 query plan:

mapit=# explain analyze
SELECT * FROM "mapit_postcode"
WHERE (
ST_CoveredBy(location, ST_Transform((select ST_Collect(polygon) from mapit_geometry where area_id=2577 group by area_id), 4326))
AND
location && ST_Transform((select ST_Collect(polygon) from mapit_geometry where area_id=2577 group by area_id), 4326)
)
LIMIT 1;
                                                                                QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=25.19..38.88 rows=1 width=43) (actual time=32.082..32.082 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  GroupAggregate  (cost=0.42..12.27 rows=2 width=36) (actual time=0.570..0.570 rows=1 loops=1)
           Group Key: mapit_geometry.area_id
           ->  Index Scan using mapit_geometry_area_id on mapit_geometry  (cost=0.42..12.23 rows=2 width=13783) (actual time=0.007..0.007 rows=1 loops=1)
                 Index Cond: (area_id = 2577)
   InitPlan 2 (returns $1)
     ->  GroupAggregate  (cost=0.42..12.27 rows=2 width=36) (actual time=0.282..0.282 rows=1 loops=1)
           Group Key: mapit_geometry_1.area_id
           ->  Index Scan using mapit_geometry_area_id on mapit_geometry mapit_geometry_1  (cost=0.42..12.23 rows=2 width=13783) (actual time=0.019..0.020 rows=1 loops=1) 
                 Index Cond: (area_id = 2577)
   ->  Index Scan using postcodes_postcode_location_id on mapit_postcode  (cost=0.66..849.44 rows=62 width=43) (actual time=32.080..32.080 rows=1 loops=1)
         Index Cond: (location && st_transform($1, 4326))
         Filter: st_coveredby(location, st_transform($0, 4326))
 Planning time: 0.428 ms
 Execution time: 36.386 ms
(16 rows)

PostgreSQL 13 query plan on the same query:

mapit=# explain analyze
SELECT * FROM "mapit_postcode"
WHERE (
ST_CoveredBy(location, ST_Transform((select ST_Collect(polygon) from mapit_geometry where area_id=2577 group by area_id), 4326)) 
AND
location && ST_Transform((select ST_Collect(polygon) from mapit_geometry where area_id=2577 group by area_id), 4326)
)
LIMIT 1;
                                                                                QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=90.23..148.25 rows=1 width=43) (actual time=9843.067..9843.072 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  GroupAggregate  (cost=0.42..19.91 rows=3 width=36) (actual time=1.473..1.475 rows=1 loops=1)
           Group Key: mapit_geometry.area_id
           ->  Index Scan using mapit_geometry_area_id on mapit_geometry  (cost=0.42..15.75 rows=3 width=13437) (actual time=0.012..0.016 rows=1 loops=1)
                 Index Cond: (area_id = 2577)
   InitPlan 2 (returns $1)
     ->  GroupAggregate  (cost=0.42..19.91 rows=3 width=36) (actual time=1.354..1.355 rows=1 loops=1)
           Group Key: mapit_geometry_1.area_id
           ->  Index Scan using mapit_geometry_area_id on mapit_geometry mapit_geometry_1  (cost=0.42..15.75 rows=3 width=13437) (actual time=0.026..0.029 rows=1 loops=1) 
                 Index Cond: (area_id = 2577)
   ->  Index Scan using postcodes_postcode_location_id on mapit_postcode  (cost=50.41..108.43 rows=1 width=43) (actual time=9843.065..9843.065 rows=1 loops=1)
         Index Cond: ((location @ st_transform($0, 4326)) AND (location && st_transform($1, 4326)))
         Filter: st_coveredby(location, st_transform($0, 4326))
         Rows Removed by Filter: 406
 Planning Time: 0.294 ms
 Execution Time: 9844.084 ms
(17 rows)

Here is PostgreSQL 13 on a non-materialized CTE version of the same query:

mapit=# explain analyze WITH target AS  (
    SELECT ST_Transform((select ST_Collect(polygon) from mapit_geometry
            where area_id=2577 group by area_id), 4326) AS polygon )
SELECT "mapit_postcode"."id", "mapit_postcode"."postcode", "mapit_postcode"."location"::bytea
  FROM mapit_postcode, target WHERE ST_CoveredBy(location, target.polygon) LIMIT 1;
                                                                             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=45.32..583.80 rows=1 width=43) (actual time=9297.474..9297.479 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  GroupAggregate  (cost=0.42..19.91 rows=3 width=36) (actual time=2.005..2.008 rows=1 loops=1)
           Group Key: mapit_geometry.area_id
           ->  Index Scan using mapit_geometry_area_id on mapit_geometry  (cost=0.42..15.75 rows=3 width=13437) (actual time=0.026..0.029 rows=1 loops=1)
                 Index Cond: (area_id = 2577)
   ->  Index Scan using postcodes_postcode_location_id on mapit_postcode  (cost=25.41..105027.91 rows=195 width=43) (actual time=9297.473..9297.473 rows=1 loops=1)
         Index Cond: (location @ st_transform($0, 4326))
         Filter: st_coveredby(location, st_transform($0, 4326))
         Rows Removed by Filter: 406
 Planning Time: 0.330 ms
 Execution Time: 9297.806 ms
(12 rows)

And here is the PostgreSQL 13 query plan on my rewritten query:

mapit=# explain analyze WITH target AS MATERIALIZED (
    SELECT ST_Transform((select ST_Collect(polygon) from mapit_geometry
            where area_id=2577 group by area_id), 4326) AS polygon )
SELECT "mapit_postcode"."id", "mapit_postcode"."postcode", "mapit_postcode"."location"::bytea
  FROM mapit_postcode, target WHERE ST_CoveredBy(location, target.polygon) LIMIT 1;
                                                                            QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=120.01..148.00 rows=1 width=43) (actual time=48.101..48.107 rows=1 loops=1)
   CTE target
     ->  Result  (cost=19.91..44.92 rows=1 width=32) (actual time=39.050..39.053 rows=1 loops=1)
           InitPlan 1 (returns $0)
             ->  GroupAggregate  (cost=0.42..19.91 rows=3 width=36) (actual time=0.986..0.988 rows=1 loops=1)
                   Group Key: mapit_geometry.area_id
                   ->  Index Scan using mapit_geometry_area_id on mapit_geometry  (cost=0.42..15.75 rows=3 width=13437) (actual time=0.014..0.017 rows=1 loops=1) 
                         Index Cond: (area_id = 2577)
   ->  Nested Loop  (cost=75.09..54659.02 rows=1950 width=43) (actual time=48.099..48.101 rows=1 loops=1)
         ->  CTE Scan on target  (cost=0.00..0.02 rows=1 width=32) (actual time=39.171..39.171 rows=1 loops=1)
         ->  Bitmap Heap Scan on mapit_postcode  (cost=75.09..54413.30 rows=195 width=43) (actual time=8.909..8.910 rows=1 loops=1)
               Filter: st_coveredby(location, target.polygon)
               Heap Blocks: exact=1
               ->  Bitmap Index Scan on postcodes_postcode_location_id  (cost=0.00..75.04 rows=1950 width=0) (actual time=8.355..8.355 rows=29757 loops=1)        
                     Index Cond: (location @ target.polygon)
 Planning Time: 0.299 ms
 Execution Time: 48.335 ms
(17 rows)

Change History (6)

comment:1 by robe, 2 years ago

at a glance key thing I am seeing is the index condition

PG 13 appears to use an index for @ (which I suspect comes from ST_Coveredby support function)

  ->  Index Scan using postcodes_postcode_location_id on mapit_postcode  (cost=50.41..108.43 rows=1 width=43) (actual time=9843.065..9843.065 rows=1 loops=1)
         Index Cond: ((location @ st_transform($0, 4326)) AND (location && st_transform($1, 4326)))
         Filter: st_coveredby(location, st_transform($0, 4326))

compared to PG 12

   ->  Index Scan using postcodes_postcode_location_id on mapit_postcode  (cost=0.66..849.44 rows=62 width=43) (actual time=32.080..32.080 rows=1 loops=1)
         Index Cond: (location && st_transform($1, 4326))
         Filter: st_coveredby(location, st_transform($0, 4326))

My guess is the @ is being picked up before the && and is costlier, though I wouldn't expect it to be that much more costly.

comment:2 by robe, 2 years ago

Curious what happens if you flip the order, so write it like this:

SELECT * FROM "mapit_postcode"
WHERE (
location && ST_Transform((select ST_Collect(polygon) from mapit_geometry where area_id=2577 group by area_id), 4326) AND
ST_CoveredBy(location, ST_Transform((select ST_Collect(polygon) from mapit_geometry where area_id=2577 group by area_id), 4326)) 
)
LIMIT 1;

OR just leave out the first, it seems redundant, though you might have had it there because it made things faster. I'd be interested to see how the below fairs on your old (if you still have it around) and your new. If my suspicions are right, I would expect the 9.6 version to be come just as slow as your 13. If not well that is an added piece of information

SELECT * FROM "mapit_postcode"
WHERE (

ST_CoveredBy(location, ST_Transform((select ST_Collect(polygon) from mapit_geometry where area_id=2577 group by area_id), 4326)) 
)
LIMIT 1;

comment:3 by dracos, 2 years ago

Hi robe, thanks for looking. I do still have the old. Swapping the order round gives the same query plan and execution time on both 9.6 and 13. If I leave out the explicit index part, then on PostgreSQL 13 it makes no difference (as you'd hope), but on PostgreSQL 9.6 it makes the query take many minutes to run, because it doesn't use the index automatically.

in reply to:  3 comment:4 by robe, 2 years ago

Replying to dracos:

Hi robe, thanks for looking. I do still have the old. Swapping the order round gives the same query plan and execution time on both 9.6 and 13. If I leave out the explicit index part, then on PostgreSQL 13 it makes no difference (as you'd hope), but on PostgreSQL 9.6 it makes the query take many minutes to run, because it doesn't use the index automatically.

So execution time, with && and ST_CoveredBy (does that make the 13 faster or the 9.6 slower?)

comment:5 by robe, 2 years ago

Milestone: PostGIS 3.2.2PostGIS 3.0.7

I'm pushing this to 3.0.7 as it sounds like it's an issue with 3.1 as well (from what you said) and given where I think the issue is, I think it is probably an issue with 3.0 as well for PostgreSQL 12+

comment:6 by robe, 2 years ago

Resolution: wontfix
Status: newclosed

@dracos,

Unfortunately we are unable to replicate this in our system and didn't get an answer to my last question. I suspect there is not much we can do here.

Note: See TracTickets for help on using tickets.