Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#4732 closed enhancement (wontfix)

long planning time

Reported by: michal Owned by: pramsey
Priority: low Milestone: PostGIS 3.1.0
Component: postgis Version: 3.0.x
Keywords: Cc:

Description

I have table with geography column way. I filter rows within approximately 300metre radius from a point. I have two options: where way && st_buffer ('0101000020E61000008E18E2FD5D8036406DD0F29771854840'::geography, 300) vs. where st_dwithin(way ,'0101000020E61000008E18E2FD5D8036406DD0F29771854840'::geography, 300)

both use indexes, both have more-less similar results, both have very short execution time. however the first one has extremely long planning time (40ms vs 1ms on low RAM server, 8ms vs 0.5ms on high RAM server).

what could be the issue? can the planning time be reduced?

Change History (6)

comment:1 by robe, 4 years ago

michal would help if you stated what version of PostgreSQL / PostGIS and on what platform

SELECT version() || ' ' || postgis_full_version();

output

comment:2 by michal, 4 years ago

Debian testing: low RAM server: PostgreSQL 12.3 (Debian 12.3-1+b1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 9.3.0-13) 9.3.0, 64-bit POSTGIS="3.0.1 ec2a9aa" [EXTENSION] PGSQL="120" GEOS="3.8.1-CAPI-1.13.3" PROJ="7.0.1" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.4.3 (Internal)"

high RAM server: PostgreSQL 12.3 (Debian 12.3-1+b1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 9.3.0-13) 9.3.0, 64-bit POSTGIS="3.0.1 rec2a9aa" [EXTENSION] PGSQL="120" GEOS="3.8.1-CAPI-1.13.3" PROJ="7.1.0" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.4.3 (Internal)" (core procs from "3.0.0 r17983" need upgrade) TOPOLOGY (topology procs from "3.0.0 r17983" need upgrade)

comment:3 by michal, 4 years ago

steps to reproduce (create some data, two cities with a lot of points plus some points in between):

drop table if exists tmp_within;
create table tmp_within (name text, way geography);
insert into tmp_within select left(md5(i::text), 8) as desc, geography(st_makepoint(48+random(), 17+random())) from generate_series(1,25000) as i;
insert into tmp_within select left(md5(i::text), 8) as desc, geography(st_makepoint(48+random(), 19+random())) from generate_series(25000,50000) as i;
insert into tmp_within select left(md5(i::text), 8) as desc, geography(st_makepoint(48+random(), 17+3*random())) from generate_series(50000,75000) as i;
create index on tmp_within using gist(way) ;
analyze tmp_within;

and then select which points are near our point (48.9,17.1) - within roughly 1km (I generally don't care about the difference between square and round buffer, as long as it is fast).

selecting rows within (square) buffer:

explain analyze select name from tmp_within where way && st_buffer(geography(st_makepoint(48.9, 17.1)), 1000);

Index Scan using tmp_within_way_idx on tmp_within  (cost=0.28..3.90 rows=1 width=9) (actual time=0.096..0.116 rows=13 loops=1)
   Index Cond: (way && '0103000020E610000001000000210000002BB56A00677448405C3505F89F19314030735BBB617448406A20C2592C193140EC5A10D6507448401DCD6BEEBC18314004EFD4F63474484098061EFE551831407F8DEB2F0F74484007137F7DFB173140FC1901F5E07348404E74D7E6B0173140ED65E30CAC734840EA3BDE1779173140C70F078072734840B5FB8935561731401CC18884367348400A1DFB964917314032496E68FA72484083974EB8531731406822FD7AC07248402E4DDA357417314065DB04F68A72484086B501D0A9173140624DFCE75B72484073207D77F2173140BF7BC81F3572484010EF9A614B1831408BBBF31A18724840B7D0B323B118314058EF03F705724840BD5CC4D41F19314023208066FF714840D8F6E13393193140E4AB11AA047248401D1E11D2061A31403A01058E1572484078ECE13D761A314029F3426C317248403DF1252FDD1A314091BAAF3257724840446911B1371B3140076CB16D8572484039332949821B314044AC7656BA724840BA307E19BA1B3140E55A71E4F37248400FDEE4FCDC1B3140633659E12F734840740A139CE91B31408864F1FE6B734840A77DD37ADF1B31403F4EBAEDA5734840F04CCDFCBE1B31403D0FB073DB73484057A6AF61891B31405B3735820A744840F2B6E7B8401B31400ACB514A3174484018BD59CDE71A31400DFF7E4E4E7448405616E509821A3140C872507160744840301DC257131A31402BB56A00677448405C3505F89F193140'::geography)
 Planning Time: 15.122 ms
 Execution Time: 0.135 ms

selecting rows with st_dwithin (which is round):

explain analyze select name from tmp_within where st_dwithin(way,geography(st_makepoint(48.9, 17.1)), 1000);

Index Scan using tmp_within_way_idx on tmp_within  (cost=0.53..29.15 rows=8 width=9) (actual time=0.697..0.793 rows=12 loops=1)
   Index Cond: (way && _st_expand('0101000020E610000033333333337348409A99999999193140'::geography, '1000'::double precision))
   Filter: st_dwithin(way, '0101000020E610000033333333337348409A99999999193140'::geography, '1000'::double precision, true)
   Rows Removed by Filter: 5
 Planning Time: 0.300 ms
 Execution Time: 0.838 ms

The first approach, which is far less accurate, spends long time in planning (and less time in execution, which is expectable, since it is far less accurate).

the fastest approach should be (I am not sure, why the _ in front of _st_expand is neccessary):

explain analyze select name from tmp_within where way && _st_expand(geography(st_makepoint(48.9, 17.1)), 1000);

Index Scan using tmp_within_way_idx on tmp_within  (cost=0.28..3.90 rows=1 width=9) (actual time=0.182..0.280 rows=17 loops=1)
   Index Cond: (way && '0101000020E610000033333333337348409A99999999193140'::geography)
 Planning Time: 0.419 ms
 Execution Time: 0.313 ms


Why is there such huge difference in planning time between the approaches?

comment:4 by Algunenano, 4 years ago

Resolution: wontfix
Status: newclosed

I think that the planning difference comes from calling ST_Buffer(geography). This requires calling ST_Transform multiple times and from PROJ6+ the initialization to do the transformation is much slower than before.

For example:

explain analyze Select * from populated_places_10_200 where the_geom && st_buffer(geography(st_makepoint(48.9, 17.1)), 1000);

 Rows Removed by Filter: 10
 Planning Time: 18.755 ms
 Execution Time: 0.047 ms

But if we precalculate the buffer call it doesn't need to be calculated during the planning process:

explain analyze Select * from populated_places_10_200 where the_geom && '0103000020E610000001000000210000002BB56A00677448405C3505F89F19314030735BBB617448406A20C2592C193140EC5A10D6507448401DCD6BEEBC18314004EFD4F63474484098061EFE551831407F8DEB2F0F74484007137F7DFB173140FC1901F5E07348404E74D7E6B0173140ED65E30CAC734840EA3BDE1779173140C70F078072734840B5FB8935561731401CC18884367348400A1DFB964917314032496E68FA72484083974EB8531731406822FD7AC07248402E4DDA357417314065DB04F68A72484086B501D0A9173140624DFCE75B72484073207D77F2173140BF7BC81F3572484010EF9A614B1831408BBBF31A18724840B7D0B323B118314058EF03F705724840BD5CC4D41F19314023208066FF714840D8F6E13393193140E4AB11AA047248401D1E11D2061A31403A01058E1572484078ECE13D761A314029F3426C317248403DF1252FDD1A314091BAAF3257724840446911B1371B3140076CB16D8572484039332949821B314044AC7656BA724840BA307E19BA1B3140E55A71E4F37248400FDEE4FCDC1B3140633659E12F734840740A139CE91B31408864F1FE6B734840A77DD37ADF1B31403F4EBAEDA5734840F04CCDFCBE1B31403D0FB073DB73484057A6AF61891B31405B3735820A744840F2B6E7B8401B31400ACB514A3174484018BD59CDE71A31400DFF7E4E4E7448405616E509821A3140C872507160744840301DC257131A31402BB56A00677448405C3505F89F193140'::geography;

 Rows Removed by Filter: 10
 Planning Time: 0.138 ms
 Execution Time: 0.076 ms

AFAIK, nothing can be done on our side to speed up PROJ.

comment:5 by michal, 4 years ago

just to understand: the most expensive step is to calculate perfectly round buffer around the point, which is calculated in the planning time (and not execution time).

but then execution is significantly higher in the case of pre-calculated buffer than the st_buffer buffer.

comment:6 by Algunenano, 4 years ago

just to understand: the most expensive step is to calculate perfectly round buffer around the point, which is calculated in the planning time (and not execution time).

No, the most expensive step is to convert the geography to the best SRS as this requires initializing PROJ. Once you have the best SRS, generating the buffer is really fast.

Note: See TracTickets for help on using tickets.