#4732 closed enhancement (wontfix)
long planning time
Reported by: | michal | Owned by: | pramsey |
---|---|---|---|
Priority: | low | Milestone: | PostGIS 3.1.0 |
Component: | postgis | Version: | 3.0.x |
Keywords: | Cc: |
Description
I have table with geography column way. I filter rows within approximately 300metre radius from a point. I have two options:
where way && st_buffer ('0101000020E61000008E18E2FD5D8036406DD0F29771854840'::geography, 300)
vs.
where st_dwithin(way ,'0101000020E61000008E18E2FD5D8036406DD0F29771854840'::geography, 300)
both use indexes, both have more-less similar results, both have very short execution time. however the first one has extremely long planning time (40ms vs 1ms on low RAM server, 8ms vs 0.5ms on high RAM server).
what could be the issue? can the planning time be reduced?
Change History (6)
comment:1 by , 4 years ago
comment:2 by , 4 years ago
Debian testing:
low RAM server:
PostgreSQL 12.3 (Debian 12.3-1+b1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 9.3.0-13) 9.3.0, 64-bit POSTGIS="3.0.1 ec2a9aa" [EXTENSION] PGSQL="120" GEOS="3.8.1-CAPI-1.13.3" PROJ="7.0.1" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.4.3 (Internal)"
high RAM server:
PostgreSQL 12.3 (Debian 12.3-1+b1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 9.3.0-13) 9.3.0, 64-bit POSTGIS="3.0.1 rec2a9aa" [EXTENSION] PGSQL="120" GEOS="3.8.1-CAPI-1.13.3" PROJ="7.1.0" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.4.3 (Internal)" (core procs from "3.0.0 r17983" need upgrade) TOPOLOGY (topology procs from "3.0.0 r17983" need upgrade)
comment:3 by , 4 years ago
steps to reproduce (create some data, two cities with a lot of points plus some points in between):
drop table if exists tmp_within; create table tmp_within (name text, way geography); insert into tmp_within select left(md5(i::text), 8) as desc, geography(st_makepoint(48+random(), 17+random())) from generate_series(1,25000) as i; insert into tmp_within select left(md5(i::text), 8) as desc, geography(st_makepoint(48+random(), 19+random())) from generate_series(25000,50000) as i; insert into tmp_within select left(md5(i::text), 8) as desc, geography(st_makepoint(48+random(), 17+3*random())) from generate_series(50000,75000) as i; create index on tmp_within using gist(way) ; analyze tmp_within;
and then select which points are near our point (48.9,17.1) - within roughly 1km (I generally don't care about the difference between square and round buffer, as long as it is fast).
selecting rows within (square) buffer:
explain analyze select name from tmp_within where way && st_buffer(geography(st_makepoint(48.9, 17.1)), 1000); Index Scan using tmp_within_way_idx on tmp_within (cost=0.28..3.90 rows=1 width=9) (actual time=0.096..0.116 rows=13 loops=1) Index Cond: (way && '0103000020E610000001000000210000002BB56A00677448405C3505F89F19314030735BBB617448406A20C2592C193140EC5A10D6507448401DCD6BEEBC18314004EFD4F63474484098061EFE551831407F8DEB2F0F74484007137F7DFB173140FC1901F5E07348404E74D7E6B0173140ED65E30CAC734840EA3BDE1779173140C70F078072734840B5FB8935561731401CC18884367348400A1DFB964917314032496E68FA72484083974EB8531731406822FD7AC07248402E4DDA357417314065DB04F68A72484086B501D0A9173140624DFCE75B72484073207D77F2173140BF7BC81F3572484010EF9A614B1831408BBBF31A18724840B7D0B323B118314058EF03F705724840BD5CC4D41F19314023208066FF714840D8F6E13393193140E4AB11AA047248401D1E11D2061A31403A01058E1572484078ECE13D761A314029F3426C317248403DF1252FDD1A314091BAAF3257724840446911B1371B3140076CB16D8572484039332949821B314044AC7656BA724840BA307E19BA1B3140E55A71E4F37248400FDEE4FCDC1B3140633659E12F734840740A139CE91B31408864F1FE6B734840A77DD37ADF1B31403F4EBAEDA5734840F04CCDFCBE1B31403D0FB073DB73484057A6AF61891B31405B3735820A744840F2B6E7B8401B31400ACB514A3174484018BD59CDE71A31400DFF7E4E4E7448405616E509821A3140C872507160744840301DC257131A31402BB56A00677448405C3505F89F193140'::geography) Planning Time: 15.122 ms Execution Time: 0.135 ms
selecting rows with st_dwithin (which is round):
explain analyze select name from tmp_within where st_dwithin(way,geography(st_makepoint(48.9, 17.1)), 1000); Index Scan using tmp_within_way_idx on tmp_within (cost=0.53..29.15 rows=8 width=9) (actual time=0.697..0.793 rows=12 loops=1) Index Cond: (way && _st_expand('0101000020E610000033333333337348409A99999999193140'::geography, '1000'::double precision)) Filter: st_dwithin(way, '0101000020E610000033333333337348409A99999999193140'::geography, '1000'::double precision, true) Rows Removed by Filter: 5 Planning Time: 0.300 ms Execution Time: 0.838 ms
The first approach, which is far less accurate, spends long time in planning (and less time in execution, which is expectable, since it is far less accurate).
the fastest approach should be (I am not sure, why the _ in front of _st_expand is neccessary):
explain analyze select name from tmp_within where way && _st_expand(geography(st_makepoint(48.9, 17.1)), 1000); Index Scan using tmp_within_way_idx on tmp_within (cost=0.28..3.90 rows=1 width=9) (actual time=0.182..0.280 rows=17 loops=1) Index Cond: (way && '0101000020E610000033333333337348409A99999999193140'::geography) Planning Time: 0.419 ms Execution Time: 0.313 ms
Why is there such huge difference in planning time between the approaches?
comment:4 by , 4 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
I think that the planning difference comes from calling ST_Buffer(geography). This requires calling ST_Transform multiple times and from PROJ6+ the initialization to do the transformation is much slower than before.
For example:
explain analyze Select * from populated_places_10_200 where the_geom && st_buffer(geography(st_makepoint(48.9, 17.1)), 1000); Rows Removed by Filter: 10 Planning Time: 18.755 ms Execution Time: 0.047 ms
But if we precalculate the buffer call it doesn't need to be calculated during the planning process:
explain analyze Select * from populated_places_10_200 where the_geom && '0103000020E610000001000000210000002BB56A00677448405C3505F89F19314030735BBB617448406A20C2592C193140EC5A10D6507448401DCD6BEEBC18314004EFD4F63474484098061EFE551831407F8DEB2F0F74484007137F7DFB173140FC1901F5E07348404E74D7E6B0173140ED65E30CAC734840EA3BDE1779173140C70F078072734840B5FB8935561731401CC18884367348400A1DFB964917314032496E68FA72484083974EB8531731406822FD7AC07248402E4DDA357417314065DB04F68A72484086B501D0A9173140624DFCE75B72484073207D77F2173140BF7BC81F3572484010EF9A614B1831408BBBF31A18724840B7D0B323B118314058EF03F705724840BD5CC4D41F19314023208066FF714840D8F6E13393193140E4AB11AA047248401D1E11D2061A31403A01058E1572484078ECE13D761A314029F3426C317248403DF1252FDD1A314091BAAF3257724840446911B1371B3140076CB16D8572484039332949821B314044AC7656BA724840BA307E19BA1B3140E55A71E4F37248400FDEE4FCDC1B3140633659E12F734840740A139CE91B31408864F1FE6B734840A77DD37ADF1B31403F4EBAEDA5734840F04CCDFCBE1B31403D0FB073DB73484057A6AF61891B31405B3735820A744840F2B6E7B8401B31400ACB514A3174484018BD59CDE71A31400DFF7E4E4E7448405616E509821A3140C872507160744840301DC257131A31402BB56A00677448405C3505F89F193140'::geography; Rows Removed by Filter: 10 Planning Time: 0.138 ms Execution Time: 0.076 ms
AFAIK, nothing can be done on our side to speed up PROJ.
comment:5 by , 4 years ago
just to understand: the most expensive step is to calculate perfectly round buffer around the point, which is calculated in the planning time (and not execution time).
but then execution is significantly higher in the case of pre-calculated buffer than the st_buffer buffer.
comment:6 by , 4 years ago
just to understand: the most expensive step is to calculate perfectly round buffer around the point, which is calculated in the planning time (and not execution time).
No, the most expensive step is to convert the geography to the best SRS as this requires initializing PROJ. Once you have the best SRS, generating the buffer is really fast.
michal would help if you stated what version of PostgreSQL / PostGIS and on what platform
output