Estimate returned by geography_gist_selectivity results in slow query plan for ST_DWithin
|Reported by:||realityexists||Owned by:||pramsey|
I've run into a query performance problem that appears to be due to the && opeartor selectivity estimate (geography_gist_selectivity function).
I have a large, complex query which joins the results of several set-returning functions with some tables and filters them by calling another function, which involves 3 ST_DWithin(geog) calls. This used to run in about 10 seconds until I changed the functions to allow them to be inlined. (They previously had "SET search_path FROM current", which prevented inlining.) Now the query doesn't return in 10 minutes! If I again prevent the filtering function from being inlined (eg. by adding STRICT or SECURITY DEFINER or SET) the time goes down to 20 seconds. If I do the same to one of the set-returning functions it goes down to 15 seconds. It seems to change the query plan at the top level: without inlining it picks a Hash Join or Merge Join (fast), but with inlining it picks a Nested Loop (slow).
I asked about this on pgsql-general http://archives.postgresql.org/pgsql-general/2012-05/msg00370.php and Tom Lane suggested that it might be due to a PostGIS selectivity function returning a poor estimate. When the function is not inlined it uses the default estimate, which happens to be good in this case. If I hack postgis.sql to remove the RESTRICT and JOIN clauses, ie.
CREATE OPERATOR && (
LEFTARG = geography, RIGHTARG = geography, PROCEDURE = geography_overlaps, COMMUTATOR = '&&'
then the query runs in 5 seconds, which seems to support this theory.
I can reproduce the problem with the attached simplified test case, running on PostgreSQL 9.1.3 with PostGIS 2.0.0. Run initialise.sql, then query.sql. Note that the query is initially fast (140 ms), but after running ANALYZE the query plan changes from Hash Join to Nested Loop and it takes 15000 ms. If you then delete the table statistics again it goes back to the fast plan. If I mark test_points_are_near as STRICT it uses the fast plan. If I remove one of the ST_DWithin calls it uses the fast plan (see plans.txt).