Opened 16 years ago
Last modified 13 years ago
#139 closed enhancement
Default selectivity returned when geometry used for restricted join comes from another table, but accurate when from WKT — at Version 2
Reported by: | mose.andre | Owned by: | mcayland |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 1.4.3 |
Component: | postgis | Version: | |
Keywords: | Cc: |
Description (last modified by )
What steps will reproduce the problem?
— TEST CASE
CREATE TABLE points(point_id bigint, point geometry); — Some points in the unit square INSERT INTO points (point_id, point) VALUES (generate_series(1, 100000), ST_MakePoint(random(), random())); CREATE INDEX points_point_index ON points USING gist (point); ANALYZE points;
CREATE TABLE shapes(shape_id bigint, shape geometry); — A shape covering a bit of the unit square INSERT INTO shapes (shape_id, shape) VALUES (1, GeomFromText('POLYGON((0 0, 0 .1, .1 .1, .1 0, 0 0))'));
— Restricted join (1) EXPLAIN ANALYZE SELECT
count(*)
FROM
points INNER JOIN shapes ON ST_Contains(shape, point)
WHERE
shape_id=1;
— Constant restriction (2) EXPLAIN ANALYZE SELECT
count(*)
FROM
points
WHERE
ST_Contains(GeomFromText('POLYGON((0 0, 0 .1, .1 .1, .1 0, 0 0))'), point);
What is the expected output? What do you see instead?
In case (1) I expected to see an accurate estimate of the rows returned but see 1 instead of something ~1000. For more information, please see:
http://postgis.refractions.net/pipermail/postgis-devel/2008-September/003547.html
What version of the product are you using? On what operating system?
postgis_full_version | POSTGIS='1.3.3' GEOS='3.0.0-CAPI-1.4.1' PROJ='Rel. 4.5.0, 22 Oct 2006' USE_STATS
version | PostgreSQL 8.3.3 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-9)
Please provide any additional information below.
Another instance of someone running into a similar thing is in this thread.
Maybe there's a reason we shouldn't be expecting real selectivity estimate
in this case…
http://postgis.refractions.net/pipermail/postgis-users/2009-March/023043.html
Change History (2)
comment:1 by , 16 years ago
comment:2 by , 16 years ago
Description: | modified (diff) |
---|---|
Milestone: | → 1.4.X |
Owner: | set to |
Just to add my 2 cents.
I tested the above on 'POSTGIS='1.3.5' GEOS='3.0.3-CAPI-1.4.2' PROJ='Rel. 4.6.1, 21 August 2008' USE_STATS', PostgreSQL 8.3.7
(the _ST_Contains goes into a nested loop for the non constant geometry ) but the && is applied first non-nested and both timings are about the same on my box 92 ms)
But this one even on my 8.3.7 performs really badly (5-6 times worse than the above queries - 600 ms) which is closer to what Robert was trying to do) and bizarrely doesn't even try to use an index. So there seems to be a distinct disadvantage to doing subselects like this.
SELECT
FROM