Opened 10 years ago

Closed 7 years ago

#139 closed enhancement (wontfix)

Default selectivity returned when geometry used for restricted join comes from another table, but accurate when from WKT

Reported by: mose.andre Owned by: mcayland
Priority: medium Milestone: PostGIS 1.4.3
Component: postgis Version:
Keywords: Cc:

Description (last modified by robe)

What steps will reproduce the problem?

-- TEST CASE

CREATE TABLE points(point_id bigint, point geometry); -- Some points in the unit square INSERT INTO points (point_id, point) VALUES (generate_series(1, 100000), ST_MakePoint(random(), random())); CREATE INDEX points_point_index ON points USING gist (point); ANALYZE points;

CREATE TABLE shapes(shape_id bigint, shape geometry); -- A shape covering a bit of the unit square INSERT INTO shapes (shape_id, shape) VALUES (1, GeomFromText?('POLYGON((0 0, 0 .1, .1 .1, .1 0, 0 0))'));

-- Restricted join (1) EXPLAIN ANALYZE SELECT

count(*)

FROM

points INNER JOIN shapes ON ST_Contains(shape, point)

WHERE

shape_id=1;

-- Constant restriction (2) EXPLAIN ANALYZE SELECT

count(*)

FROM

points

WHERE

ST_Contains(GeomFromText?('POLYGON((0 0, 0 .1, .1 .1, .1 0, 0 0))'), point);

What is the expected output? What do you see instead?

In case (1) I expected to see an accurate estimate of the rows returned but see 1 instead of something ~1000. For more information, please see:

http://postgis.refractions.net/pipermail/postgis-devel/2008-September/003547.html

What version of the product are you using? On what operating system?

postgis_full_version | POSTGIS='1.3.3' GEOS='3.0.0-CAPI-1.4.1' PROJ='Rel. 4.5.0, 22 Oct 2006' USE_STATS

version | PostgreSQL 8.3.3 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-9)

Please provide any additional information below.

Another instance of someone running into a similar thing is in this thread.

Maybe there's a reason we shouldn't be expecting real selectivity estimate

in this case...

http://postgis.refractions.net/pipermail/postgis-users/2009-March/023043.html

Change History (6)

comment:1 Changed 10 years ago by lr1234567

Just to add my 2 cents.

I tested the above on 'POSTGIS='1.3.5' GEOS='3.0.3-CAPI-1.4.2' PROJ='Rel. 4.6.1, 21 August 2008' USE_STATS', PostgreSQL 8.3.7

The above two return in about the same speed (though plans are very different --

(the _ST_Contains goes into a nested loop for the non constant geometry ) but the && is applied first non-nested and both timings are about the same on my box 92 ms)

But this one even on my 8.3.7 performs really badly (5-6 times worse than the above queries - 600 ms) which is closer to what Robert was trying to do) and bizarrely doesn't even try to use an index. So there seems to be a distinct disadvantage to doing subselects like this.

SELECT

count(*)

FROM

points where ST_Contains((SELECT shape FROM shapes WHERE shape_id = 1), point);

comment:2 Changed 10 years ago by robe

Description: modified (diff)
Milestone: 1.4.X
Owner: set to mcayland

comment:3 Changed 10 years ago by robe

Type: taskenhancement

comment:4 Changed 10 years ago by pramsey

Milestone: postgis 1.4.1postgis 1.4.2

comment:5 Changed 9 years ago by pramsey

Milestone: PostGIS 1.4.2PostGIS 1.4.3

comment:6 Changed 7 years ago by pramsey

Resolution: wontfix
Status: newclosed

Let's be frank, this isn't changing in the 1.4 series. If it manifests in trunk an enhancement should be filed there.

Note: See TracTickets for help on using tickets.