Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#2196 closed defect (worksforme)

ST_Intersects performs worse on 2.0 than 1.5

Reported by: zwerner Owned by: pramsey
Priority: medium Milestone: PostGIS 2.0.4
Component: postgis Version: 2.0.x
Keywords: Cc: werner@…

Description

When using the same dataset and same query, Explain Analyze shows the same query plan but the time for the filter on st_intersects takes much longer.

postgresql 8.4 with postgis 1.5.4 (trimmed to show differences):

'->  Bitmap Heap Scan on recording r  (cost=2467.53..140276.51 rows=32280 width=29) (actual time=452.074..4896.323 rows=699591 loops=1)'
'    Recheck Cond: ('<geom>'::geometry && geom))'
'    Filter: _st_intersects('<geom>'::geometry, geom)'
'       ->  Bitmap Index Scan on geom_idx  (cost=0.00..2459.46 rows=96841 width=0) (actual time=358.077..358.077 rows=699598 loops=1)'
'           Index Cond: ('<geom>'::geometry && geom))'

postgresql 9.2 with postgis 2.0.2 (trimmed again):

'->  Bitmap Heap Scan on recording r  (cost=2395.49..137026.40 rows=31248 width=29) (actual time=438.094..11424.818 rows=699591 loops=1)'
'    Recheck Cond: ('<geom>'::geometry && geom)'
'    Filter: _st_intersects('<geom>'::geometry, geom)'
'      ->  Bitmap Index Scan on geom_idx  (cost=0.00..2387.68 rows=93743 width=0) (actual time=342.217..342.217 rows=699591 loops=1)'
'          Index Cond: ('<geom>'::geometry && geom))'

As you can see, the st_intersects filter takes 6.5 seconds longer than the equivalent query on a previous version.

Attachments (1)

test.sql (4.4 KB ) - added by zwerner 11 years ago.
Sample data/table/queries

Download all attachments as: .zip

Change History (14)

comment:1 by zwerner, 11 years ago

Full versions: psql 8.4.11 POSTGIS="1.5.3" GEOS="3.3.3-CAPI-1.7.4" PROJ="Rel. 4.5.0, 22 Oct 2006" LIBXML="2.6.26" USE_STATS

psql 9.2.2 POSTGIS="2.0.2 r10789" GEOS="3.3.6-CAPI-1.7.6" PROJ="Rel. 4.8.0, 6 March 2012" LIBXML="2.6.26" LIBJSON="UNKNOWN"

comment:2 by pramsey, 11 years ago

Actually I don't see. Where's the 6.5?

comment:3 by zwerner, 11 years ago

On the first lines listing the actual times taken by the filter

8.4: (actual time=452.074..4896.323 rows=699591 loops=1) 9.2: (actual time=438.094..11424.818 rows=699591 loops=1)

comment:4 by pramsey, 11 years ago

Can you provide a data bundle and SQL statement?

by zwerner, 11 years ago

Attachment: test.sql added

Sample data/table/queries

comment:5 by zwerner, 11 years ago

The query is on 5.2 million rows (so a bit too big to post here), but it's a bunch of junk data anyway. Basically it's a random distribution from random points around the world, and a bigger part of the table/query is additional information that is being matched as well. To keep things simple, if you were to make a table with just the geometries in a random distribution, generate a simple index on that column and then do a simple intersects query off of it, you would get these results. I attached (above) a sample file to do all of this along with the sample data I'm using.

comment:6 by zwerner, 11 years ago

Would it be possible to get an update on this? Did the script work for you? Are you seeing similar results? I would appreciate any feedback you have.

comment:7 by robe, 11 years ago

zwerner,

The script worked fine for me. Not sure about Paul, but I didn't get a chance to setup on 8.4 to do a compare and also a 9.2 with 1.5. The issue that makes this tricky is that 9.2 has changed a lot too. So not sure if the issue you are having is related to 9.2 or 2.0.

For example on the geocode logic I have I had to revise some of my queries to work efficiently on 9.2 and at the time I ruled out 2.0 as the issue since I had 2.0 running on 8.4 and it worked faster.

If you are able of installing 2.0 on 8.4 or 1.5 on 9.2 that would of course help comparing apples with apples.

comment:8 by robe, 11 years ago

Just for the record I just tried your query on my 8.4 32-bit running (1.5) on windows vs. my 9.1 64-bit running (2.0.1) on the same windows 7 64-bit box. The timings don't look that different to me.

EXPLAIN ANALYZE VERBOSE SELECT id FROM public.test_geom_table WHERE ST_Intersects(ST_PolygonFromText('POLYGON(( -132.71472222222 51.508611111111, -132.71472222222 18.646111111111, -57.831944444444 18.646111111111, -57.831944444444 51.508611111111, -132.71472222222 51.508611111111 ))', 4326), geom) ;

outputs:

PostgreSQL 9.1.7, compiled by Visual C++ build 1500, 64-bit POSTGIS="2.0.1 r9979" GEOS="3.3.5-CAPI-1.7.5" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.1, released 2012/05/15" LIBXML="2.7.8" LIBJSON="UNKNOWN" 

Bitmap Heap Scan on public.test_geom_table  (cost=26272.52..312210.90 rows=228337 width=4) (actual time=185.047..5102.375 rows=692395 loops=1)
  Output: id
  Recheck Cond: ('0103000020E6100000010000000500000019452301DF9660C04F4D3C2B1AC1494019452301DF9660C0D1CDAB8967A53240E97CD2277DEA4CC0D1CDAB8967A53240E97CD2277DEA4CC04F4D3C2B1AC1494019452301DF9660C04F4D3C2B1AC14940'::geometry && test_geom_table.geom)
  Filter: _st_intersects('0103000020E6100000010000000500000019452301DF9660C04F4D3C2B1AC1494019452301DF9660C0D1CDAB8967A53240E97CD2277DEA4CC0D1CDAB8967A53240E97CD2277DEA4CC04F4D3C2B1AC1494019452301DF9660C04F4D3C2B1AC14940'::geometry, test_geom_table.geom)
  ->  Bitmap Index Scan on test_geom_table_geom_idx  (cost=0.00..26215.43 rows=685011 width=0) (actual time=184.039..184.039 rows=692395 loops=1)
        Index Cond: ('0103000020E6100000010000000500000019452301DF9660C04F4D3C2B1AC1494019452301DF9660C0D1CDAB8967A53240E97CD2277DEA4CC0D1CDAB8967A53240E97CD2277DEA4CC04F4D3C2B1AC1494019452301DF9660C04F4D3C2B1AC14940'::geometry && test_geom_table.geom)
Total runtime: 5149.532 ms

For my 8.4 instance running 32-bit First time took 5140ms, second 4970ms

PostgreSQL 8.4.8, compiled by Visual C++ build 1400, 32-bit POSTGIS="1.5.6" GEOS="3.3.6-CAPI-1.7.6" PROJ="Rel. 4.6.1, 21 August 2008" LIBXML="2.7.8" USE_STATS

Bitmap Heap Scan on test_geom_table  (cost=26616.92..303346.06 rows=231344 width=4) (actual time=773.432..4928.097 rows=692762 loops=1)
  Output: id
  Recheck Cond: ('0103000020E6100000010000000500000019452301DF9660C04F4D3C2B1AC1494019452301DF9660C0D1CDAB8967A53240E97CD2277DEA4CC0D1CDAB8967A53240E97CD2277DEA4CC04F4D3C2B1AC1494019452301DF9660C04F4D3C2B1AC14940'::geometry && geom)
  Filter: _st_intersects('0103000020E6100000010000000500000019452301DF9660C04F4D3C2B1AC1494019452301DF9660C0D1CDAB8967A53240E97CD2277DEA4CC0D1CDAB8967A53240E97CD2277DEA4CC04F4D3C2B1AC1494019452301DF9660C04F4D3C2B1AC14940'::geometry, geom)
  ->  Bitmap Index Scan on test_geom_table_geom_idx  (cost=0.00..26559.08 rows=694031 width=0) (actual time=771.016..771.016 rows=692762 loops=1)
        Index Cond: ('0103000020E6100000010000000500000019452301DF9660C04F4D3C2B1AC1494019452301DF9660C0D1CDAB8967A53240E97CD2277DEA4CC0D1CDAB8967A53240E97CD2277DEA4CC04F4D3C2B1AC1494019452301DF9660C04F4D3C2B1AC14940'::geometry && geom)
Total runtime: 4970.246 ms

I still need to install 2.0 on my 8.4 but looking at the numbers, I'm not seeing a 6.5 second longer difference. At most probably around 250ms which can be easily dismissed giving I ran the data creation script instead of running against the same generated table.

comment:9 by robe, 11 years ago

oh yah and where are you getting 6.5 seconds from again? - can you show your total run time?

comment:10 by zwerner, 11 years ago

I looked at what was causing the issue and assumed it had to be postgis, as I didn't know how postgresql would make it that much slower. Of course it does make sense to compare versions of postgis on the same version of postgresql to rule it out, and I will try to do that soon. I think I will also try it on 9.1 vs 9.2 and see if that helps.

If postgresql is the problem, is there a way to fix it? Or do I just submit a bug to them and hope for the best?

As for the times, you can see on the first line of each blurb (actual time='<start time>..<end time>'): 4896.323 vs 11424.818.

comment:11 by robe, 11 years ago

Well Like I said, I don't think even PostgreSQL is the culprit because I tested 2.0 on both 8.4 and 9.2 and also tested 1.5. It could be the difference in your server configs. You are using EXPLAIN ANALYZE right and not the time it takes to dump out the results. The time is very different and can be even more so the further you are away from the server.

Also I think in 2.0 the time to dump out would be longer since the geometries are slightly fatter.

comment:12 by robe, 11 years ago

Resolution: worksforme
Status: newclosed

zwerner, Were you able to retest and see. I'm closing this out since I can't replicate the issue.

comment:13 by zwerner, 11 years ago

Sorry for the delayed response. I'm a contract worker, and we've stopped work on the upgrade from 8.4 to 9.2 for now, so I've not made any more progress on it. Once we continue, I will dive more into it and post any updates.

Note: See TracTickets for help on using tickets.