Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#3071 closed defect (invalid)

CLUSTER on geometry index taking forever

Reported by: robe Owned by: pramsey
Priority: medium Milestone: PostGIS 2.2.0
Component: postgis Version: master
Keywords: Cc:

Description

I'm pretty sure I've done a CLUSTER and tables bigger than this and it did not take this long.

I grabbed building footprints from this building footprints https://data.sfgov.org/Geographic-Locations-and-Boundaries/Building-Footprints-Zipped-Shapefile-Format-/jezr-5bxm

and ran:

shp2pgsql -D -I -t 2D building_footprint data.sfo_buildings | psql -d workshop_1

The data had a ton of self-intersections

so Ir an this:

UPDATE data.sfo_buildings 
SET geom = ST_Multi(ST_MakeValid(geom))
WHERE NOT ST_IsValid(geom);

which was fine, and then proceeded to cluster

ALTER TABLE data.sfo_buildings
  CLUSTER ON sfo_buildings_geom_idx;

CLUSTER data.sfo_buildings;

I've been waiting for 10 minutes already and the table has only got 80,000 some odd records. Much longer than I have patience for. I'm pretty sure I've done an exercise like this with many more multipolygons and it was way faster (under a minute)

I'll try to reproduce with a random dataset and also with PostGIS 2.1.5

This is running:

POSTGIS="2.2.0dev r13298" GEOS="3.5.0dev-CAPI-1.9.0 r4048" SFCGAL="1.0.5" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11.1, released 2014/09/24" LIBXML="2.7.8" LIBJSON="0.12" RASTER PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit

Change History (5)

comment:1 by pramsey, 10 years ago

I ran same commands on same data, and got a cluster in < 3seconds.

PostgreSQL 9.3.5 on x86_64-apple-darwin13.4.0
POSTGIS="2.2.0dev r13311"

Check for gremlins.

comment:2 by robe, 10 years ago

Okay it finally finished after 16 minutes.

I did the same exercise on my:

POSTGIS="2.1.5 r13152" GEOS="3.4.2-CAPI-1.8.2 r3924" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11.1, released 2014/09/24" LIBXML="2.7.8" LIBJSON="UNKNOWN" (core procs from "2.1.3 r12547" need upgrade) RASTER (raster procs from "2.1.3 r12547" need upgrade) PostgreSQL 9.3.5, compiled by Visual C++ build 1600, 64-bit

and it finished in 3.5 SECONDS

So if there is something wrong here, I suppose it might be something with PostgreSQL 9.4.1.

comment:3 by robe, 10 years ago

My 9.4.0 (but its not 9.4.1) dev running 2.1.5 seems to be okay:

POSTGIS="2.1.5 r13152" GEOS="3.5.0dev-CAPI-1.9.0 r4038" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11.1, released 2014/09/24" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER PostgreSQL 9.4.0, compiled by Visual C++ build 1800, 64-bit

Took 2.5 seconds to run.

next to test against 2.2. and then upgrade to 9.4.1

comment:4 by robe, 10 years ago

Resolution: invalid
Status: newclosed

My 9.4.0 on

POSTGIS="2.2.0dev r13180" GEOS="3.5.0dev-CAPI-1.9.0 r4038" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11.1, released 2014/09/24" LIBXML="2.7.8" LIBJSON="0.12" RASTER

was even faster 1640 ms.

I'll reopen if I can replicate on 9.4.1 but given pramsey said his 9.4.1 with trunk works fine, gremlins might be the only plausible answer.

comment:5 by robe, 10 years ago

definitely gremlin. Just deleted the table and reran the steps on

POSTGIS="2.2.0dev r13298" GEOS="3.5.0dev-CAPI-1.9.0 r4048" SFCGAL="1.0.5" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11.1, released 2014/09/24" LIBXML="2.7.8" LIBJSON="0.12" RASTER PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit

completed in 3790 ms (still slower than others but in acceptable range).

Perhaps I had a query locking the table before.

Note: See TracTickets for help on using tickets.