Opened 9 years ago

Closed 9 years ago

#2976 closed enhancement (duplicate)

geometry_columns performance on large databases

Reported by: rnewton Owned by: pramsey
Priority: medium Milestone: PostGIS 2.2.0
Component: postgis Version: 2.1.x
Keywords: pgsql2shp, geometry_columns Cc:

Description

I discovered this performance issue when using pgsql2shp. Our environment uses thousands of tables in a single database where the general interaction is to not perform lookups via the postgres tables (pg_class, pg_namespace, etc), but rather directly by name.

The query that retrieves the projection string in pgsql2shp uses the geometry_columns view to find the proper projection. In our environment, this lookup is very slow because it has to enumerate all the tables in order to find every one that includes a geometry column.

This had the effect of causing our shapefile queries to timeout in some cases. To export the shape for a single city was taking about 15 minutes.

As a work-around, we recreated the geometry_columns view as a materialized view. Even without indexes, this greatly improves the performance, in our environment. The same single-city shapefile will be generated in about 1 second. We can schedule the sync of the materialized view on a nightly basis or resync when adding tables with new geometry columns, which is infrequent for us.

Ours might be a special case due to the huge number of tables in our database, ability to use PostgreSQL 9.3+ and stable number of tables with geometry columns.

A possible, related fix would be to include a flag to pgsql2shp for specifying the projection. This wouldn't replace the geometry_columns check, but rather provide a short cut.

My suggestion would be to use a materialized view for geometry_columns (as we're doing now) and triggering the resync when a new geometry column is added to a table (the same hook that used to exist when geometry_columns was a table).

Attachments (1)

geometry_columns_session.txt (6.4 KB ) - added by rnewton 9 years ago.
Performance comparison

Download all attachments as: .zip

Change History (3)

by rnewton, 9 years ago

Performance comparison

comment:1 by robe, 9 years ago

Milestone: PostGIS 2.1.5PostGIS 2.2.0

rnewton,

I've considered that and mentioned such a hack here.

http://www.postgresonline.com/journal/archives/314-Materialized-geometry_columns-using-Event-Triggers.html

I'm hesistant build that right into PostGIS because in 9.3 rebuilding a materialized view causes a read lock on the view so would just replace one set of problems for another. In 9.4 that is much more of a possiblity to consider since it doesn't lock if we add a unique index on the view (which is easy to do), but since that doesn't work for prior versions and we have to support prior versions in 2.2, I'm afraid its still too much keep track of versions to make the benefit not more work to maintain than the gain.

The idea of augmenting pgsql2shp to allow to explicitly specifying the projection might be worthwhile though and something we can look at in 2.2.0

comment:2 by robe, 9 years ago

Resolution: duplicate
Status: newclosed

marking this as duplicate of #3092.

Note: See TracTickets for help on using tickets.