Opened 17 years ago

Closed 15 years ago

#1476 closed defect (fixed)

Improper handling of Postgis tables with multiple geometry columns

Reported by: aaime@… Owned by: Even Rouault
Priority: normal Milestone: 1.6.0
Component: OGR_SF Version: 1.4.0
Severity: minor Keywords: PostgreSQL PostGIS
Cc: warmerdam, Ari Jolma

Description (last modified by Mateusz Łoskot)

OGR is not able to handle properly feature types with multiple geometries. In the specific case of a Postgis table with 4 geometry columns we do have the following output:

ogrinfo -so PG:"host=localhost port=5432 user=myuser
 password=mystery dbname=tiger2005fe" major_roads

INFO: Open of `PG:host=localhost port=5432 user=myuser password=mystery dbname=tiger2005fe' using driver `PostgreSQL' successful.

Layer name: major_roads
Geometry: Unknown (any)
Feature Count: 105628
Extent: (-170.837387, -14.377902) - (144.914001, 66.922287)
Layer SRS WKT:
(unknown)
Geometry Column = gen_full
fid: Integer (0.0)
interstate: Integer (0.0)
othername: String (0.0)
state: String (0.0)
statehighway: Integer (0.0)
ushighway: Integer (0.0)

Layer name: major_roads
Geometry: Unknown (any)
Feature Count: 105628
Extent: (-170.837387, -14.377902) - (144.914001, 66.922287)
Layer SRS WKT:
(unknown)
Geometry Column = gen_full
fid: Integer (0.0)
interstate: Integer (0.0)
othername: String (0.0)
state: String (0.0)
statehighway: Integer (0.0)
ushighway: Integer (0.0)

Layer name: major_roads
Geometry: Unknown (any)
Feature Count: 105628
Extent: (-170.837387, -14.377902) - (144.914001, 66.922287)
Layer SRS WKT:
(unknown)
Geometry Column = gen_full
fid: Integer (0.0)
interstate: Integer (0.0)
othername: String (0.0)
state: String (0.0)
statehighway: Integer (0.0)
ushighway: Integer (0.0)

Layer name: major_roads
Geometry: Unknown (any)
Feature Count: 105628
Extent: (-170.837387, -14.377902) - (144.914001, 66.922287)
Layer SRS WKT:
(unknown)
Geometry Column = gen_full
fid: Integer (0.0)
interstate: Integer (0.0)
othername: String (0.0)
state: String (0.0)
statehighway: Integer (0.0)
ushighway: Integer (0.0)

whilst the table definition is:

tiger2005fe=# \d major_roads
                             Table "public.major_roads"
    Column    |   Type   |                         Modifiers
--------------+----------+-----------------------------------------------------------
 state        | text     |
 gen_full     | geometry |
 gen_1        | geometry |
 gen_2        | geometry |
 gen_3        | geometry |
 interstate   | integer  |
 ushighway    | integer  |
 statehighway | integer  |
 othername    | text     |
 fid          | integer  | not null default nextval('major_roads_fid_seq'::regclass)
Indexes:
    "fid_pkey" PRIMARY KEY, btree (fid)
    "major_roads_spatial_ind" gist (gen_full)
    "major_roads_spatial_ind1" gist (gen_1)
    "major_roads_spatial_ind2" gist (gen_2)
    "major_roads_spatial_ind3" gist (gen_3)

Attachments (1)

pg_columns.diff (13.2 KB ) - added by Ari Jolma 15 years ago.
A quick and dirty patch to add column names to pg table layers

Download all attachments as: .zip

Change History (20)

comment:1 by warmerdam, 17 years ago

Assigning to Mateusz.

Please discuss any planned solution with me.  If it is not very disruptive 
if would be nice to fix things in 1.4 branch, but not if the changes are
very substantial.

comment:2 by Mateusz Łoskot, 17 years ago

Frank,

Current data model used by OGR is based on assumption that cardinality of the 
relation between a feature and associated geometry is 1:1.

feature -|----------|- geometry

If we want, in OGR, to support PostGIS ability to handle multiple geometry
columns will likely change the cardinality to 1:N

feature -|----------|< geometry

This change of relation cardinality will require substantial changes to
interface of some of OGR classes (as well as corresponding OGR C API).

Currently, OGRFeature provides access to single instance of associated geometry.
This will need to be changed to provide access to collection of geometries.

Here is list of public interfaces that will require changes to support
multiple geometries per feature:

1.
OGRFeature::GetGeometryRef
OGRFeature::StealGeometry
OGRFeature::SetGeometryDirectly
OGRFeature::SetGeometry

- access to geometries by index
- access default geometry (ie. with index 0, first column)

2. 
OGRFeatureDefn::GetGeomType
OGRFeatureDefn::SetGeomType

- specify type of which geometry is requested/set

3. 
OGRLayer::GetGeometryColumn

- return geometry by index, with default index = 0

4.
OGRLayer::GetSpatialRef

- specify SRS of which geometry is requested


Here are operations of OGRLayer that might require some changes:

OGRLayer::GetSpatialFilter
OGRLayer::SetSpatialFilter
OGRLayer::SetSpatialFilterRect

- these 3 operations listed above might require changes of their semantic:
-- how to specify to which geometry a filter should be applied?
-- do we want to provide all-by-one filtering where a common filter is
   applied to all geometries?


Certainly, this is a very first analysis and it is only supposed to help
us to decide if we want to support multiple geometries sooner or later
(I assume we need to support it at some time).

comment:3 by warmerdam, 17 years ago

Mateusz, 

In the short to medium term we can't practically reengineer OGR to support
multiple geometry columns properly.  What we might consider are special ways 
of selecting the geometry column to use.

I can quickly imagine 3 scenarios that might apply. 

1) We actually offer distinct appearing layers for each geometry column.  So 
if the table roads has two geometry columns, one called centerline, and one
called boundaries we might show the application a layer called
roads_centerline, and roads_boundaries.  

2) We could provide a mechanism in the datasource connection string where
a particular geometry column could be selected for a given table.  For
instance I think we already support listing the tables to be represented
as layers in the PG driver connection string.  We could allow adding a
geometry column in brackets after the table name as a way of selecting the
geometry column to use.  If unspecified, the first geometry column would
be used. 

3) We could extend the OGR API with a way of listing the geometry columns
available for a layer, and a method for selecting a particular one.  Note,
we already have the informative GetGeometryColumn() method which for some
RDBMS providers returns the name of the currently used geometry column. 

I'm inclined to pursue option (1) in the postgis driver for the time being
as it requires no changes to public apis and applications.  I think option 
(3) should be a longer term goal. 

comment:6 by warmerdam, 17 years ago

Description: modified (diff)
Milestone: 1.5.0
Priority: highestnormal
Severity: blockerminor

Re-prioritizing. Target 1.5.0.

comment:7 by warmerdam, 17 years ago

Cc: warmerdam added

comment:8 by Mateusz Łoskot, 17 years ago

Description: modified (diff)
Status: newassigned

comment:9 by Mateusz Łoskot, 17 years ago

Keywords: PostgreSQL PostGIS added

comment:10 by warmerdam, 16 years ago

Milestone: 1.5.01.6.0

comment:11 by Ari Jolma, 15 years ago

Cc: Ari Jolma added

comment:12 by Ari Jolma, 15 years ago

I guess my suggestion is to add column name to the constructor of a OGRPgTableLayer. It could be NULL, and then the behavior would be the same as now, but if it is not NULL, it would be used, maybe with some caution. The PGDataSource::Open could pick up the column names if possible (when it has geometry_columns available) and use those in OpenTable. PGDataSource::GetLayerByName could detect a column name in its argument.

QGIS uses, at least in its add postgis table interface, the syntax table_name(column_name). Maybe similar syntax could be used in PGDataSource::GetLayerByName.

by Ari Jolma, 15 years ago

Attachment: pg_columns.diff added

A quick and dirty patch to add column names to pg table layers

comment:13 by Even Rouault, 15 years ago

Owner: changed from Mateusz Łoskot to Even Rouault
Status: assignednew

I'm working on Ari's patch to make it work in a fully backward compatible way

The idea is that if a table 'foo' has only one geometry column (or zero), we continue to report it as 'foo'. If 'foo' has the default 'wkb_geometry' column and a new one called 'new_geometry_column', then 2 layers will be reported : 'foo' and 'foo(new_geometry_column)'.

comment:14 by Even Rouault, 15 years ago

Resolution: fixed
Status: newclosed

Support added in r15721, test added in r15722

comment:15 by Ari Jolma, 15 years ago

Resolution: fixed
Status: closedreopened

I think OGRPGTableLayer::GetSpatialRef is failing in the case where a table has a wkb_geometry field and other geometry fields. Then the select srid from geometry_columns does not return unique value because now f_geometry_column is not specified. Maybe add

osCommand += CPLString().Printf(" AND f_geometry_column = 'wkb_geometry'", pszGeomColumn);

as the default case?

I can't test just now but I'm observing this.

comment:16 by Ari Jolma, 15 years ago

One more thing: "DELETE FROM geometry_columns.." in OGRPGDataSource::CreateLayer is a bit too brutal now that we understand multiple geometry columns.

comment:17 by warmerdam, 15 years ago

Ari,

I'm not sure why you think the:

"DELETE FROM geometry_columns WHERE f_table_name = '%s' AND f_table_schema = '%s'"

is inappropriate. If we are creating a new table we don't want any old entries in the geometry_columns from past incarnations of that table. Likewise, DeleteLayer() needs to wipe all the entries in geometry_columns. It occurs to me that the existing DropGeometryColumn() invocation may not be sufficient for tables with multiple geometry column entries. I would say that DeleteLayer() is not brutal enough.

comment:18 by Even Rouault, 15 years ago

For OGRPGTableLayer::GetSpatialRef(), I think we'd need an example where it fails because if pszGeomColumn == NULL, it means that the table had only one geometry column (see the end of OGRPGDataSource::Open).

For OGRPGDataSource::CreateLayer() and DeleteLayer(), I generally concur with Frank's analysis. However the semantics of CreateLayer() and DeleteLayer() on a table with multiple geometry columns is not clear because when you delete layer "foo(bar)" (or override it), you also indirectly delete "foo(baz)"... I suppose that multiple geometry column support should be considered as a read-only feature for the moment.

comment:19 by Ari Jolma, 15 years ago

The first report (table with a wkb_geometry and an another geometry column) is invalid. Perhaps I was using an incomplete build or something, but testing now on a different build the problem does not exist. Sorry.

The second point is a genuine problem I believe. Assume you create a new layer -- note: a layer is not necessarily a table -- from a table which already defines a layer. In that case you don't want to remove the existing layer. I'd say that the semantics issue arises from confusion between a layer and a table. A table with multiple geometries defines multiple layers and thus multiple entries in geometry_columns.

I suggest removing the DELETE FROM... command from CreateLayer and adding a column name constraint to SELECT DropGeometryColumn.. in DeleteLayer. But I assume this needs more thought and testing.

Interestingly this issue exists also in other FOSS4G:

http://jira.codehaus.org/browse/UDIG-1134 https://lists.berlios.de/pipermail/mapnik-users/2008-May/000902.html

comment:20 by warmerdam, 15 years ago

Ari,

I agree there may be differing reasonable interpretations of whether deleting a layer means actually destroying the table, or just cleaning away the one geometry column if there are others.

However, I lean towards destroying a layer means destroying the table, and ideally cleaning up any other layers also dependent on that table. Our using two layers to represent a table with two geometry columns is a work around for OGR's inability to handle multiple geometry properties for features. I don't think we should overcomplicate this. But nevertheless, the issue is not entirely clear.

comment:21 by Ari Jolma, 15 years ago

Resolution: fixed
Status: reopenedclosed

This issue will remain a bit open since OGR will not be able to handle features with multiple geometry attributes any time soon. Such features are, by the way, still "simple features" in the OGC sense (I hadn't realized this).

It is perfectly ok to use multiple rows in geometry_columns for a single table. I found such usage in OGC specifications. The users - or middleware - just need to exercise care when creating new layers or deleting layers with the Pg driver.

I'm closing this as it was me who reopened it.

Note: See TracTickets for help on using tickets.