Opened 11 years ago

Closed 9 years ago

#5131 closed defect (fixed)

PG Driver less tolerant with SRID definition in 1.10 vs 1.9

Reported by: AlexGacon Owned by: warmerdam
Priority: normal Milestone: 1.10.2
Component: OGR_SF Version: 1.10.0
Severity: normal Keywords: srid
Cc:

Description

I am migrating my project from GDAL 1.9.0 to 1.10.0 and I have regressions in some features of my software.

The problem is related with a postgres view which is not declared in geometry_columns.

In version 1.9.0, since GDAL always asks the SRID with getsrid, it retrieved the SRID without problem.

In version 1.10.0, GDAL find a geometry column, a table name and a schema in the request and thus retrieve the SRID from geometry_columns. The SRID is not found, so we get a -1 value and the request fails.

I manage to fix the problem in my software by changing in OGRPGResultLayer::GetSpatialRef, line 356:

if( nSRSId == UNDETERMINED_SRID )

by

if( nSRSId == UNDETERMINED_SRID || nSRSId == -1)

Could this fix be integrated in the trunk ?

Change History (41)

comment:1 by Even Rouault, 11 years ago

I suppose you are using PostGIS 1.X and not PostGIS 2.0 ? Well, the current behaviour doesn't particularly shock me. Spatial tables or views should be registered in the geometry_columns table. By the way, I've tried with 1.9 and I've the same result as in 1.10

I've created a view like this :

ogr2ogr -f postgresql pg:dbname=autotest poly.shp -overwrite
ogrinfo pg:dbname=autotest -sql "create view poly_view as select * from poly"

And neither "ogrinfo pg:dbname=autotest poly_view" nor "ogrinfo pg:dbname=autotest -sql "select * from poly_view"" report the SRS in 1.9 or 1.10

But if I do : INSERT INTO geometry_columns VALUES ( , 'public', 'poly_view', 'wkb_geometry', 2, 4328, 'POLYGON')

then both "ogrinfo pg:dbname=autotest poly_view" nor "ogrinfo pg:dbname=autotest -sql "select * from poly_view"" report the SRS

So perhaps could you give a reproducable way of how to create the table, view and the query you do on it. But I think that inserting into geometry_columns will be anyway the proper way of fixing your issue.

comment:2 by AlexGacon, 11 years ago

First, I don't work with GDAL through ogrinfo and ogr2ogr but directly with the C++ API.

Second, the problem occurs when you use a spatial filter : in this case, the additional where clause generated by GDAL is wrong, postgis reports an error in the request, because of a difference of SRID between the filter and the query.

Moreover if you want to compare a previous behavior, you have to take care to test against the version 1.9.0, the version 1.9.2 already includes the change.

I have already several customers who have an installation with the view and I would like to avoid to send them a patch to be able to use the next version of my software.

comment:3 by strk, 11 years ago

I'm not sure if our problem is related but we're missing a .prj file in postgis->shp conversion due to geometry_columns NOT constraining the SRID. I mean I don't know if this is a regression,but surely is something that would be nice to do better (checking SRID from values, not only from constraint)

comment:4 by strk, 11 years ago

So now I'm pretty sure the problem is related, as we actually _always_ fetch the data from a query (so it's always impossible to query geometry_column for SRID). I guess this started happening for us as soon as we upgraded to 1.10

comment:5 by strk, 11 years ago

Actually I can't confirm that 1.10.1 is unable to fetch srid from values. This call does produce a .prj file:

ogr2ogr -f 'ESRI Shapefile' /tmp/t.shp PG: -sql "SELECT g FROM (SELECT 'SRID=4326;POINT(0 0)'::geometry as g) as f"

comment:6 by strk, 11 years ago

Ok, now I see a difference, which is pretty crazy. Depending on the query ogr2ogr does or does not produce a .prj. It looks like it tries to parse the query.

This produces it:

SELECT * FROM (SELECT the_geom from untitle_table_4) as f

This doesn't:

SELECT * FROM (SELECT * from untitle_table_4 limit 1) as f

It looks like if ogr2ogr sees a "SELECT * FROM something" then it solely perform a lookup into geometry_columns, otherwise it does fetch the SRID from the values.

Fetching the SRID from the values is always good, so why not doing it always instead ?

comment:7 by strk, 11 years ago

But then again, this produces no .prj file:

SELECT st_setsrid(st_makepoint(0,0),4326)

comment:8 by strk, 11 years ago

OGRPGTableLayer::ResolveSRID gets invoked rather than OGRPGResultLayer::ResolveSRID by the previous call, is that expected ? Where's the code that decides that the above is a table, not a "result" ?

comment:9 by strk, 11 years ago

It's OGRPGDataSource being too picky and only wanting "from" or "FROM" in a query:

        if (EQUALN(pszSQLCommand, "SELECT", 6) == FALSE ||
            (strstr(pszSQLCommand, "from") == NULL && strstr(pszSQLCommand, "FROM") == NULL))

why !?

comment:10 by strk, 11 years ago

And now I also realized the difference between select the_geom and select * in that the table also contains another geometry with SRID=3857 which for some reason result in no .prj too (although OGRSpatialReference->importFromWkt returns no error)

comment:11 by Even Rouault, 11 years ago

Sandro,

Several observations :

  • reading SRID from geometry_columns has the advantage to work even with 0 rows tables. OGRPGResultLayer::ResolveSRID() tries to read from the table information if pszGeomTableName != NULL, which is the case if PQftable() != InvalidOid (see line 79 and following in ogrpresultlayer.cpp). We could also try to read the SRID from rows but that would require some restructuring to avoid bad interactions with GetNextFeature() (i.e. make sure that ResolveSRID() doesn't consume a row

)

  • there's indeed a particular behaviour with "SELECT some_function_without_from_clause". The rationale is that most of the time the function could have a side-effect, and when doing "ogrinfo -al -sql 'SELECT that_function'", the statement could be run several times due to ResetReading() being called, which is not desirable (imagine it to be AddGeometryColumn()). That special code has for now just a dummy implementation of ResolveSRID() ( OGRPGNoResetResultLayer::ResolveSRID() ).
  • I believe I've reproduced your issue with "select the_geom" vs "select *" when converting from a PostGIS table with 2 geometry columns to a shapefile. This issue should be specific to trunk and the new multiple geometry per feature capability. I've pushed a fix for that in r26490.

comment:12 by strk, 11 years ago

Thanks for your observations, Evan. I think it'd still make sense to try reading the first row IFF the SRID is unknown by geometry_columns. Unknown would be a SRID <= 0, with any PostGIS version. About interaction with iterating, I think you can easily move backward in a declared cursor.

About calling function, presence of "from" doesnt' really tell you that there are no function calls, or side-effects for that matter. Indeed I have to workaround that by changing the query to: SELECT * FROM ( SELECT myfunc() ) . Sounds like a subclass of the problem you referenced in first item (not visiting a single row twice).

Fantastic work in r26490, it fixes our case! Alex, is that enough to fix your case too ?

comment:13 by strk, 11 years ago

Actually, I can't confirm it fixes our case, as we're hitting the case where we do have FROM, and we have a PQftable() != InvalidOi, but there's no constraint on the field thus geometry_column has unknown SRID, but values do have a known one.

The only fix for us would be to NOT accept a SRID <= 0 read from geometry_column but still fetch the first row in that case.

comment:14 by strk, 11 years ago

This patch fixes my case:

diff --git a/gdal/ogr/ogrsf_frmts/pg/ogrpgresultlayer.cpp b/gdal/ogr/ogrsf_frmts/pg/ogrpgresultlayer.cpp
index 0aedf13..09b0e34 100644
--- a/gdal/ogr/ogrsf_frmts/pg/ogrpgresultlayer.cpp
+++ b/gdal/ogr/ogrsf_frmts/pg/ogrpgresultlayer.cpp
@@ -345,7 +345,7 @@ void OGRPGResultLayer::ResolveSRID(OGRPGGeomFieldDefn* poGFldDefn)
             }
         }
 
-        if( nSRSId == UNDETERMINED_SRID )
+        if( nSRSId <= 0 )
         {
             CPLString osGetSRID;
 

comment:15 by strk, 11 years ago

NOTE: similar issue is being discussed for pgsql2shp: http://trac.osgeo.org/postgis/ticket/2417

comment:16 by Even Rouault, 11 years ago

Fixed differently in r26501.

comment:17 by Even Rouault, 11 years ago

r26502 "Fix test with simulation of PostGIS disabled in a PostGIS 2.0 DB (fix previous commit)"

comment:18 by strk, 11 years ago

Milestone: 1.10.2

Uhm, works fine with simple selects but still fails with literals:

d=4; ogr2ogr -f "ESRI Shapefile" /tmp/${d} PG:"dbname=cartodb_dev_user_1_db tables=fake" -sql "select 'SRID=4326;POINT(0 0)'::geometry as g"; ls /tmp/${d}

The resulting shapefile has no .prj

comment:19 by strk, 11 years ago

I just realized we're passing "tables=fake" parameter to the source URI, as a trick to skip query of geometry_columns. Now I don't remember the rationale of that, but the effect is probably that the query is considered the same as a literal query.

comment:20 by strk, 11 years ago

Ah,now I remember the rationale for "tables=fake": our user has no access to geometry_columns! So, EvanR, the only possible way to fetch srid is from actual value. Possible at all ? See https://github.com/CartoDB/CartoDB-SQL-API/issues/110

comment:21 by strk, 11 years ago

beside, when geometry_columns is NOT readable, GDAL currently raises a FATAL error and gives up. Instead, if it considered that error the same as finding a SRID=0 (unknown) things would probably work just fine.

comment:22 by strk, 11 years ago

Bingo, I found the culprit being a transaction aborted when it's time to call ST_Srid:

OGRPGTableLayer::ResolveSRID called
OGRPGTableLayer::ResolveSRID: nSRIDId=0, ePostgisType:1
OGRPGTableLayer::ResolveSRID query SELECT ST_SRID("the_geom") FROM "european_countries_export" LIMIT 1 error: ERROR:  current transaction is aborted, commands ignored until end of transaction block

Note that this is my own debugging lines added so theres lack of error reporting earlier in the process (probably when querying geometry_columns which is read-only for my user)

comment:23 by strk, 11 years ago

Commenting out poDS->SoftStartTransaction(); and poDS->SoftCommit(); in OGRPGTableLayer::ReadTableDefinition() fixes the non-literal query for me (the one that fetches data from real tables).

Literal queries are still not fixed.

NOTE: for a proper testcase you should connect to the database as a user that has no permission to read the "geometry_columns" table/view.

comment:24 by strk, 11 years ago

Sorry, I meant in ::ResolveSrid. Here's a patch:

diff --git a/gdal/ogr/ogrsf_frmts/pg/ogrpgtablelayer.cpp b/gdal/ogr/ogrsf_frmts/pg/ogrpgtablelayer.cpp
index 58d7ec4..c9b1644 100644
--- a/gdal/ogr/ogrsf_frmts/pg/ogrpgtablelayer.cpp
+++ b/gdal/ogr/ogrsf_frmts/pg/ogrpgtablelayer.cpp
@@ -2784,7 +2784,7 @@ void OGRPGTableLayer::ResolveSRID(OGRPGGeomFieldDefn* poGFldDefn)
 
     int nSRSId = poDS->GetUndefinedSRID();
 
-    poDS->SoftStartTransaction();
+    //poDS->SoftStartTransaction();
 
     osCommand.Printf(
                 "SELECT srid FROM geometry_columns "
@@ -2842,7 +2842,7 @@ void OGRPGTableLayer::ResolveSRID(OGRPGGeomFieldDefn* poGFldDefn)
         OGRPGClearResult( hResult );
     }
 
-    poDS->SoftCommit();
+    //poDS->SoftCommit();
 
     poGFldDefn->nSRSId = nSRSId;
 }

comment:25 by strk, 11 years ago

An alternative, smaller but still working patch:

diff --git a/gdal/ogr/ogrsf_frmts/pg/ogrpgtablelayer.cpp b/gdal/ogr/ogrsf_frmts/pg/ogrpgtablelayer.cpp
index 58d7ec4..f11ed17 100644
--- a/gdal/ogr/ogrsf_frmts/pg/ogrpgtablelayer.cpp
+++ b/gdal/ogr/ogrsf_frmts/pg/ogrpgtablelayer.cpp
@@ -2809,6 +2809,7 @@ void OGRPGTableLayer::ResolveSRID(OGRPGGeomFieldDefn* poGFldDefn)
     }
 
     OGRPGClearResult( hResult );
+    poDS->FlushSoftTransaction();
 
     /* With PostGIS 2.0, SRID = 0 can also mean that there's no constraint */
     /* so we need to fetch from values */

comment:26 by strk, 11 years ago

About the literal query, I keep not understanding the logic behind NoResetLayer, as I can cheat it very easily with this:

select * from ( select 'SRID=4326;POINT(0 0)'::geometry as g ) as _xx

Which triggers use of OGRPGResultLayer and does give me the correct .prj file back

comment:27 by strk, 10 years ago

I've filed a pull request with the oneliner patch above (no testcase, sorry) https://github.com/OSGeo/gdal/pull/22

comment:28 by strk, 10 years ago

Another bug found: the changeset of r26501 should also be ported into OgrPGResultLayer. I can commit that part to the pull request too, if you want.

comment:29 by strk, 10 years ago

pushed second fix

comment:30 by Even Rouault, 10 years ago

About the issue with transaction, I'm not sure why this is done in transaction in the first place. I would just drop SoftStartTransaction() and SoftCommit()

The logic behind NoResetLayer is for "SELECT AddGeometryColumn()" or similar stuff. Of course someone can "cheat", but it is their problem if they want to do crazy stuff. Aynway, I'm not sure there's a convincing use case for trying to retrieve the SRID for case with a "SELECT a_function_without_where_clause()"

Are you sure that https://github.com/strk/gdal/commit/4c282def0e3262e0ce99b631a8239074dcd11341 is necessary after r26501 has been applied ? I don't think so. If the geometry column of a PGResultLayer comes from a table, since r26501, we have already tried in two ways (geometry_columns and then ST_SRID()) to retrieve the SRID, and doing it again with ST_SRID() in OGRPGResultLayer with ST_SRID() will not help

comment:31 by strk, 10 years ago

I'm sure the commit is needed because I'm testing locally and OGRTableResult wasn't being used in my case (maybe it was the select * from (select literalthing)). I guess the testsuite needs to be enhanced to include all these cases. Note that I also had a difference between using a privileged and an unprivileged user, so there might be another spot where the missing transaction was triggering the need for that second commit (which anyway applies a valid change, if we accept it as valid in the other class)

I agree about dropping the transaction.

One case for "SELECT a_function_without_where_clause()" is a grid-generating function, to produce a shapefile with an hexagon grid, for example.

comment:32 by strk, 10 years ago

Here's my test script:

#!/bin/sh

# -sql "select the_geom from ( select * from european_countries_export ) as _xx"
# -sql "select 'SRID=4326;POINT(0 0)'::geometry as g"
# -sql "select * from ( select 'SRID=4326;POINT(0 0)'::geometry as g ) as _xx"

out=/tmp/missing_srid_$$
ogr2ogr -f "ESRI Shapefile" ${out} \
  PG:"host=localhost dbname=cartodb_dev_user_1_db user=development_cartodb_user_1 password= tables=fake" \
 -sql "select * from ( select 'SRID=4326;POINT(0 0)'::geometry as g ) as _xx"
echo ${out}
ls ${out}
rm -rf ${out}

I manually change the -sql line between those 3 queries and manually change the "user" line between "development_cartodb_user_1" (privileged) and "publicuser" (unprivileged).

With my patches they all produce a .prj file except the second one (the one with no "from")

comment:33 by Even Rouault, 10 years ago

With r26507 "PG: fix OGRPGTableLayer::ResolveSRID() when the current user has no select rights on geometry_columns table (#5131)", I don't see the need of changing if( nSRSId == UNDETERMINED_SRID ) to if( nSRSId <= 0 ) in with OGRPGResultLayer::ResolveSRID() your scenarios

comment:34 by Even Rouault, 10 years ago

r26508 "PG: fix retrieval of SRID on a table without SRID constraint, and when the datasource is opened with ' tables=fake' (#5131)"

comment:35 by strk, 10 years ago

Thank you, I confirm all of my cases, except the "query with no FROM" succeed as of r26508. The unsupported case is not going to be a real case for cartodb. Are you going to backport this to the 0.10 branch ?

NOTE: I filed a ticket about having SRID=null in geometry_columns here http://trac.osgeo.org/postgis/ticket/2491 (not sure if we'll ever go there, but worth mentioning in case it'll change behavior again).

comment:36 by strk, 10 years ago

I meant 1.10 branch, of course...

comment:37 by Even Rouault, 10 years ago

I'm not really keen on backporting this. This is kind of a new feature (supporting geometry tables without srid constraint and/or no select rights on geometry_columns), and the driver has been reworked in trunk for the RFC 41 changes so the backport isn't staightforward.

comment:38 by strk, 10 years ago

Another case related to SRID detection: https://github.com/CartoDB/CartoDB-SQL-API/issues/116

In this case the SRID value of a simple "select * from mytable" where "mytable" has two distinct geometric columns is extracted wrong by the user that _does_have_ permission to read geometry_columns table. Does it need a separate ticket ?

comment:39 by Even Rouault, 10 years ago

Yes, a new ticket would be appropriate. This one has accumulated too much stuff. Is it with GDAL trunk ? There has been major changes in trunk to accomodate with multiple geometry columns per table. And it would be great if you could have a way of reproducing which is standalone ( a few CREATE TABLE and INSERT )

comment:40 by strk, 10 years ago

Filed new ticket for the issue in previous comment: #5287

comment:41 by Even Rouault, 9 years ago

Resolution: fixed
Status: newclosed

The activity on this ticket is just insanely too long. I'll assume we can close it.

Note: See TracTickets for help on using tickets.