Opened 2 months ago

Closed 2 months ago

Last modified 2 months ago

#7007 closed defect (fixed)

ogr2ogr: FID index attribute not preserved

Reported by: hansw Owned by: warmerdam
Priority: normal Milestone:
Component: default Version: svn-trunk
Severity: normal Keywords: ogr2ogr GeoPackage Spatialite Index
Cc: tamas

Description

Prerequisites

Windows x64, GDAL version: 2.3.0 dev, trunk revision r39844, build from gisinternals.com.

I used GeoPackage v1.0 GDAL Technology Test Data Set and the Esri Sample Data Set. Both from the official GeoPackage site. The same behavior occurs with GeoPackage v1.2 files.

Overview

My aim is to automagically copy certain GeoPackage layers into a Spatialite database, as complete as possible. During testing I encountered inconsistent behaviour of ogr2ogr while copying the Index-attributes.

Detailed attribute behaviour

Copy behaviour of "GDAL Technology Test Data Set" Layer "linestring2d" attributes:

original (GeoPackage) created target (Spatialite)
- OGC_FID [only index attribute]
fid [only index attribute] -
geom [Geometry] the_geom [Geometry]

Copy behaviour of "Esri Sample Data Set" Layer "counties" attributes:

original (GeoPackage) created target (Spatialite)
- OGC_FID [only index attribute]
OBJECTID [only index attribute] OBJECTID [no index]
NAME NAME
STATE_NAME STATE_NAME
STATE_FIPS STATE_FIPS
... ...
Shape [Geometry] the_geom [Geometry]

Detailed description

When I set the "-unsetFid" command line option I understood that the if a "fid"-Index attribute exists in the original layer it will not be used as a index attribute in the target layer. So ogr2ogr will create a new Index attribute in the target layer. In the case of the SQLite/Spatialite driver I can conveniently name the new index with the -lco "FID=XXX" command line option.

But for the sake of completeness the original Index attributes (e.g. "fid" or, "OBJECTID") should still be preserved as "normal" attributes. In the case of the "Esri Sample Data" it keeps the "OBJECTID"-Index as a "normal" (non-index) attribute just as expected. But in the "GDAL Technology Test Data Set" the "fid"-Index attribute is just not copied. As consistent behaviour I expected the original "fid"-Index attribute to be copied as non-Index attribute. In my situation this can lead to data loss as indices can be important.

Used command line arguments

For the "GDAL Technology Test Data Set":

ogr2ogr.exe -f "SQLite" "C:\Temp\gdal_sample_v1.0.sqlite" "C:\Temp\gdal_sample.gpkg" -nln "import1" -lco "SPATIAL_INDEX=YES" -lco "FORMAT=SPATIALITE" -lco "LAUNDER=NO" -sql "SELECT * FROM \"linestring2d\"" -lco "GEOMETRY_NAME=the_geom" -dsco "SPATIALITE=YES" -unsetFid

For the "Esri Sample Data":

ogr2ogr.exe -f "SQLite" "C:\Temp\esri_sample_v1.0.sqlite" "C:\Temp\sample1_0.gpkg" -nln "import1" -lco "SPATIAL_INDEX=YES" -lco "FORMAT=SPATIALITE" -lco "LAUNDER=NO" -sql "SELECT * FROM \"counties\"" -lco "GEOMETRY_NAME=the_geom" -dsco "SPATIALITE=YES" -unsetFid

Change History (22)

comment:1 Changed 2 months ago by Even Rouault

Why don't you drop the -unsetFid option ? The behaviour you see when using it is the one intended.

Alternatively, you could modify your SQL statement like : -sql "SELECT rowid + 0 AS SRC_FID, * FROM \"counties\""

comment:2 Changed 2 months ago by hansw

Thanks for your suggestions!

Detailed attribute behaviour if i don't use the "-unsetFid" option

Copy behaviour of "GDAL Technology Test Data Set" Layer "linestring2d" attributes:

original (GeoPackage) created target (Spatialite)
fid [only index attribute] fid [only index attribute]
geom [Geometry] the_geom [Geometry]

Copy behaviour of "Esri Sample Data Set" Layer "counties" attributes:

original (GeoPackage) created target (Spatialite)
- OGC_FID [only index attribute]
OBJECTID [only index attribute] OBJECTID [no index]
NAME NAME
STATE_NAME STATE_NAME
STATE_FIPS STATE_FIPS
... ...
Shape [Geometry] the_geom [Geometry]

Detailed description

I need consistency while adding attributes. In my case it will produce inconsistencies if attributes will be added for the one set of layers (e.g. "Esri Sample Data Set") but not for another set of layers ("GDAL Technology Test Data Set").
Both of your suggestions have this inconsistent behaviour.

The reason is in ogr2ogr's behaviour:

  • Copying the Index attribute for the one case (e.g. "GDAL Technology Test Data Set"), as it "thinks" this index is appropriate to use for the new layer.
  • Creating a new Index attribute, while keeping the old index as non-index attribute, in the other case (e.g. "Esri Sample Data Set"), as it "thinks" this index is not appropriate to use for the new layer.
Last edited 2 months ago by hansw (previous) (diff)

comment:3 Changed 2 months ago by Even Rouault

I don't get the same results as you. If I do

ogr2ogr  -f "SQLite" out2.sqlite sample1_0.gpkg -nln "import1" -lco "SPATIAL_INDEX=YES" -lco "FORMAT=SPATIALITE" -lco "LAUNDER=NO" -sql "SELECT * FROM \"counties\"" -lco "GEOMETRY_NAME=the_geom" -dsco "SPATIALITE=YES"

the OBJECTID column is properly copied as the feature index column in the target Spatialite database, and no OGC_FID column is created as expected

$ ogrinfo out2.sqlite -al -so
INFO: Open of `out2.sqlite'
      using driver `SQLite' successful.

Layer name: import1
Geometry: Multi Polygon
Feature Count: 3141
Extent: (-178.217598, 18.921786) - (-66.969271, 71.406235)
Layer SRS WKT:
GEOGCS["NAD83",
    DATUM["North_American_Datum_1983",
        SPHEROID["GRS 1980",6378137,298.257222101,
            AUTHORITY["EPSG","7019"]],
        TOWGS84[0,0,0,0,0,0,0],
        AUTHORITY["EPSG","6269"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9122"]],
    AXIS["Latitude",NORTH],
    AXIS["Longitude",EAST],
    AUTHORITY["EPSG","4269"]]
FID Column = OBJECTID
Geometry Column = the_geom
NAME: String (0.0)
STATE_NAME: String (0.0)
STATE_FIPS: String (0.0)
CNTY_FIPS: String (0.0)
FIPS: String (0.0)
AREA: Real (0.0)
POP1990: Real (0.0)
POP2000: Real (0.0)
POP90_SQMI: Integer (0.0)

comment:4 Changed 2 months ago by hansw

Thanks for your feedback! This seems very strange.
I don't have any time to find the issue right now. I will revisit this issue in the end of next week and give some feedback.

comment:5 Changed 2 months ago by Jukka Rahkonen

I donwloaded a fresh build from gisinternals (http://download.gisinternals.com/sdk/downloads/release-1800-x64-gdal-mapserver.zip) and re-used Even's command. My result is different:

FID Column = OGC_FID
Geometry Column = the_geom
OBJECTID: Integer64 (0.0)
NAME: String (0.0) 

comment:6 Changed 2 months ago by Even Rouault

Hum might come from a difference regarding enabling sqlite3 metadata, but normally gisinternals builds now enable it : https://github.com/gisinternals/buildsystem/issues/106

Could you retry by replacing '-sql "select * from counties"' by just counties ?

comment:7 Changed 2 months ago by Jukka Rahkonen

I believe that sqlite3 metadata is enabled, at least if it can be tested with "pragma table_info('import1')". However, if I remove the -sql select and give only the layer name then OBJECTID is preserved as FID and PK.

Pragma shows some differences in the primary key column

Original gpkg
0	OBJECTID	INTEGER	1	NULL	1
Spatialite with -sql
0	OGC_FID	        INTEGER	0	NULL	1
Spatialite with "layer"
0	OBJECTID	INTEGER	0	NULL	1

The difference is in the "not null" but we both know from the discussion on the OGC GeoPackage issue tracker that it does not really change anything to have not null or not to have it in "CREATE TABLE counties (OBJECTID INTEGER primary key autoincrement not null"

Last edited 2 months ago by Jukka Rahkonen (previous) (diff)

comment:8 Changed 2 months ago by Even Rouault

Cc: tamas added

This was indeed an issue with sqlite metadata not being enabled in gisinternals binaries (the fact that pragma table_info works is something else). I've submitted a pull request to the gisinternals script per https://github.com/gisinternals/buildsystem/pull/111

Adding Tamas in CC so he's aware of it

comment:9 Changed 2 months ago by Even Rouault

Tamas has just merged the pull request, so next nightly builds should have the fix.

comment:10 Changed 2 months ago by Jukka Rahkonen

This seems to be the same option that needs to be enabled for using spatial index with a view http://osgeo-org.1560.x6.nabble.com/gdal-dev-GeoPackege-views-and-spatial-index-td5312752.html. Is there any simple method how end users could check this feature? Serious geopackage users might appreciate it. I can imagine a situation where Mapserver is serving WMS nicely from geopackage views until system admins update GDAL which is built with sqlite metadata disabled and Mapserver admin would go nuts.

comment:11 Changed 2 months ago by Even Rouault

In 39870:

SQLITE/GPKG: report SQLITE_HAS_COLUMN_METADATA=YES as driver metadata item when it is available (refs #7007)

comment:12 Changed 2 months ago by Even Rouault

In 39871:

gdalinfo --format / ogrinfo --format: report extra metada items in a 'Other metadata items:' section (refs #7007)

comment:13 Changed 2 months ago by Even Rouault

In 39872:

GPKG doc: explain how to check if SQLITE_HAS_COLUMN_METADATA is enabled (refs #7007)

comment:14 Changed 2 months ago by hansw

Thanks for all your input! I used the new revision r39906 from here for this test.

Detailed attribute behaviour if I don't use the "-unsetFid" option

Copy behaviour of "GDAL Technology Test Data Set" Layer "linestring2d" attributes:

original (GeoPackage) created target (Spatialite)
fid [only index attribute] fid [only index attribute]
geom [Geometry] the_geom [Geometry]

Copy behaviour of "Esri Sample Data Set" Layer "counties" attributes:

original (GeoPackage) created target (Spatialite)
- OGC_FID [only index attribute]
OBJECTID [only index attribute] OBJECTID [no index]
NAME NAME
STATE_NAME STATE_NAME
STATE_FIPS STATE_FIPS
... ...
Shape [Geometry] the_geom [Geometry]
ogr2ogr.exe -f "SQLite" "C:\Temp\gdal_sample_v1.0.sqlite" "C:\Temp\gdal_sample.gpkg" -nln "import1" -lco "SPATIAL_INDEX=YES" -lco "FORMAT=SPATIALITE" -lco "LAUNDER=NO" -sql "SELECT * FROM \"linestring2d\"" -lco "GEOMETRY_NAME=the_geom" -dsco "SPATIALITE=YES"

Detailed attribute behaviour if I use the "-unsetFid" option

Copy behaviour of "GDAL Technology Test Data Set" Layer "linestring2d" attributes:

original (GeoPackage) created target (Spatialite)
fid [only index attribute] -
- OGC_FID [only index attribute]
geom [Geometry] the_geom [Geometry]

Copy behaviour of "Esri Sample Data Set" Layer "counties" attributes:

original (GeoPackage) created target (Spatialite)
- OGC_FID [only index attribute]
OBJECTID [only index attribute] OBJECTID [no index]
NAME NAME
STATE_NAME STATE_NAME
STATE_FIPS STATE_FIPS
... ...
Shape [Geometry] the_geom [Geometry]
ogr2ogr.exe -f "SQLite" "C:\Temp\gdal_sample_v1.0.sqlite" "C:\Temp\gdal_sample.gpkg" -nln "import1" -lco "SPATIAL_INDEX=YES" -lco "FORMAT=SPATIALITE" -lco "LAUNDER=NO" -sql "SELECT * FROM \"linestring2d\"" -lco "GEOMETRY_NAME=the_geom" -dsco "SPATIALITE=YES" -unsetFid

Detailed description

In my test non of the inconsistent behaviour of ogr2ogr as described in my posts above is fixed.
I need consistency while adding attributes (even Index attributes). For one set of layers (e.g. "Esri Sample Data Set") attributes will be added, but not for another set of layers ("GDAL Technology Test Data Set"). This happens independently of the "-unsetFid" option.

I want to achieve one of two behaviours:

  1. The original (GeoPackage) Index attribute (independently of its name) will be copied as the new (Spatialite) Index attribute with the same name (or the name I specified with the "-lco "FID=myFID"" command line option).

or

  1. The original (GeoPackage) index attribute (independently of its name) will be copied as a "normal" attribute with the same name. And a new Index attribute, with a standard name or (the name specified with the "-lco "FID=myFID"" command line option), will be created.



The behaviour of ogr2ogr right now is so inconsistent that I'm not able to achive the expected behaviour for a random GeoPackage with unknown Index attribute name.

Possible reason

I assume this behaviour occurs, because ogr2ogr identifies the Index-Attribute with the name "fid" as "real" Index and Index attributes with other names not.

comment:15 Changed 2 months ago by Even Rouault

Are you sure you're really using the gisinternal version you're thinking too ?

I just downloaded http://download.gisinternals.com/sdk/downloads/release-1500-gdal-mapserver.zip

and ran (no -unsetFid)

ogr2ogr  -f "SQLite" out2.sqlite sample1_0.gpkg -nln "import1" -lco "SPATIAL_INDEX=YES" -lco "FORMAT=SPATIALITE" -lco "LAUNDER=NO" -sql "SELECT * FROM \"counties\"" -lco "GEOMETRY_NAME=the_geom" -dsco "SPATIALITE=YES"

and the resulting out2.sqlite has, as expected,

FID Column = OBJECTID
Geometry Column = the_geom
NAME: String (0.0)
STATE_NAME: String (0.0)
STATE_FIPS: String (0.0)
CNTY_FIPS: String (0.0)
FIPS: String (0.0)
AREA: Real (0.0)
POP1990: Real (0.0)
POP2000: Real (0.0)
POP90_SQMI: Integer (0.0)
OGRFeature(import1):1
  NAME (String) = Lake of the Woods
  STATE_NAME (String) = Minnesota
  STATE_FIPS (String) = 27
  CNTY_FIPS (String) = 077
  FIPS (String) = 27077
  AREA (Real) = 1784.0634
  POP1990 (Real) = 4076
  POP2000 (Real) = 4651
  POP90_SQMI (Integer) = 2
ogrinfo --format GPKG

should report (at the end of the output)

  Other metadata items:
    SQLITE_HAS_COLUMN_METADATA=YES

comment:16 Changed 2 months ago by Jukka Rahkonen

Hi,

Did you also try what happens if you use just the layer name, instead of having

-sql "SELECT * FROM \"counties\""

use just this

ogr2ogr  -f "SQLite" -dsco "SPATIALITE=YES" out2.sqlite sample1_0.gpkg counties -nln "import1" -lco "LAUNDER=NO"  -lco "GEOMETRY_NAME=the_geom"

This syntax preserved the FID even last week for me.

comment:17 in reply to:  16 Changed 2 months ago by hansw

Replying to Jukka Rahkonen:

ogr2ogr  -f "SQLite" -dsco "SPATIALITE=YES" out2.sqlite sample1_0.gpkg counties -nln "import1" -lco "LAUNDER=NO"  -lco "GEOMETRY_NAME=the_geom"

Thank you! This worked exactly like expected. Only one problem remains:

Remaining question

How do I use layer names with a space in its name? Just using double quotes around the layer name does not work.

Remaining inconsistent behaviour

Maybe the above behaviour of ogr2ogr while using a "-sql SELECT ..." command needs to be corrected? I feel it is unintuitive if the normal behaviour changes. But maybe I'm the only one experiencing this?

comment:18 Changed 2 months ago by Even Rouault

ogr2ogr -f gpkg out.gpkg in.gpkg "foo bar" works for me.

Regarding the inconsistent behaviour with a sql SELECT, I don't reproduce it as you saw from my above comment. Which SDK exactly did you use ? Perhaps there's an issue with a particular MSVC version.

comment:19 Changed 2 months ago by hansw

Now it works with the double quotes. Thanks.
I did not build GDAL myself. But you can find information about the build here.

Other issue

I tested the import with a custom GeoPackage file (created with QGIS):

  • layer name: "te st"
  • index attribute name: "123 456"

The following error occurred:

ERROR 1: sqlite3_exec(CREATE TABLE 'import1' ( 123 456 INTEGER PRIMARY KEY AUTOINCREMENT)) failed: near "123": syntax error

In this SQL statement the Index attribute name should be in single quotes. Should I create a new bug report for this issue?

test file
Command line:

ogr2ogr.exe -f "SQLite" "test.sqlite" "test.gpkg" "te st" -gt 65536 --config OGR_SQLITE_PRAGMA "SYNCHRONOUS=OFF" -nln "import1" -lco "SPATIAL_INDEX=YES" -lco "FORMAT=SPATIALITE" -lco "LAUNDER=NO" -lco "GEOMETRY_NAME=the_geom" -dsco "SPATIALITE=YES"

comment:20 Changed 2 months ago by Even Rouault

Resolution: fixed
Status: newclosed

In 39913:

SQLite: escape integer primary key column name on table creation (fixes #7007)

comment:21 Changed 2 months ago by Even Rouault

In 39914:

SQLite: escape integer primary key column name on table creation (fixes #7007)

comment:22 Changed 2 months ago by hansw

Thanks for the fast fix!

Note: See TracTickets for help on using tickets.