#2409 closed defect (fixed)
ogr2ogr -skipfailures backs out previous successful insert to postgis
Reported by: | kyngchaos | Owned by: | warmerdam |
---|---|---|---|
Priority: | normal | Milestone: | 1.5.2 |
Component: | Utilities | Version: | 1.5.1 |
Severity: | normal | Keywords: | ogr2ogr |
Cc: |
Description
When appending to a PostGIS table, using the -skipfailures flag, when an error is encountered and skipped, it looks like the previously successful feature insert is getting reverted/backed out. An example is appending a TIGER county shapefile to the table that is adjacent to a county already appended, where the lines along the county border are duplicated. I have the 'tlid' field set as the primary key (changed from the default ogc_fid field added by OGR on initial import).
ogr2ogr -update -append -skipfailures -a_srs EPSG:4269 -nln edges -f PostgreSQL PG:'dbname=tiger' fe_2007_55021_edges.shp
Where FIPS county 55025 (adjacent county to the south) has already been successfully and completely imported.
There are many failure errors, as expected for the duplicate shared lines along their common border. But now there are many missing lines inside the newly imported county that are not common county boundary lines.
When I look at the missing lines in the original shapefile, they appear to be the record just before a skipped county line. ie:
shape 1 OK shape 2 missing interior line shape 3 missing shared county boundary, the one skipped by ogr2ogr shape 4 OK ...
None of the missing interior lines are in the error log, only the ones skipped.
Change History (9)
comment:1 by , 16 years ago
Status: | new → assigned |
---|
comment:2 by , 16 years ago
Ah, that does it. I ended up doing a similar thing with shp2pgsql - I had to strip out all the BEGIN; and END; lines in the sql before dumping that into psql.
It was quite a shock when my 25GB of TIGER in PostGIS truned out to be missing a lot of records! I'll use the -tg flag until it gets into the next release.
PS. Could there be a runtime-configurable value for MAX_NUMBER_OF_ERRORS_REPORTED in cpl_error/CPLDefaultErrorHandler()? In order to keep track of all the skipped features and verify that those are the only errors, the #defined 1000 was too little, and I commented out that limit so I could get all errors. Preferrably, some way to tell it no limit (0 or -1?).
comment:3 by , 16 years ago
follow-up: 5 comment:4 by , 16 years ago
Component: | OGR_SF → Utilities |
---|---|
Keywords: | ogr2ogr added |
Milestone: | → 1.5.2 |
Resolution: | → fixed |
Severity: | critical → normal |
Status: | assigned → closed |
William,
I have added a config option for the default error handler in trunk (r14632).
The option is "CPL_MAX_ERROR_REPORTS".
comment:5 by , 16 years ago
comment:7 by , 8 years ago
comment:8 by , 8 years ago
Hi!
Is this problem only affecting 'append' operations? if that is the case it'll be great to also check for the 'append' flag, in the if condition, in order to avoid problems like the one we have, with non-append operations that have 1 feature per transaction instead of the default 20k slowing down a lot the process.
Thank you!
comment:9 by , 8 years ago
It does not affect only "append". "skipfailures" must always lead to using gt=1 so that only one row at a time is tried to insert and commit and just the failures can be skipped.
Instead of "skipfailures" it is sometimes possible to filter out faulty input geometries with SQLite dialect:
-dialect SQLite -sql "SELECT * from source WHERE ST_IsValid(geometry)=1"
I see your point. You should be able to use "-tg 1", an undocumented switch to set the transaction group size to 1 from the default 200. That way, only one feature is going to be handled per transaction.
If this fixes things for you, I will modify ogr2ogr so that -skipfailures implicitly changes the transaction group size to 1.