Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#6137 closed defect (fixed)

Segfault in ogr2ogr when importing csv file

Reported by: rtorre Owned by: warmerdam
Priority: normal Milestone: 2.1.0
Component: OGR_SF Version: svn-trunk
Severity: normal Keywords: crash, segfault, csv, csv_drv, GEOMETRY_NAME
Cc:

Description

When importing a csv file with the options -lco GEOMETRY_NAME=the_geom and -oo KEEP_GEOM_COLUMNS=NO if the input file already contains a field named the_geom then ogr2ogr crashes with a segmentation fault.

Tested against r30744.

This is the stack trace I got from the issue:

(gdb) bt
#0  0x00000000004289c4 in OGRFieldDefn::GetType (this=0x0) at /home/developer/src/ogr2ogr-package/gdal/ogr/ogr_feature.h:87
#1  0x0000000000be45d7 in OGRCSVLayer::GetNextUnfilteredFeature (this=0x2ca2890) at ogrcsvlayer.cpp:1362
#2  0x0000000000be563e in OGRCSVLayer::GetNextFeature (this=0x2ca2890) at ogrcsvlayer.cpp:1618
#3  0x00000000004276f7 in LayerTranslator::Translate (this=0x7ffe19df1fe0, psInfo=0x2cef430, nCountLayerFeatures=0, pnReadFeatureCount=0x0, pfnProgress=0, pProgressArg=0x0) at ogr2ogr.cpp:3781
#4  0x0000000000423a69 in main (nArgc=32, papszArgv=0x2ca1530) at ogr2ogr.cpp:2458

I attach the input csv file and a small script used to reproduce the issue.

Attachments (2)

reproduce.sh (1.0 KB ) - added by rtorre 9 years ago.
ne_10m_populated_places_simple.csv (7.9 KB ) - added by rtorre 9 years ago.

Download all attachments as: .zip

Change History (8)

by rtorre, 9 years ago

Attachment: reproduce.sh added

by rtorre, 9 years ago

comment:1 by Even Rouault, 9 years ago

Resolution: fixed
Status: newclosed

trunk r30750 "CSV: fix crash in case of multiple candidate geometry columns and KEEP_GEOM_COLUMNS=NO (#6137, trunk only)"

comment:2 by Even Rouault, 9 years ago

trunk r30751 "CSV: further fix for #6137"

comment:3 by rtorre, 9 years ago

With the same input file and script to reproduce the issue I'm getting this error:

ERROR 1: ERROR:  column "the_geom" specified more than once

ERROR 1: CREATE TABLE "public"."populated_places_simple" ( ogc_fid SERIAL, PRIMARY KEY (ogc_fid), "scalerank" INTEGER, "natscale" INTEGER, "labelrank" INTEGER, "featurecla" VARCHAR, "name" VARCHAR, "namepar" VARCHAR, "namealt" VARCHAR, "d
iffascii" INTEGER, "nameascii" VARCHAR, "adm0cap" FLOAT8, "capalt" FLOAT8, "capin" VARCHAR, "worldcity" FLOAT8, "megacity" INTEGER, "sov0name" VARCHAR, "sov_a3" VARCHAR, "adm0name" VARCHAR, "adm0_a3" VARCHAR, "adm1name" VARCHAR, "iso_a2" 
VARCHAR, "note" VARCHAR, "changed" FLOAT8, "namediff" INTEGER, "diffnote" VARCHAR, "pop_max" INTEGER, "pop_min" INTEGER, "pop_other" INTEGER, "geonameid" FLOAT8, "meganame" VARCHAR, "ls_name" VARCHAR, "ls_match" INTEGER, "checkme" INTEGER
, "the_geom" VARCHAR, "cartodb_id" INTEGER, "created_at" timestamp with time zone, "updated_at" timestamp with time zone, "the_geom_webmercator" VARCHAR, "the_geom" geometry(MULTIPOINT,4326) )
ERROR:  column "the_geom" specified more than once

am I doing something wrong? Shall I open a new ticket for that?

comment:4 by Even Rouault, 9 years ago

Well, the error at PostgreSQL insertion can be logically explained. You asked with -lco GEOMETRY_NAME=the_geom that the PostgreSQL geometry column is the_geom. But your input dataset contains a string column called the_geom, hence the conflict. The issue is that the CSV has both longitude,latitude and the_geom fields. longitude,latitude gets picked as geometry column, and the_geom is then considered as a text field. A workaround would be to remove -oo GEOM_POSSIBLE_NAMES=the_geom in that case.

comment:5 by rtorre, 9 years ago

I tried removing the_geom from possible names but the same error is raised. I'll figure out some better approach to drive ogr2ogr options.

comment:6 by Even Rouault, 9 years ago

Ah sorry my suggestion to remove -oo GEOM_POSSIBLE_NAMES=the_geom was wrong indeed. Apart from doing an explicit -sql and selecting all the fields but the_geom, I've no better suggestion. Actually some time ago, there was dicussion about allowing a "-select *,!the_geom" syntax in ogr2ogr to select all the fields but the one(s) mentionned.

Note: See TracTickets for help on using tickets.