Opened 12 years ago

Closed 12 years ago

#2966 closed defect (fixed)

OGR GPX driver creates duplicate field names in shapefile

Reported by: hamish Owned by: warmerdam
Priority: normal Milestone: 1.6.1
Component: OGR_SF Version: 1.5.2
Severity: normal Keywords: gpx
Cc:

Description

Hi,

If I try to convert a GPX file to a shapefile the resulting shapefile DBF contains a repeated column name:

ogr2ogr -f "ESRI Shapefile"  gps_track gps_track_29apr2009.gpx

(it creates a number of shapefiles, e.g. track_points.shp)

dbview -e gps_track/track_points.dbf | less
Field Name      Type    Length  Decimal Pos
track fid         N        11       0
track seg         N        11       0
track seg         N        11       0
ele               N        24      15
time              D         8       0
magvar            N        24      15
geoidheigh        N        24      15
name              C        80       0
cmt               C        80       0
...

note the repeated "track_seg_" field.

the ogr2ogr -sql option could be used to remap field names, but it is ambiguous as to which one you will rename.

Hamish

Change History (2)

comment:1 Changed 12 years ago by Even Rouault

This is not really an issue of the GPX driver. It reports 2 distinct field names : "track_seg_id" and "track_seg_point_id".

The main culprit is the DBF format itself that doesn't accept more than 10 characters for field names, so both get truncated to "track_seg_". Maybe the shape driver and/or ogr2ogr could realize that translating the GPX layer will yield to duplicate names after truncation, but I don't see any obvious way of fixing this. The process should be roughly :

1) The OGRShapeLayer::CreateField?() method detects that there will be a duplicated field name after truncation to 10 characters, so it starts a map between "long name" and "truncated names". It could use MS-DOS technique to create truncated names, like "track_se~1"

2) The OGRFeature::SetFrom?() method uses this map to copy the source feature into the destination feature

This is just a draft that should be improved. Much pain and API changes for a marginal use case...

As I don't want to change by default the field names reported by the GPX driver as people may already rely on them in their code (already 2 stable versions), I'll probably add an environment variable "OGR_GPX_SHORT_NAMES" that, when set to YES, will report (and expect) :

  • "track_seg_id" --> "trksegid"
  • "track_seg_point_id" --> "trksegptid"
  • "route_point_id" --> "rteptid"

A few other field names (geoidheight, ageofdgpsdata) are slightly above 10 characters but their truncation doesn't cause ambiguity.

comment:2 Changed 12 years ago by Even Rouault

Milestone: 1.6.1
Resolution: fixed
Status: newclosed

Done in trunk (r16886) and in branches/1.6 (r16887). The option is called "GPX_SHORT_NAMES" to be consistant with other option names in the GPX driver.

Note: See TracTickets for help on using tickets.