Opened 13 years ago

Last modified 13 years ago

#1250 closed defect (fixed)

when GML contains namespace, ogr2ogr writes namespace in dbf column names

Reported by: bartvde@… Owned by: Mateusz Łoskot
Priority: highest Milestone:
Component: default Version: unspecified
Severity: blocker Keywords:
Cc:

Description

When a GML contains a namespace (like Mapserver with ms), the attribute names of the dbf are ms:XXXX. This does not work well. Ideally I would like to see the ms: part removed, so an attribute name without the XML namespace.

Attaching a Mapserver GML file to reproduce.

Attachments (1)

dtb.xml (614.4 KB) - added by bartvde@… 13 years ago.
gml file from Mapserver

Download all attachments as: .zip

Change History (10)

Changed 13 years ago by bartvde@…

Attachment: dtb.xml added

gml file from Mapserver

comment:1 Changed 13 years ago by Mateusz Łoskot

I'm taking this bug.

comment:2 Changed 13 years ago by Mateusz Łoskot

I have just fixed this bug, so I'm going to close it.

Now, Shape driver will normalize the field name before adding to the DBF table.
Normalization is done using CPLScanString function which replaces colon character with underscore.

Here is control output (CPL_DEBUG=ON) from conversion of Bart's GML file to shapefile:

mloskot:~/dev/gdal/bugs/1250$ ogr2ogr -f "ESRI Shapefile" test.shp dtb.xml
OGR: OGROpen(dtb.xml/0x804f278) succeeded as GML.
Shape: Normalized field name: ms_CTE
Shape: Normalized field name: ms_DATUM
Shape: Normalized field name: ms_DTM
Shape: Normalized field name: ms_EIGNAM
Shape: Normalized field name: ms_LAYER
Shape: Normalized field name: ms_OMSCHR
Shape: Normalized field name: ms_THEMA
Shape: Normalized field name: ms_THEMB

and DBF table dumped using Shapelib's dbfdump utility:

mloskot:~/dev/gdal/bugs/1250$ dbfdump -h test | head -n 8
Field 0: Type=String, Title=`ms_CTE', Width=8, Decimals=0
Field 1: Type=Integer, Title=`ms_DATUM', Width=11, Decimals=0
Field 2: Type=String, Title=`ms_DTM', Width=1, Decimals=0
Field 3: Type=String, Title=`ms_EIGNAM', Width=80, Decimals=0
Field 4: Type=Integer, Title=`ms_LAYER', Width=11, Decimals=0
Field 5: Type=String, Title=`ms_OMSCHR', Width=25, Decimals=0
Field 6: Type=String, Title=`ms_THEMA', Width=12, Decimals=0
Field 7: Type=String, Title=`ms_THEMB', Width=80, Decimals=0

I hope it is OK.

comment:3 Changed 13 years ago by bartvde@…

Hi Mateusz,

it will work this way, but it would be a lot more convenient for us if the ms_ part would be removed (the namespace part). 

Now we get users complaining that their attribute names have changed.

Bart

comment:4 Changed 13 years ago by bartvde@…

Maybe the GML reader should be adapted so it will ignore the namespace?

Or maybe Mapserver WFS should be adapted to use ms as the default namespace, so it does not have to use ms: explicitly in the XML document.

What do you think Mateusz?

comment:5 Changed 13 years ago by warmerdam

I think I have flip flopped at least once on the "stripping namespace" issue
in the GML driver without coming to any clear conclusion.  One problem is
that several name spaces can be used in GML and then preserving the name space
qualifier can be necessary to keep fields distinct. 

On the other hand, the namespace isn't really a proper part of the element
name. 

Garr. 

Well, I'd suggest resubmitting a bug report suggesting stripping of namespaces.

comment:6 Changed 13 years ago by Mateusz Łoskot

(In reply to comment #4)
> Hi Mateusz,
> 
> it will work this way, but it would be a lot more convenient for us if the ms_
> part would be removed (the namespace part). 
> 
> Now we get users complaining that their attribute names have changed.

Bart,

I've considered possible problems caused by my fix of this Bug 1250.
However, I've not found any better solution.
This bug applies to user-defined namespaces which are not part of the GML spec, so I'm not sure if removing *all* custom namespaces is a good approach.
Is it?
Or it's better to *only* test and remove ms: namespace?

comment:7 Changed 13 years ago by Mateusz Łoskot

(In reply to comment #5)
> Maybe the GML reader should be adapted so it will ignore the namespace?

What do you mean?
To ignore user-defined namespace?
I think it's a bad idea, see Frank's comment.

> Or maybe Mapserver WFS should be adapted to use ms as the default
> namespace, so
> it does not have to use ms: explicitly in the XML document.

Hmm, Frank's comment applies here too.
I'm not sure I see any solution well-working for everyone :-(

comment:8 Changed 13 years ago by Mateusz Łoskot

(In reply to comment #6)
> I think I have flip flopped at least once on the "stripping namespace" issue
> in the GML driver without coming to any clear conclusion.

Yes, I can understand it.
I faced the same problem.

> One problem is
> that several name spaces can be used in GML and then preserving the name space
> qualifier can be necessary to keep fields distinct.

Yup, that's my concern too.

> On the other hand, the namespace isn't really a proper
> part of the element name. 

Not a proper part of XML element name?
Sorry, I don't understand it.

> Well, I'd suggest resubmitting a bug report suggesting
> stripping of namespaces.

Or may be we could add new option to ogr2ogr and GML driver to control following behaviour:
- strip namespace from fields name (user has to check if names clash is possible)
- replace colon in XML element name one of character from a set of valid characters for DBF field name (for example, _, [A-Z], [a-z], [0-9])
- replace namespace part (colon included) with custom prefix, for example ms: could be replaced with "" (empty string, with the same result as in the first option), or 'ms' to give msmyfield, etc.

What do you think?

comment:9 Changed 13 years ago by bartvde@…

Hi Mateusz, Frank,

maybe the best option is for Mapserver to define "ms" as the default namespace, i.e.:

xmlns="http://mapserver.gis.umn.edu/mapserver"

then Mapserver does not have to put ms: in front of the elements.

I will ask this on the mapserver list.

Bart
Note: See TracTickets for help on using tickets.