Opened 7 years ago

Closed 5 years ago

Last modified 4 years ago

#4608 closed defect (fixed)

GDAL 1.9.0 for Python 2.7.2 does not support native unicode

Reported by: invisibleroads Owned by: hobu
Priority: normal Milestone: 2.0.0
Component: PythonBindings Version: 1.9.0
Severity: normal Keywords: unicode utf-8 utf8 SetField GetField
Cc:

Description (last modified by invisibleroads)

Currently, using the Python SWIG bindings for GDAL 1.9.0,

feature.SetField?(0, u'xxx')

feature.SetField?(0, 'Спасибо'.decode('utf-8'))

raise the following exception

NotImplementedError?: Wrong number of arguments for overloaded

function 'Feature_SetField'

Meanwhile, feature.SetField2() forcibly converts the string to ascii using str().

Ideally,

SetField?() should accept native unicode

SetField2() should accept native unicode

GetField?() should return native unicode

GetFieldAsString?() should return native unicode

References:

http://lists.osgeo.org/pipermail/gdal-dev/2010-September/026156.html

http://trac.osgeo.org/gdal/wiki/rfc5_unicode

Change History (9)

comment:1 Changed 7 years ago by invisibleroads

Description: modified (diff)

comment:2 Changed 7 years ago by invisibleroads

Description: modified (diff)

comment:3 Changed 7 years ago by invisibleroads

Component: SWIG (all bindings)PythonBindings

Ari Jolma reports that Perl already supports unicode attributes in GDAL. This means that GDAL stores unicode attributes, but there is something wrong with the Python typemaps.

Le 11 avril 2012 10:01, Ari Jolma <ari.jolma@…> a écrit :

In the Perl bindings all strings going to GDAL internals are upgraded from Perl internal format to utf-8 and all strings coming from GDAL internals are marked for Perl to be utf-8.

This is done in the Perl typemaps after a change last November. http://trac.osgeo.org/gdal/changeset/23405

I don't see a similar problem in the Perl bindings as below. However utf-8 strings are handled in Python differently from Perl.

comment:4 Changed 7 years ago by invisibleroads

Maybe one can apply a variation of Ari's changes in http://trac.osgeo.org/gdal/changeset/23405 to http://trac.osgeo.org/gdal/browser/trunk/gdal/swig/include/python

I can't do this now, but I may be able to look at this issue in a few weeks.

comment:5 Changed 5 years ago by Jukka Rahkonen

invisibleroads, what would you say about this ticket now?

comment:6 Changed 5 years ago by invisibleroads

Hmm, I am still getting the same errors with GDAL 1.11.0. Unfortunately, I'm not familiar with Perl and I am not able to make much sense of Ari's changeset.

from osgeo import ogr, osr
data_driver = ogr.GetDriverByName('ESRI Shapefile')
data_source = data_driver.CreateDataSource('/tmp/examples.shp')
spatial_reference = osr.SpatialReference()
spatial_reference.ImportFromProj4('+proj=longlat +datum=WGS84 +no_defs')
data_layer = data_source.CreateLayer('layer', spatial_reference, ogr.wkbPoint)
feature_definition = data_layer.GetLayerDefn()
feature = ogr.Feature(feature_definition)

feature.SetField(0, 'Спасибо'.decode('utf-8'))
NotImplementedError: Wrong number of arguments for overloaded function 'Feature_SetField'.
  Possible C/C++ prototypes are:
    SetField(OGRFeatureShadow *,int,char const *)
    SetField(OGRFeatureShadow *,char const *,char const *)
    SetField(OGRFeatureShadow *,int,int)
    SetField(OGRFeatureShadow *,char const *,int)
    SetField(OGRFeatureShadow *,int,double)
    SetField(OGRFeatureShadow *,char const *,double)
    SetField(OGRFeatureShadow *,int,int,int,int,int,int,int,int)
    SetField(OGRFeatureShadow *,char const *,int,int,int,int,int,int,int)

feature.SetField2(0, 'Спасибо'.decode('utf-8'))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-6: ordinal not in range(128)

comment:7 Changed 5 years ago by Even Rouault

Milestone: 2.0
Resolution: fixed
Status: newclosed

trunk r28259 "Python bindings: for Python 2.X, accept unicode string as argument of Feature.SetField?(idx_or_name, value) (#4608)"

I'm not really considering returning unicode strings however as this has the potential of breaking a lot of stuff.

comment:8 Changed 5 years ago by invisibleroads

Cool, thanks rouault. I think at the time we were dealing with accents in French sounding towns in Senegal.

comment:9 Changed 4 years ago by Even Rouault

Milestone: 2.02.0.0

Milestone renamed

Note: See TracTickets for help on using tickets.