Opened 19 years ago

Last modified 13 years ago

#882 closed enhancement

Unicode support in OGR — at Version 6

Reported by: magnus@… Owned by: warmerdam
Priority: normal Milestone: 1.9.0
Component: OGR_SF Version: unspecified
Severity: normal Keywords: Shape
Cc: Markus Neteler, alexbruy, gislab, Jeff McKenna

Description (last modified by Mateusz Łoskot)

A function to return/set the encoding in a shapefiles .dbf is wanted. See URL for an ESRI view of how it can be done. Alternatively/Also, the functions should be able to support Unicode in all places where char * are used.

Background: I'm working on qgis, and it uses Qt with QStrings. They are unicode, and when using a non-unicode library one can chose between mystring.latin1(), mystring.ascii(), mystring.utf8() msytring.unicode() and mystring.local8Bit() for converting from unicode. So far, filenames use .local8Bit(), but attributes use .ascii().

I'm thinking I might change it to .local(Bit() too. The ideal(?) soultion would be if we could set the encoding to utf-8 for instance, and also note that in the .dbf header somewhere. Reverse on reading.

References: http://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_STRUCT Discussion on irc #gdal 2005-07-06 Search "gdal org unicode" on google reveals discussion in gdal-dev on "Shapelib and unicode".

Change History (5)

comment:1 by warmerdam, 19 years ago

I would add that the dbf reference lists byte 30 (offset 29) as the
language driver code.  There is no listed value for unicode though.  

http://www.clicketyclick.dk/databases/xbase/format/dbf.html

In OGR, for creation, we should support a layer creation option to set
the language code in the shapefile driver.  

There is no obvious means of reporting language code when reading since 
OGR has no metadata facility.  


comment:2 by magnus@…, 19 years ago

One more ref on i18n in Qt:
http://doc.trolltech.com/3.3/i18n.html

comment:3 by neteler@…, 18 years ago

(From update of attachment 296)
sorry, submitted to the wrong bug number. Please delete here.

comment:5 by warmerdam, 17 years ago

Description: modified (diff)
Priority: highnormal
Severity: majornormal
Type: defectenhancement

An RFC is under development to address this:

http://www.gdal.org/rfc5_unicode.html

Adding Andrey as a cc: in case the infrormation in this report is helpful.

Reclassifying as an enhancement.

comment:6 by Mateusz Łoskot, 17 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.