Opened 19 years ago

Last modified 13 years ago

#882 closed enhancement

Unicode support in OGR — at Version 5

Reported by: magnus@… Owned by: warmerdam
Priority: normal Milestone: 1.9.0
Component: OGR_SF Version: unspecified
Severity: normal Keywords: Shape
Cc: Markus Neteler, alexbruy, gislab, Jeff McKenna

Description (last modified by warmerdam)

A function to return/set the encoding in a shapefiles .dbf is wanted. See URL
for an ESRI view of how it can be done. Alternatively/Also, the functions should
be able to support Unicode in all places where char * are used.

Background:
I'm working on qgis, and it uses Qt with QStrings. They are unicode, and when
using a non-unicode library one can chose between mystring.latin1(),
mystring.ascii(), mystring.utf8() msytring.unicode() and mystring.local8Bit()
for converting from unicode. So far, filenames use .local8Bit(), but attributes
use .ascii(). I'm thinking I might change it to .local(Bit() too.
The ideal(?) soultion would be if we could set the encoding to utf-8 for
instance, and also note that in the .dbf header somewhere. Reverse on reading.

References:
http://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_STRUCT 
Discussion on irc #gdal 2005-07-06
Search "gdal org unicode" on google reveals discussion in gdal-dev on "Shapelib
and unicode".

Change History (4)

comment:1 by warmerdam, 19 years ago

I would add that the dbf reference lists byte 30 (offset 29) as the
language driver code.  There is no listed value for unicode though.  

http://www.clicketyclick.dk/databases/xbase/format/dbf.html

In OGR, for creation, we should support a layer creation option to set
the language code in the shapefile driver.  

There is no obvious means of reporting language code when reading since 
OGR has no metadata facility.  


comment:2 by magnus@…, 19 years ago

One more ref on i18n in Qt:
http://doc.trolltech.com/3.3/i18n.html

comment:3 by neteler@…, 18 years ago

(From update of attachment 296)
sorry, submitted to the wrong bug number. Please delete here.

comment:5 by warmerdam, 17 years ago

Description: modified (diff)
Priority: highnormal
Severity: majornormal
Type: defectenhancement

An RFC is under development to address this:

http://www.gdal.org/rfc5_unicode.html

Adding Andrey as a cc: in case the infrormation in this report is helpful.

Reclassifying as an enhancement.

Note: See TracTickets for help on using tickets.