Opened 12 years ago

Closed 12 years ago

#4739 closed enhancement (wontfix)

Is LDID/87 appropriate as default encoding of the Shapefile driver dataset creation?

Reported by: akaginch Owned by: warmerdam
Priority: normal Milestone:
Component: OGR_SF Version: unspecified
Severity: normal Keywords: Shapefile, Language Driver ID
Cc:

Description (last modified by akaginch)

When creating Shapefile dataset without "ENCODING" option in layer creation options, the Language Driver ID of dbf file is set to LDID/87 (ISO-8859-1) now.

Non-zero LDID seems to be exclusive for other specified code page in some cases. For example:
QGIS #4343 Shapefile, created in Qgis, encoding not recognized by Esri ArcGIS 10

SHAPEWORKSPACE - ArcXML Programmer's Reference

In the first case, when the dataset that has non-zero LDID not corresponding to encoding of the table contents will be used, the user have to set correct LDID with binary editor beforehand. Otherwise the user have to set zero to LDID by saving with OpenOffice.org Calc and create cpg file with any text editor. Compared to them, it's easier to add a cpg file to the dataset created with LDID/0 as a default.

Therefore, I think that LDID/0 will be more appropriate than LDID/87 as a default.

Change History (4)

comment:1 by akaginch, 12 years ago

I give some examples of the encoding transitions by ogr2ogr without specifying the ENCODING option.

Command example: ogr2ogr -t_srs EPSG:3099 dest source.shp

(Source encoding) ---> (Inner encoding) ---> (Generated encoding)

In case of LDID/87 as a default encoding:

  1. Source code page is specified.
    CP932 ---> UTF-8 ---> ISO-8859-1

  2. Source code page is NOT specified.
    CP932 ---> CP932 ---> ISO-8859-1
                (wrong conversion from UTF-8 to ISO-8859-1. CP932 was treated as UTF-8)

In case of LDID/0 as a default encoding:

  3. Source code page is specified.
    CP932 ---> UTF-8 ---> UTF-8

  4. Source code page is NOT specified.
    CP932 ---> CP932 ---> CP932

LDID/0 means no code page specified. A patch is here (if needed).

dbfopen.c (633)
-    return DBFCreateEx( pszFilename, "LDID/87" ); // 0x57
+    return DBFCreateEx( pszFilename, NULL );

If "LDID/0" was passed to DBFCreateEx as pszCodepage(2nd argument), OGR Shapefile Driver failed to recode from UTF-8 to LDID/0. So, NULL is better.

comment:2 by akaginch, 12 years ago

Description: modified (diff)

comment:3 by akaginch, 12 years ago

Description: modified (diff)

comment:4 by akaginch, 12 years ago

Resolution: wontfix
Status: newclosed

I gave another solution (See #4808). It will significantly reduce the user's need to edit LDID value or CPG file. I cannot confirm the things in the description of this ticket because they are not my own problems. So I would like to close this ticket.

Note: See TracTickets for help on using tickets.