[PATCH] Shapefile: interpreting LDID/87 not as ISO-8859-1 but as no codepage specified
|Reported by:||akaginch||Owned by:||warmerdam|
|Severity:||normal||Keywords:||Language driver ID|
Description (last modified by )
What does LDID/87 means? The page cited in the source code (ogrshapelayer.cpp) says LDID/87 is "Current ANSI Codepage", but the Shapefile Driver treats it as ISO-8859-1.
In Shapefile creation, default LDID is "87". If we create a Shapefile without specifying ENCODING option, a DBF file whose LDID is this value will be generated. Then OGR Shapefile driver recodes the attribute strings from UTF-8 to ISO-8859-1 when the user writes features into the shapefile,
Encoding conversion ability of OGR is useful, but a problem arises because many applications using GDAL have not adapted to the imporovement of the ability. So attribute strings output from such applications are garbled.
Now, I would like to propose that Shapefile driver should interpret LDID/87 as no codepage specified. If it does so, without specifying ENCODING option, the driver doesn't convert character encodings. Additionally it makes ability to handle a Shapefile that has "87" in the LDID field and attribute strings in the encoding other than ISO-8859-1, without any problem.
gdal/ogr/ogrsf_frmts/shape/ogrshapelayer.cpp CPLString OGRShapeLayer::ConvertCodePage( const char *pszCodePage ) line 221 - case 87: return CPL_ENC_ISO8859_1; + case 87: return osEncoding;
Change History (7)
follow-up: 5 comment:4 by , 10 years ago
|Summary:||Shapefile: interpreting LDID/87 not as ISO-8859-1 but as no codepage specified → [PATCH] Shapefile: interpreting LDID/87 not as ISO-8859-1 but as no codepage specified|