Opened 8 years ago
Closed 4 years ago
#5963 closed defect (wontfix)
Reading shape layer attributes in chinese on chinese operating system returns wrong values when using C#
|Reported by:||joker99||Owned by:||tamas|
The bug is in c# bindings, specifically in Utf8BytesToString function in Ogr.cs file.
The function tries to get the length of unmanaged buffer using this line
int length = Marshal.PtrToStringAnsi(pNativeData).Length
It assumes that number of characters in string is equals to number of bytes. But this is wrong. It is correct only in ASCII but not in other cases including ANSI strings.
PtrToStringAnsi tries to decode the unmanaged buffer using default decoder of the system (set in wiondows by Regional Settings->Administrative->Current language for non-unicode languages). The default decoder on chinese operating system is not ascii but GB2312 or something similar. In this decoder, for UTF8 encoded chinese string character is not equals byte and PtrToStringAnsi(...).Length returns wrong number of bytes
The fix is to count bytes till nullptr manually, i.e.
int length = 0; while (Marshal.ReadByte(pNativeData, length) != 0) length++;
To reproduce, it is enough to change "Current language for non-unicode languages" to Chinese on English windows.
Attached is an utf8 encoded shape file with Chinese attribute values and png that shows the actual values of attributes
Change History (3)
by , 8 years ago
by , 8 years ago
actual values that should be returned
comment:1 by , 4 years ago
|Status:||new → closed|
This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.
shape file with chinese attributes