Opened 5 years ago
Last modified 5 years ago
#3925 new defect
winGRASS 7.8.1dev: 'charmap' codec can't decode byte 0x9d - issue in vector attribute data handling (e.g. opening attribute table, v.report, etc)
Reported by: | hellik | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 7.8.3 |
Component: | Vector | Version: | git-releasebranch78 |
Keywords: | python3, py3, wingrass | Cc: | |
CPU: | x86-64 | Platform: | MSWindows |
Description
tested with
GRASS Version: 7.8.1dev Code revision: d1c4ad132 Build date: 2019-10-22 Build platform: x86_64-w64-mingw32 GDAL: 2.4.1 PROJ: 5.2.0 GEOS: 3.8.0 SQLite: 3.29.0 Python: 3.7.0 wxPython: 4.0.3 Platform: Windows-10-10.0.18362-SP0 (OSGeo4W)
downloaded data from geonames.org and imported data by v.in.geonames
v.report map=at_out@data option=coor Traceback (most recent call last): File "C:\OSGEO4~1\apps\grass\grass78/scripts/v.report.py", line 226, in <module> main() File "C:\OSGEO4~1\apps\grass\grass78/scripts/v.report.py", line 108, in main cols = decode(line).rstrip('\r\n').split('|') File "C:\OSGEO4~1\apps\grass\grass78\etc\python\grass\scri pt\utils.py", line 193, in decode return bytes_.decode(enc) File "C:\OSGEO4~1\apps\Python37\lib\encodings\cp1252.py", line 15, in decode return codecs.charmap_decode(input,errors,decoding_table) UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 195: character maps to <undefined>
Change History (14)
comment:1 by , 5 years ago
Priority: | major → blocker |
---|---|
Summary: | v.report - UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d → winGRASS 7.8.1dev: 'charmap' codec can't decode byte 0x9d - issue in vector attribute data handling (e.g. opening attribute table, v.report, etc) |
comment:2 by , 5 years ago
comment:3 by , 5 years ago
maybe related #3220 WinGRASS not recognizing accented utf-8 (nor cp1252) attribute values
follow-up: 5 comment:4 by , 5 years ago
v.db.select output of the geonames data:
v.db.select map=at_geonames@data cat|geonameid|name|asciiname|alternatename|latitude|longitude|featureclass|featurecode|countrycode|cc2|admin1code|admin2code|admin3code|admin4code|population|elevation|gtopo30|timezone|modification 1|2598245|Sandgatterl|Sandgatterl||47.75|14.56667|T|PASS|AT||04|415|41522||0||1490|Europe/Vienna|2014-05-02 2|2598246|Viehtalalm|Viehtalalm||47.75|14.56667|L|GRAZ|AT||04||||0||1490|Europe/Vienna|1999-04-30 3|2598247|Adlmoarstein|Adlmoarstein||47.75|14.55|T|CLF|AT||04||||0||1023|Europe/Vienna|1999-04-30 4|2598248|Waldbaueralm|Waldbaueralm||47.75|14.56667|L|GRAZ|AT||04||||0||1490|Europe/Vienna|1999-04-30 5|2598249|Federeck|Federeck||47.75|14.56667|T|PK|AT||04|415|41522||0||1490|Europe/Vienna|2014-05-02 6|2598250|Mooshöhe|Mooshoehe||47.75|14.55|P|PPL|AT||04|415|41522||0||1023|Europe/Vienna|2014-05-02 7|2598251|Antonihütte|Antonihuette||47.75|14.53333|S|HUT|AT||04|415|41522||0||866|Europe/Vienna|2014-05-02 8|2598252|Bergeralm|Bergeralm||47.75|14.51667|L|GRAZ|AT||04||||0||780|Europe/Vienna|1999-04-30 9|2598253|Blabergalm|Blabergalm||47.75|14.5|L|GRAZ|AT||04||||0||885|Europe/Vienna|1999-04-30 10|2598254|Nattereck|Nattereck||47.75|14.48333|T|PK|AT||04|409|40914||0||721|Europe/Vienna|2014-05-02 11|2598255|Langeck|Langeck||47.75|14.48333|T|PK|AT||04|409|40914||0||721|Europe/Vienna|2014-05-02 12|2598256|Zorngraben|Zorngraben||47.75|14.46667|H|STMI|AT||04||||0||951|Europe/Vienna|1999-04-30 13|2598257|Gugler|Gugler||47.75|14.45|T|PK|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 14|2598258|Zorngrabenklause|Zorngrabenklause||47.75|14.45|T|SLP|AT||04||||0||1019|Europe/Vienna|1999-04-30 15|2598259|Sitzenbacher Klause|Sitzenbacher Klause||47.75|14.45|T|PK|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 16|2598260|Sitzenbachhütte|Sitzenbachhuette||47.75|14.45|S|HUT|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 17|2598261|Deckleitnerbach|Deckleitnerbach||47.75|14.45|H|STM|AT||04||||0||1019|Europe/Vienna|1999-04-30 18|2598262|Hundseck|Hundseck||47.75|14.41667|T|PK|AT||04|409|40914||0||1120|Europe/Vienna|2014-05-02 19|2598263|Schafgraben|Schafgraben||47.75|14.4|H|STMI|AT||04||||0||1081|Europe/Vienna|1999-04-30 20|2598264|Maierreut|Maierreut||47.75|14.4|L|GRAZ|AT||04||||0||1081|Europe/Vienna|1999-04-30 21|2598265|Rumpelmayrreut|Rumpelmayrreut||47.75|14.38333|L|GRAZ|AT||04||||0||1094|Europe/Vienna|1999-04-30 22|2598266|Bloßboden|Blossboden||47.75|14.36667|T|SLP|AT||04||||0||1439|Europe/Vienna|1999-04-30 23|2598267|Weiße Ries|Weisse Ries||47.75|14.36667|T|CLF|AT||04||||0||1439|Europe/Vienna|1999-04-30
v.db.select seems to work, but some encoding issues also there, e.g. Weiße Ries|Weisse Ries
follow-up: 12 comment:5 by , 5 years ago
Replying to hellik:
v.db.select output of the geonames data:
v.db.select map=at_geonames@data cat|geonameid|name|asciiname|alternatename|latitude|longitude|featureclass|featurecode|countrycode|cc2|admin1code|admin2code|admin3code|admin4code|population|elevation|gtopo30|timezone|modification 1|2598245|Sandgatterl|Sandgatterl||47.75|14.56667|T|PASS|AT||04|415|41522||0||1490|Europe/Vienna|2014-05-02 2|2598246|Viehtalalm|Viehtalalm||47.75|14.56667|L|GRAZ|AT||04||||0||1490|Europe/Vienna|1999-04-30 3|2598247|Adlmoarstein|Adlmoarstein||47.75|14.55|T|CLF|AT||04||||0||1023|Europe/Vienna|1999-04-30 4|2598248|Waldbaueralm|Waldbaueralm||47.75|14.56667|L|GRAZ|AT||04||||0||1490|Europe/Vienna|1999-04-30 5|2598249|Federeck|Federeck||47.75|14.56667|T|PK|AT||04|415|41522||0||1490|Europe/Vienna|2014-05-02 6|2598250|Mooshöhe|Mooshoehe||47.75|14.55|P|PPL|AT||04|415|41522||0||1023|Europe/Vienna|2014-05-02 7|2598251|Antonihütte|Antonihuette||47.75|14.53333|S|HUT|AT||04|415|41522||0||866|Europe/Vienna|2014-05-02 8|2598252|Bergeralm|Bergeralm||47.75|14.51667|L|GRAZ|AT||04||||0||780|Europe/Vienna|1999-04-30 9|2598253|Blabergalm|Blabergalm||47.75|14.5|L|GRAZ|AT||04||||0||885|Europe/Vienna|1999-04-30 10|2598254|Nattereck|Nattereck||47.75|14.48333|T|PK|AT||04|409|40914||0||721|Europe/Vienna|2014-05-02 11|2598255|Langeck|Langeck||47.75|14.48333|T|PK|AT||04|409|40914||0||721|Europe/Vienna|2014-05-02 12|2598256|Zorngraben|Zorngraben||47.75|14.46667|H|STMI|AT||04||||0||951|Europe/Vienna|1999-04-30 13|2598257|Gugler|Gugler||47.75|14.45|T|PK|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 14|2598258|Zorngrabenklause|Zorngrabenklause||47.75|14.45|T|SLP|AT||04||||0||1019|Europe/Vienna|1999-04-30 15|2598259|Sitzenbacher Klause|Sitzenbacher Klause||47.75|14.45|T|PK|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 16|2598260|Sitzenbachhütte|Sitzenbachhuette||47.75|14.45|S|HUT|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 17|2598261|Deckleitnerbach|Deckleitnerbach||47.75|14.45|H|STM|AT||04||||0||1019|Europe/Vienna|1999-04-30 18|2598262|Hundseck|Hundseck||47.75|14.41667|T|PK|AT||04|409|40914||0||1120|Europe/Vienna|2014-05-02 19|2598263|Schafgraben|Schafgraben||47.75|14.4|H|STMI|AT||04||||0||1081|Europe/Vienna|1999-04-30 20|2598264|Maierreut|Maierreut||47.75|14.4|L|GRAZ|AT||04||||0||1081|Europe/Vienna|1999-04-30 21|2598265|Rumpelmayrreut|Rumpelmayrreut||47.75|14.38333|L|GRAZ|AT||04||||0||1094|Europe/Vienna|1999-04-30 22|2598266|Bloßboden|Blossboden||47.75|14.36667|T|SLP|AT||04||||0||1439|Europe/Vienna|1999-04-30 23|2598267|Weiße Ries|Weisse Ries||47.75|14.36667|T|CLF|AT||04||||0||1439|Europe/Vienna|1999-04-30v.db.select seems to work, but some encoding issues also there, e.g. Weiße Ries|Weisse Ries
starting v.db.select pops up the same encoding error:
Exception in thread Thread-20: Traceback (most recent call last): File "C:\OSGEO4~1\apps\Python37\lib\threading.py", line 917, in _bootstrap_inner self.run() File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\core\gco nsole.py", line 162, in run self.resultQ.put((requestId, self.requestCmd.run())) File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\core\gcmd.py", line 606, in run self._redirect_stream() File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\core\gcmd.py", line 631, in _redirect_stream line = recv_some(self.module, e=0, stderr=0) File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\core\gcmd.py", line 335, in recv_some y.append(decode(r)) File "C:\OSGEO4~1\apps\grass\grass78\etc\python\grass\scri pt\utils.py", line 193, in decode return bytes_.decode(enc) File "C:\OSGEO4~1\apps\Python37\lib\encodings\cp1252.py", line 15, in decode return codecs.charmap_decode(input,errors,decoding_table) UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 840: character maps to <undefined>
follow-up: 7 comment:6 by , 5 years ago
This is a larger problem. In case of geonames, they are encoded in utf8, but the decode function uses local encoding. There is GRASS_DB_ENCODING, which should in theory help, but it's not used in v.report and more generally, it's not tied to individual tables and it's not user-friendly. One practical way, which wouldn't solve this for all cases, but perhaps majority is to have a new decode function for attribute data, which would try first GRASS_DB_ENCODING if specified, then try decoding with local encoding and if that doesn't work, use utf8.
follow-up: 8 comment:7 by , 5 years ago
Replying to annakrat:
One practical way, which wouldn't solve this for all cases, but perhaps majority is to have a new decode function for attribute data, which would try first GRASS_DB_ENCODING if specified, then try decoding with local encoding and if that doesn't work, use utf8.
Conditional GRASS_DB_ENCODING which might be used for inspiration in this regards:
comment:8 by , 5 years ago
Replying to neteler:
Conditional GRASS_DB_ENCODING which might be used for inspiration in this regards:
Based on this if you don't have DB encoding specified (in GUI preferences or through env variable) then it uses utf-8. That's fine on systems with utf-8 but on Windows? Should it use local encoding instead? Since we need to work with Python 3 and unicode strings, the garbage in, garbage out doesn't work now and at the same time we don't know the encoding of the attributes.
comment:11 by , 5 years ago
Milestone: | → 7.8.3 |
---|
comment:14 by , 5 years ago
Priority: | blocker → major |
---|
There has been some fixes in GUI, which could help a little bit. The general problem persists, but I don't think it's blocker.
now tested in a windows 10 box with a german locale:
download geonames data of an AT dump
clicking on a point with a german umlaut
trying v.report
or trying to open the vector attribute table
opening the table freezes the attribute table window
it seems to be an encoding issue of attribute data handling