Opened 5 years ago
Last modified 4 years ago
#3925 new defect
winGRASS 7.8.1dev: 'charmap' codec can't decode byte 0x9d - issue in vector attribute data handling (e.g. opening attribute table, v.report, etc)
| Reported by: | hellik | Owned by: | |
|---|---|---|---|
| Priority: | major | Milestone: | 7.8.3 |
| Component: | Vector | Version: | git-releasebranch78 |
| Keywords: | python3, py3, wingrass | Cc: | |
| CPU: | x86-64 | Platform: | MSWindows |
Description
tested with
GRASS Version: 7.8.1dev Code revision: d1c4ad132 Build date: 2019-10-22 Build platform: x86_64-w64-mingw32 GDAL: 2.4.1 PROJ: 5.2.0 GEOS: 3.8.0 SQLite: 3.29.0 Python: 3.7.0 wxPython: 4.0.3 Platform: Windows-10-10.0.18362-SP0 (OSGeo4W)
downloaded data from geonames.org and imported data by v.in.geonames
v.report map=at_out@data option=coor
Traceback (most recent call last):
File "C:\OSGEO4~1\apps\grass\grass78/scripts/v.report.py",
line 226, in <module>
main()
File "C:\OSGEO4~1\apps\grass\grass78/scripts/v.report.py",
line 108, in main
cols = decode(line).rstrip('\r\n').split('|')
File "C:\OSGEO4~1\apps\grass\grass78\etc\python\grass\scri
pt\utils.py", line 193, in decode
return bytes_.decode(enc)
File "C:\OSGEO4~1\apps\Python37\lib\encodings\cp1252.py",
line 15, in decode
return
codecs.charmap_decode(input,errors,decoding_table)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d
in position 195: character maps to <undefined>
Change History (14)
comment:1 by , 5 years ago
| Priority: | major → blocker |
|---|---|
| Summary: | v.report - UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d → winGRASS 7.8.1dev: 'charmap' codec can't decode byte 0x9d - issue in vector attribute data handling (e.g. opening attribute table, v.report, etc) |
comment:2 by , 5 years ago
comment:3 by , 5 years ago
maybe related #3220 WinGRASS not recognizing accented utf-8 (nor cp1252) attribute values
follow-up: 5 comment:4 by , 5 years ago
v.db.select output of the geonames data:
v.db.select map=at_geonames@data cat|geonameid|name|asciiname|alternatename|latitude|longitude|featureclass|featurecode|countrycode|cc2|admin1code|admin2code|admin3code|admin4code|population|elevation|gtopo30|timezone|modification 1|2598245|Sandgatterl|Sandgatterl||47.75|14.56667|T|PASS|AT||04|415|41522||0||1490|Europe/Vienna|2014-05-02 2|2598246|Viehtalalm|Viehtalalm||47.75|14.56667|L|GRAZ|AT||04||||0||1490|Europe/Vienna|1999-04-30 3|2598247|Adlmoarstein|Adlmoarstein||47.75|14.55|T|CLF|AT||04||||0||1023|Europe/Vienna|1999-04-30 4|2598248|Waldbaueralm|Waldbaueralm||47.75|14.56667|L|GRAZ|AT||04||||0||1490|Europe/Vienna|1999-04-30 5|2598249|Federeck|Federeck||47.75|14.56667|T|PK|AT||04|415|41522||0||1490|Europe/Vienna|2014-05-02 6|2598250|Mooshöhe|Mooshoehe||47.75|14.55|P|PPL|AT||04|415|41522||0||1023|Europe/Vienna|2014-05-02 7|2598251|Antonihütte|Antonihuette||47.75|14.53333|S|HUT|AT||04|415|41522||0||866|Europe/Vienna|2014-05-02 8|2598252|Bergeralm|Bergeralm||47.75|14.51667|L|GRAZ|AT||04||||0||780|Europe/Vienna|1999-04-30 9|2598253|Blabergalm|Blabergalm||47.75|14.5|L|GRAZ|AT||04||||0||885|Europe/Vienna|1999-04-30 10|2598254|Nattereck|Nattereck||47.75|14.48333|T|PK|AT||04|409|40914||0||721|Europe/Vienna|2014-05-02 11|2598255|Langeck|Langeck||47.75|14.48333|T|PK|AT||04|409|40914||0||721|Europe/Vienna|2014-05-02 12|2598256|Zorngraben|Zorngraben||47.75|14.46667|H|STMI|AT||04||||0||951|Europe/Vienna|1999-04-30 13|2598257|Gugler|Gugler||47.75|14.45|T|PK|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 14|2598258|Zorngrabenklause|Zorngrabenklause||47.75|14.45|T|SLP|AT||04||||0||1019|Europe/Vienna|1999-04-30 15|2598259|Sitzenbacher Klause|Sitzenbacher Klause||47.75|14.45|T|PK|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 16|2598260|Sitzenbachhütte|Sitzenbachhuette||47.75|14.45|S|HUT|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 17|2598261|Deckleitnerbach|Deckleitnerbach||47.75|14.45|H|STM|AT||04||||0||1019|Europe/Vienna|1999-04-30 18|2598262|Hundseck|Hundseck||47.75|14.41667|T|PK|AT||04|409|40914||0||1120|Europe/Vienna|2014-05-02 19|2598263|Schafgraben|Schafgraben||47.75|14.4|H|STMI|AT||04||||0||1081|Europe/Vienna|1999-04-30 20|2598264|Maierreut|Maierreut||47.75|14.4|L|GRAZ|AT||04||||0||1081|Europe/Vienna|1999-04-30 21|2598265|Rumpelmayrreut|Rumpelmayrreut||47.75|14.38333|L|GRAZ|AT||04||||0||1094|Europe/Vienna|1999-04-30 22|2598266|Bloßboden|Blossboden||47.75|14.36667|T|SLP|AT||04||||0||1439|Europe/Vienna|1999-04-30 23|2598267|Weiße Ries|Weisse Ries||47.75|14.36667|T|CLF|AT||04||||0||1439|Europe/Vienna|1999-04-30
v.db.select seems to work, but some encoding issues also there, e.g. Weiße Ries|Weisse Ries
follow-up: 12 comment:5 by , 5 years ago
Replying to hellik:
v.db.select output of the geonames data:
v.db.select map=at_geonames@data cat|geonameid|name|asciiname|alternatename|latitude|longitude|featureclass|featurecode|countrycode|cc2|admin1code|admin2code|admin3code|admin4code|population|elevation|gtopo30|timezone|modification 1|2598245|Sandgatterl|Sandgatterl||47.75|14.56667|T|PASS|AT||04|415|41522||0||1490|Europe/Vienna|2014-05-02 2|2598246|Viehtalalm|Viehtalalm||47.75|14.56667|L|GRAZ|AT||04||||0||1490|Europe/Vienna|1999-04-30 3|2598247|Adlmoarstein|Adlmoarstein||47.75|14.55|T|CLF|AT||04||||0||1023|Europe/Vienna|1999-04-30 4|2598248|Waldbaueralm|Waldbaueralm||47.75|14.56667|L|GRAZ|AT||04||||0||1490|Europe/Vienna|1999-04-30 5|2598249|Federeck|Federeck||47.75|14.56667|T|PK|AT||04|415|41522||0||1490|Europe/Vienna|2014-05-02 6|2598250|Mooshöhe|Mooshoehe||47.75|14.55|P|PPL|AT||04|415|41522||0||1023|Europe/Vienna|2014-05-02 7|2598251|Antonihütte|Antonihuette||47.75|14.53333|S|HUT|AT||04|415|41522||0||866|Europe/Vienna|2014-05-02 8|2598252|Bergeralm|Bergeralm||47.75|14.51667|L|GRAZ|AT||04||||0||780|Europe/Vienna|1999-04-30 9|2598253|Blabergalm|Blabergalm||47.75|14.5|L|GRAZ|AT||04||||0||885|Europe/Vienna|1999-04-30 10|2598254|Nattereck|Nattereck||47.75|14.48333|T|PK|AT||04|409|40914||0||721|Europe/Vienna|2014-05-02 11|2598255|Langeck|Langeck||47.75|14.48333|T|PK|AT||04|409|40914||0||721|Europe/Vienna|2014-05-02 12|2598256|Zorngraben|Zorngraben||47.75|14.46667|H|STMI|AT||04||||0||951|Europe/Vienna|1999-04-30 13|2598257|Gugler|Gugler||47.75|14.45|T|PK|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 14|2598258|Zorngrabenklause|Zorngrabenklause||47.75|14.45|T|SLP|AT||04||||0||1019|Europe/Vienna|1999-04-30 15|2598259|Sitzenbacher Klause|Sitzenbacher Klause||47.75|14.45|T|PK|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 16|2598260|Sitzenbachhütte|Sitzenbachhuette||47.75|14.45|S|HUT|AT||04|409|40914||0||1019|Europe/Vienna|2014-05-02 17|2598261|Deckleitnerbach|Deckleitnerbach||47.75|14.45|H|STM|AT||04||||0||1019|Europe/Vienna|1999-04-30 18|2598262|Hundseck|Hundseck||47.75|14.41667|T|PK|AT||04|409|40914||0||1120|Europe/Vienna|2014-05-02 19|2598263|Schafgraben|Schafgraben||47.75|14.4|H|STMI|AT||04||||0||1081|Europe/Vienna|1999-04-30 20|2598264|Maierreut|Maierreut||47.75|14.4|L|GRAZ|AT||04||||0||1081|Europe/Vienna|1999-04-30 21|2598265|Rumpelmayrreut|Rumpelmayrreut||47.75|14.38333|L|GRAZ|AT||04||||0||1094|Europe/Vienna|1999-04-30 22|2598266|Bloßboden|Blossboden||47.75|14.36667|T|SLP|AT||04||||0||1439|Europe/Vienna|1999-04-30 23|2598267|Weiße Ries|Weisse Ries||47.75|14.36667|T|CLF|AT||04||||0||1439|Europe/Vienna|1999-04-30v.db.select seems to work, but some encoding issues also there, e.g. Weiße Ries|Weisse Ries
starting v.db.select pops up the same encoding error:
Exception in thread Thread-20:
Traceback (most recent call last):
File "C:\OSGEO4~1\apps\Python37\lib\threading.py", line
917, in _bootstrap_inner
self.run()
File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\core\gco
nsole.py", line 162, in run
self.resultQ.put((requestId, self.requestCmd.run()))
File
"C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\core\gcmd.py",
line 606, in run
self._redirect_stream()
File
"C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\core\gcmd.py",
line 631, in _redirect_stream
line = recv_some(self.module, e=0, stderr=0)
File
"C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\core\gcmd.py",
line 335, in recv_some
y.append(decode(r))
File "C:\OSGEO4~1\apps\grass\grass78\etc\python\grass\scri
pt\utils.py", line 193, in decode
return bytes_.decode(enc)
File "C:\OSGEO4~1\apps\Python37\lib\encodings\cp1252.py",
line 15, in decode
return
codecs.charmap_decode(input,errors,decoding_table)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d
in position 840: character maps to <undefined>
follow-up: 7 comment:6 by , 5 years ago
This is a larger problem. In case of geonames, they are encoded in utf8, but the decode function uses local encoding. There is GRASS_DB_ENCODING, which should in theory help, but it's not used in v.report and more generally, it's not tied to individual tables and it's not user-friendly. One practical way, which wouldn't solve this for all cases, but perhaps majority is to have a new decode function for attribute data, which would try first GRASS_DB_ENCODING if specified, then try decoding with local encoding and if that doesn't work, use utf8.
follow-up: 8 comment:7 by , 5 years ago
Replying to annakrat:
One practical way, which wouldn't solve this for all cases, but perhaps majority is to have a new decode function for attribute data, which would try first GRASS_DB_ENCODING if specified, then try decoding with local encoding and if that doesn't work, use utf8.
Conditional GRASS_DB_ENCODING which might be used for inspiration in this regards:
comment:8 by , 5 years ago
Replying to neteler:
Conditional GRASS_DB_ENCODING which might be used for inspiration in this regards:
Based on this if you don't have DB encoding specified (in GUI preferences or through env variable) then it uses utf-8. That's fine on systems with utf-8 but on Windows? Should it use local encoding instead? Since we need to work with Python 3 and unicode strings, the garbage in, garbage out doesn't work now and at the same time we don't know the encoding of the attributes.
comment:11 by , 5 years ago
| Milestone: | → 7.8.3 |
|---|
comment:14 by , 4 years ago
| Priority: | blocker → major |
|---|
There has been some fixes in GUI, which could help a little bit. The general problem persists, but I don't think it's blocker.

now tested in a windows 10 box with a german locale:
download geonames data of an AT dump
clicking on a point with a german umlaut
east, north: 11.31700474110463, 47.23295282435024 at_geonames@data: Type: Point Id: 10136 Layer: 1 Category: 10136 Driver: sqlite Database: D:\grassdata\loc_test_vingeonames\data\sqlite\sqlite.db Table: at_geonames Key_column: cat Attributes: cat: 10136 geonameid: 2762446 name: Vellenberg asciiname: Vellenberg latitude: 47.23333 longitude: 11.31667 featureclass: S featurecode: FRM countrycode: AT admin1code: 07 admin2code: 703 admin3code: 70312 population: 0 gtopo30: 865 timezone: Europe/Vienna modification: 2014-05-03 at_geonames@data: Type: Point Id: 25781 Layer: 1 Category: 25781 Driver: sqlite Database: D:\grassdata\loc_test_vingeonames\data\sqlite\sqlite.db Table: at_geonames Key_column: cat Attributes: cat: 25781 geonameid: 2778215 name: Götznerberg asciiname: Goetznerberg latitude: 47.23333 longitude: 11.31667 featureclass: S featurecode: FRM countrycode: AT admin1code: 07 admin2code: 703 admin3code: 70312 population: 0 gtopo30: 865 timezone: Europe/Vienna modification: 2014-05-03trying v.report
v.report map=at_geonames@data option=coor Traceback (most recent call last): File "C:\OSGEO4~1\apps\grass\grass78/scripts/v.report.py", line 226, in <module> main() File "C:\OSGEO4~1\apps\grass\grass78/scripts/v.report.py", line 108, in main cols = decode(line).rstrip('\r\n').split('|') File "C:\OSGEO4~1\apps\grass\grass78\etc\python\grass\scri pt\utils.py", line 193, in decode return bytes_.decode(enc) File "C:\OSGEO4~1\apps\Python37\lib\encodings\cp1252.py", line 15, in decode return codecs.charmap_decode(input,errors,decoding_table) UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 195: character maps to <undefined>or trying to open the vector attribute table
Traceback (most recent call last): File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\lmgr\frame.py", line 2060, in OnShowAttributeTable selection=selection) File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\dbmgr\ma nager.py", line 112, in __init__ self.CreateDbMgrPage(parent=self, pageName='browse') File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\dbmgr\base.py", line 811, in CreateDbMgrPage parent=parent, parentDbMgrBase=self, onlyLayer=onlyLayer) File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\dbmgr\base.py", line 1095, in __init__ self.AddLayer(layer) File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\dbmgr\base.py", line 1138, in AddLayer self.dbMgrData, layer, self.pages) File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\dbmgr\base.py", line 113, in __init__ keyColumn = self.LoadData(layer) File "C:\OSGEO4~1\apps\grass\grass78\gui\wxpython\dbmgr\base.py", line 278, in LoadData record = decode(outFile.readline().strip()).replace('\n', '') File "C:\OSGEO4~1\apps\grass\grass78\etc\python\grass\scri pt\utils.py", line 193, in decode return bytes_.decode(enc) File "C:\OSGEO4~1\apps\Python37\lib\encodings\cp1252.py", line 15, in decode return codecs.charmap_decode(input,errors,decoding_table) UnicodeDecodeError : 'charmap' codec can't decode byte 0x9d in position 219: character maps to <undefined>opening the table freezes the attribute table window
it seems to be an encoding issue of attribute data handling