Opened 11 years ago

Closed 9 years ago

Last modified 9 years ago

#2120 closed defect (fixed)

wxgui: encoding errors

Reported by: mlennert Owned by: grass-dev@…
Priority: major Milestone: 6.4.6
Component: wxGUI Version: svn-releasebranch64
Keywords: locale encoding Cc:
CPU: All Platform: All

Description

Using grass64release in French (locale: fr_BE.utf8), I seem to be seeing more encoding errors than before in the wxGUI. Examples are when running v.in.ogr, r.in.gdal, v.build. In fact whenever a module emits a message in the locale's language which contains special characters, I get errors such as

'ascii' codec can't decode byte 0xef in position 16: ordinal
not in range(128)

I don't get such errors on the command line.

These errors do not keep the module from functioning, but it makes reading its messages nearly impossible unless you really know what to look for.

I don't have the time now to go back and try to find out if this worked better before, but I do think so. It might just be that more strings have been translated to French and that is why I see these errors now.

I think this also adds to the recent discussions about test in that it makes it evident that if we want to take multi-locale usage of GRASS seriously, we probably need to think of a series of tests that developers should use to make sure any changes do not create such encoding problems.

Moritz

Change History (18)

comment:1 by marisn, 11 years ago

Is this still an issue in GRASS 7? There has been a huge progress related to ironing out encoding issues in WXGUI.

in reply to:  1 ; comment:2 by mlennert, 11 years ago

Replying to marisn:

Is this still an issue in GRASS 7? There has been a huge progress related to ironing out encoding issues in WXGUI.

I've been testing in the last days, and it seems that everything is working smoothly now in GRASS7. Great job !

However, in GRASS6release I still get the same errors.

Maybe some of what was done in grass7 should be backported to grass6?

Moritz

in reply to:  2 comment:3 by neteler, 10 years ago

Replying to mlennert:

I've been testing in the last days, and it seems that everything is working smoothly now in GRASS7. Great job !

However, in GRASS6release I still get the same errors. Maybe some of what was done in grass7 should be backported to grass6?

See also #1856

comment:4 by hcho, 10 years ago

I see the same error message in 7 trunk when I check "Redirect to console" in the wx monitor.

Traceback (most recent call last):
  File "/home/grass/trunk/dist.x86_64-unknown-linux-gnu/gui/wxpython/gui_core/query.py", line 65, in <lambda>
    self.redirect.Bind(wx.EVT_CHECKBOX, lambda evt: self._onRedirect(evt.IsChecked()))
  File "/home/grass/trunk/dist.x86_64-unknown-linux-gnu/gui/wxpython/gui_core/query.py", line 143, in _onRedirect
    self.redirectOutput.emit(output=self._textToRedirect())
  File "/home/grass/trunk/dist.x86_64-unknown-linux-gnu/gui/wxpython/gui_core/query.py", line 148, in _textToRedirect
    text = printResults(self._model, self._colNames[1])
  File "/home/grass/trunk/dist.x86_64-unknown-linux-gnu/gui/wxpython/gui_core/query.py", line 213, in printResults
    return '\n'.join(textList)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xea in position 2: ordinal not in range(128)

comment:5 by hcho, 10 years ago

I mean Redirect to console in Query results.

comment:6 by hcho, 10 years ago

Maybe, this error message is useful:

/usr/lib64/python2.7/site-packages/wx-2.8-gtk2-unicode/wx/lib/mixins/treemixin.py:463: UnicodeWarning: Unicode unequal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  if getattr(self, 'Get%s'%attribute)(item, *args) != value:

comment:7 by hcho, 10 years ago

OK. I solved this issue by creating sitecustomize.py. Python's default encoding is 'ascii', so whenever it tries to compare a string (not u"...", but "...") with a translated message, which may not be ASCII, it complains about different encodings and throws an exception. My sitecustomize.py file looks like

import sys
sys.setdefaultencoding("utf-8")

and I put it in gui/wxpython.

I'm not sure if it's a good idea to include this file in GRASS SVN because it overrides and forces to use UTF-8 encoding. I don't know why the encoding setting in Preferences doesn't work. It seems like we cannot change the default encoding outside sitecustomize.py.

comment:8 by hcho, 10 years ago

FYI, Python 3's default encoding is UTF-8, so I guess it doesn't hurt to add sitecustomize.py in GRASS SVN.

comment:9 by hcho, 10 years ago

Fixed in r60307

in reply to:  7 comment:10 by glynn, 10 years ago

Replying to hcho:

I'm not sure if it's a good idea to include this file in GRASS SVN

It isn't.

It's working around a bug, namely that Python's default encoding is being used. Fix the bug, not the symptoms.

I don't know why the encoding setting in Preferences doesn't work. It seems like we cannot change the default encoding outside sitecustomize.py.

That's correct. The reason is related to the use of hashtables to implement dictionaries (including those which underlie Python objects).

When a Unicode object is hashed, if it can be converted to a byte string using the default encoding then it is, and the resulting string's hash is used. This allows a dictionary key which is a string to be matched using the string's Unicode equivalent, and vice-versa.

However, if the default encoding is changed after any non-ASCII keys have been added to dictionaries, any lookups for those keys would use the wrong hash and fail. To protect against this possibility, the site module deletes the setdefaultencoding() function from the sys module after loading the sitecustomize and/or usercustomize modules. Basically, once the interpreter is fully "up and running", the default encoding is fixed.

Note that Python's default encoding (as per sys.getdefaultencoding()) is only used for implicit conversions between str and unicode objects, e.g. via the str() and unicode() constructors, or via the C API (e.g. PyString_AsString). But not for filenames (see below).

It is not the same thing as the locale's default encoding as per locale.getpreferredencoding() or locale.getdefaultlocale(). Nor is it the same thing as the filesystem encoding as per sys.getfilesystemencoding(), which is used for converting filenames (from unicode to str on Unix, Mac and Windows 9x, from str to Unicode on Windows NT). These are what application code should be using.

So, the implicit conversions should never be used by wxGUI. Conversions between Unicode strings (which is what wxPython normally uses) and byte strings should be explicit, with a specified encoding (e.g. that from the locale, unless information is available to indicate a different encoding).

comment:11 by hcho, 10 years ago

CPU: UnspecifiedAll
Platform: UnspecifiedAll
Priority: normalmajor

There seems to be a real fix other than changing the default encoding. Hopefully your explanation is helpful to the wxGUI devs for fixing the bug in the correct way.

Meanwhile the users who are bothered by the issue, including me, can work around it by creating usercustomize.py.

Also I would say this is major or even blocker?, not normal because some functionalities are not working at all because of this issue. It's not simply an annoyance.

comment:12 by hcho, 10 years ago

For our records, we still have this bug in 7 trunk.

comment:13 by marisn, 10 years ago

Milestone: 6.4.47.0.1
Version: svn-releasebranch647.0.0

As fixing all issues would involve too invasive changes in GRASS 6.x maintenance releases, bumping up versions to 7.

Reporters: please fill a separate bug report for each case, as it is hard to track progress by comments alone. Add bug # here for easier tracking. As most of issues are spotted on Windows running non-english GRASS version, do NOT close any of them before testing with (insert your favourite language) translated version of GRASS on Windows.

  • Raster query redirection to console issue reported as #2617
  • Raster query (no redirection) #2601
  • Save console output #2614
  • Command console fails if username is not ascii only #2390
  • Network analysis tool fails to start #2145

comment:14 by mlennert, 9 years ago

I know that this will probably not get fixed in the grass6 branch, but in view of the upcoming release of 6.4.5, I just wanted to note that this problem still exists. When I create a new map with any command (e.g. v.surf.idw, v.net, ...) and this new map is then automatically displayed, I get the following type of error message in the gui console:

v.surf.idw --overwrite input=elev_lid792_randpts@PERMANENT output=elev column=value

Traceback (most recent call last):
  File
"/data/home/mlennert/SRC/GRASS/grass-6.4.5RC1/dist.x86_64
-unknown-linux-gnu/etc/wxpython/gui_core/goutput.py", line
759, in OnCmdOutput

self.cmdOutput.AddTextWrapped(message, wrap = None)
  File
"/data/home/mlennert/SRC/GRASS/grass-6.4.5RC1/dist.x86_64
-unknown-linux-gnu/etc/wxpython/gui_core/goutput.py", line
1238, in AddTextWrapped

txt = EncodeString(txt)
  File
"/data/home/mlennert/SRC/GRASS/grass-6.4.5RC1/dist.x86_64
-unknown-linux-gnu/etc/wxpython/core/gcmd.py", line 103, in
EncodeString

return string.encode(enc)
UnicodeDecodeError
:
'ascii' codec can't decode byte 0xe9 in position 22: ordinal
not in range(128)

v.net op=connect

Traceback (most recent call last):
  File
"/data/home/mlennert/SRC/GRASS/grass-6.4.5RC1/dist.x86_64
-unknown-linux-gnu/etc/wxpython/gui_core/goutput.py", line
759, in OnCmdOutput

self.cmdOutput.AddTextWrapped(message, wrap = None)
  File
"/data/home/mlennert/SRC/GRASS/grass-6.4.5RC1/dist.x86_64
-unknown-linux-gnu/etc/wxpython/gui_core/goutput.py", line
1238, in AddTextWrapped

txt = EncodeString(txt)
  File
"/data/home/mlennert/SRC/GRASS/grass-6.4.5RC1/dist.x86_64
-unknown-linux-gnu/etc/wxpython/core/gcmd.py", line 103, in
EncodeString

return string.encode(enc)
UnicodeDecodeError
:
'ascii' codec can't decode byte 0xef in position 16: ordinal
not in range(128)

All this in a French locale setting:

> locale
LANG=fr_BE.utf8
LANGUAGE=fr_BE.utf8
LC_CTYPE="fr_BE.utf8"
LC_NUMERIC="fr_BE.utf8"
LC_TIME="fr_BE.utf8"
LC_COLLATE="fr_BE.utf8"
LC_MONETARY="fr_BE.utf8"
LC_MESSAGES="fr_BE.utf8"
LC_PAPER="fr_BE.utf8"
LC_NAME="fr_BE.utf8"
LC_ADDRESS="fr_BE.utf8"
LC_TELEPHONE="fr_BE.utf8"
LC_MEASUREMENT="fr_BE.utf8"
LC_IDENTIFICATION="fr_BE.utf8"
LC_ALL=

comment:15 by neteler, 9 years ago

Milestone: 7.0.17.0.2

Ticket retargeted after 7.0.1 milestone closed

comment:16 by neteler, 9 years ago

Milestone: 7.0.27.0.3

Ticket retargeted after milestone closed

comment:17 by mlennert, 9 years ago

Resolution: fixed
Status: newclosed

For me this has been a grass6 problem, not grass7.

Testing rapidly (v.surf.idw, v.net) I don't see the error message anymore. I don't see any relevant changes in the grass6 code that explains why this should work, now, but it does.

So, closing this for now, as it seems to be fixed for me in grass6 and I never had these issues in grass7.

Huidae's issue seems to be more related to #2617.

comment:18 by mlennert, 9 years ago

Milestone: 7.0.36.4.6
Version: 7.0.0svn-releasebranch64
Note: See TracTickets for help on using tickets.