Opened 10 years ago

Closed 10 years ago

#882 closed defect (fixed)

i18n enabled winGRASS: properties dialog not opening

Reported by: neteler Owned by: grass-dev@…
Priority: blocker Milestone: 6.4.0
Component: wxGUI Version: svn-releasebranch64
Keywords: wingrass Cc:
CPU: x86-32 Platform: All

Description

In the current 6.4.svn version, with NLS enabled, the wxGUI dialog window for selecting raster and vector maps does not open. Also right-mouse button/properties fail to open.

Using Japanese XP.

Attachments (1)

update_menudata.diff (789 bytes) - added by neteler 10 years ago.
patch as in comment 4

Download all attachments as: .zip

Change History (20)

comment:2 Changed 10 years ago by neteler

Unfortunately the problem persists (WinGRASS-6.4.SVN-r40553-1-Setup.exe). Using XP (Japanese).

Markus

comment:3 Changed 10 years ago by neteler

I just discovered that in the terminal some problems are indicated:

Traceback (most recent call last):
  File "C:/Program
Files/GRASS-64-SVN/etc/wxpython/wxgui.py", line 1260, in
OnAddRaster

self.AddRaster(event)
  File "C:/Program
Files/GRASS-64-SVN/etc/wxpython/wxgui.py", line 1385, in
AddRaster

self.curr_page.maptree.AddLayer('raster')
  File "C:\Program
Files\GRASS-64-SVN\etc\wxpython\gui_modules\wxgui_utils.py",
line 792, in AddLayer

self.PropertiesDialog(layer, show=True)
  File "C:\Program
Files\GRASS-64-SVN\etc\wxpython\gui_modules\wxgui_utils.py",
line 858, in PropertiesDialog

parentframe=self)
  File "C:\Program
Files\GRASS-64-SVN\etc\wxpython\gui_modules\menuform.py",
line 1819, in ParseCommand

handler)
  File "C:\Program
Files\GRASS-64-SVN\Python25\lib\xml\sax\__init__.py", line
49, in parseString

parser.parse(inpsrc)
  File "C:\Program
Files\GRASS-64-SVN\Python25\lib\xml\sax\expatreader.py",
line 107, in parse

xmlreader.IncrementalParser.parse(self, source)
  File "C:\Program
Files\GRASS-64-SVN\Python25\lib\xml\sax\xmlreader.py", line
123, in parse

self.feed(buffer)
  File "C:\Program
Files\GRASS-64-SVN\Python25\lib\xml\sax\expatreader.py",
line 211, in feed

self._err_handler.fatalError(exc)
  File "C:\Program
Files\GRASS-64-SVN\Python25\lib\xml\sax\handler.py", line
38, in fatalError

raise exception
xml.sax._exceptions
.
SAXParseException
:
<unknown>:1:30: unknown encoding

Maybe that gives an idea.

Markus

comment:4 in reply to:  3 Changed 10 years ago by glynn

Replying to neteler:

I just discovered that in the terminal some problems are indicated:

SAXParseException
:
<unknown>:1:30: unknown encoding

Maybe that gives an idea.

Look at the first line of the output from d.rast --interface-description. The encoding= parameter is probably something Windows-specific like "CP932". In which case, we may need to use something other than locale_charset() (e.g. bind_textdomain_codeset() with NULL for the codeset parameter), or even convert the text to UTF-8.

Except ... GUI.ParseCommand() assumes that the text is in the locale's encoding and is explicitly converting it to UTF-8. But it doesn't change the encoding= in the XML header, which will still contain the locale's encoding (maybe this pre-dates r17034 being back-ported?).

The fact that it's xml.sax.parseString() which is failing rather than the ".decode(enc)" in the argument suggests that Python understands the encoding, so it may suffice to do something like:

- ... .decode(enc).encode("utf-8") ...
+ ... .decode(enc).split('\n',1)[1].replace('', '<?xml version="1.0" encoding="utf-8"?>\n', 1).encode("utf-8") ...

The situation is complicated by the fact that 6.5/7.0 has abandoned xml.sax in favour of xml.etree.ElementTree.

comment:5 Changed 10 years ago by neteler

On a Japanese Linux box, the current encoding is:

r.cost --interface-description
<?xml version="1.0" encoding="EUC-JP"?>

but should be UTF8.

comment:6 Changed 10 years ago by neteler

On a Japanese Windows-XP box, the current encoding is:

<?xml version="1.0" encoding="CP932" ?>

comment:7 in reply to:  5 ; Changed 10 years ago by glynn

Replying to neteler:

On a Japanese Linux box, the current encoding is:

r.cost --interface-description
<?xml version="1.0" encoding="EUC-JP"?>

but should be UTF8.

What do you mean by "should be"? Are the descriptions encoded in EUC-JP or in UTF-8? First and foremost, the encoding specified in the header must be the encoding which is actually used for the data.

The strings to which opt->label and opt->description point will be in the locale's encoding, as that's how dgettext() works, and that's how they need to be encoded so that the --help output displays correctly on the user's terminal. G__usage_xml() doesn't recode the data; it just copies the strings into the output.

If you want the --interface-description output to always use UTF-8, then G__usage_xml() will need to recode any localised strings with iconv().

There shouldn't be any problem with encoding=EUC-JP, but encoding=CP932 may be problematic as "CP932" isn't an IANA-registered encoding (the IANA name is "Windows-31J"), so it may not be recognised by XML parsers.

IANA-registered encodings

comment:8 in reply to:  7 ; Changed 10 years ago by neteler

Keywords: wingrass added
Platform: MSWindows XPAll

Replying to glynn:

Replying to neteler:

On a Japanese Linux box, the current encoding is:

r.cost --interface-description
<?xml version="1.0" encoding="EUC-JP"?>

but should be UTF8.

What do you mean by "should be"?

OK: s/should be/IMHO/

Are the descriptions encoded in EUC-JP or in UTF-8?

The grass*_ja.po files are in UTF8.

First and foremost, the encoding specified in the header must be the encoding which is actually used for the data.

That's why I assumed a mismatch (both for Win and Linux, but in different ways).

The strings to which opt->label and opt->description point will be in the locale's encoding, as that's how dgettext() works, and that's how they need to be encoded so that the --help output displays correctly on the user's terminal. G__usage_xml() doesn't recode the data; it just copies the strings into the output.

If you want the --interface-description output to always use UTF-8, then G__usage_xml() will need to recode any localised strings with iconv().

There shouldn't be any problem with encoding=EUC-JP, but encoding=CP932 may be problematic as "CP932" isn't an IANA-registered encoding (the IANA name is "Windows-31J"), so it may not be recognised by XML parsers.

I did not have success with encoding=EUC-JP while the .po files are in UTF8. Same of course for encoding=CP932.

IANA-registered encodings

comment:9 in reply to:  8 Changed 10 years ago by glynn

Replying to neteler:

Are the descriptions encoded in EUC-JP or in UTF-8?

The grass*_ja.po files are in UTF8.

That's not what I'm asking. Is the text in the <description> and <label> elements in the --interface-description output encoded in EUC-JP or in UTF-8? If it's EUC-JP, then the header should say encoding=EUC-JP. If it's in UTF-8, the header should say encoding=UTF-8.

I did not have success with encoding=EUC-JP while the .po files are in UTF8. Same of course for encoding=CP932.

The encoding of the .po files is irrelevant. dgettext() converts the strings from the .mo files to the locale's encoding, so that you don't need separate .mo files for e.g. LANG=ja_JP (Japanese, EUC-JP) and LANG=ja_JP.utf-8 (Japanese, UTF-8).

comment:10 Changed 10 years ago by neteler

Testing on Linux (EN is default locale):

GRASS 6.4.0svn (nc_spm_08):~ > r.walk --interface-description  | grep -v encoding | grep -v DOCTYPE > en.txt

GRASS 6.4.0svn (nc_spm_08):~ > . ja.sh
GRASS 6.4.0svn (nc_spm_08):~ > r.walk --interface-description  | grep encoding
<?xml version="1.0" encoding="EUC-JP"?>
GRASS 6.4.0svn (nc_spm_08):~ > r.walk --interface-description  | grep -v encoding | grep -v DOCTYPE > ja.txt

GRASS 6.4.0svn (nc_spm_08):~ > file en.txt ja.txt
en.txt: ASCII English text
ja.txt: ISO-8859 English text

GRASS 6.4.0svn (nc_spm_08):~ > r.walk
Traceback (most recent call last):
  File "/home/neteler/grass64/dist.x86_64-unknown-linux-gnu/etc/wxpython/gui_modules/menuform.py", line 1937, in <module>
...
xml.sax._exceptions.SAXParseException: <unknown>:12:4: not well-formed (invalid token)

So: while it advertises EUC-JP, the content appears to be ISO-8859 if I got the test right (thanks for hint to Hamish).

comment:11 in reply to:  10 Changed 10 years ago by glynn

Replying to neteler:

GRASS 6.4.0svn (nc_spm_08):~ > file en.txt ja.txt
en.txt: ASCII English text
ja.txt: ISO-8859 English text

So: while it advertises EUC-JP, the content appears to be ISO-8859 if I got the test right (thanks for hint to Hamish).

Nah. "file"'s encoding detection is rather rudimentary. If it sees a mix of ASCII and 8-bit characters, and it isn't UTF-8, it concludes that it's ISO-8859.

Based upon this I'd say that it's EUC-JP; a Japanese locale on Unix is likely to use either UTF-8 or EUC-JP, and file can detect UTF-8. If you want a more accurate test, try iconv which each encoding in turn and see which one works, or load the file in Firefox and see which encoding it selects.

comment:12 Changed 10 years ago by neteler

Here the test:

[neteler@north ~]$  iconv -f utf-8 -t utf-8 ja.txt >/dev/null
iconv: illegal input sequence at position 465

[neteler@north ~]$  iconv -f euc-jp -t utf-8 ja.txt >/dev/null
[neteler@north ~]$

comment:13 in reply to:  12 ; Changed 10 years ago by glynn

Replying to neteler:

Here the test:

Yep; that's EUC-JP, which is what I'd expect for LANG=ja_JP (UTF-8 locales normally have a .UTF-8 suffix, e.g. ja_JP.UTF-8).

comment:14 in reply to:  13 ; Changed 10 years ago by neteler

Platform: AllMSWindows XP

Replying to glynn:

Replying to neteler:

Here the test:

Yep; that's EUC-JP, which is what I'd expect for LANG=ja_JP (UTF-8 locales normally have a .UTF-8 suffix, e.g. ja_JP.UTF-8).

I see, so it was an improper setting here. So, setting on Linux:

export LANG=ja_JP.UTF-8
export LANGUAGE=ja_JP.UTF-8
export LC_MESSAGES=ja_JP.UTF-8

enables r.walk to come up with a translated wxGUI. Excellent.

This reduces the problem to the Windows encoding issue.

comment:15 in reply to:  14 Changed 10 years ago by glynn

Replying to neteler:

I see, so it was an improper setting here. So, setting on Linux:

export LANG=ja_JP.UTF-8
export LANGUAGE=ja_JP.UTF-8
export LC_MESSAGES=ja_JP.UTF-8

enables r.walk to come up with a translated wxGUI. Excellent.

This reduces the problem to the Windows encoding issue.

Hmm; no. If it doesn't work with LANG=ja_JP, that's a bug.

Did you try the fix suggested in comment:4?

Changed 10 years ago by neteler

Attachment: update_menudata.diff added

patch as in comment 4

comment:16 Changed 10 years ago by neteler

Platform: MSWindows XPAll

I have tried (now attached) patch from comment 4, on Linux box.

# session 1:
GRASS 6.4.0svn (nc_spm_08):~ > set | grep ja_
LANG=ja_JP
LANGUAGE=ja_JP
LC_MESSAGES=ja_JP
GRASS 6.4.0svn (nc_spm_08):~ > r.walk --interface-description  | grep encoding
<?xml version="1.0" encoding="UTF-8"?>

and

# session 2:
GRASS 6.4.0svn (nc_spm_08):~ > set | grep ja_
LANG=ja_JP.UTF-8
LANGUAGE=ja_JP.UTF-8
LC_MESSAGES=ja_JP.UTF-8
GRASS 6.4.0svn (nc_spm_08):~ > r.walk --interface-description  | grep encoding
<?xml version="1.0" encoding="UTF-8"?>

In both cases the wxGUI of r.walk comes now up in Japanese.

comment:17 in reply to:  16 Changed 10 years ago by glynn

Replying to neteler:

I have tried (now attached) patch from comment 4, on Linux box.

In both cases the wxGUI of r.walk comes now up in Japanese.

The more interesting question is whether it works on Windows (I suspect that GUI.ParseCommand() may need similar treatment).

Also, it should probably be using the encoding from the XML header rather than locale.getdefaultlocale().

comment:18 Changed 10 years ago by neteler

Attached patch submitted as r40630 (6.4 only) to get it into the automated winGRASS package.

comment:19 Changed 10 years ago by neteler

Resolution: fixed
Status: newclosed

It needed to be fixed in one more place: r40640. Now it works. Closing.

Markus

Note: See TracTickets for help on using tickets.