Opened 15 years ago
Closed 8 years ago
#943 closed defect (wontfix)
wxpython gui hangs after switching to cairo display driver
Reported by: | epatton | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | 6.4.6 |
Component: | wxGUI | Version: | svn-develbranch6 |
Keywords: | cairo, driver, gui, wxpython | Cc: | |
CPU: | All | Platform: | Linux |
Description
The wxpython Map Display window freezes and becomes unresponsive after changing the display driver from 'default' to 'cairo'.
I have compiled Grass 6.5 from a fresh checkout from svn, set --with-cairo, --with-cairo-libs=/usr/lib, and --with-cairo-includes=/usr/include. One thing I just noticed is that the configure line
checking for cairo linking flags...
appears as it does above, with no confirmation about whether or not it found cairo linking flags. But then I'm not sure it it's supposed to print anything in a successful case, either.
I've checked my package manager, and I have libcairo2 and libcairo2-dev installed, versions 1.8.8.
Anything else I should check? Can anyone confirm this error?
Thanks,
~ Eric.
Change History (27)
follow-up: 3 comment:1 by , 15 years ago
comment:3 by , 15 years ago
Replying to hamish:
I don't know if it's related, but I can get
d.mon -L
to die:
... export GRASS_RENDER_IMMEDIATE=TRUE ... d.mon -L
I wouldn't expect GRASS_RENDER_IMMEDIATE and d.mon to work well together.
In particular, mon.status calls R_open_driver() repeatedly, which may not work with direct rendering (i.e. R_close_driver() may not restore everything sufficiently for a subsequent R_open_driver() to work; this case won't have had much testing).
comment:4 by , 15 years ago
Replying to epatton:
I have compiled Grass 6.5 from a fresh checkout from svn, set --with-cairo, --with-cairo-libs=/usr/lib, and --with-cairo-includes=/usr/include.
You should never need to specify /usr/lib or /usr/include explicitly (however, recent versions of GCC will explicitly ignore them, so it's harmless).
On Linux, you should only need --with-cairo, which obtains the compiling and linking flags via pkg-config. The --with-cairo-includes, --with-cairo-libs, and --with-cairo-ldflags switches are there for platforms lacking pkg-config.
One thing I just noticed is that the configure line
checking for cairo linking flags...
appears as it does above, with no confirmation about whether or not it found cairo linking flags. But then I'm not sure it it's supposed to print anything in a successful case, either.
It prints whatever you specified for --with-cairo-ldflags. If you didn't use it, then it won't print anything.
If the cairo driver gets built, there's nothing wrong with the configure switches. If it doesn't work, that suggests a bug in either the driver or the GUI.
follow-up: 7 comment:5 by , 15 years ago
this still locks up with 64bit grass 6.5 + debian/stable.
I wonder if it is the different enviro vars needed for Cairo in 6.5?
export GRASS_CAIRO_READ=TRUE export GRASS_CAIRO_MAPPED=TRUE export GRASS_CAIROFILE=map.png
in 6.4svn & 7svn on the same machine it works. does 6.5svn + 32bit work?
Hamish
follow-up: 8 comment:6 by , 15 years ago
Yep, still locks up here too - 64bit, 6.5.svn on Ubuntu 9.10.
I tried exporting the variables you posted to .grass.bashrc, to no avail.
I have no way of testing on a 32-bit system.
~ Eric.
comment:7 by , 15 years ago
Replying to hamish:
this still locks up with 64bit grass 6.5 + debian/stable.
I wonder if it is the different enviro vars needed for Cairo in 6.5?
export GRASS_CAIRO_READ=TRUE export GRASS_CAIRO_MAPPED=TRUE export GRASS_CAIROFILE=map.png
GRASS_CAIRO_MAPPED only works with BMP images, not any of the other formats. If the filename doesn't end in ".bmp", GRASS_CAIRO_MAPPED is silently ignored.
The file must have the correct size (width * height * 4 + HEADER_SIZE bytes).
comment:8 by , 15 years ago
CPU: | x86-64 → All |
---|
Replying to epatton:
I have no way of testing on a 32-bit system.
ok, tried on 32-bit debian/lenny (stable). locks up, need to use xkill
.
so both 32 and 64 bit, ubuntu and debian, grass 6.5 only.
do any non-.deb family users have cairo working with the wxGUI?
Hamish
comment:9 by , 11 years ago
Component: | Display → wxGUI |
---|---|
Milestone: | 6.5.0 → 6.4.3 |
Priority: | normal → critical |
ok, seems like two bugs,
[debian/squeeze on amd64, testing in both relbr64 and devbr6, cairo 1.10]
with GRASS_PNG_READ as in the example in comment:1, d.mon still crashes with,
G643svn> export GRASS_PNGFILE=map.bmp G643svn> export GRASS_PNG_READ=TRUE G643svn> d.mon -L *** glibc detected *** status: free(): invalid pointer: ... G643svn> echo $? 1
(as above)
If I touch map.bmp
first then d.mon runs ok.
So it either needs to test if the file exists and exit with an error, or test if it exists and skip trying to read it, or create it if it's missing. (maybe d.mon shouldn't be creating it..)
---
in the wxGUI preferences, after changing the Map Display -> Display driver to "cairo" I get this error in the command console:
Settings applied to current session but not saved ERROR: Rendering failed. Details: File </tmp/tmpsF0WNE.ppm> not found
however, there is an empty file in /tmp/ called tmpsF0WNE
. (no .ppm)
?, Hamish
comment:10 by , 11 years ago
the following patch against devbr6 changes the error message from "File <...> not found" to "ERROR: Rendering failed. Details: Error reading PPM file" (the file exists but is empty).
Index: gui/wxpython/core/render.py =================================================================== --- gui/wxpython/core/render.py (revision 56191) +++ gui/wxpython/core/render.py (working copy) @@ -87,13 +87,14 @@ self.opacity, self.hidden)) # generated file for each layer - self.gtemp = tempfile.mkstemp()[1] - self.maskfile = self.gtemp + ".pgm" if self.type == 'overlay': - self.mapfile = self.gtemp + ".png" + tempfile_sfx =".png" else: - self.mapfile = self.gtemp + ".ppm" - + tempfile_sfx =".ppm" + self.mapfile = tempfile.mkstemp(suffix = tempfile_sfx)[1] + # do we need to `touch` the maskfile so it exists? + self.maskfile = self.mapfile.rsplit(".",1)[0] + ".pgm" + def __del__(self): Debug.msg (3, "Layer.__del__(): layer=%s, cmd='%s'" % (self.name, self.GetCmd(string = True)))
shooting in the dark: I guess the error means that g.pnmcomp is trying to read from the existing GRASS_CAIROFILE before adding to it as GRASS_CAIRO_READ is set(?), or maybe the GRASS_RENDER_IMMEDIATE=TRUE is causing the PNG driver to be used instead?
(I notice GRASS_PNG_READ is referenced in a few places, but GRASS_CAIRO_READ is not at all)
Hamish
comment:11 by , 11 years ago
I notice in wxpython/core/render.py sets GRASS_COMPRESSION=0, but that doesn't exist, GRASS_PNG_COMPRESSION does. Having that mis-set probably doesn't help rendering speed & cpu load any.
broken since forever; fixed in all branches.
wrt the tempfiles mentioned in comment:10, I wonder if we can at least mitigate the left-over-file problem (#560) by putting them in the GISRC tempdir with something like:
tempfile.mkstemp(suffix = '.ppm', prefix = os.path.basename(os.path.dirname(os.getenv('GISRC'))) + os.sep + 'tmp_')
(? is the GISRC enviro var reset in core/render.py there for the GCP tool, or..?)
Hamish
comment:12 by , 11 years ago
Hi,
a few cleanups and fixes applied to devbr6 in r56443. The main problem there was GRASS_RENDER_IMMEDIATE always being set to TRUE, which in GRASS 6 always triggers the PNG driver instead.
Now the error message is by d.rast (e.g.) and has to do with the monitor not accepting connections. Interestingly enough, if I have the x0 XMONITOR started and selected (on linux) and the cairo driver selected as the wxGUI rendering mode, it renders to the Xmon much like the old tcl/tk GUIs would.
But as for the Cairo driver, trying to start it still locks up the wxGUI, as per the original bug report, so is currently commented out.
RunCommand('d.mon', start = 'cairo')
I had an hourglass mouse cursor on the Map Display window, and could not change tabs or do anything with the Layer Manager window, although if I dragged it around and moved other windows over the top of it it still looked ok after dragging them off. But I had to use xkill
to get rid of it, as neither the Map Display or the Layer manager's "X" window decoration worked.
Hamish
comment:13 by , 11 years ago
oh, and GRASS_RENDER_IMMEDIATE is now not set at the time that d.mon is called to start the cairo driver, but the GUI still locks up anyway.
comment:14 by , 11 years ago
Hi,
a little more debug (I had to log it to a file, since I couldn't switch back to the output tab after the lock-up).
start g.gui, prefs -> map display -> display driver: cairo -> apply. (blank map display)
g.pnmcomp opacity=1.0 mask=/tmp/tmprA1z8Q.pgm height=537 width=698 \ background=255:255:255 input=/tmp/tmprA1z8Q.ppm output=/tmp/tmpxRSoaJ.ppm
(those .ppm,.pgm are the right size, but all pixels black)
Sometimes I get no rendered map, but do see this error on the layer manager output tab: (the message is a G_fatal_error() from g.pnmcomp)
ERROR: Rendering failed. Details: Error reading PPM file
and sometimes this in the terminal:
.: Fatal IO error 0 (Success) on X server :0.0.
when it tries to run "d.mon start=cairo select=cairo
" the GUI lockup happens on this line of gui/wxpython/core/gcmd.py:
Debug.msg(3, "gcmd.RunCommand(): decoding string") stdout, stderr = map(DecodeString, ps.communicate())
so something is holding the stderr pipe open?
it's a bit funny, the order of RunCommand()
s has the driver not started the first time:
d.mon -p d.rast --q -o map=elevation.dem@PERMANENT d.mon stop=cairo g.pnmcomp opacity=1.0 mask=/tmp/tmpEdfEuF.pgm ... ERROR: Rendering failed. Details: Error reading PPM file [hit the eyeball redraw button] d.mon -p d.mon start=cairo select=cairo [lockup]
... ah, the cairo driver was still running since the last crash, so it didn't get started a second time before the "d.rast". ..nevermind.
adding quiet=True to the 'd.mon start=' command doesn't help, so I suspect it is the stderr pipe which is holding things up?
Hamish
follow-up: 18 comment:15 by , 11 years ago
re the X error,
quote:
There could be several problems here first one is related to this I found by doing a search http://developer.gnome.org/gtk-faq/stable/x505.html. So if you use fork() and then use exit() you have effectively closed the socket to the X11 server. Another problem could be that you have a X11 socket connection created in the parent via GTK and then trying to reuse it in the child processes with the data entering the socket in a bad order.
(another web search mentions that X is perhaps not thread-safe)
I'm testing with Python 2.6.6 btw.
Hamish
comment:16 by , 11 years ago
Adding a ps.poll() right before the ps.communicate() in wxpython/core/gcmd.py RunCommand()
doesn't help much, it returns "None" (i.e. the process is still running).
looking at "ps fax
" during the lockup shows that d.mon has become a zombie:
22954 pts/0 SNl 0:02 wxgui /home/hamish/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/etc/wxpython/wxgui.py 23018 pts/0 ZN 0:00 \_ [d.mon] <defunct> 23020 pts/0 SN 0:00 /home/hamish/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/driver/cairo cairo dev/fifo.8a dev/fifo.8b
the d.mon goes away along with the wxgui as soon as I xkill
the GUI window, the cairo process keeps running until I manually do "d.mon stop=cairo", as expected.
Hamish
comment:17 by , 11 years ago
If I run "d.mon stop=cairo
" from the terminal's command line while the GUI is locked up, life returns to the GUI, although there is a traceback to do with opacity shown in the output tab after:
Command 'd.rast -o map=elevation.dem@PERMANENT' failed Details: No graphics monitor has been selected for output.
and the mouse cursor remains as an hourglass. But you can use the menus and go to prefs and switch back to the "default" display driver, then render maps the old way, etc.
Hamish
comment:18 by , 11 years ago
Replying to hamish:
The content of that page is nonsense.
The difference between exit() and _exit() is the the former flushes streams (FILE*) and calls any handlers registered with atexit() while the latter doesn't. Unless something creates a FILE* for the socket with fdopen() (Xlib doesn't), the distinction doesn't matter.
Terminating a process implicitly close()s all open file descriptors, but the underlying object (file, socket, etc) isn't actually closed until no process has it open. This is true for an explicit close() call as well as an implicit close due to process termination.
(another web search mentions that X is perhaps not thread-safe)
None of this has anything to do with display drivers in general or the cairo driver in particular. None of them use GTK, GDK, or threads.
follow-up: 20 comment:19 by , 11 years ago
by adding a little debug write to d.mon just before its final exit(), I can see that the module was completing ok; i.e. not getting hung up in the G_spawn(). But then something is still holding it open and it becomes a zombie until you manually stop the cairo driver process. At which point it releases and all is back to normal (except wxgui mouse cursor remains an hourglass)
doing the same thing from the command line works fine.
if it was some "Graphics driver [cairo] started" text from d.mon waiting to be flushed from the stderr buffer by python's popen.communicate(), then for one thing the proc.communicate() should flush it, and for another why would externally stopping the cairo driver suddenly free it?
Glynn wrote:
Terminating a process implicitly close()s all open file descriptors, but the underlying object (file, socket, etc) isn't actually closed until no process has it open.
is there a chance that the G_spawn()'d mon.start
or mon.select
program has gotten a hold of one of the stdin/out/err pipes, and is holding it open, and then in the wxgui the proc.communicate() can't let its end go until that happens?
but in that case perhaps the zombie wouldn't exist? since the zombie exists perhaps it is d.mon's exit() which can't let go of the pipe?
thanks, Hamish
comment:20 by , 11 years ago
Replying to hamish:
is there a chance that the G_spawn()'d
mon.start
ormon.select
program has gotten a hold of one of the stdin/out/err pipes, and is holding it open, and then in the wxgui the proc.communicate() can't let its end go until that happens?
Yes. fork() duplicates practically everything, including all descriptors. G_spawn() will only close descriptors which are explicitly requested to be closed (with SF_CLOSE_DESCRIPTOR) or which are implicitly closed by being the target of a redirection (SF_REDIRECT_FILE or SF_REDIRECT_DESCRIPTOR). Any descriptors which have the close-on-exec flag set will be closed in the child process when the new program is exec()d.
ISTR that this is/was actually a problem for the d.resize script, as the new monitor inherits any descriptors from d.resize, so if d.resize inherits the write end of a pipe (e.g. if its stdout or stderr are pipes), the reader won't see EOF so long as the monitor process survives.
but in that case perhaps the zombie wouldn't exist? since the zombie exists perhaps it is d.mon's exit() which can't let go of the pipe?
A zombie is a process which has terminated (completely), but whose parent is alive and hasn't yet been "reaped" by retrieving the child's exit status with wait(), waitpid() etc. If the parent is no longer alive, the child will be "adopted" by the init process which will reap it.
For a child spawned with Python's subprocess.Popen, the .wait() or .poll() methods should be called on the Popen object (in the case of .poll(), it must be called until it returns a value other than None). Popen objects which are garbage-collected will have the .poll() method called from the finaliser; if the child process is still alive at that point, it is added to a list (subprocess._active) which is pruned (by calling subprocess._cleanup()) whenever a new Popen object is constructed. So if a Popen object is created, abandoned and gc'd before it completes, the process will remain in the zombie state until another Popen object is constructed.
For a child spawned with GRASS' G_spawn() function with the SF_BACKGROUND option, G_spawn() returns the child's PID which must be passed to G_wait().
Other than in relation to wait() etc, the existence of a zombie has no side effects other than occupying a slot in the process table (and counting toward any limit on the maximum number of processes). Descriptors have already been closed at that point.
comment:21 by , 11 years ago
should general/g.gui/main.c (using SF_BACKGROUND) or lib/db/dbmi_client/start.c (using SF_*_DESCRIPTOR,) be used as a model?
thanks, Hamish
follow-up: 23 comment:22 by , 11 years ago
it looks like the fix will need to be in d.mon. That's not something I want to risk breaking just before release, as it's too critical a component.
cairo driver support from the wxGUI commented out in 6.4svn r56679 so we can move forward with the release. Will continue to try and get it working in devbr6.
tbc, Hamish
comment:23 by , 11 years ago
Milestone: | 6.4.3 → 6.4.5 |
---|
comment:25 by , 9 years ago
Milestone: | → 6.4.6 |
---|
comment:26 by , 8 years ago
The 7 branch works well (you can switch between cairo
and png
and back. Should we close it as wontfix. See also #2066.
comment:27 by , 8 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
As the cairo driver is commented out for the GUI display, and so users will not be confronted with the issue and since it works well grass 7, I'm closing this as wontfix.
same here. grass64 and grass7 work fine. I'm on debian/stable amd64, yesterday's svn.
wxGUI freezes after you change the preferences->display mode to cairo and first try to render a map.
I don't know if it's related, but I can get
d.mon -L
to die:apparently GRASS_PNG_READ being set is what triggers it.
?, Hamish