Opened 9 years ago

Closed 3 years ago

#943 closed defect (wontfix)

wxpython gui hangs after switching to cairo display driver

Reported by: epatton Owned by: grass-dev@…
Priority: critical Milestone: 6.4.6
Component: wxGUI Version: svn-develbranch6
Keywords: cairo, driver, gui, wxpython Cc:
CPU: All Platform: Linux

Description

The wxpython Map Display window freezes and becomes unresponsive after changing the display driver from 'default' to 'cairo'.

I have compiled Grass 6.5 from a fresh checkout from svn, set --with-cairo, --with-cairo-libs=/usr/lib, and --with-cairo-includes=/usr/include. One thing I just noticed is that the configure line

checking for cairo linking flags...

appears as it does above, with no confirmation about whether or not it found cairo linking flags. But then I'm not sure it it's supposed to print anything in a successful case, either.

I've checked my package manager, and I have libcairo2 and libcairo2-dev installed, versions 1.8.8.

Anything else I should check? Can anyone confirm this error?

Thanks,

~ Eric.

Change History (27)

comment:1 Changed 9 years ago by hamish

same here. grass64 and grass7 work fine. I'm on debian/stable amd64, yesterday's svn.

wxGUI freezes after you change the preferences->display mode to cairo and first try to render a map.

I don't know if it's related, but I can get d.mon -L to die:

# grass65
export GRASS_PNGFILE=map.bmp
export GRASS_WIDTH=640
export GRASS_HEIGHT=480
export GRASS_RENDER_IMMEDIATE=TRUE
export GRASS_PNG_MAPPED=TRUE
export GRASS_PNG_READ=TRUE
d.mon -L

name            description                    status
----            -----------                    ------
PNG: GRASS_TRUECOLOR status: TRUE
PNG: collecting to file: map.bmp,
GRASS_WIDTH=640, GRASS_HEIGHT=480
HTMLMAP         Create HTML Image Map          running
PNG: GRASS_TRUECOLOR status: TRUE
PNG: collecting to file: map.bmp,
GRASS_WIDTH=640, GRASS_HEIGHT=480
*** glibc detected *** /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/etc/mon.status: free(): invalid pointer: 0x00007f06499b3036 ***
======= Backtrace: =========
/lib/libc.so.6[0x7f064802d928]
/lib/libc.so.6(cfree+0x76)[0x7f064802fa36]
/usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_gis.so(G_free+0x15)[0x7f0649072611]
/usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_pngdriver.so[0x7f06494d89fe]
/usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_pngdriver.so(PNG_Graph_set+0x3f8)[0x7f06494d8e1f]
/usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_driver.so(COM_Graph_set+0x39)[0x7f06492cabc9]
/usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_driver.so(LIB_init+0x1be)[0x7f06492cc426]
/usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_raster.so(LOC_open_driver+0x62)[0x7f06496e4fdc]
/usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_raster.so(R_open_driver+0x18)[0x7f06496e41ff]
/usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/etc/mon.status(main+0x8f)[0x400b4b]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7f0647fd81a6]
/usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/etc/mon.status[0x4009f9]
======= Memory map: ========
00400000-00401000 r-xp 00000000 09:02 34916230                           /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/etc/mon.status
00601000-00602000 rw-p 00001000 09:02 34916230                           /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/etc/mon.status
02496000-024b7000 rw-p 02496000 00:00 0                                  [heap]
7f0640000000-7f0640021000 rw-p 7f0640000000 00:00 0 
7f0640021000-7f0644000000 ---p 7f0640021000 00:00 0 
7f0647b9f000-7f0647bb5000 r-xp 00000000 09:00 553795                     /lib/libgcc_s.so.1
7f0647bb5000-7f0647db5000 ---p 00016000 09:00 553795                     /lib/libgcc_s.so.1
7f0647db5000-7f0647db6000 rw-p 00016000 09:00 553795                     /lib/libgcc_s.so.1
7f0647db6000-7f0647db8000 r-xp 00000000 09:00 555034                     /lib/libdl-2.7.so
7f0647db8000-7f0647fb8000 ---p 00002000 09:00 555034                     /lib/libdl-2.7.so
7f0647fb8000-7f0647fba000 rw-p 00002000 09:00 555034                     /lib/libdl-2.7.so
7f0647fba000-7f0648104000 r-xp 00000000 09:00 555046                     /lib/libc-2.7.so
7f0648104000-7f0648303000 ---p 0014a000 09:00 555046                     /lib/libc-2.7.so
7f0648303000-7f0648306000 r--p 00149000 09:00 555046                     /lib/libc-2.7.so
7f0648306000-7f0648308000 rw-p 0014c000 09:00 555046                     /lib/libc-2.7.so
7f0648308000-7f064830d000 rw-p 7f0648308000 00:00 0 
7f064830d000-7f0648311000 r-xp 00000000 09:02 34916046                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_psdriver.6.5.svn.so
7f0648311000-7f0648510000 ---p 00004000 09:02 34916046                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_psdriver.6.5.svn.so
7f0648510000-7f0648511000 rw-p 00003000 09:02 34916046                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_psdriver.6.5.svn.so
7f0648511000-7f0648593000 r-xp 00000000 09:00 555051                     /lib/libm-2.7.so
7f0648593000-7f0648792000 ---p 00082000 09:00 555051                     /lib/libm-2.7.so
7f0648792000-7f0648794000 rw-p 00081000 09:00 555051                     /lib/libm-2.7.so
7f0648794000-7f06487b9000 r-xp 00000000 09:00 151710                     /usr/lib/libpng12.so.0.27.0
7f06487b9000-7f06489b8000 ---p 00025000 09:00 151710                     /usr/lib/libpng12.so.0.27.0
7f06489b8000-7f06489b9000 rw-p 00024000 09:00 151710                     /usr/lib/libpng12.so.0.27.0
7f06489b9000-7f0648a38000 r-xp 00000000 09:00 151676                     /usr/lib/libfreetype.so.6.3.18
7f0648a38000-7f0648c37000 ---p 0007f000 09:00 151676                     /usr/lib/libfreetype.so.6.3.18
7f0648c37000-7f0648c3d000 rw-p 0007e000 09:00 151676                     /usr/lib/libfreetype.so.6.3.18
7f0648c3d000-7f0648c53000 r-xp 00000000 09:00 148994                     /usr/lib/libz.so.1.2.3.3
7f0648c53000-7f0648e53000 ---p 00016000 09:00 148994                     /usr/lib/libz.so.1.2.3.3
7f0648e53000-7f0648e54000 rw-p 00016000 09:00 148994                     /usr/lib/libz.so.1.2.3.3
7f0648e54000-7f0648e5d000 r-xp 00000000 09:02 34915909                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_datetime.6.5.svn.so
7f0648e5d000-7f064905d000 ---p 00009000 09:02 34915909                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_datetime.6.5.svn.so
7f064905d000-7f064905e000 rw-p 00009000 09:02 34915909                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_datetime.6.5.svn.so
7f064905e000-7f06490c2000 r-xp 00000000 09:02 34915953                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_gis.6.5.svn.so
7f06490c2000-7f06492c2000 ---p 00064000 09:02 34915953                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_gis.6.5.svn.so
7f06492c2000-7f06492c4000 rw-p 00064000 09:02 34915953                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_gis.6.5.svn.so
7f06492c4000-7f06492c6000 rw-p 7f06492c4000 00:00 0 
7f06492c6000-7f06492d2000 r-xp 00000000 09:02 34916041                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_driver.6.5.svn.so
7f06492d2000-7f06494d2000 ---p 0000c000 09:02 34916041                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_driver.6.5.svn.so
7f06494d2000-7f06494d3000 rw-p 0000c000 09:02 34916041                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_driver.6.5.svn.so
7f06494d3000-7f06494d5000 rw-p 7f06494d3000 00:00 0 
7f06494d5000-7f06494dc000 r-xp 00000000 09:02 34916043                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_pngdriver.6.5.svn.so
7f06494dc000-7f06496dc000 ---p 00007000 09:02 34916043                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_pngdriver.6.5.svn.so
7f06496dc000-7f06496dd000 rw-p 00007000 09:02 34916043                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_pngdriver.6.5.svn.so
7f06496dd000-7f06496de000 rw-p 7f06496dd000 00:00 0 
7f06496de000-7f06496ea000 r-xp 00000000 09:02 34916048                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_raster.6.5.svn.so
7f06496ea000-7f06498e9000 ---p 0000c000 09:02 34916048                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_raster.6.5.svn.so
7f06498e9000-7f06498eb000 rw-p 0000b000 09:02 34916048                   /usr/local/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/lib/libgrass_raster.6.5.svn.so
7f06498eb000-7f0649907000 r-xp 00000000 09:00 555038                     /lib/ld-2.7.so
7f06499b3000-7f0649ae0000 rw-s 00000000 09:02 34760511                   /usr/local/src/grass/tests/imgview/map.bmp
7f0649ae0000-7f0649ae5000 rw-p 7f0649ae0000 00:00 0 
7f0649b00000-7f0649b06000 rw-p 7f0649b00000 00:00 0 
7f0649b06000-7f0649b08000 rw-p 0001b000 09:00 555038                     /lib/ld-2.7.so
7fff50ba7000-7fff50bbc000 rw-p 7ffffffea000 00:00 0                      [stack]
7fff50bff000-7fff50c00000 r-xp 7fff50bff000 00:00 0                      [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

apparently GRASS_PNG_READ being set is what triggers it.

?, Hamish

comment:2 Changed 9 years ago by hamish

(and I never started the HTMLMAP driver manually)

comment:3 in reply to:  1 Changed 9 years ago by glynn

Replying to hamish:

I don't know if it's related, but I can get d.mon -L to die:

...
export GRASS_RENDER_IMMEDIATE=TRUE
...
d.mon -L

I wouldn't expect GRASS_RENDER_IMMEDIATE and d.mon to work well together.

In particular, mon.status calls R_open_driver() repeatedly, which may not work with direct rendering (i.e. R_close_driver() may not restore everything sufficiently for a subsequent R_open_driver() to work; this case won't have had much testing).

comment:4 in reply to:  description Changed 9 years ago by glynn

Replying to epatton:

I have compiled Grass 6.5 from a fresh checkout from svn, set --with-cairo, --with-cairo-libs=/usr/lib, and --with-cairo-includes=/usr/include.

You should never need to specify /usr/lib or /usr/include explicitly (however, recent versions of GCC will explicitly ignore them, so it's harmless).

On Linux, you should only need --with-cairo, which obtains the compiling and linking flags via pkg-config. The --with-cairo-includes, --with-cairo-libs, and --with-cairo-ldflags switches are there for platforms lacking pkg-config.

One thing I just noticed is that the configure line

checking for cairo linking flags...

appears as it does above, with no confirmation about whether or not it found cairo linking flags. But then I'm not sure it it's supposed to print anything in a successful case, either.

It prints whatever you specified for --with-cairo-ldflags. If you didn't use it, then it won't print anything.

If the cairo driver gets built, there's nothing wrong with the configure switches. If it doesn't work, that suggests a bug in either the driver or the GUI.

comment:5 Changed 9 years ago by hamish

this still locks up with 64bit grass 6.5 + debian/stable.

I wonder if it is the different enviro vars needed for Cairo in 6.5?

export GRASS_CAIRO_READ=TRUE
export GRASS_CAIRO_MAPPED=TRUE
export GRASS_CAIROFILE=map.png

in 6.4svn & 7svn on the same machine it works. does 6.5svn + 32bit work?

Hamish

comment:6 Changed 9 years ago by epatton

Yep, still locks up here too - 64bit, 6.5.svn on Ubuntu 9.10.

I tried exporting the variables you posted to .grass.bashrc, to no avail.

I have no way of testing on a 32-bit system.

~ Eric.

comment:7 in reply to:  5 Changed 9 years ago by glynn

Replying to hamish:

this still locks up with 64bit grass 6.5 + debian/stable.

I wonder if it is the different enviro vars needed for Cairo in 6.5?

export GRASS_CAIRO_READ=TRUE
export GRASS_CAIRO_MAPPED=TRUE
export GRASS_CAIROFILE=map.png

GRASS_CAIRO_MAPPED only works with BMP images, not any of the other formats. If the filename doesn't end in ".bmp", GRASS_CAIRO_MAPPED is silently ignored.

The file must have the correct size (width * height * 4 + HEADER_SIZE bytes).

comment:8 in reply to:  6 Changed 9 years ago by hamish

CPU: x86-64All

Replying to epatton:

I have no way of testing on a 32-bit system.

ok, tried on 32-bit debian/lenny (stable). locks up, need to use xkill.

so both 32 and 64 bit, ubuntu and debian, grass 6.5 only.

do any non-.deb family users have cairo working with the wxGUI?

Hamish

comment:9 Changed 6 years ago by hamish

Component: DisplaywxGUI
Milestone: 6.5.06.4.3
Priority: normalcritical

ok, seems like two bugs,

[debian/squeeze on amd64, testing in both relbr64 and devbr6, cairo 1.10]

with GRASS_PNG_READ as in the example in comment:1, d.mon still crashes with,

G643svn> export GRASS_PNGFILE=map.bmp
G643svn> export GRASS_PNG_READ=TRUE
G643svn> d.mon -L
*** glibc detected *** status: free(): invalid pointer:
...
G643svn> echo $?
1

(as above)

If I touch map.bmp first then d.mon runs ok. So it either needs to test if the file exists and exit with an error, or test if it exists and skip trying to read it, or create it if it's missing. (maybe d.mon shouldn't be creating it..)

---

in the wxGUI preferences, after changing the Map Display -> Display driver to "cairo" I get this error in the command console:

Settings applied to current session but not saved                               
ERROR: Rendering failed. Details: File </tmp/tmpsF0WNE.ppm>
not found

however, there is an empty file in /tmp/ called tmpsF0WNE. (no .ppm)

?, Hamish

comment:10 Changed 6 years ago by hamish

the following patch against devbr6 changes the error message from "File <...> not found" to "ERROR: Rendering failed. Details: Error reading PPM file" (the file exists but is empty).

Index: gui/wxpython/core/render.py
===================================================================
--- gui/wxpython/core/render.py	(revision 56191)
+++ gui/wxpython/core/render.py	(working copy)
@@ -87,13 +87,14 @@
                         self.opacity, self.hidden))
         
         # generated file for each layer
-        self.gtemp = tempfile.mkstemp()[1]
-        self.maskfile = self.gtemp + ".pgm"
         if self.type == 'overlay':
-            self.mapfile  = self.gtemp + ".png"
+            tempfile_sfx =".png"
         else:
-            self.mapfile  = self.gtemp + ".ppm"
-        
+            tempfile_sfx =".ppm"
+        self.mapfile = tempfile.mkstemp(suffix = tempfile_sfx)[1]
+        # do we need to `touch` the maskfile so it exists?
+        self.maskfile = self.mapfile.rsplit(".",1)[0] + ".pgm"
+
     def __del__(self):
         Debug.msg (3, "Layer.__del__(): layer=%s, cmd='%s'" %
                    (self.name, self.GetCmd(string = True)))

shooting in the dark: I guess the error means that g.pnmcomp is trying to read from the existing GRASS_CAIROFILE before adding to it as GRASS_CAIRO_READ is set(?), or maybe the GRASS_RENDER_IMMEDIATE=TRUE is causing the PNG driver to be used instead?

(I notice GRASS_PNG_READ is referenced in a few places, but GRASS_CAIRO_READ is not at all)

Hamish

comment:11 Changed 6 years ago by hamish

I notice in wxpython/core/render.py sets GRASS_COMPRESSION=0, but that doesn't exist, GRASS_PNG_COMPRESSION does. Having that mis-set probably doesn't help rendering speed & cpu load any.

broken since forever; fixed in all branches.

wrt the tempfiles mentioned in comment:10, I wonder if we can at least mitigate the left-over-file problem (#560) by putting them in the GISRC tempdir with something like:

tempfile.mkstemp(suffix = '.ppm',
   prefix = os.path.basename(os.path.dirname(os.getenv('GISRC'))) + os.sep + 'tmp_')

(? is the GISRC enviro var reset in core/render.py there for the GCP tool, or..?)

Hamish

comment:12 Changed 6 years ago by hamish

Hi,

a few cleanups and fixes applied to devbr6 in r56443. The main problem there was GRASS_RENDER_IMMEDIATE always being set to TRUE, which in GRASS 6 always triggers the PNG driver instead.

Now the error message is by d.rast (e.g.) and has to do with the monitor not accepting connections. Interestingly enough, if I have the x0 XMONITOR started and selected (on linux) and the cairo driver selected as the wxGUI rendering mode, it renders to the Xmon much like the old tcl/tk GUIs would.

But as for the Cairo driver, trying to start it still locks up the wxGUI, as per the original bug report, so is currently commented out.

   RunCommand('d.mon', start = 'cairo')

I had an hourglass mouse cursor on the Map Display window, and could not change tabs or do anything with the Layer Manager window, although if I dragged it around and moved other windows over the top of it it still looked ok after dragging them off. But I had to use xkill to get rid of it, as neither the Map Display or the Layer manager's "X" window decoration worked.

Hamish

comment:13 Changed 6 years ago by hamish

oh, and GRASS_RENDER_IMMEDIATE is now not set at the time that d.mon is called to start the cairo driver, but the GUI still locks up anyway.

comment:14 Changed 6 years ago by hamish

Hi,

a little more debug (I had to log it to a file, since I couldn't switch back to the output tab after the lock-up).

start g.gui, prefs -> map display -> display driver: cairo -> apply. (blank map display)

g.pnmcomp opacity=1.0 mask=/tmp/tmprA1z8Q.pgm height=537 width=698 \
  background=255:255:255 input=/tmp/tmprA1z8Q.ppm output=/tmp/tmpxRSoaJ.ppm

(those .ppm,.pgm are the right size, but all pixels black)

Sometimes I get no rendered map, but do see this error on the layer manager output tab: (the message is a G_fatal_error() from g.pnmcomp)

ERROR: Rendering failed. Details: Error reading PPM file

and sometimes this in the terminal:

.: Fatal IO error 0 (Success) on X server :0.0.

when it tries to run "d.mon start=cairo select=cairo" the GUI lockup happens on this line of gui/wxpython/core/gcmd.py:

    Debug.msg(3, "gcmd.RunCommand(): decoding string")
    stdout, stderr = map(DecodeString, ps.communicate())

so something is holding the stderr pipe open?

it's a bit funny, the order of RunCommand()s has the driver not started the first time:

d.mon -p
d.rast --q -o map=elevation.dem@PERMANENT
d.mon stop=cairo
g.pnmcomp opacity=1.0 mask=/tmp/tmpEdfEuF.pgm ...
 ERROR: Rendering failed. Details: Error reading PPM file
[hit the eyeball redraw button]
d.mon -p
d.mon start=cairo select=cairo
[lockup]

... ah, the cairo driver was still running since the last crash, so it didn't get started a second time before the "d.rast". ..nevermind.

adding quiet=True to the 'd.mon start=' command doesn't help, so I suspect it is the stderr pipe which is holding things up?

Hamish

comment:15 Changed 6 years ago by hamish

re the X error,

http://www.gtkforums.com/viewtopic.php?f=3&t=55792

quote:

There could be several problems here first one is related to this I found
by doing a search http://developer.gnome.org/gtk-faq/stable/x505.html. So
if you use fork() and then use exit() you have effectively closed the socket
to the X11 server.

Another problem could be that you have a X11 socket connection created in
the parent via GTK and then trying to reuse it in the child processes with
the data entering the socket in a bad order.

(another web search mentions that X is perhaps not thread-safe)

I'm testing with Python 2.6.6 btw.

Hamish

comment:16 Changed 6 years ago by hamish

Adding a ps.poll() right before the ps.communicate() in wxpython/core/gcmd.py RunCommand() doesn't help much, it returns "None" (i.e. the process is still running).

looking at "ps fax" during the lockup shows that d.mon has become a zombie:

22954 pts/0    SNl    0:02 wxgui /home/hamish/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/etc/wxpython/wxgui.py
23018 pts/0    ZN     0:00  \_ [d.mon] <defunct>
23020 pts/0    SN     0:00 /home/hamish/src/grass/svn/grass65/dist.x86_64-unknown-linux-gnu/driver/cairo cairo        dev/fifo.8a dev/fifo.8b

the d.mon goes away along with the wxgui as soon as I xkill the GUI window, the cairo process keeps running until I manually do "d.mon stop=cairo", as expected.

Hamish

comment:17 Changed 6 years ago by hamish

If I run "d.mon stop=cairo" from the terminal's command line while the GUI is locked up, life returns to the GUI, although there is a traceback to do with opacity shown in the output tab after:

Command 'd.rast -o map=elevation.dem@PERMANENT' failed
Details: No graphics monitor has been selected for output.

and the mouse cursor remains as an hourglass. But you can use the menus and go to prefs and switch back to the "default" display driver, then render maps the old way, etc.

Hamish

comment:18 in reply to:  15 Changed 6 years ago by glynn

Replying to hamish:

http://developer.gnome.org/gtk-faq/stable/x505.html.

The content of that page is nonsense.

The difference between exit() and _exit() is the the former flushes streams (FILE*) and calls any handlers registered with atexit() while the latter doesn't. Unless something creates a FILE* for the socket with fdopen() (Xlib doesn't), the distinction doesn't matter.

Terminating a process implicitly close()s all open file descriptors, but the underlying object (file, socket, etc) isn't actually closed until no process has it open. This is true for an explicit close() call as well as an implicit close due to process termination.

(another web search mentions that X is perhaps not thread-safe)

None of this has anything to do with display drivers in general or the cairo driver in particular. None of them use GTK, GDK, or threads.

comment:19 Changed 6 years ago by hamish

by adding a little debug write to d.mon just before its final exit(), I can see that the module was completing ok; i.e. not getting hung up in the G_spawn(). But then something is still holding it open and it becomes a zombie until you manually stop the cairo driver process. At which point it releases and all is back to normal (except wxgui mouse cursor remains an hourglass)

doing the same thing from the command line works fine.

if it was some "Graphics driver [cairo] started" text from d.mon waiting to be flushed from the stderr buffer by python's popen.communicate(), then for one thing the proc.communicate() should flush it, and for another why would externally stopping the cairo driver suddenly free it?

Glynn wrote:

Terminating a process implicitly close()s all open file descriptors, but the underlying object (file, socket, etc) isn't actually closed until no process has it open.

is there a chance that the G_spawn()'d mon.start or mon.select program has gotten a hold of one of the stdin/out/err pipes, and is holding it open, and then in the wxgui the proc.communicate() can't let its end go until that happens?

but in that case perhaps the zombie wouldn't exist? since the zombie exists perhaps it is d.mon's exit() which can't let go of the pipe?

thanks, Hamish

comment:20 in reply to:  19 Changed 6 years ago by glynn

Replying to hamish:

is there a chance that the G_spawn()'d mon.start or mon.select program has gotten a hold of one of the stdin/out/err pipes, and is holding it open, and then in the wxgui the proc.communicate() can't let its end go until that happens?

Yes. fork() duplicates practically everything, including all descriptors. G_spawn() will only close descriptors which are explicitly requested to be closed (with SF_CLOSE_DESCRIPTOR) or which are implicitly closed by being the target of a redirection (SF_REDIRECT_FILE or SF_REDIRECT_DESCRIPTOR). Any descriptors which have the close-on-exec flag set will be closed in the child process when the new program is exec()d.

ISTR that this is/was actually a problem for the d.resize script, as the new monitor inherits any descriptors from d.resize, so if d.resize inherits the write end of a pipe (e.g. if its stdout or stderr are pipes), the reader won't see EOF so long as the monitor process survives.

but in that case perhaps the zombie wouldn't exist? since the zombie exists perhaps it is d.mon's exit() which can't let go of the pipe?

A zombie is a process which has terminated (completely), but whose parent is alive and hasn't yet been "reaped" by retrieving the child's exit status with wait(), waitpid() etc. If the parent is no longer alive, the child will be "adopted" by the init process which will reap it.

For a child spawned with Python's subprocess.Popen, the .wait() or .poll() methods should be called on the Popen object (in the case of .poll(), it must be called until it returns a value other than None). Popen objects which are garbage-collected will have the .poll() method called from the finaliser; if the child process is still alive at that point, it is added to a list (subprocess._active) which is pruned (by calling subprocess._cleanup()) whenever a new Popen object is constructed. So if a Popen object is created, abandoned and gc'd before it completes, the process will remain in the zombie state until another Popen object is constructed.

For a child spawned with GRASS' G_spawn() function with the SF_BACKGROUND option, G_spawn() returns the child's PID which must be passed to G_wait().

Other than in relation to wait() etc, the existence of a zombie has no side effects other than occupying a slot in the process table (and counting toward any limit on the maximum number of processes). Descriptors have already been closed at that point.

comment:21 Changed 6 years ago by hamish

should general/g.gui/main.c (using SF_BACKGROUND) or lib/db/dbmi_client/start.c (using SF_*_DESCRIPTOR,) be used as a model?

thanks, Hamish

comment:22 Changed 6 years ago by hamish

it looks like the fix will need to be in d.mon. That's not something I want to risk breaking just before release, as it's too critical a component.

cairo driver support from the wxGUI commented out in 6.4svn r56679 so we can move forward with the release. Will continue to try and get it working in devbr6.

tbc, Hamish

comment:23 in reply to:  22 Changed 5 years ago by neteler

Milestone: 6.4.36.4.5

Replying to hamish:

cairo driver support from the wxGUI commented out in 6.4svn r56679 so we can move forward with the release. Will continue to try and get it working in devbr6.

Milestone updated to 6.4.5 for the time being.

comment:24 Changed 4 years ago by martinl

Milestone: 6.4.5

Ticket retargeted after milestone closed

comment:25 Changed 4 years ago by martinl

Milestone: 6.4.6

comment:26 Changed 3 years ago by wenzeslaus

The 7 branch works well (you can switch between cairo and png and back. Should we close it as wontfix? See also #2066.

Last edited 3 years ago by wenzeslaus (previous) (diff)

comment:27 Changed 3 years ago by mlennert

Resolution: wontfix
Status: newclosed

As the cairo driver is commented out for the GUI display, and so users will not be confronted with the issue and since it works well grass 7, I'm closing this as wontfix.

Note: See TracTickets for help on using tickets.