Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#2151 closed defect (fixed)

g.gui.* modules which use temporal framework leave processes after exiting

Reported by: annakrat Owned by: grass-dev@…
Priority: normal Milestone: 7.0.0
Component: Python Version: svn-trunk
Keywords: g.gui.animation, temporal, RCP Cc:
CPU: All Platform: Linux

Description

During start, g.gui.animation and g.gui.timeline create 2 subprocesses, specifically calling tgis.init() does. I suppose these are related to the RCP server recently implemented. The problem is that these processes are not finished when the gui module is closed. This is caused by the fact that g.gui.* modules are called in the background by GuiModuleMain function. Is there anything what we can do about it? I guess it's not a big deal when a user runs the module a few times but for me it's pretty annoying.

This applies to Linux and probably Mac OSX, too.

Change History (13)

comment:1 Changed 6 years ago by huhabla

This is really strange. The subprocesses created by tgis.init() for the messenger and the C-interface are marked as daemon processes (they will be terminated if the parent process will exit). In addition, the destructor of the messenger and C-interface objects explicitly terminate the subprocesses.

It seems that booth mechanism do not work with the GUI? AFAIC the destructors are not called when the GUI process exits. I have no idea why.

comment:2 Changed 6 years ago by huhabla

I tried to solve the problem by registering a function at exit that explicitly terminates the subprocesses. I added this code to lib/python/temporal/core.py:

import atexit

def stop_subprocesses():
    """!Stop the messenger and C-interface subprocesses
    """
    global message_interface
    global c_library_interface
    if message_interface:
        message_interface.stop()
    if c_library_interface:
        c_library_interface.stop()

atexit.register(stop_subprocesses)

This works for temporal modules, but not for the GUI.

The registered function is not be called at exit when g.gui.animate terminates! How does the GUI framework terminates so that this functionality does not work?

Besides of that, it seems to me that tgis.init() is called several time when g.gui.animate is opened and a space time raster dataset is chosen for visualization. It hope that this can be reduced to a single tgis.init() call?

comment:3 Changed 6 years ago by huhabla

All the methods that i have tried are not working in case the GUI module uses os.fork() to start the module in the background.

Why do we need fork() at all? Unix already support job management in the shell and it does not work on windows?

If we really need it, can this be implemented in a better way?

comment:4 in reply to:  2 Changed 6 years ago by annakrat

Replying to huhabla:

Besides of that, it seems to me that tgis.init() is called several time when g.gui.animate is opened and a space time raster dataset is chosen for visualization. It hope that this can be reduced to a single tgis.init() call?

Sure, done in r58578.

comment:5 in reply to:  3 Changed 6 years ago by wenzeslaus

Replying to huhabla:

All the methods that i have tried are not working in case the GUI module uses os.fork() to start the module in the background.

Why do we need fork() at all? Unix already support job management in the shell and it does not work on windows?

The idea was to start the application in the background (r57388, r57370 - r57378 launch GUI in the background). There was also a discussion on the mailing list with reasoning that g.gui.* should behave in the same way as g.gui, that these (g.gui*) are special cases (not able to find link now).

While GuiModuleMain is using Python os.fork, g.gui is using G_spawn from grass/spawn.h.

When the generated GUI for a module is shown (v.pack or g.region --ui), no special functionality is provided and it blocks terminal unless you use &.

If we really need it, can this be implemented in a better way?

comment:6 Changed 6 years ago by annakrat

I just tried this python recipe and it worked for me, the processes are terminated. I tried to find the difference and it seems that just this line is enough (in the GuiModuleMain? function):

--- core/utils.py	(revision 58561)
+++ core/utils.py	(working copy)
@@ -1064,7 +1064,8 @@
         child_pid = os.fork()
         if child_pid == 0:
             mainfn()
-        os._exit(0)
+        else:
+            os._exit(0)
     else:
         mainfn()

Can I apply it safely or can it cause some other troubles?

comment:7 in reply to:  6 Changed 6 years ago by annakrat

Replying to annakrat:

I just tried this python recipe and it worked for me, the processes are terminated. I tried to find the difference and it seems that just this line is enough (in the GuiModuleMain? function):

--- core/utils.py	(revision 58561)
+++ core/utils.py	(working copy)
@@ -1064,7 +1064,8 @@
         child_pid = os.fork()
         if child_pid == 0:
             mainfn()
-        os._exit(0)
+        else:
+            os._exit(0)
     else:
         mainfn()

Can I apply it safely or can it cause some other troubles?

Soeren had the same idea (r58580). Unfortunately, I can't say it's solved completely. I have still problem with g.gui.timeline. The subprocesses don't terminate when it's called with parameter input or when it's called without any parameter but then you display the dataset. When it's just opened and closed the subprocesses terminate. For g.gui.animation it's seems ok now. So I am confused what's different in g.gui.timeline. Well, at least some progress...

comment:8 Changed 6 years ago by huhabla

It seems to me that the g.gui.timeline child process is still alive after the fork. The init process is the new parent of the forked child??

I have patched g.gui.timeline a make it a bit faster and to assure at least the termination of the messenger and C-interface subprocesses of the temporal framework:

  • timeline/frame.py

     
    7878        self._layout()
    7979        self.temporalType = None
    8080        self.unit = None
     81        # We create a database interface here to speedup the GUI
     82        self.dbif = tgis.SQLDatabaseInterfaceConnection()
     83        self.dbif.connect()
    8184
     85    def __del__(self):
     86        """!Close the database interface and stop the messenger and C-interface
     87           subprocesses.
     88        """
     89        if self.dbif.connected is True:
     90            self.dbif.close()
     91        tgis.stop_subprocesses()
     92
    8293    def _layout(self):
    8394        """!Creates the main panel with all the controls on it:
    8495             * mpl canvas
     
    145156        self.timeData = {}
    146157        mode = None
    147158        unit = None
     159
    148160        for series in timeseries:
    149161            name = series[0] + '@' + series[1]
    150162            etype = series[2]
    151163            sp = tgis.dataset_factory(etype, name)
    152             if not sp.is_in_db():
     164            if not sp.is_in_db(dbif=self.dbif):
    153165                GError(self, message=_("Dataset <%s> not found in temporal database") % (name))
    154166                return
    155167
    156             sp.select()
     168            sp.select(dbif=self.dbif)
    157169
    158170            self.timeData[name] = {}
    159171            self.timeData[name]['elementType'] = series[2]
     
    167179                return
    168180
    169181            # check topology
    170             maps = sp.get_registered_maps_as_objects()
    171             self.timeData[name]['validTopology'] = sp.check_temporal_topology(maps)
     182            maps = sp.get_registered_maps_as_objects(dbif=self.dbif)
     183            self.timeData[name]['validTopology'] = sp.check_temporal_topology(maps=maps, dbif=self.dbif)
    172184
    173185            self.timeData[name]['temporalMapType'] = sp.get_map_time()  # point/interval
    174186            self.timeData[name]['unit'] = None  # only with relative
     
    194206                                'north', 'south', 'west', 'east'])
    195207
    196208            rows = sp.get_registered_maps(columns=columns, where=None,
    197                                           order='start_time', dbif=None)
     209                                          order='start_time', dbif=self.dbif)
    198210            if rows is None:
    199211                rows = []
    200212            for row in rows:
     
    385397        @return (mapName, mapset, type)
    386398        """
    387399        validated = []
    388         tDict = tgis.tlist_grouped('stds', group_type=True)
     400        tDict = tgis.tlist_grouped('stds', group_type=True, dbif=self.dbif)
    389401        # nested list with '(map, mapset, etype)' items
    390402        allDatasets = [[[(map, mapset, etype) for map in maps]
    391403                     for etype, maps in etypesDict.iteritems()]

The patch is not applied yet, since it does not solve the issue that the child process is still alive. Maybe an explicit os.exit() should be added to g.gui.timeline? How about an exit button?

comment:9 in reply to:  8 ; Changed 6 years ago by annakrat

Replying to huhabla:

The patch is not applied yet, since it does not solve the issue that the child process is still alive. Maybe an explicit os.exit() should be added to g.gui.timeline? How about an exit button?

os.exit would close the whole wxGUI when the timeline tool is called from the wxGUI. Exit button would not solve anything, it just does the same as the standard x close button or Alt+F4. I tried it to be sure and there is no difference. I think you can commit your changes, it's improvement. Could the problem come from the matplotlib? Maybe I am using it in an incorrect way.

comment:10 in reply to:  9 ; Changed 6 years ago by huhabla

Replying to annakrat:

Replying to huhabla:

The patch is not applied yet, since it does not solve the issue that the child process is still alive. Maybe an explicit os.exit() should be added to g.gui.timeline? How about an exit button?

os.exit would close the whole wxGUI when the timeline tool is called from the wxGUI. Exit button would not solve anything, it just does the same as the standard x close button or Alt+F4. I tried it to be sure and there is no difference. I think you can commit your changes, it's improvement. Could the problem come from the matplotlib? Maybe I am using it in an incorrect way.

I have applied the patch and included an explicit kill call in the destructor of TimelineFrame?, see r58597. The kill signal will only be send if g.gui.timeline works in stand-alone mode (hopefully, please cross-check that).

Killing the process is a crude solution, but i have no better idea how to avoid orphaned daemon g.gui.timeline processes.

comment:11 in reply to:  10 ; Changed 6 years ago by annakrat

Replying to huhabla:

Replying to annakrat:

Replying to huhabla:

The patch is not applied yet, since it does not solve the issue that the child process is still alive. Maybe an explicit os.exit() should be added to g.gui.timeline? How about an exit button?

os.exit would close the whole wxGUI when the timeline tool is called from the wxGUI. Exit button would not solve anything, it just does the same as the standard x close button or Alt+F4. I tried it to be sure and there is no difference. I think you can commit your changes, it's improvement. Could the problem come from the matplotlib? Maybe I am using it in an incorrect way.

I have applied the patch and included an explicit kill call in the destructor of TimelineFrame?, see r58597. The kill signal will only be send if g.gui.timeline works in stand-alone mode (hopefully, please cross-check that).

Killing the process is a crude solution, but i have no better idea how to avoid orphaned daemon g.gui.timeline processes.

I found the problem and removed it (r58598, r58599). Now it should terminate all processes but please check. It was caused by a line of code which actually did nothing for me but I thought it might be working for some other version of matplotlib. I should have looked at it earlier. What about the changes you did? Part of them like the os.kill are redundant now, not sure about tgis.stop_subprocesses()?

comment:12 in reply to:  11 ; Changed 6 years ago by huhabla

Resolution: fixed
Status: newclosed

Replying to annakrat:

Replying to huhabla:

Replying to annakrat:

Replying to huhabla:

The patch is not applied yet, since it does not solve the issue that the child process is still alive. Maybe an explicit os.exit() should be added to g.gui.timeline? How about an exit button?

os.exit would close the whole wxGUI when the timeline tool is called from the wxGUI. Exit button would not solve anything, it just does the same as the standard x close button or Alt+F4. I tried it to be sure and there is no difference. I think you can commit your changes, it's improvement. Could the problem come from the matplotlib? Maybe I am using it in an incorrect way.

I have applied the patch and included an explicit kill call in the destructor of TimelineFrame?, see r58597. The kill signal will only be send if g.gui.timeline works in stand-alone mode (hopefully, please cross-check that).

Killing the process is a crude solution, but i have no better idea how to avoid orphaned daemon g.gui.timeline processes.

I found the problem and removed it (r58598, r58599). Now it should terminate all processes but please check. It was caused by a line of code which actually did nothing for me but I thought it might be working for some other version of matplotlib. I should have looked at it earlier. What about the changes you did? Part of them like the os.kill are redundant now, not sure about tgis.stop_subprocesses()?

I kept tgis.stop_subprocesses() within the destructor to assure the termination of the messenger and C-interface subprocesses when the time line tool was started from the main GUI. The explicit kill and dependent code was removed in svn (r58602 and r58609). It works for me, no orphaned processes after closing g.gui.timeline.

I will close this ticket, marking it as fixed.

comment:13 in reply to:  12 Changed 6 years ago by annakrat

Replying to huhabla:

I will close this ticket, marking it as fixed.

Great, thanks for help

Note: See TracTickets for help on using tickets.