Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#2127 closed enhancement (fixed)

Python implementation of g.message

Reported by: huhabla Owned by: grass-dev@…
Priority: normal Milestone: 7.0.0
Component: Python Version: svn-trunk
Keywords: Cc:
CPU: All Platform: Unspecified

Description

The Python grass script library uses g.message to provide warning, error, debug, verbose, info and percent messages. In case these messages are called many times (> 100), the overhead of calling g.message rises and can slow the actual processing massively down.

I would suggest to implement the behavior of g.message directly in Python to reduce the overhead, replacing the functions that make use of g.message:

  • grass.core.message()
  • grass.core.debug()
  • grass.core.verbose()
  • grass.core.info()
  • grass.core.percent()
  • grass.core.error()
  • grass.core.warning()

Attachments (1)

__init__.py (7.2 KB) - added by huhabla 5 years ago.
Sample implementation of a GRASS messaging interface

Download all attachments as: .zip

Change History (7)

comment:1 in reply to:  description ; Changed 5 years ago by glynn

Replying to huhabla:

In case these messages are called many times (> 100), the overhead of calling g.message rises and can slow the actual processing massively down.

Why would you call them so many times?

I can just about understand it for debug(), in which case it might be better to use native Python equivalents (e.g. the logging module). If you're calling anything else >100 times (even verbose()), the script is probably too chatty.

I would suggest to implement the behavior of g.message directly in Python to reduce the overhead, replacing the functions that make use of g.message:

Then we would need to keep the two in sync.

G_message() etc aren't exactly trivial; they support multiple output formats, word-wrapping, configution of verbosity and output format via environment variables and command-line switches, and reporting of messages via stderr, log file and/or email.

If performance is a genuine issue, I would rather see g.message enhanced so that it can be used as a server, with the script spawning a single persistent g.message process which can accept multiple messages (of varying priorities) read from stdin.

comment:2 in reply to:  1 ; Changed 5 years ago by wenzeslaus

Replying to glynn:

Replying to huhabla:

In case these messages are called many times (> 100), the overhead of calling g.message rises and can slow the actual processing massively down.

Why would you call them so many times?

I can just about understand it for debug(), in which case it might be better to use native Python equivalents (e.g. the logging module). If you're calling anything else >100 times (even verbose()), the script is probably too chatty.

I would suggest to implement the behavior of g.message directly in Python to reduce the overhead, replacing the functions that make use of g.message:

Then we would need to keep the two in sync.

G_message() etc aren't exactly trivial; they support multiple output formats, word-wrapping, configution of verbosity and output format via environment variables and command-line switches, and reporting of messages via stderr, log file and/or email.

This is certainly an issue. Python logging module could help in implementing the functionality but it is still necessary to implement GRASS interface (the environmental variable, etc.).

It seems to me that better option is to use G_message() etc through ctypes. There is always some complexity in using ctypes but we need them work anyway for many things (some scripts, pygrass, parts of GUI), so making them necessary for all scripts is not such an issue from my point of view.

If performance is a genuine issue, I would rather see g.message enhanced so that it can be used as a server, with the script spawning a single persistent g.message process which can accept multiple messages (of varying priorities) read from stdin.

Although, this approach might be beneficial at more places in GRASS (mainly GUI-related things (g.message is actually GUI thing too)) and I would like to have a nice way how to do it when necessary, I don't think that it is better than ctypes. It think ctypes is a better option.

I think that it is fragile as well as ctypes. The difference, I can see, is the number of created processes. When we speak about call of one Python module there is no such a difference. One g.message server process versus zero when using ctypes. But when we speak about calling Python module many times (> 100), we have now a lot of g.message server processes versus zero in case of ctypes.

comment:3 in reply to:  1 Changed 5 years ago by huhabla

Replying to glynn:

Replying to huhabla:

In case these messages are called many times (> 100), the overhead of calling g.message rises and can slow the actual processing massively down.

Why would you call them so many times?

I would like to add debug messages, verbose messages and the percentage output to many processing steps in the temporal framework, so that the user can follow the processing of time stamped maps. I usually handle hundreds to many thousands of maps. In the current message approach g.message is called every single step to evaluate the debug, verbosity level and so on.

I can just about understand it for debug(), in which case it might be better to use native Python equivalents (e.g. the logging module). If you're calling anything else >100 times (even verbose()), the script is probably too chatty.

I would suggest to implement the behavior of g.message directly in Python to reduce the overhead, replacing the functions that make use of g.message:

Then we would need to keep the two in sync.

G_message() etc aren't exactly trivial; they support multiple output formats, word-wrapping, configution of verbosity and output format via environment variables and command-line switches, and reporting of messages via stderr, log file and/or email.

If performance is a genuine issue, I would rather see g.message enhanced so that it can be used as a server, with the script spawning a single persistent g.message process which can accept multiple messages (of varying priorities) read from stdin.

That is a great idea indeed. I would suggest to implement this message server using Python that calls the G_message() functions using ctypes. This can be part of PyGRASS that implements the server process and the client access functions that sends text messages to the sever process that writes to stderr.

Example interface:

import grass.pygrass.messages as messages

# Create the messenger object that starts the server process.
# As Glynn said, the server accepts multiple messages with different priorities.
msgr = messages.Messenger()

# In addition to the stdout/stderr output, the server can write the messages into a logfile
msgr.set_logfile("logfile.txt")

# Send an info message to the server, the server will call the G_info() function using ctypes
msgr.info("This is an info message")

# Send a verbose message, the server will call the G_verbose() function using ctypes
msgr.verbose("This is a verbose message")

# Send a warning message, the server will call the G_warning() function using ctypes
msgr.warning("This is the last warning")

# Send an error message, the server will call the G_error() function using ctypes
msgr.error("This is an error message")

# Send a percentage message, the server will call the G_percent() function using ctypes
msgr.percent(1, 1, 1) # 100%

The PyGRASS implementation could be able to respawn the server process in case a G_fatal_error() occurred while using the ctypes interface. The server process will be shut down if the Python object gets deleted. The communication between server and client functions should be as fast as possible, hence the client simply sends out a message and does not wait for a server response.

As Vaclav pointed out: such an interface would be very useful for Python libraries, modules and the GUI.

This kind of client server approach where the server calls ctype GRASS functions can also be useful in other application: for example the GUI can send vector editing messages (ctypes objects?) to a GRASS vector edit server process, that can be respawned in case of a fatal error. Hence the GUI will not crash when an error occurs.

Sounds like a kind of remote procedure call interface to the functions of the GRASS C-libraries.

Changed 5 years ago by huhabla

Attachment: __init__.py added

Sample implementation of a GRASS messaging interface

comment:4 Changed 5 years ago by huhabla

I have implementation GRASS messenger interface prototype. The file is attached in the ticket. It uses the Python multiprocessing interface. The IPC is handled via pipes. Here a usage example:

    msgr = Messenger()
    msgr.message("message")
    msgr.verbose("verbose message")
    msgr.important("important message")
    msgr.test_fatal_error()
    msgr.percent(1, 1, 1)
    msgr.debug(0, "debug 0")
    msgr.warning("Ohh")
    msgr.debug(1, "debug 1")
    msgr.error("Ohh no")
    msgr.stop()
    # This should result in:
    """
    message
    important message
    ERROR: this is a fatal error
    WARNING: Needed to restart the messenger server
     100%
    D0/0: debug 0
    WARNING: Ohh
    ERROR: Ohh no
    """
    # Test of the percentage creation
    msgr = Messenger()
    num = 100000
    for i in range(num):
        msgr.percent(i, num, 10)
    msgr.stop()

What are you thinking, any improvement suggestions, enhancement requests? :)

I would like to put the attached file __init__.py into a new pygrass directory "lib/python/pygrass/messenger", if there are no objections against it.

comment:5 Changed 5 years ago by huhabla

Resolution: fixed
Status: newclosed

The fast and exit-safe interface to GRASS C-library message functions is now available in trunk revision r58201.

comment:6 in reply to:  2 Changed 5 years ago by glynn

Replying to wenzeslaus:

It seems to me that better option is to use G_message() etc through ctypes.

Using ctypes for this is overkill.

Python's standard library intentionally doesn't use ctypes, so that sites can remove it if they so wish (it's considered a risk factor).

grass.script is supposed to be a support library for scripts, not a Python wrapper around the GRASS libraries.

Note: See TracTickets for help on using tickets.