Opened 6 years ago

Last modified 3 years ago

#3772 new enhancement

Make the grass library importable outside of a GRASS session

Reported by: pmav99 Owned by: grass-dev@…
Priority: normal Milestone: 7.8.3
Component: Python Version: svn-trunk
Keywords: Cc:
CPU: Unspecified Platform: Unspecified

Description

Note

This came out rather lengthy. Before we get started, I should say that comments and feedback are more than welcome.

Objective

The grass library is not currently importable outside of a GRASS session. This is easily demonstratable with this:

$ PYTHONPATH=dist.x86_64-pc-linux-gnu/etc/python python -c 'import grass.script; print("OK")'

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/feanor/Prog/git/grass-p2/repo/dist.x86_64-pc-linux-gnu/etc/python/grass/script/__init__.py", line 5, in <module>
    from .core   import *
  File "/home/feanor/Prog/git/grass-p2/repo/dist.x86_64-pc-linux-gnu/etc/python/grass/script/core.py", line 38, in <module>
    gettext.install('grasslibs', os.path.join(os.getenv("GISBASE"), 'locale'))
  File "/home/feanor/Prog/git/grass-p2/.direnv/python-2.7.15/lib/python2.7/posixpath.py", line 70, in join
    elif path == '' or path.endswith('/'):
AttributeError: 'NoneType' object has no attribute 'endswith'

The error is pretty much obvious. $GISBASE is not defined and consequently the os.getenv() call returns None instead of the path to the GRASS distribution.

IMHO having an importable library will be significantly beneficial both to GRASS devs and to those who want to script GRASS since, among other things, it gives you the ability to programmatically create a GRASS session without having to jump through hoops.

So I decided to have a look.

tl;dr

Making this work on Linux is not too hard but it needs feedback/testing on Win and Mac which unfortunately I cannot do.

Necessary note about gettext

When you are developing internationalized Python applications the very first thing you usually need to do is to call gettext.install(). This function call, injects _() into the builtins namespace, effectively making it globally available in the codebase.

If you do this at the very top of your package's __init__.py you then don't need to do anything else.

What is the problem?

There is nothing really preventing us to import the GRASS library apart from this line:

gettext.install('grasslibs', os.path.join(os.getenv("GISBASE"), 'locale'))

How to fix?

$GISBASE is the (absolute?) path to the GRASS installation/distribution directory. The python library is located at $GISBASE/etc/python/grass while the locale directory is located at $GISBASE/locale.

All we need to do to get this working is to get rid of the os.getenv call and replace it with a relative path to the locale directory.

Assumming that the directory structure of a GRASS distribution

is stable we can easily do this using os.path.dirname and __file___.

E.g. by using something like this:

# contents of lib/python/__init__.py
import gettext
import os

from os.path import dirname

# _ROOT_DIR points to the root directory of the GRASS installation/distribution
# Yeap, calling 4 times dirname is not really elegant, but we want to go from:
#     dist.x86_64-pc-linux-gnu/etc/python/grass/__init__.py
# to:
#     dist.x86_64-pc-linux-gnu/
#
_ROOT_DIR = dirname(dirname(dirname(dirname(os.path.abspath(__file__)))))

# Setup i18N
#
# Calling `gettext.install()` installs `_()` in the builtins namespace and thus it
# becomes available globally (i.e. in the same process). We need to do this here
# to ensure that the injection happens before anything else gets imported.
#
# For more info please check the following links:
# - https://docs.python.org/2/library/gettext.html#gettext.install
# - https://pymotw.com/2//gettext/index.html#application-vs-module-localization
# - https://www.wefearchange.org/2012/06/the-right-way-to-internationalize-your.html
#
_LOCALE_DIR = os.path.join(_ROOT_DIR, "locale")

# XXX not really necessary + it will fail when GRASS is compiled without NLS
# XXX but it does make debugging easier.
if not os.path.exists(_LOCALE_DIR):
    raise ValueError("LOCALE could not be found: %s" % _LOCALE_DIR)

gettext.install('grasslibs', _LOCALE_DIR)
gettext.install('grassmods', _LOCALE_DIR)
gettext.install('grasswxpy', _LOCALE_DIR)

# ...

After doing that, any code that uses import grass will be able to use _() and, consequently, the various gettext.install() calls throughout the codebase are no longer needed. The only exception to that is lib/init/grass.py since it does not import the grass library.

WRT to Win and Mac

No idea if this works OK on Win & Mac. I would appreciate any feedback from users of those platforms.

WRT to Python 2 and Python 3 compat

I have not extensively tested this yet. But if there is any boilerplace code needed for Python 2/3 compatibility like e.g. the one introduced here: https://trac.osgeo.org/grass/browser/grass/trunk/gui/wxpython/core/globalvar.py?rev=73930#L40 then this code should also be moved to lib/python/__init__.py

Show me the code!

The code is here (branch "importable"): https://github.com/pmav99/grass-ci/tree/importable

There are only two commits currently, but more may be added. Be warned, this is WIP so I will be rebasing this branch.

How to test

The quick test is of course:

$ PYTHONPATH=dist.x86_64-pc-linux-gnu/etc/python python -c 'import grass.script; print("OK")'

which should now work just fine. If you can run any tests and report any issues I would be really grateful.

Attachments (1)

importable.patch (2.5 KB ) - added by pmav99 6 years ago.

Download all attachments as: .zip

Change History (16)

comment:1 by mankoff, 6 years ago

How does the proposed change differ from https://github.com/zarch/grass-session ?

comment:2 by pmav99, 6 years ago

The proposed change is not directly comparable to grass-session. They have different scopes. To make this more clear:

grass-session is a 3rd party library that can be used to programmatically create a GRASS session.

If grass-session was implemented in the grass library itself, then, since you can't import the grass library unless you already are inside a GRASS session, we would have a chicken-egg problem. I.e. in order to create a GRASS session we would need to already be inside a GRASS session. That's why grass-session needs to be a 3rd party library.

So, if the grass library becomes importable, then, grass-session would have a cleaner implementation and eventually, grass-session could be implemented in the grass library itself, with obvious benefits (e.g. keeping the code up to date, easier deployments etc).

Version 0, edited 6 years ago by pmav99 (next)

comment:3 by pmav99, 6 years ago

Question: Just in case I have completely misunderstood this, is the grass library non-importable by design? If yes, can someone please explain the motivation for this?

I am asking because, I consider myself to be a fairly experienced Python dev and I find it really strange that I can't import the library, but I don't have that much experience with GRASS, so I might be missing something.

comment:4 by mmetz, 6 years ago

This is a question about the general design of GRASS GIS.

There are two requirements in order to use GRASS GIS;

some environment variables mus be set:

At the very least GISBASE, LD_LIBRARY_PATH must be expanded to include the path to GRASS libraries, and PATH must be expanded to include the path to both all GRASS core modules and all GRASS scripts. GISRC must point to a valid gisrc file.

See also https://grasswiki.osgeo.org/wiki/Working_with_GRASS_without_starting_it_explicitly

comment:5 by Nikos Alexandris, 6 years ago

Making the grass library importable, will be of great. Imagine an HPC infrastructure set up in a way that allows for arbitrary Python code execution only. Albeit in a secure way. Having GRASS GIS importable, would fit such a case very well.

in reply to:  4 ; comment:6 by pmav99, 6 years ago

Replying to mmetz:

This is a question about the general design of GRASS GIS.

Mmm... yes and no.

You are right of course that in order to use GRASS you need to first setup the GRASS session. But I think that this issue has a bit different scope, since it is about making it easier to use the grass library.

More specifically, being able to import the grass library from a normal bash session means that we will be able to do stuff like:

  • being able to programmatically create a GRASS session without relying on 3rd party libraries. This might seem limited, but not having to pip install packages does simplify large scale GRASS deployments.
  • write tests that can be run outside of a GRASS session (note: creating the session ideally should be probably part of a test's fixture, different tests might need different sessions etc). For me this is the most important aspect.

Nevertheless, once the ability is there, more use cases might become evident.

Anyway, this is currently blocked by #3790 because what practically makes grass "unimportable" is the gettext code + this line. Once that is fixed, I will upload a concrete patch for review

in reply to:  6 comment:7 by neteler, 6 years ago

Replying to pmav99:

Anyway, this is currently blocked by #3790 because what practically makes grass "unimportable" is the gettext code + this line. Once that is fixed, I will upload a concrete patch for review

Thanks for your patch to fix #3790 (applied in r74307), now this ticket should be unblocked.

by pmav99, 6 years ago

Attachment: importable.patch added

comment:8 by pmav99, 6 years ago

Added a patch that makes it possible to import the GRASS library even from a normal shell session.

After applying the patch and compiling, the code can be tested with:

LD_LIBRARY_PATH=dist.x86_64-pc-linux-gnu/lib PYTHONPATH=dist.x86_64-pc-linux-gnu/etc/python python -c 'import grass.script' && echo 'OK!'

The patch itself does 2 things:

  1. It does not resolve the path to GISBASE by using the ENV variable but by using the relative path from dist.x86_64-pc-linux-gnu/etc/python/grass/__init__.py
  2. It moves the caching of the GRASS commands from the lib/python/pygrass/modules/shortcuts module to the MetaModule class itself.

comment:9 by pmav99, 6 years ago

According to the docs, lib/python/script/setup.py contains functions that

can be used in Python scripts to setup a GRASS environment and session without using grassXY

Not having the ability to import the grass library from a normal session means that you need to already be in a GRASS session in order to use the functions that can create a session (chicken-egg problem).

After making the grass library importable, the whole problem of bootstrapping the GRASS session will finally be much easier to be resolved.

Or course, we will need to review and if necessary improve the setup/cleanup functions in lib/python/script/setup.py and, most importantly, write tests for them. But after that is done, it will finally be possible to refactor/simplify the bootstrapping code in lib/python/init/grass.py

in reply to:  8 comment:10 by mlennert, 6 years ago

Replying to pmav99:

Added a patch that makes it possible to import the GRASS library even from a normal shell session.

After applying the patch and compiling, the code can be tested with:

LD_LIBRARY_PATH=dist.x86_64-pc-linux-gnu/lib PYTHONPATH=dist.x86_64-pc-linux-gnu/etc/python python -c 'import grass.script' && echo 'OK!'

The patch itself does 2 things:

  1. It does not resolve the path to GISBASE by using the ENV variable but by using the relative path from dist.x86_64-pc-linux-gnu/etc/python/grass/__init__.py

I have to admit I don't really understand why all this code is necessary just to be able to do

LD_LIBRARY_PATH=/usr/lib/grass74/lib PYTHONPATH=/usr/lib/grass74/etc/python python -c 'import grass.script' && echo 'OK!

instead of

GISBASE=/usr/lib/grass76 LD_LIBRARY_PATH=/usr/lib/grass76/lib PYTHONPATH=/usr/lib/grass76/etc/python python -c 'import grass.script' && echo 'OK!

or even

GISBASE=/usr/lib/grass76 LD_LIBRARY_PATH=$GISBASE/lib PYTHONPATH=$GISBASE/etc/python python -c 'import grass.script' && echo 'OK!'

Since you have to know the GISBASE path anyhow to be able to set LD_LIBRARY_PATH and PYTHONPATH ?

Moritz

comment:11 by pmav99, 6 years ago

Well, the next steps are to get rid of PYTHONPATH and LD_LIBRARY_PATH too. PYTHONPATH should be rather easy via a path configuration file (e.g. grass.pth - source). Not sure how easy is going to be to get rid of LD_LIBRARY_PATH but at least in theory if you figure out the appropriate linker options, it is doable.

But still, you need to start from somewhere.

in reply to:  11 comment:12 by mmetz, 6 years ago

Replying to pmav99:

Well, the next steps are to get rid of PYTHONPATH and LD_LIBRARY_PATH too. PYTHONPATH should be rather easy via a path configuration file (e.g. grass.pth - source). Not sure how easy is going to be to get rid of LD_LIBRARY_PATH but at least in theory if you figure out the appropriate linker options, it is doable.

I still don't understand why you want to load the GRASS Python library outside a GRASS session: you can't use it unless a GRASS session with a GISRC file containing GISDBASE, LOCATION_NAME, and MAPSET has been defined, otherwise GRASS executables will fail.

Setting PYTHONPATH and LD_LIBRARY_PATH is not sufficient: PATH must include the path to GRASS executables, otherwise the GRASS Python library will not work because it calls GRASS executables which are expected to be in PATH.

Moreover, you can't just set PYTHONPATH and LD_LIBRARY_PATH, you need to prepend or append the corresponding GRASS paths if PYTHONPATH or LD_LIBRARY_PATH are already set.

comment:13 by pmav99, 6 years ago

According to the docs, lib/python/script/setup.py contains functions that

can be used in Python scripts to setup a GRASS environment and session without using grassXY

Perhaps I am missing something, but can these functions be actually used in order to create a session?

Because unless I am mistaken, you need to already be inside a session before you can import them and call them. Am I wrong?

comment:14 by neteler, 5 years ago

Milestone: 7.8.3

comment:15 by wenzeslaus, 3 years ago

If PR:1838 works, it will fix the AttributeError issue. Instead of deriving the path from the package path as suggested here, it delays the initialization of translations to the first translation function call. However, the new approach can be combined with or enhanced by the solution suggested here.

Note: See TracTickets for help on using tickets.