Opened 9 years ago
Closed 6 years ago
#2873 closed enhancement (fixed)
Simplify usage of GRASS in Python from outside
Reported by: | wenzeslaus | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 7.4.2 |
Component: | Python | Version: | svn-trunk |
Keywords: | startup, installation, scripts, interpreter, windows installer, pygrass, temporal, bootstrap, boilerplate | Cc: | |
CPU: | Unspecified | Platform: | All |
Description
To use GRASS GIS functionality in Python from outside of a GRASS session, i.e. without starting GRASS GIS explicitly and running a script (or an actual module) in the session, one needs to include approximately 50 lines as described in the grass.script.setup manual (or in the lengthy related wiki page).
Ideally, one would just do import and then one or two lines of initialization, for example:
import grass.script as gscript rcfile = gscript.init_data("~/grassdata", "nc_spm", "user1") do_what_ever_with_grass() os.remove(rcfile)
Suggestion
The attached code is a prototype implementation of the Python part which would allow something like this. The code can only work if the following variables are set:
The GRASS_EXECUTABLE
does not have to be set if grass
is on the path or, in theory, if it is in on some standard path (e.g. C:\Program Files (x86)\GRASS GIS 7.0.0\grass70.bat
on MS Windows). However, dynamic library path must always be set ahead (as described in #2424) for ctypes to work (both PyGRASS and temporal depends on ctypes). The path to Python packages must be set ahead as well if you want to use the initialization functions from the package.
Making it simple
The three lines above might be good enough on Linux where you just dump it to command line or .bashrc
but for MS Windows users it is too complicated. The QGIS project also considers this too much work (PyQGIS bootstrap is complicated).
GRASS Python packages could go to the system packages directory, so that we avoid the need for setting PYTHONPATH
. This might work well on MS Windows when usage of system Python is implemented as described in #2333.
On Linux, LD_LIBRARY_PATH
can be avoided if the libraries are installed into the system path. On MS Windows, putting more things on PATH
is standard procedure from what I have seen. Mac OS X, DYLD_LIBRARY_PATH
can't be used anyway since El Capitan.
GRASS GIS executable should be on path on all platforms in the same way as it is on path in Linux. This is maybe not standard on MS Windows but at the end this is what users want (they want GRASS to be available right away).
Challenges
When putting dynamic libraries and Python packages directly into system paths, installing more than one GRASS version becomes more complicated. However, that's OK because only advanced users would have more than one version, so the hard work of making it work (perhaps just not using the default settings in the installer) will be on them. Beginners will likely have just one version. The exception might be on MS Windows where it is possible that beginner has standalone GRASS GIS, the one from QGIS and one from OSGeo4W.
GRASS Python packages are not prepared to be imported when GISBASE
is not set and may require even more. We would need to change the code to not require anything from GRASS session at import time. So far, I needed to add lazy initialization for the translate function (underscore) to be able to import grass.script.core
(patch attached).
There is already some duplication between grass.script.setup
and grass.py
executable. To create a full session (e.g. Mapset locking) we would need even more duplication. We could move some parts from grass.py
to grass.script.setup
if we are sure that we can import the right grass.script.setup
during the initialization phase (we already rely on in when creating a Location).
Attachments (5)
Change History (21)
by , 9 years ago
Attachment: | lazy_gettext.patch added |
---|
by , 9 years ago
Attachment: | run_grass.py added |
---|
First prototype of API for simplified GRASS startup of standalone scripts
follow-up: 2 comment:1 by , 9 years ago
Replying to wenzeslaus:
To use GRASS GIS functionality in Python from outside of a GRASS session, i.e. without starting GRASS GIS explicitly and running a script (or an actual module) in the session, one needs to include approximately 50 lines as described in the grass.script.setup manual (or in the lengthy related wiki page).
Ideally, one would just do import and then one or two lines of initialization, for example:
import grass.script as gscript rcfile = gscript.init_data("~/grassdata", "nc_spm", "user1") do_what_ever_with_grass() os.remove(rcfile)Suggestion
The attached code is a prototype implementation of the Python part which would allow something like this. The code can only work if the following variables are set:
The
GRASS_EXECUTABLE
does not have to be set ifgrass
is on the path or, in theory, if it is in on some standard path (e.g.C:\Program Files (x86)\GRASS GIS 7.0.0\grass70.bat
on MS Windows). However, dynamic library path must always be set ahead (as described in #2424) for ctypes to work (both PyGRASS and temporal depends on ctypes). The path to Python packages must be set ahead as well if you want to use the initialization functions from the package.Making it simple
The three lines above might be good enough on Linux where you just dump it to command line or
.bashrc
but for MS Windows users it is too complicated. The QGIS project also considers this too much work (PyQGIS bootstrap is complicated).GRASS Python packages could go to the system packages directory, so that we avoid the need for setting
PYTHONPATH
. This might work well on MS Windows when usage of system Python is implemented as described in #2333.
there is no system Python in windows. users always has to install python manually systemwide, possibly interfering with other python installations installed by other software.
On Linux,
LD_LIBRARY_PATH
can be avoided if the libraries are installed into the system path. On MS Windows, putting more things onPATH
is standard procedure from what I have seen.[...] Mac OS X,
DYLD_LIBRARY_PATH
can't be used anyway since El Capitan.GRASS GIS executable should be on path on all platforms in the same way as it is on path in Linux. This is maybe not standard on MS Windows but at the end this is what users want (they want GRASS to be available right away).
IMHO it is not a good practice to put everything in %PATH% in windows. poisoning %PATH% should be avoided IMHO.
comment:2 by , 9 years ago
Replying to hellik:
Replying to wenzeslaus:
GRASS Python packages could go to the system packages directory, so that we avoid the need for setting
PYTHONPATH
. This might work well on MS Windows when usage of system Python is implemented as described in #2333.there is no system Python in windows. users always has to install python manually systemwide, possibly interfering with other python installations installed by other software.
By system Python I mostly mean what #2333 is talking about. Possible interference is the inherent issue of Windows operating system. I'm OK with including multiple options in the installer giving the choice to the user (with the last option being "Don't know what to choose? Install Ubuntu and let the package managers solve it for you." ;-).
On MS Windows, putting more things on
PATH
is standard procedure from what I have seen... GRASS GIS executable should be on path on all platforms in the same way as it is on path in Linux. This is maybe not standard on MS Windows but at the end this is what users want (they want GRASS to be available right away).IMHO it is not a good practice to put everything in %PATH% in windows. poisoning %PATH% should be avoided IMHO.
I'm not sure if we can do it. Is there a another way how to set path to dynamic libraries for a process (so that Python scripts using GRASS ctypes work)?
comment:3 by , 9 years ago
See also Glynn's comment in ticket 580 speaking about "fixing the installation process to make GRASS 'sessions' an optional feature".
by , 9 years ago
Attachment: | lib_python_script_init_data.patch added |
---|
Second prototype of API for simplified GRASS startup of standalone scripts as a patch for lib/python/script
follow-up: 5 comment:4 by , 9 years ago
I uploaded a second prototype of the API as a patch. Before executing the code outside of GRASS GIS, the following is needed:
The minimal script is:
import grass.script as gscript import grass.script.setup as gsetup session = gsetup.init_data("~/grassdata", "nc_spm", "user1") # code goes here, e.g. gscript.run_command(...) session.close()
The names and the overall API is not final, nor is the implementation and it might be nicer to have one import instead two. However, I think the basic structure is right. Please comment.
comment:5 by , 9 years ago
Replying to wenzeslaus:
...it might be nicer to have one import instead two.
On the other hand, separating the stuff into its own module might have some advantages. The API could look like:
import grass.script as gscript import grass.session as gsession session = gsession.create_session("~/grassdata", "nc_spm", "user1") gscript.run_command(...) session.close()
And in future perhaps allow:
session_a = gsession.create_session("~/grassdata", "nc_spm", "user1") session_b = gsession.create_session("~/grassdata", "nc_spf", "PERMANENT") session_a.run_command(...) session_b.run_command(...) session_b.close() session_a.close()
Now uploading a third prototype which is in the separate package, you can do something like:
session = gsession.init_data(location="test_xy", mapset="test1", geostring='XY') # ... session.close()
Code is still messy and naming is not final but should work. Some documentation provided. Please test.
by , 9 years ago
Attachment: | grass_session_package.patch added |
---|
Third prototype with a separate package, location creation functionality and some documentation (patch for lib/python directory)
follow-up: 8 comment:6 by , 9 years ago
Replying to wenzeslaus:
The
GRASS_EXECUTABLE
does not have to be set ifgrass
is on the path or, in theory, if it is in on some standard path
IMHO, this "solution" is yet more duct tape on top of the existing heap. Any real solution would start by simply deleting the GRASS startup script then figuring out what needs to be done to make everything still work without it.
It should not be necessary to "start" GRASS.
On Unix, GRASS modules and libraries should be installed in system directories. Python packages should go in Python's site-packages directory. Environment variables should be set in /etc/profile (or whatever mechanism the distribution uses, e.g. /etc/profile.d).
On Windows, configuration settings should probably be stored in the registry.
GISRC should have a system-wide default setting such as $(HOME)/.grass/rc. Users who never need more than one session at a time can just use that file always, changing the database, location or mapset with g.mapset.
comment:7 by , 9 years ago
I think that the scope of this proposal is very wide. IMHO importing dynamic libraries in a cross platform way and providing an official API are different issues.
WRT to providing an official API for working with GRASS Locations / Mapsets I believe that the proper python idiom is to use a context manager. In other words, the user should not have to do anything when exiting E.g.:
import grass.some_namespace.GrassSession with GrassSession("/path/to/gisdb/location/mapset"): # work with the specified Location/Mapset
This could be expanded to creating temporary Locations / Mapsets:
import grass.some_namespace.GrassSession # not cleaning up might make sense when you debug a script. with GrassSession.temporary(cleanup=False): # create a temporary location/mapset and optionally clean up when exiting the context
or even creating new Locations / Mapsets:
import grass.some_namespace.GrassSession with GrassSession.create_from_epsg(mapset_path, epsg): # create a new location/mapset with GrassSession.create_from_geofile(mapset_path, geofile_path): # create a new location/mapset
comment:8 by , 9 years ago
Replying to glynn:
Replying to wenzeslaus:
The
GRASS_EXECUTABLE
does not have to be set ifgrass
is on the path or, in theory, if it is in on some standard pathIMHO, this "solution" is yet more duct tape on top of the existing heap.
I'll try to prepare some patch without this workaround which will expect grass
executable, Python pakcages and libraries already on path. The API is the easy part here I guess.
Any real solution would start by simply deleting the GRASS startup script then figuring out what needs to be done to make everything still work without it.
It should not be necessary to "start" GRASS.
Back when you suggested that in #580 it seemed strange to me, but now I think it would be much better than the current situation.
On Unix, GRASS modules and libraries should be installed in system directories. Python packages should go in Python's site-packages directory. Environment variables should be set in /etc/profile (or whatever mechanism the distribution uses, e.g. /etc/profile.d).
I can see this working for dynamic libraries and Python packages and I think this would be good enough for now, but how this would work for modules? For example, I have 141 PCL tools installed (pcl_*
) but we have >500 modules plus addons which actually have to be on a separate site. It would be good to get opinions from some packagers.
Anyway, dynamic libraries and Python packages in system paths are place to start. Can the compile/install process be set to this now?
GISRC should have a system-wide default setting such as $(HOME)/.grass/rc. Users who never need more than one session at a time can just use that file always, changing the database, location or mapset with g.mapset.
For me this is different because it is related to the data being used and I'm not so convinced about it in comparison to the need for ready to use runtime environment. I often work in more then one Mapset and I actually use Mapset locks to see where I have already open sessions. However, I can see that for many users permanent connection to given Mapset plus a additional sessions (GISRCs) upon request (starting GRASS application or API in some special way) might work, although particular details matter a lot here.
comment:10 by , 9 years ago
Support for scripting GRASS GIS in Ruby (github.com/jgoizueta/grassgis) is something to take some inspiration from when creating something like a Session class:
GrassGis.session configuration do g.list 'vect' puts output # will print list of vector maps end
comment:11 by , 9 years ago
Milestone: | 7.2.0 → 7.3.0 |
---|
comment:14 by , 7 years ago
Milestone: | 7.4.1 → 7.4.2 |
---|
comment:15 by , 6 years ago
Meanwhile pip install grass-session
is available (source: https://github.com/zarch/grass-session), can the ticket be closed?
comment:16 by , 6 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Now we have pip install grass-session
for Python (and grass ... --exec
in command line). New suggestions and requests would need to specify the relation to these and their evaluation. A new ticket would be more appropriate. Closing as fixed since there is pip install grass-session
.
Trac management: bash processor for Trac which worked at one point seems to be missing now (Error: Failed to load processor bash)
Patch for the lazy initialization of the underscore function (does not require GISBASE at import time)