[[TOC]] = Testing framework for GRASS GIS = || Title: || '''Testing framework for GRASS GIS'''|| || Student: || Vaclav Petras, [http://gis.ncsu.edu/osgeorel/ North Carolina State University, Open Source Geospatial Research and Education Laboratory]|| || Organization: || [http://www.osgeo.org OSGeo - Open Source Geospatial Foundation]|| || Mentors: || [http://grasswiki.osgeo.org/wiki/User:Huhabla Sören Gebbert], [http://www4.ncsu.edu/~hmitaso/ Helena Mitasova] || || GSoC link: || [https://www.google-melange.com/gsoc/project/details/google/gsoc2014/wenzeslaus/5741031244955648 abstract] || == Abstract == GRASS GIS is one of the core projects in the OSGeo Foundation. GRASS provides wide range of geospatial analyses including raster and vector analyses and image processing. However, there is no system for regular testing of it's algorithms. To ensure software quality and reliability, a standardized way of testing needs to be introduced. This project will implement a testing framework which can be used for writing and running tests of GRASS GIS modules, C/C++ libraries and Python libraries. == Introduction == GRASS GIS is one of the core projects in the OSGeo Foundation and is used by several other free and open source projects to perform geoprocessing tasks. The software quality and reliability is crucial. Thus, proper testing is needed. So far, the testing was done manually by both developers and users. This is questionable in terms of test coverage and frequency of the tests and moreover, it is inconvenient. This project will implement a testing framework which can be used for writing and running tests for GRASS GIS. This will be beneficial not only for the quality of GRASS GIS but also for everyday development of GRASS GIS because it will help to identify problems with the new code at the time when the change is done. == Background == There was already several attempts to establish testing infrastructure for GRASS GIS, namely [http://lists.osgeo.org/listinfo/grass-qa quality assessment and monitoring mailing list] which is inactive for several years, then older test suite which was never integrated into GRASS GIS itself, and most recently a [http://grasswiki.osgeo.org/wiki/Test_Suite test suite proposal] which was trying to interpret shell scripts as test cases. Also, an [source:grass/trunk/raster/r.category/test_rcategory_doctest.txt?rev=58814 experience] with usage of Python [https://docs.python.org/2/library/doctest.html doctest] at different circumstances shows that this solution is not applicable everywhere. These previous experiences give us a clear idea what is not working (e.g. tests outside main source code), what is overcomplicated (e.g. reimplementing shell) and what is oversimplified (e.g. shell scripts without clear set up and tear down steps), and point us to the direction of an implementation which will be most efficient (general but simple enough), integrated in GRASS source code, and which will be accepted by the GRASS development team. The long preceding discussions also showed what is necessary to have in the testing framework and what should be left out. == The idea == The purpose of this project is to develop a general mechanism which would be applicable for testing GRASS modules, libraries or workflows with different data sets. Tests will be part of GRASS main source code, cross-platform, and as easy to write and run as possible. The testing framework will enable the use of different testing data sets because different test cases might need special data. The testing framework will be implemented in Python and based on testing tools included in standard Python distribution (most notably [https://docs.python.org/2/library/unittest.html unittest]) which will not bring a new dependency but also it will avoid writing everything from scratch. The usage of Makefile system will be limited to triggering the test or tests with the right parameters for particular location in the source tree, everything else will be implemented in Python to ensure maximum re-usability, maintainability, and availability across platforms. This project will focus on building infrastructure to test modules, C/C++ libraries (using ctypes interface), and Python libraries. It is expected that testing of Python GUI code will be limited to pure Python parts. The focus will be on the majority of GRASS modules and functionality while special cases such as rendering, creation of locations, external data sources and databases, and downloading of extensions from GRASS Addons will be left for future work. Moreover, this project will not cover tests of graphical user interface, server side automatic testing (e.g. commit hooks), using [source:grass/trunk/testsuite/raster/rmapcalc_test.sh?rev=58814 testing shell scripts] or [source:grass/trunk/lib/raster3d/test?rev=58025 C/C++ programs], and testing of internal functions in C/C++ code (e.g. static functions in libraries and functions in modules). Creation of HTML, XML, or other rich outputs will not be completely solved but the implementation will consider the need for a presentation of test results. Finally, writing the tests for particular parts will not be part of this project, however several sample tests for different parts of code, especially modules, will be written to test the testing framework. == Project plan == || date || proposed task || || 2014-05-19 - 2014-05-23 (week 01) || Designing a basic template for the test case and interface of test suite class(es) || || 2014-05-26 - 2014-05-30 (week 02) || Basic implementation || 2014-06-02 - 2014-06-06 (week 03) || Dealing with evaluation and comparison of textual and numerical outputs || 2014-06-09 - 2014-06-13 (week 04) || Dealing with evaluation and comparison of map outputs and other outputs || 2014-06-16 - 2014-06-20 (week 05) || Re-writing some existing tests using testing framework || 2014-06-23 - 2014-06-27 (week 06) || Testing of what was written so far and evaluating current design and implementation || June 23 || Mentors and students can begin submitting mid-term evaluations || June 27 || Mid-term evaluations deadline || 2014-06-30 - 2014-07-04 (week 07) || Integration with GRASS source code, documentation and build system || 2014-07-07 - 2014-07-11 (week 08) || Implementation of location switching || 2014-07-14 - 2014-07-18 (week 09) || Dealing with evaluation and comparison of so far unresolved outputs || 2014-07-21 - 2014-07-25 (week 10) || Implementing the basic test results reports || 2014-07-28 - 2014-08-01 (week 11) || Re-writing some other existing tests using testing framework || 2014-08-04 - 2014-08-08 (week 12) || Writing documentation of framework internals and guidelines how to write tests || 2014-08-11 - 2014-08-15 (week 13) || Polish the code and documentation || August 11 || Suggested 'pencils down' date. Take a week to scrub code, write tests, improve documentation, etc. || 2014-08-18 - 2014-08-22 (week 14) || Submit evaluation and code to Google || August 18 || Firm 'pencils down' date. Mentors, students and organization administrators can begin submitting final evaluations to Google. || August 22 || Final evaluation deadline || August 22 || Students can begin submitting required code samples to Google == Design of testing API == {{{ #!python import unittest import grass.pygrass.modules as gmodules # alternatively, these can be private to module with setter and getter # or it can be in a class USE_VALGRIND = False class GrassTestCase(unittest.TestCase): """Base class for GRASS test cases.""" def run_module(self, module): """Method to run the module. It will probably use some class or instance variables""" # get command from pygrass module command = module.make_cmd() # run command using valgrind if desired and module is not python script # see also valgrind notes at be end of this section if is_not_python_script(command[0]) and USE_VALGRIND: command = ['valgrind', '--tool=...', '--xml=...', '--xml-file=...'] + command # run command # store valgrind output (memcheck has XML output to a file) # store module return code, stdout and stderr, how to distinguish from valgrind? # return code, stdout and stderr could be returned in tuple def assertRasterMap(self, actual, reference, msg=None): # e.g. g.compare.md5 from addons # uses msg if provided, generates its own if not, # or both if self.longMessage is True (unittest.TestCase.longMessage) # precision should be considered too (for FCELL and DCELL but perhaps also for CELL) if check sums not equal: self.fail(...) # unittest.TestCase.fail }}} {{{ #!python class SomeModuleTestCase(GrassTestCase): """Example of test case for a module.""" def test_flag_g(self): """Test to validate the output of r.info using flag "g" """ # Configure a r.info test module = gmodules.Module("r.info", map="test", flags="g", run_=False) self.run_module(module=module) # it is not clear where to store stdout and stderr self.assertStdout(actual=module.stdout, reference="r_info_g.ref") def test_something_complicated(self): """Test something which has several outputs """ # Configure a r.info test module = gmodules.Module("r.complex", rast="test", vect="test", flags="p", run_=False) (ret, stdout, stderr) = self.run_module(module=module) self.assertEqual(ret, 0, "Module should have suceed but return code is not 0") self.assertStdout(actual=stdout, reference="r_complex_stdout.ref") self.assertRasterMap(actual=module.rast, reference="r_complex_rast.ref") self.assertVectorMap(actual=module.vect, reference="r_complex_vect.ref") }}} Compared to suggestion in ticket:2105#comment:4 it does not solve everything in `test_module` (`run_module`) function but it uses `self.assert*` similarly to `unittest.TestCase` which (syntactically) allows to check more then one thing. Modules (or any tests?) can run with `valgrind` (probably `--tool=memcheck`). This could be done on the level of testing classes but the better option is to integrate this functionality (optional running with `valgrind`). Environmental variable (GRASS_PYGRASS_VALGRIND) or additional option `valgrind_=True` (similarly to overwrite) would invoke module with `valgrind` (works for both binaries and scripts). Additional options can be passed to `valgrind` using `valgrind`'s environmental variable `$VALGRIND_OPTS`. Output would be saved in file to not interfere with module output. We may want to use also some (runtime checking) tools other than `valgrind`, for example clang/LLVM sanitizers (as for example [https://docs.python.org/devguide/clang.html Python does]). However, it is unclear how to handle more than one tool as well as it is unclear how to store the results for any of these (including `valgrind`) because one test can have multiple module calls (or none), module calls can be indirect (function in Python lib which calls a module or module calling module) and there is no standard way in `unittest` to pass additional test details. == Data types to be checked == We must deal especially with GRASS specific files such as raster maps. We consider that comparison of simple things such as strings and individual numbers is already implemented by [https://docs.python.org/2/library/unittest.html#test-cases unittest]. * raster map * composite? reclassified map? * color table included * vector map * 3D raster map * color table * SQL table * file Most of the outputs can be checked with different numerical precision. Resources: * http://grasswiki.osgeo.org/wiki/Test_Suite#Modules == Naming conventions == The [https://docs.python.org/2/library/unittest.html#unittest.TestLoader.discover unittest.TestLoader.discover] function requires that module names ''are importable (i.e. are valid Python identifiers)''. Consequently, names of files with tests should contain dots (except for the `.py` suffix). Methods with tests must start with `test_` to be recognized by the [https://docs.python.org/2/library/unittest.html#organizing-test-code unittest] framework.