= Gotchas in the GDAL and OGR Python Bindings = This page lists aspects of GDAL's and OGR's Python bindings that may catch Python programmers by surprise. If you find something new, feel free to add it to the list, but consider discussing it on the [http://www.osgeo.org/mailman/listinfo/gdal-dev/ gdal-dev mailing list] first, to make sure you fully understand the issue and that others agree that it is unexpected, "non-Pythonic", or something that would catch many Python programmers by surprise. Be sure to reference email threads, Trac tickets, and other sources of additional information. This list is not the place to report bugs. If you believe something is a bug, please [http://trac.osgeo.org/gdal/newticket open a ticket] and report the problem to gdal-dev. Then consider listing it here if it is something related to Python specifically. Do not list it here if it relates to GDAL or OGR generally, and not the Python bindings specifically. Not all items listed here are bugs. Some of these are just how GDAL and OGR work and will never be "fixed" per se. If you don't like how something works and think it should be changed, feel free to discuss it on gdal-dev and see what can be done. = Gotchas that are by design = These are unexpected behaviors that are not considered by the GDAL and OGR teams to be bugs and are unlikely to be changed due to effort required, the need to maintain backward compatibility, etc. === Python bindings do not raise exceptions unless you explicitly call {{{UseExceptions()}}} === By default, the GDAL and OGR Python bindings do not raise exceptions when errors occur. Instead they return an error value such as {{{None}}} and write an error message to {{{sys.stdout}}}. For example, when you try to open a non-existing dataset with GDAL: {{{ #!python >>> from osgeo import gdal >>> gdal.Open('C:\\foo.img') ERROR 4: `C:\foo.img' does not exist in the file system, and is not recognised as a supported dataset name. >>> }}} In Python, it is traditional to report errors by raising exceptions. You can enable this behavior in GDAL and OGR by calling the {{{UseExceptions()}}} function: {{{ #!python >>> from osgeo import gdal >>> gdal.UseExceptions() # Enable exceptions >>> gdal.Open('C:\\foo.img') Traceback (most recent call last): File "", line 1, in RuntimeError: `C:\foo.img' does not exist in the file system, and is not recognised as a supported dataset name. >>> }}} The GDAL team acknowledges that Python programmers expect exceptions to be enabled by default, but says that exceptions are disabled by default to [http://lists.osgeo.org/pipermail/gdal-dev/2010-September/026031.html preserve backward compatibility]. === Python crashes if you use an object after deleting an object it was related to === Consider this example: {{{ #!python >>> from osgeo import gdal >>> dataset = gdal.Open('C:\\RandomData.img') >>> band = dataset.GetRasterBand(1) >>> print band.Checksum() 31212 }}} In this example, {{{band}}} has a relationship with {{{dataset}}} that requires {{{dataset}}} to remain allocated in order for {{{band}}} to work. If we delete {{{dataset}}} and then try to use {{{band}}}, Python will crash: {{{ #!python >>> from osgeo import gdal >>> dataset = gdal.Open('C:\\RandomData.img') >>> band = dataset.GetRasterBand(1) >>> del dataset # This will cause the Python garbage collector to deallocate dataset >>> band.GetChecksum() # This will now crash Python because the band's dataset is gone < Python crashes > }}} This problem can manifest itself in subtle ways. For example, can occur if you try to instantiate a temporary dataset instance within a single line of code: {{{ #!python >>> from osgeo import gdal >>> print gdal.Open('C:\\RandomData.img').GetRasterBand(1).Checksum() < Python crashes > }}} In this example, the dataset instance was no longer needed after the call to {{{GetRasterBand()}}} so Python deallocated it ''before'' calling {{{Checksum()}}}. This problem occurs because the GDAL and OGR objects are implemented in C++ and the relationships between them are maintained in C++ using pointers. When you delete the dataset instance in Python it causes the C++ object behind it to be deallocated. But the C++ object behind the band instance does not know that this happened, so it contains a pointer to the C++ dataset object that no longer exists. When the band tries to access the non-existing object, the process crashes. The GDAL team knows that this design is not what Python programmers expect. Unfortunately the design is difficult to correct so it is likely to remain for some time. Please consult the GDAL team for more information. The problem is not restricted to the GDAL band and dataset objects. It happens in other areas where objects have relationships with each other. Unfortunately there is no complete list, so you have to watch for it yourself. One other known place involves the OGR {{{GetGeometryRef()}}} function: {{{ #!python >>> feat = lyr.GetNextFeature() >>> geom = feat.GetGeometryRef() # geom contains a reference into the C++ geometry object maintained by the C++ feature object >>> del feat # This deallocates the C++ feature object, and its C++ geometry >>> print(geom.ExportToWkt()) # Crash here. The C++ geometry no longer exists < Python crashes > }}} If you read the GDAL and OGR API documentation carefully, you will see that the functions that end in "Ref" mention that they transfer ownership of objects. This is a clue that the problem could occur. Be careful when using the "Ref" functions. Also watch out for functions that end in "Directly", such as {{{SetGeometryDirectly()}}}. {{{ #!python >>> point = ogr.Geometry(ogr.wkbPoint) >>> feature = ogr.Feature(layer_defn) >>> feature.SetGeometryDirectly(point) # Transfers ownership of the C++ geometry from point to feature >>> del feature # point becomes implicitly invalid, because feature owns the C++ geometry >>> print point.ExportToWkt() # Crash here < Python crashes > }}} The advantage of the "Ref" and "Directly" functions is they provide faster performance because a duplicate object does not need to be created. The disadvantage is that you have to watch out for this problem. The examples here are based on [http://lists.osgeo.org/pipermail/gdal-dev/2010-September/026027.html email from Even Rouault]. === Certain objects contain a {{{Destroy()}}} method, but you should never use it === = Gotchas fixed in GDAL 1.8.0 = These are bugs that were fixed or designs that were changed in GDAL 1.8.0 and later. If you use an older version, watch out for these. === {{{gdal.ErrorReset()}}} must be called after an error occurs, or it will keep happening === In this example, we use OGR to create a geometry object using valid WKT. Then we try some invalid WKT, which is expected to fail, and then try the valid WKT again but it fails. Other OGR functions will also fail. {{{ #!python >>> from osgeo import ogr >>> ogr.UseExceptions() >>> ogr.CreateGeometryFromWkt('POINT(1 2)') # Create a point using valid WKT > >>> ogr.CreateGeometryFromWkt('blah blah blah') # Now try to create one using invalid WKT Traceback (most recent call last): File "", line 1, in File "C:\Python25\lib\site-packages\osgeo\ogr.py", line 2885, in CreateGeometryFromWkt return _ogr.CreateGeometryFromWkt(*args, **kwargs) RuntimeError: OGR Error: Unsupported geometry type >>> ogr.CreateGeometryFromWkt('POINT(1 2)') # Now try to to create one using valid WKT again Traceback (most recent call last): File "", line 1, in File "C:\Python25\lib\site-packages\osgeo\ogr.py", line 2885, in CreateGeometryFromWkt return _ogr.CreateGeometryFromWkt(*args, **kwargs) RuntimeError: OGR Error: Unsupported geometry type }}} The problem is that OGR and GDAL maintain an internal state variable that tracks whether an error occurred during the last operation, but that this variable is not automatically cleared by OGR or GDAL. You must manually clear it by calling the {{{gdal.ErrorReset()}}} function: {{{ #!python >>> ogr.CreateGeometryFromWkt('blah blah blah') Traceback (most recent call last): File "", line 1, in File "C:\Python25\lib\site-packages\osgeo\ogr.py", line 2885, in CreateGeometryFromWkt return _ogr.CreateGeometryFromWkt(*args, **kwargs) RuntimeError: OGR Error: Unsupported geometry type >>> from osgeo import gdal >>> gdal.ErrorReset() >>> ogr.CreateGeometryFromWkt('POINT(1 2)') > }}} This function only appears in the GDAL Python bindings, not the OGR Python bindings. Even if you are only using OGR, you must use GDAL to clear the error. This problem is ackowledged by the GDAL team as a bug. Please see #3077. = Gotchas that result from bugs or behaviors of other software = === Python crashes in GDAL functions when you upgrade or downgrade numpy === Much of GDAL's Python bindings are implemented in C++. Much of the core of numpy is also implemented in C++. The C++ part of GDAL's Python bindings interacts with the C++ part of numpy through numpy's ABI (application binary interface). This requires GDAL's Python bindings to be compiled using numpy header files that define numpy C++ data structures. Those data structures sometimes change between numpy versions. When this happens, the new version of numpy is not be compatible at the binary level with the old version, and the GDAL Python bindings must be recompiled before they will work with the new verison of numpy. And when they are recompiled, they probably won't work with the old verison. If you obtain a precompiled version of GDAL's Python bindings, such as the Windows packages from http://vbkto.dyndns.org/sdk/, be sure you look up what version of numpy was used to compile them, and install that version of numpy on your machine. === Python bindings cannot be used successfully from ArcGIS in-process geoprocessing tools (ArcGIS 9.3 and later) === ArcGIS allows the creation of custom, Python-based geoprocessing tools. Until ArcGIS 10, there was no easy way to read raster data into memory. GDAL provides such a mechanism. Starting with ArcGIS 9.3, geoprocessing tools can either run in the ArcGIS process itself (!ArcCatalog.exe or !ArcMap.exe) or run in a separate python.exe worker process. Unfortunately ArcGIS contains a bug in how it runs in-process tools. Thus, if you use GDAL from an in-process tool, it will run fine the first time but after that it may fail with {{{TypeError}}} exceptions until you restart the ArcGIS process. For example, band.!ReadAsArray() fails with: {{{TypeError: in method 'BandRasterIONumPy', argument 1 of type 'GDALRasterBandShadow *'}}} This is a bug in ArcGIS. Please see #3672 for complete details and advice on workarounds.