Table of Contents
This page intends to discuss some security issues that users or developers may face while using GDAL. It can also serve as a check-list of items to take care of, for developers coding new drivers.
DISCLAIMER: It is not meant as being a comprehensive nor vetted cook-book for being "safe" in all circumstances, and should be considered as work-in-progress beneficiating from the feed-back of people deploying GDAL.
Classes of potential vulnerabilities
- Arbitrary code execution
- Theft, tampering or destruction of data
- Denial of Service : abortion of a process, high consumption of CPU, memory, I/O ressource
- Unwanted Web access
- Software bugs - in GDAL code itself, or in third-party dependencies - related to code that processes user data, assuming that it is conformant to the file format specification, and not asserting it correctly. Classical bugs are :
- stack or heap buffer overflows
- excessive memory allocation
- infinite, or very long, loops in code
- Software functionnalities themselves : see "Known issues in drivers" below
Situations at risk
That essentially sums up to situation where one processes untrusted data that would have been crafted to trigger a vulnerability :
- web services accepting input files provided by clients,
- desktop use of GDAL where an hostile party manages to convince the victim user to process hostile data.
- Upgrade to the last versions of GDAL and its dependencies that might contain bug fixes for some vulnerabilities.
- Build GDAL with the hardening options of compilers, e.g. with -D_FORTIFY_SOURCE=2 (some Linux distributions turn on it by default), to minimize the effect of buffer overflows.
- Restrict access to the part of the file system that are only needed: the chroot mechanism, or other sand-boxing solutions, might be a technical solution for this.
- Compile only the subset of drivers really needed. ./configure or nmake.opt options can be used for that. Otherwise, for full control, edit the frmts/gdalallregister.cpp and ogr/ogrsf_frmts/generic/ogrregisterall.cpp and disable the registrations of unneeded drivers.
- Disable at run-time unneeded drivers by setting the GDAL_SKIP and OGR_SKIP configuration options / environment variables
- Disable curl at compilation time to prevent risk of unwanted web access.
- Process (the act of opening with the GDAL/OGR API or utilities is a form of processing) untrusted data in a dedicated user account with no access to other local sensitive data (data files, passwords, crypting/signing keys, etc...).
- For automatic services, place restrictions on the CPU time and memory consumption allowed to the process using GDAL. Check the options of your HTTP servers.
- Check the raster dimensions and number of bands just after opening a GDAL driver and before processing it. Several GDAL drivers use the GDALCheckDatasetDimensions() and GDALCheckBandCount() functions (gcore/gdal_misc.cpp) to do early sanity checks, but user checks can also be usefull. For example, a dataset with huge raster dimensions but with a very small file size might be suspicious (but not always, for example a VRT file, or highly compressed data...)
- Using the GDAL API Proxy mechanism can prevent crashes happening in drivers to propagate to the code using GDAL. Not however that it will not protect against other issues (arbitrary code execution, etc...)
- Do not allow arbitrary arguments to be passed to command line utilities. In particular "--config GDAL_DRIVER_PATH xxx" or "--config OGR_DRIVER_PATH xxx" could be used to trigger arbitrary code execution.
- (More a philosophical consideration) Avoid using closed-source dependency libraries that cannot be audited for vulnerabilities.
- The seccomp_launcher utility is a (yet experimental) sandbox mechanism for Linux that can be used to run GDAL/OGR binaries.
Known issues in drivers
- GDAL and OGR driver do not always use file extensions to determine which file must be handled by which driver (this is a feature in most situations !). But, for example, a VRT file might be disguised as a .tif, .png, or .jpg file. So you cannot know which driver will handle a file by just looking at its extensions. Using "gdalmanage identify the.file" can be a means to know the driver without attempting a full open of the file, but, drivers not having a specialized implementation of the Identify() method will resort to the Open() method.
- Drivers depending on third-party libraries whose code has been embedded in GDAL. Binary builds might rely on the internal version, or the external version. If using the internal version, they might use an obsolete version of the third-party library that might contain known vulnerabilities. Potentially concerned drivers are GTiff (libtiff, libgeotiff), PNG (libpng), GIF (giflib), JPEG (libjpeg), PCRaster (libcsf), GeoJSON (libjson-c), MapInfo? File (MITAB lib), AVCBin/AVCE00 (AVCE00 lib). An internal version of ZLib is also contained in GDAL sources. Packagers of GDAL are recommanded to use the external version of the libraries when possible (might be impractical with libtiff due to the libtiff 4.X vs libtiff 3.X issue), so that security upgrades of those dependencies benefit to GDAL.
- Drivers using GDALOpen() or OGROpen() internally cause other drivers to be used (and their possible flows exploited), without it being obvious at first sight. VRT, MBTiles, KMLSuperOverlay, RasterLite?, PCIDSK, PDF, RPFTOC, RS2, WMS, WCS, WFS, ... are such drivers.
- Drivers depending on downloaded data (HTTP, DODS, WMS, WCS, WFS). A subset of the previously mentionned drivers, but where the hostile payload might come from the Web, so local inspection of content is not sufficient.
- Other drivers can access remote ressources. On GDAL side : ECW (through ecwp://), PostGISRaster, RASDAMAN, GeoRaster, SDE. On OGR side : to-be-done
- XML based drivers: might be subject to denial of service by billion laugh-like attacks (though most OGR XML based drivers can detect such patterns).
- SQL injections: services that would accept untrusted SQL requests could trigger SQL injection vulnerabilities in OGR database-based drivers.
GDAL MEM driver
- The opening syntax MEM:::DATAPOINTER=some_address can access any valid virtual memory of the process. Feeding it with a random access can cause a crash, or a read of unwanted virtual memory. The MEM driver is used by various algorithms and drivers in creation mode (which is not vulnerable to the DATAPOINTER issue), so completely disabling the driver might be detrimental to other areas of GDAL. It is possible to define the GDAL_NO_OPEN_FOR_MEM_DRIVER *compilation* flag to disable the MEM::::DATAPOINTER= syntax only.
GDAL PDF driver
- The OGR_DATASOURCE creation option accepts a file name. So any OGR datasource, and potentially any file (see OGR VRT) could be read through this option, and its content embeded in the generated PDF.
GDAL VRT driver
- Can be used to access any valid GDAL dataset. If a hostile party, with knowledge of the location on the filesystem of a valid GDAL dataset, convinces a victim user to gdal_translate a VRT file and give it back the result, he might be able to steal data.
- The VRTRawRasterBand mechanism can allow accessing any file (not necessarily a valid GDAL dataset) accessible, which can extend the scope of the above mentionned issue.
- /vsicurl/ filenames can be used, thus causing remote data to be downloaded.
OGR VRT driver
- similar issues as the GDAL VRT driver
- <SrcSQL> could be used to modify data.
- Writing drivers compatible with the Linux seccomp mechanism could be a way of limiting the effects of bugs in the driver. This could be conceptually an extension of the GDAL API Proxy mechanism (GDAL core communicating via a pipe with the drivers), with also redirection of low level routines.