wiki:SummerOfCode

Version 63 (modified by Robert Coup, 9 years ago) ( diff )

--

Google Summer of Code

GDAL participates in the Google Summer of Code under the OSGeo umbrella.

2015 Ideas List

*To be completed...*

1. Security enhancements

Due to its nature of dealing essentially with external input, mostly on the form of files, but also network exchanges, GDAL/OGR can be exposed to various threats regarding its security of use. The page http://trac.osgeo.org/gdal/wiki/SecurityIssues summarizes such issues.

The project consists of :

  • manual code auditing to detect and fix issues,
  • use of automatic fuzzing tools to stress-test the library (such as afl),
  • development and use of generic classes/methods/practices (e.g. detection of integer overflows) to ease robust development.
  • exploring sandboxing solutions to write drivers compatible with the Linux seccomp mechanism could be a way of limiting the effects of bugs in drivers. This could be conceptually an extension of the GDAL API Proxy mechanism (GDAL core communicating via a pipe with the drivers), with also redirection of low level routines.

Students that want to apply for this subject will have to priorly demonstrate their capabilities and interest in the topic, for example by identifing a few existing defects in the code base and propose ways of addressing them.

Skills:

  • programming skills needed - C/C++, awareness of software security issues and practices related to those languages
  • difficulty level - moderate/high

Possible mentor/co-mentor: Even Rouault (even.rouault at spatialys.com)

2. Integration of cpp GDAL utilities into GDAL core library

There has been expressed interest several times to have e.g. the VRTBuilder class included in GDAL core. This proposal extends the idea by integrating all functionality of cpp GDAL utilities into GDAL core, providing an unique entry point like

int GDALRunUtility(const char* pszArguments, GDALProgressFunc pfnProgress, void* pProgressArg, ...)

allowing something like the following syntax:

GDALRunUtility("gdal_translate in.tif out.tif -of PNG", NULL, NULL)
or
GDALRunUtility("gdalwarp $1 out.tif -t_srs EPSG:4326", NULL, NULL, hSrcDS)
or
GDALRunUtility("gdalwarp $1 &$2 -t_srs EPSG:4326 -of MEM", NULL, NULL, hSrcDS, &hOutDS)
or
GDALRunUtility("gdalbuildvrt out.vrt *.tif", NULL, NULL)
or
GDALRunUtility("gdalbuildvrt out.vrt $*", NULL, NULL, hSrcDS1, hSrcDS2, NULL)
or
GDALRunUtility("gdalbuildvrt &$1 $*", NULL, NULL, &hOutVRT, hSrcDS1, hSrcDS2, NULL)

where :
$X is substituted with a GDAL object (input)
&$X is substituted with a GDAL object (output)
$* is substituted with a NULL-terminated list of GDAL objects

The utilities then would be just a simple main that would call the library code.

Besides other advantages, the incorporation would allow more flexible application development, e.g. calling GDAL utilities without the need to make system calls.

The project consists of:

  • extracting the code of the utilities, i.e. removing any exit(), global variables, etc., and convert them to proper functions
  • integrating the code in libgdal (or a libgdal_utilities)
  • developing a proper syntax for GDALRunUtility()
  • exchanging utilitiy code with a simple main calling GDALRunUtility()
  • checking compatibility with SWIG bindings
  • updating the build system and bindings

Possible mentor/co-mentor: Even Rouault (even.rouault at spatialys.com)

3. GDAL image stretch/filter utility

GDAL utilities currently have limited built-in support for image stretching and filtering beyond simple linear scaling via “-scale” in gdal_translate. The VRT format has a method to accept a filter algorithm but code still must be written. Here I propose that faster C++ image stretching and filtering implementations (based on existing methods and already sand-boxed within a Python library) be written more formally for GDAL.

In short, various image stretching and filtering techniques allow one to bring out details in satellite or airborne images. And as various GDAL applications mature, like image matching and feature extraction, having a well-tested set of stretches and filters can only further to help the evolution of these methods also.

Skills:

  • programming skills needed - C/C++
  • difficulty level - moderate

As a start I recommend building C++ implementations of the stretches and filters as available in the sand-box code pystretch (from: https://pythonhosted.org/PyStretch/examples.html ):

Linear Stretches

  • Linear Contrast Stretch, Binary Contrast Stretch, Inverse Contrast Stretch
  • Standard Deviation Stretch
  • High Cut Stretch, Low Cut Stretch

Non-linear Stretches:

  • Gamma Stretch
  • Histogram Equalization
  • Logarithmic Stretch

Filters:

  • Laplacian Filter
  • High Pass Filter (3x3 Kernel, 5x5 Kernel)
  • Gaussian Filter, Gaussian High Pass Filter
  • Mean Filter, Conservative Filter, Median Filter
  • Running Standard deviation
  • Custom

Possible mentor/co-mentor: Trent Hare (thare at usgs.gov) and Jay Laura (also usgs)

4. Tool(s) for performance profiling and option tuning

Some GDAL options can have enormous effects on the performance of some operations, depending on dataset size/complexity/source and all sorts of other factors. I'm imagining a tuning tool that looks at which caches/limits are being "hit" (or not) during an operation (eg. a specific gdalwarp), where time is being spent (IO, CPU, ...) and suggest better settings for your datasets & host configuration. Maybe this could be expanded in future to select better "defaults" automatically. I'm thinking settings like: GDAL_MAX_DATASET_POOL_SIZE, GDAL_CACHEMAX, GDAL_SWATH_SIZE, VSI_CACHE, GDAL_DISABLE_READDIR_ON_OPEN, OSM_MAX_TMPFILE_SIZE, warp memory/threading/options, and possibly per-format options -- for GTiff things like tiling, interleaving, overviews, GTIFF_VIRTUAL_MEM_IO, GTIFF_DIRECT_IO. Creating a structure that future measurements and options can fit into in future will be an important design issue, and this project will require digging deep into the implementations of various settings & caches used by GDAL.

In terms of reporting possibly something along the lines of MySQLTuner (example output). Maybe invocation like gdalwarp --tune ... but would also be good if usage via library/bindings could be profiled in the same way and the output dumped somewhere for later analysis/reporting (eg. gdaltune my_warp.gdaltune).

Skills:

  • programming skills needed - C/C++
  • difficulty level - hard

Possible mentor/co-mentor: Robert Coup (robert.coup at koordinates.com)

5. Promoting VSI

The virtual filesystem (VSI) functionality in GDAL (vsizip, vsicurl, vsimem, vsisubfile, etc) is pretty cool, and is useful for a lot of things outside GDAL. Look at whether an external project could be a better place for it to live (even if it's just a separate build/packaging from code that continues to live in GDAL) -- adding tests & CI, looking at cross-platform issues, documenting vsi_preload.so, making a library other apps could utilise for the functionality (libvsi?), and possibly creating a FUSE implementation that maps onto the VSI code (ala. `mount -t vsi /vsizip/vsicurl/http://example.com/foo.zip foo/`).

Skills:

  • programming skills needed - C/C++
  • experience with build, test, CI, packaging tools
  • difficulty level - moderate

Possible mentor/co-mentor: Robert Coup (robert.coup at koordinates.com)

6. OpenFileGDB Write support

The existing OpenFileGDB driver doesn't implement writing, but is more stable for reading than the proprietary ESRI driver in most cases.

This project would aim to add some level of write support to the OpenFileGDB driver. The primary goal would be so it can create files that the OpenFileGDB driver can read again, and the secondary goal would be to improve compatibility so that ArcGIS itself can read the files. As Even describes it, reverse engineering and black box testing isn't always fun, but it is a great skill to have and there'd be as much software & support & test data as we can get:

Hum, that depends on the perseverance of the student to not give up if some proprietary software refuse to read its neat generated geodatabase or crashes, whereas it can be read with the openfilegdb read side ;-)

Skills:

  • programming skills needed - C/C++
  • some experience with reverse engineering of file formats/protocols
  • difficulty level - hard

Possible mentor/co-mentor: Robert Coup (robert.coup at koordinates.com)

2014 Ideas List

1. Automatic Geo-referencer

  • Presently the geo-referencing of images is done by selecting points manually.
  • By this project we aim to automate this task.

This Project has two parts:

  1. There is a GDAL-Correlator project which implemented Simple SURF algorithm. Modification of that to support Multi Band Images and Large Data sets
  2. Applying the modified algorithm for geo-referencing.

Mentor: Chaitanya Kumar CH (chaitanya[dot]ch[at]gmail.com)

2. AutoCAD DWG OGR Driver based on libredwg. Note that libredwg development seems to be stalled, but the original library, libdwg, that served as a starting point for libredwg has seen some recent activity in late 2013. Some review of status would be appropriate before committing. This would be an alternative to the more proprietary Open Design Alliance library based driver.

Currently GDAL supports AutoCAD DWG file via Open Design Alliance Teiga library which is not freely available and has strict license. There is a project libredwg (GNU GPL v3) which provide DWG support (R13, R14 and R2000 version). Need to rewrite (http://gdal.org/ogr/drv_dwg.html) or write the new driver for DWG.

  • programming skills needed - C/C++
  • difficulty level - high

Possible mentor/co-mentor: Dmitry Baryshnikov (polimax@…)

3. Adding support for "M" dimension in OGR Geometries in a fashion inline with modern OGC and ISO simple features geometry standards. Note, GEOS does not yet support M which will be a limiting factor.

  • programming skills needed - C/C++
  • difficulty level - moderate

Possible mentor/co-mentor: Even Rouault( even.rouault at mines-paris.org )

4. Adding support for VERT_CS. GDAL support of VERT_CS is very limited. OGRSpatialReference::SetVertCS only supports name, datum name and VertDatumClass. Moreover drivers such as GeoTIFF and SHP cannot store Vertical CS information. No reprojection from different Vertical CS is present. More support for VERT_CS in GDAL need to proceed SAR and LAS data, and more accurate reprojecting using Vertical Coordinate System transformation.

  • programming skills needed - C/C++
  • difficulty level - moderate

Possible mentor/co-mentor: Dmitry Baryshnikov (polimax@…)

5. Geography Network support. Use any OGR driver to create abstract network model with capabilities: routing, rules, references etc. There is lack of abstract open source network model useful for storing network data (engineering networks, road routing and etc.). There is a need to create such model to store network data in preferable OGR format. Also model should support some algorithms: compute shortest path (Dijkstra), create path array using different criteria (K shortest path), searching disconnected segments, etc.

Related projects : PostGIS topology, PgRouting, ...

  • programming skills needed - C/C++
  • difficulty level - high

Possible mentor/co-mentor: Dmitry Baryshnikov (polimax@…)

Note: done as GSoC 2014 project

6. Bring up to speed the OGR style support. Feature Style Specification is outdated. There are several popular formats SLD, SVG and etc. which can be supported by OGRLayer. Some format conversion maybe implemented. Need to select new style specification format, rewrite GDAL code to support it and add such functionality to such drivers as DXF, KML, MapInfo tab, etc. Possible further work could be to study what changes need to be done in MapServer project, so it can use it.

  • programming skills needed - C/C++
  • difficulty level - moderate

Possible mentor/co-mentor: Even Rouault( even.rouault at mines-paris.org )

7. Develop a GDAL Raster driver for GeoPackage The GeoPackage specification has just been adopted by the OGC as a standard portable container for raster and vector content. A GeoPackage driver for the vector part already exists in OGR. The project consists in developing support for the raster part of the specification, both in reading and creation mode.

  • programming skills needed - C/C++
  • difficulty level - moderate/high

Possible mentor/co-mentor: Even Rouault( even.rouault at mines-paris.org )

Note: done in GDAL 2.0dev

8. Add new supported data types. The GDAL support limited data types in attributes fields. Modern formats can support more date types in attributes. The student should check all the OGR drivers for attributes types and create a list new types which must be implemented in GDAL. E.g. the GUID type is one of the candidates. The OGR drivers which not supported such types must convert the data to more closer types (e.g. guid -> string). Also deprecated types should be removed (e.g. OFTWideString, OFTWideStringList). The main OGR Drivers should be corrected to support new types.

  • programming skills needed - C/C++
  • difficulty level - low

Possible mentor/co-mentor: Dmitry Baryshnikov (polimax@…)

Note: partly done with 64 bit integers, boolean, 16 bit integers

9. OGR Driver for MongoDB. MongoDB, a document database that provides high performance, high availability, and easy scalability, can be a good platform for storing extremely large spatial datasets, to support high performance geo-computation and real-time spatial analysis in a large scale.This project aims at developing a OGR Driver for MongoDB to help applications or softwares based on GDAL, such as QGIS, GeoServer, Mapserver, ArcGIS and so on, read & write the spatial data in it, and thus enable the Open Source GIS Ecosystem powered by the advanced NoSQL database.

  • programming skills needed - C/C++
  • difficulty level - moderate/high

Possible mentor/co-mentor: Even Rouault( even.rouault at mines-paris.org )

Note: a MongoDB driver has been developed and is available in an out-of-tree repository

2013 Ideas List

  1. OGC WMTS Driver - likely as an extension to the existing WMS driver which does various tiling schemes too. OGC WMTS is the OGC Web Map Tile Service
  1. AutoCAD DWG OGR Driver based on libredwg. Note that libredwg development seems to be stalled. Some review of status would be appropriate before committing. This would be an alternative to the more proprietary Open Design Alliance library based driver.
  1. Adding support for "M" dimension in OGR Geometries in a fashion inline with modern OGC and ISO simple features geometry standards. Note, GEOS does not yet support M which will be a limiting factor.
  1. OSGeo4W 64 builds - update scripts, OSGeo4W installer, etc to build GDAL and related components for 64bit windows deployments.
  1. WCS time series (1D dataset) support - compatible with RASDAMAN & compliant with the WCS spec. In line with this: https://wiki.services.eoportal.org/tiki-download_forum_attachment.php?attId=53
  1. GDAL/OGR on the Web - A Django/GeoDjango based Web Application for doing ogr or gdal conversions on the Web.
  1. Raster / Vector Geo-referencer on the Web Upload your unreferenced vectors/rasters and georeference them. Similar to this: http://www.youtube.com/watch?feature=player_embedded&v=88gt1gj2dbs but entirely based on GDAL and completely reusable.

8.- Support for Multiple Geometry libraries in GDAL/OGR Currently, instead of only supporting GEOS (LGPL), we could add support for Boost Geometry Library, too http://www.boost.org/doc/libs/1_53_0/libs/geometry/doc/html/index.html. This would allow a licensing scheme that would enable GDAL/OGR apps in iOS.

2010 Mentors

  • Frank Warmerdam (warmerdam at pobox.com)
  • Howard Butler (hobu.inc at gmail.com) OGR-related items
  • Philippe Vachon (philippe at cowpig.ca) - general raster, threading/parallelism items

2010 Ideas List

Google will be sponsoring another Summer of Code for 2010.

  1. OpenEV2: OpenEV is a GUI tool for efficient displaying and analyzing geospatial data formats supported by GDAL/OGR (including GeoTIFF, MrSID, ECW, .. ESRI Shape files, ..), it is able to convert between file formats, reproject, crop, display by OpenGL a 3D terrain based on elevation from DEM files on Linux, Windows and Mac. Have a look at screenshots or try it. It has almost finished port to GTK 2.0 and GDAL python-ng which needs a bit of work, packaging and improvement, more info at OpenEV2 updated post. Knowledge: Python, GNU tools, GTK 2.0, Linux and partly C, C++.
  1. PNG Driver: Implement efficient PNG driver using libPNG with support for optimization of PNG images exported by GDAL (info: tutorial + png driver). You can reuse existing open-source NeuQuant algorithm for RGBA and RGB export. NeuQuant is easy-to-use practical demonstration of power of the Kohonen Neural Networks. It's source code can be included into GDAL from original implementation of the algorithm and/or from pngnq utility. There are also other open-source tools implementing optimization of PNG which can be reused (AdvPNG, OptiPNG, PNGcrush) or can be a source of inspiration (PNGOut). Result should be an improved PNG driver for GDAL together with color quantization functions which will produce png files with optimized file-size. Result of this work is going to be usable in MapServer, GRASS, MapTiler and other GDAL-based projects. Knowledge: C, C++
  1. ODBC Driver: Implement write support for the ODBC driver and include support for MSSQL2008 spatial database. The current ODBC driver should be extended to create or transfer spatial data into the ODBC data sources like Microsoft SQL Server. The driver must have built-in support to auto create the XMIN, YMIN, XMAX and YMAX shape envelope values for the non-spatial databases and the spatial index for the MSSQL2008 spatial databases. (info: http://www.gdal.org/ogr/drv_odbc.html). Result of this work is going to be usable in any project using the OGR libraries. Knowledge: C, C++, MSSQL2008 Spatial

Note : a MSSQLSpatial driver now exists.

  1. OGR SQL .NET Data Provider: Implement a .NET Data Provider inteface for the OGR SQL API. This sample application would allow the user to use OgrSqlDataAdapter to read the result of an OGR SQL query into a DataSet which could be used as the datasource of the bindable .NET controls. The provider would also support transactions and batch queries and read the multiple results into multiple data tables within the same DataSet. The OgrSqlDataReader would provide a convenient way to retrieve the records sequentially for the user. (info: http://www.gdal.org/ogr/ogr_sql.html). Result of this work would go to the sample application section of the C# interface. Knowledge: C#, .NET Framework Class Libraries
  1. OGR WFS read (or read/write) driver using existing OGR GML driver for feature parsing.

Note : a OGR WFS driver now exists

  1. Develop a driver for IBM DB2 and its Spatial Extender
  1. GDAL_CALC.PY - Development of a simple raster calculator based on Python+GDAL. Use sample:

% gdal_calc a=img1.tif b=img2.tif c=img3.tif -calc c=((a+b)/2)

Note : a gdal_calc.py script now exists.

  1. Multithreading - Work to make GDAL and/or OGR threadsafe, and develop a test suite to validate.

or your own ideas..

2009 Ideas List

  1. KML Driver: Develop an enhanced KML driver using Google's libkml library. A new driver based around Google's reference implementation would fix some limitations in the current driver (no multi-geometries, no KMZ support, etc), make it easier to keep the driver up to date as KML evolves, and should generally make it easier for GDAL software to exchange data with Google Earth. Both OGR (vector) and GDAL (raster) KML drivers could be developed.

Note : a OGR LIBKML driver now exists

  1. OpenEV2: OpenEV is a GUI tool for efficient displaying and analyzing geospatial data formats supported by GDAL/OGR (including GeoTIFF, MrSID, ECW, .. ESRI Shape files, ..), it is able to convert between file formats, reproject, crop, display by OpenGL a 3D terrain based on elevation from DEM files on Linux, Windows and Mac. Have a look at screenshots or try it. It has almost finished port to GTK 2.0 and GDAL python-ng which needs a bit of work, packaging and improvement, more info at OpenEV2 updated post. Knowledge: Python, GNU tools, GTK 2.0, Linux and partly C, C++.
  1. PNG Driver: Implement efficient PNG driver using libPNG with support for optimization of PNG images exported by GDAL (info: tutorial + png driver). You can reuse existing open-source NeuQuant algorithm for RGBA and RGB export. NeuQuant is easy-to-use practical demonstration of power of the Kohonen Neural Networks. It's source code can be included into GDAL from original implementation of the algorithm and/or from pngnq utility. There are also other open-source tools implementing optimization of PNG which can be reused (AdvPNG, OptiPNG, PNGcrush) or can be a source of inspiration (PNGOut). Result should be an improved PNG driver for GDAL together with color quantization functions which will produce png files with optimized file-size. Result of this work is going to be usable in MapServer, GRASS, MapTiler and other GDAL-based projects. Knowledge: C, C++
  1. ODBC Driver: Implement write support for the ODBC driver and include support for MSSQL2008 spatial database. The current ODBC driver should be extended to create or transfer spatial data into the ODBC data sources like Microsoft SQL Server. The driver must have built-in support to auto create the XMIN, YMIN, XMAX and YMAX shape envelope values for the non-spatial databases and the spatial index for the MSSQL2008 spatial databases. (info: http://www.gdal.org/ogr/drv_odbc.html). Result of this work is going to be usable in any project using the OGR libraries. Knowledge: C, C++, MSSQL2008 Spatial
  1. GDAL2Tiles/MapTiler: Implementation of the coming OGC WMTS standard, implementation of the pixel-precise warping (by warped VRT editing), implementation of the opacity slider control as one of the official OpenLayers Addins, direct tiling of global maps into Spherical Mercator from WGS84 (bug), support for JPEG tiles, support for cutline clipping, NODATA transparency, bug-fixing of open issues. Code is going to be submitted into GDAL SVN and MapTiler SVN. Project MapTiler (the GUI for GDAL2Tiles) is going to be published as a stable version 1.0 with all binary installers (Windows/Linux/Mac) and with support for localization. Knowledge: Python, JavaScript, GNU tools, partly C, C++
  1. PostGIS / WKT Raster Driver: Implementation of read-only GDAL driver for WKT Raster extension to PostGIS. The WKT Raster is is an ongoing project aiming at developing raster support in PostGIS (...) goal is to implement the RASTER type as much as possible like the GEOMETRY type is implemented in PostGIS. WKT Raster crash course #1 gives detailed overview of the project background. In short, implementation steps will include parsing of WKB format, define specializations of GDALDataset and GDALRasterBand, implement read operations for bands. The idea behind WKT Raster in GSoC 2009 is to deliver a prototype (or proof of concept) that can be used as a base for further development, so only two raster types are considered: 1 band of GDT_Byte (Grayscale), 3 x 1 band of GDT_Byte (RGB). Knowledge: raster graphics, basic understanding of Well-Known-Binary-like formats, libpq, C (strong), C++ (basic).
  1. OGR SQL .NET Data Provider: Implement a .NET Data Provider inteface for the OGR SQL API. This sample application would allow the user to use OgrSqlDataAdapter to read the result of an OGR SQL query into a DataSet which could be used as the datasource of the bindable .NET controls. The provider would also support transactions and batch queries and read the multiple results into multiple data tables within the same DataSet. The OgrSqlDataReader would provide a convenient way to retrieve the records sequentially for the user. (info: http://www.gdal.org/ogr/ogr_sql.html). Result of this work would go to the sample application section of the C# interface. Knowledge: C#, .NET Framework Class Libraries

or your own ideas..

inspiration also from:

2008 Ideas List

These are suggestions. Students are encouraged to come up with their own ideas as well.

  1. Implement GeoPNG/GeoJPEG by embedding coordinate system and geotransformation information (possibly in GML) as chunks in PNG and JPEG files (see GML JP2 for a model of how this might be done).
  1. PNG Driver (using libpng)
  1. OGR WFS read (or read/write) driver using existing OGR GML driver for feature parsing.
  1. Implementation of alternative driver for GML 2 and GML 3 using Expat XML Parser
  1. Implementation of read-only WFS driver using GML driver based on the Expat XML Parser.
  1. Develop a driver for GeoRSS
  1. Develop a driver for OpenStreetMap protocol
  1. Develop a driver for IBM DB2 and its Spatial Extender
  1. Extend GeoJSON driver with cashing capabilities when accessing remote datasource.
  1. Development of Java bindings.
  1. GDAL/OGR for Windows CE: porting new drivers
  1. Development of C# language SWIG bindings for .NET Compact Framework and creating test applications for it.
  1. Creating an ASP.NET multithreading testbed for the GDAL C# bindings, by using the thread pool approach.
  1. Development of bindings for new programming languages: Lua, Ada (GNAT), put your favorite language here
  1. Development of GDAL Read/Write Driver for tiles, derived from GDAL2Tiles utility.
  1. GDAL2Tiles - support for TMS tiles with global-mercator profile to make overlays with Google Maps, MS Virtual Earth, etc. possible.
  1. Extend the HDF5 driver to support writing datasets. This should include an effort to produce HDF5 datasets according to NASA HDF5 metadata conventions where possible. HDF5 is a new generation format expected to be widely used for science data products from NASA and other agencies.
  1. GDAL_CALC.PY - Development of a simple raster calculator based on Python+GDAL. Use sample:

% gdal_calc a=img1.tif b=img2.tif c=img3.tif -calc c=((a+b)/2)

  1. Develop an enhanced KML driver using Google's libkml library. A new driver based around Google's reference implementation would fix some limitations in the current driver (no multi-geometries, no KMZ support, etc), make it easier to keep the driver up to date as KML evolves, and should generally make it easier for GDAL software to exchange data with Google Earth. Both OGR (vector) and GDAL (raster) KML drivers could be developed.
Note: See TracWiki for help on using the wiki.