| 108 | '''4. Tool(s) for performance profiling and option tuning''' |
| 109 | |
| 110 | Some [http://trac.osgeo.org/gdal/wiki/ConfigOptions GDAL options] can have enormous effects on the performance of some |
| 111 | operations, depending on dataset size/complexity/source and all sorts of |
| 112 | other factors. I'm imagining a tuning tool that looks at which |
| 113 | caches/limits are being "hit" (or not) during an operation (eg. a specific |
| 114 | gdalwarp), where time is being spent (IO, CPU, ...) and suggest better |
| 115 | settings for your datasets & host configuration. Maybe this could be |
| 116 | expanded in future to select better "defaults" automatically. I'm thinking |
| 117 | settings like: `GDAL_MAX_DATASET_POOL_SIZE`, `GDAL_CACHEMAX`, `GDAL_SWATH_SIZE`, |
| 118 | `VSI_CACHE`, `GDAL_DISABLE_READDIR_ON_OPEN`, `OSM_MAX_TMPFILE_SIZE`, warp |
| 119 | memory/threading/options, and possibly per-format options -- for GTiff |
| 120 | things like tiling, interleaving, overviews, `GTIFF_VIRTUAL_MEM_IO`, |
| 121 | `GTIFF_DIRECT_IO`. Creating a structure that future measurements and options can fit into |
| 122 | in future will be an important design issue, and this project will require digging deep |
| 123 | into the implementations of various settings & caches used by GDAL. |
| 124 | |
| 125 | In terms of reporting possibly something along the lines of MySQLTuner ([https://www.thomas-krenn.com/en/wiki/MySQL_Performance_Tuning example output]). Maybe |
| 126 | invocation like `gdalwarp --tune ...` but would also be good if usage via |
| 127 | library/bindings could be profiled in the same way and the output dumped |
| 128 | somewhere for later analysis/reporting (eg. `gdaltune my_warp.gdaltune`). |
| 129 | |
| 130 | Skills: |
| 131 | * programming skills needed - C/C++ |
| 132 | * difficulty level - hard |
| 133 | |
| 134 | Possible mentor/co-mentor: Robert Coup (robert.coup at koordinates.com) |
| 135 | |
| 136 | '''5. Promoting VSI''' |
| 137 | |
| 138 | The virtual filesystem (VSI) functionality in GDAL ([http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip vsizip, vsicurl], [http://erouault.blogspot.co.nz/2012/05/new-gdal-virtual-file-system-to-read.html vsimem, vsisubfile], etc) |
| 139 | is pretty cool, and is useful for a lot of things |
| 140 | outside GDAL. Look at whether an external project could |
| 141 | be a better place for it to live (even if it's just a separate |
| 142 | build/packaging from code that continues to live in GDAL) -- adding tests & |
| 143 | CI, looking at cross-platform issues, documenting `vsi_preload.so`, making a |
| 144 | library other apps could utilise for the functionality (libvsi?), and possibly |
| 145 | creating a FUSE implementation that maps onto the VSI code (ala. `mount -t |
| 146 | vsi /vsizip/vsicurl/http://example.com/foo.zip foo/`). |
| 147 | |
| 148 | Skills: |
| 149 | * programming skills needed - C/C++ |
| 150 | * experience with build, test, CI, packaging tools |
| 151 | * difficulty level - moderate |
| 152 | |
| 153 | Possible mentor/co-mentor: Robert Coup (robert.coup at koordinates.com) |
| 154 | |
| 155 | '''6. OpenFileGDB Write support''' |
| 156 | |
| 157 | The existing [http://www.gdal.org/drv_openfilegdb.html OpenFileGDB] driver doesn't implement writing, but is more |
| 158 | stable for reading than the [http://www.gdal.org/drv_filegdb.html proprietary ESRI driver] in most cases. |
| 159 | |
| 160 | This project would aim to add some level of write support to the OpenFileGDB |
| 161 | driver. The primary goal would be so it can create files that the OpenFileGDB |
| 162 | driver can read again, and the secondary goal would be to improve compatibility |
| 163 | so that ArcGIS itself can read the files. |
| 164 | As Even describes it, [http://erouault.blogspot.co.nz/2013/10/filegdb-format-reverse-engineered.html reverse engineering] |
| 165 | and black box testing isn't always fun, but it is a great skill to have and |
| 166 | there'd be as much software & support & test data as we can get: |
| 167 | |
| 168 | > Hum, that depends on the perseverance of the student to not give up if some |
| 169 | > proprietary software refuse to read its neat generated geodatabase or crashes, |
| 170 | > whereas it can be read with the openfilegdb read side ;-) |
| 171 | |
| 172 | Skills: |
| 173 | * programming skills needed - C/C++ |
| 174 | * some experience with reverse engineering of file formats/protocols |
| 175 | * difficulty level - hard |
| 176 | |
| 177 | Possible mentor/co-mentor: Robert Coup (robert.coup at koordinates.com) |
| 178 | |