| 1 | = RFC 51: RasterIO() improvements : resampling and progress callback = |
| 2 | |
| 3 | Author: Even Rouault[[BR]] |
| 4 | Contact: even dot rouault at spatialys dot com[[BR]] |
| 5 | Status: In development |
| 6 | |
| 7 | == Summary == |
| 8 | |
| 9 | This RFC aims at extending the RasterIO() API to allow specifying a resampling |
| 10 | algorithm when doing requests involving subsampling or oversampling. A progress |
| 11 | callback can also be specified to be notified of progression and allow |
| 12 | the user to interrupt the operation. |
| 13 | |
| 14 | == Core changes == |
| 15 | |
| 16 | === Addition of GDALRasterIOExtraArg structure === |
| 17 | |
| 18 | A new structure GDALRasterIOExtraArg is added to contain the new options. |
| 19 | |
| 20 | {{{ |
| 21 | /** Structure to pass extra arguments to RasterIO() method |
| 22 | * @since GDAL 2.0 |
| 23 | */ |
| 24 | typedef struct |
| 25 | { |
| 26 | /*! Version of structure (to allow future extensions of the structure) */ |
| 27 | int nVersion; |
| 28 | |
| 29 | /*! Resampling algorithm */ |
| 30 | GDALRIOResampleAlg eResampleAlg; |
| 31 | |
| 32 | /*! Progress callback */ |
| 33 | GDALProgressFunc pfnProgress; |
| 34 | /*! Progress callback user data */ |
| 35 | void *pProgressData; |
| 36 | |
| 37 | /*! Indicate if dfXOff, dfYOff, dfXSize and dfYSize are set. |
| 38 | Mostly reserved from the VRT driver to communicate a more precise |
| 39 | source window. Must be such that dfXOff - nXOff < 1.0 and |
| 40 | dfYOff - nYOff < 1.0 and nXSize - dfXSize < 1.0 and nYSize - dfYSize < 1.0 */ |
| 41 | int bFloatingPointWindowValidity; |
| 42 | /*! Pixel offset to the top left corner. Only valid if bFloatingPointWindowValidity = TRUE */ |
| 43 | double dfXOff; |
| 44 | /*! Line offset to the top left corner. Only valid if bFloatingPointWindowValidity = TRUE */ |
| 45 | double dfYOff; |
| 46 | /*! Width in pixels of the area of interest. Only valid if bFloatingPointWindowValidity = TRUE */ |
| 47 | double dfXSize; |
| 48 | /*! Height in pixels of the area of interest. Only valid if bFloatingPointWindowValidity = TRUE */ |
| 49 | double dfYSize; |
| 50 | } GDALRasterIOExtraArg; |
| 51 | |
| 52 | #define RASTERIO_EXTRA_ARG_CURRENT_VERSION 1 |
| 53 | |
| 54 | /** Macro to initialize an instance of GDALRasterIOExtraArg structure. |
| 55 | * @since GDAL 2.0 |
| 56 | */ |
| 57 | #define INIT_RASTERIO_EXTRA_ARG(s) \ |
| 58 | do { (s).nVersion = RASTERIO_EXTRA_ARG_CURRENT_VERSION; \ |
| 59 | (s).eResampleAlg = GRIORA_NearestNeighbour; \ |
| 60 | (s).pfnProgress = NULL; \ |
| 61 | (s).pProgressData = NULL; \ |
| 62 | (s).bFloatingPointWindowValidity = FALSE; } while(0) |
| 63 | |
| 64 | }}} |
| 65 | |
| 66 | There are several reasons to prefer a structure rather than new parameters to |
| 67 | the RasterIO() methods : |
| 68 | * code readability (GDALDataset::IRasterIO() has already 14 parameters...) |
| 69 | * allow future extensions without changing the prototype in all drivers |
| 70 | * to a lesser extent, efficiency: it is common for RasterIO() calls to be |
| 71 | chained between generic/specific and/or dataset/rasterband implementations. |
| 72 | Passing just the pointer is more efficient. |
| 73 | |
| 74 | The structure is versionned. In the future if further options are added, the |
| 75 | new members will be added at the end of the structure and the version number |
| 76 | will be incremented. Code in GDAL core&drivers can check the version number to |
| 77 | determine which options are available. |
| 78 | |
| 79 | === Addition of GDALRIOResampleAlg structure === |
| 80 | |
| 81 | The following resampling algorithms are available : |
| 82 | |
| 83 | {{{ |
| 84 | /** RasterIO() resampling method. |
| 85 | * @since GDAL 2.0 |
| 86 | */ |
| 87 | typedef enum |
| 88 | { |
| 89 | /*! Nearest neighbour */ GRIORA_NearestNeighbour = 0, |
| 90 | /*! Bilinear (2x2 kernel) */ GRIORA_Bilinear = 1, |
| 91 | /*! Cubic Convolution Approximation (4x4 kernel) */ GRIORA_Cubic = 2, |
| 92 | /*! Cubic B-Spline Approximation (4x4 kernel) */ GRIORA_CubicSpline = 3, |
| 93 | /*! Lanczos windowed sinc interpolation (6x6 kernel) */ GRIORA_Lanczos = 4, |
| 94 | /*! Average */ GRIORA_Average = 5, |
| 95 | /*! Mode (selects the value which appears most often of all the sampled points) */ |
| 96 | GRIORA_Mode = 6, |
| 97 | /*! Gauss bluring */ GRIORA_Gauss = 7 |
| 98 | } GDALRIOResampleAlg; |
| 99 | }}} |
| 100 | |
| 101 | Those new resampling methods can be used by the GDALRasterBand::IRasterIO() default |
| 102 | implementation when the size of the buffer (nBufXSize x nBufYSize) is different |
| 103 | from the size of the area of interest (nXSize x nYSize). The code heavily |
| 104 | relies on the algorithms used for overview computation, with adjustments to |
| 105 | be also able to deal with oversampling. Bilinear, CubicSpline and Lanczos are |
| 106 | now available in overview computation as well, and rely on the generic infrastructure |
| 107 | for convolution computation introduced lately for improved cubic overviews. |
| 108 | Some algorithms are not available on raster bands with color palette. A warning |
| 109 | will be emitted if an attempt of doing so is done, and nearest neightbour will |
| 110 | be used as a fallback. |
| 111 | |
| 112 | The GDAL_RASTERIO_RESAMPLING configuration option can be set as an alternate |
| 113 | way of specifying the resampling algorithm. Mainly usefull for tests with |
| 114 | applications that do not yet use the new API. |
| 115 | |
| 116 | Currently, the new resampling methods are only available for GF_Read operations. |
| 117 | The use case for GF_Write operations isn't obvious, but could be added without |
| 118 | API changes if needed. |
| 119 | |
| 120 | === C++ changes === |
| 121 | |
| 122 | GDALDataset and GDALRasterBand (non virtual) RasterIO() and (virtual) |
| 123 | IRasterIO() methods have a new final argument psExtraArg of type GDALRasterIOExtraArg*. |
| 124 | |
| 125 | GDALDataset::RasterIO() and GDALRasterBand::RasterIO() can accept a NULL pointer |
| 126 | for that argument in which case they will instanciate a default GDALRasterIOExtraArg |
| 127 | structure to be passed to IRasterIO(). Any other code that calls IRasterIO() |
| 128 | directly (a few IReadBlock() implementations) should make sure of doing so, so |
| 129 | that IRasterIO() can assume that its psExtraArg is not NULL. |
| 130 | |
| 131 | As a provision to be able to deal with very large requests with buffers larger |
| 132 | than several gigabytes, the nPixelSpace, nLineSpace and nBandSpace parameters |
| 133 | have been promoted from the int datatype to the new GSpacing datatype, which |
| 134 | is an alias of a signed 64 bit integer. |
| 135 | |
| 136 | GDALRasterBand::IRasterIO() and GDALDataset::BlockBasedRasterIO() now use the |
| 137 | progress callback when available. |
| 138 | |
| 139 | === C API changes === |
| 140 | |
| 141 | Only additions : |
| 142 | {{{ |
| 143 | CPLErr CPL_DLL CPL_STDCALL GDALDatasetRasterIOEx( |
| 144 | GDALDatasetH hDS, GDALRWFlag eRWFlag, |
| 145 | int nDSXOff, int nDSYOff, int nDSXSize, int nDSYSize, |
| 146 | void * pBuffer, int nBXSize, int nBYSize, GDALDataType eBDataType, |
| 147 | int nBandCount, int *panBandCount, |
| 148 | GSpacing nPixelSpace, GSpacing nLineSpace, GSpacing nBandSpace, |
| 149 | GDALRasterIOExtraArg* psExtraArg); |
| 150 | |
| 151 | CPLErr CPL_DLL CPL_STDCALL |
| 152 | GDALRasterIOEx( GDALRasterBandH hRBand, GDALRWFlag eRWFlag, |
| 153 | int nDSXOff, int nDSYOff, int nDSXSize, int nDSYSize, |
| 154 | void * pBuffer, int nBXSize, int nBYSize,GDALDataType eBDataType, |
| 155 | GSpacing nPixelSpace, GSpacing nLineSpace, |
| 156 | GDALRasterIOExtraArg* psExtraArg ); |
| 157 | }}} |
| 158 | |
| 159 | Those are the same as the existing functions with a final |
| 160 | GDALRasterIOExtraArg* psExtraArg argument, and the spacing parameters promoted |
| 161 | to GSpacing. |
| 162 | |
| 163 | == Changes in drivers == |
| 164 | |
| 165 | * All in-tree drivers that implemented or used RasterIO have been edited to |
| 166 | accept the GDALRasterIOExtraArg* psExtraArg parameter, and forward it when |
| 167 | needed. Those who had a custom RasterIO() implementation now use the |
| 168 | progress callback when available. |
| 169 | * VRT: the <SimpleSource> and <ComplexSource> elements can accept a 'resampling' |
| 170 | attribute. The VRT driver will also set the dfXOff, dfYOff, dfXSize and dfYSize |
| 171 | fields of GDALRasterIOExtraArg* to have source sub-pixel accuracy, so that |
| 172 | GDALRasterBand::IRasterIO() leads to consistant results when operating on |
| 173 | a small area of interest or the whole raster. If that was not done, chunking |
| 174 | done in GDALDatasetCopyWholeRaster() or other algorithms could lead to repeated |
| 175 | lines due to integer rounding issues. |
| 176 | |
| 177 | == Changes in utilities == |
| 178 | |
| 179 | * gdal_translate: accept a -r parameter to specify the resampling algorithm. |
| 180 | Defaults to NEAR. Can be set to bilinear, cubic, cubicspline, lanczos, average |
| 181 | or mode. (Under the hood, this sets the new resampling property at the VRT source |
| 182 | level.) |
| 183 | * gdaladdo: -r parameter now accepts bilinear, cubicspline and lanczos as algorithms. |
| 184 | |
| 185 | == Changes in SWIG bindings == |
| 186 | |
| 187 | * For Python and Perl bindings: Band.ReadRaster(), Dataset.ReadRaster() now |
| 188 | accept optional resample_alg, callback and callback_data arguments. (untested for |
| 189 | Perl, but the existing tests pass) |
| 190 | * For Python bindings, Band.ReadAsArray() and Dataset.ReadAsArray() now |
| 191 | accept optional resample_alg, callback and callback_data arguments. |
| 192 | |
| 193 | == Compatibility == |
| 194 | |
| 195 | * C API/ABI preserved. |
| 196 | |
| 197 | * C++ users of the GDALRasterBand::RasterIO() and |
| 198 | GDALDataset::RasterIO() API must add the new GDALRasterIOExtraArg* psExtraArg |
| 199 | argument (potentially to NULL). It would have been possible to declare it as |
| 200 | optional, but it would have then been more error prone to forget forwarding |
| 201 | psExtraArg. This was especially true in the conversion phase, but now it is |
| 202 | done, it isn't perhaps needed any more. |
| 203 | |
| 204 | * Out-of-tree drivers that implement IRasterIO() must be changed to accept the new |
| 205 | GDALRasterIOExtraArg* psExtraArg argument. Note: failing to do so will be undetected |
| 206 | at compile time (due to how C++ virtual method overloading work). |
| 207 | |
| 208 | Both issues will be mentionned in MIGRATION_GUIDE.TXT |
| 209 | |
| 210 | == Documentation == |
| 211 | |
| 212 | All new methods are documented. |
| 213 | |
| 214 | == Testing == |
| 215 | |
| 216 | The various aspects of this RFC are tested in the Python bindings: |
| 217 | * use of the new options of Band.ReadRaster(), Dataset.ReadRaster(), |
| 218 | Band.ReadAsArray() and Dataset.ReadAsArray(). |
| 219 | * resampling algorithms in subsampling and oversampling RasterIO() requests. |
| 220 | * "-r" option of gdal_translate |
| 221 | |
| 222 | == Implementation == |
| 223 | |
| 224 | Implementation will be done by Even Rouault ([http://spatialys.com Spatialys]), |
| 225 | and sponsored by [http://r3-gis.com R3 GIS]. |
| 226 | |
| 227 | The proposed implementation lies in the "rasterio" branch of the |
| 228 | https://github.com/rouault/gdal2/tree/rasterio repository. |
| 229 | |
| 230 | The list of changes : https://github.com/rouault/gdal2/compare/rasterio |
| 231 | |
| 232 | == Voting history == |
| 233 | |
| 234 | TBD |