wiki:rfc40_enhanced_rat_support

Version 1 (modified by petebunting, 11 years ago) ( diff )

--

RFC 40: Improving performance of Raster Attribute Table implementation for large tables.

Summary:

Raster Attrbute Tables from some applications (notably segmentation) can be very large and are slow to access with the current API due to the way only one element can get read or written at a time. Also, when an attribute table is requested by the application the whole table must be read - there is no way of delaying this so just the required subset is read off disk. These changes these limitations at proposed will bring the attribute table support more in line with the way raster data is accessed.

Implementation:

It is proposed that additional methods be provided in the GDALRasterAttributeTable class that allow 'chunks' of data from a column to be read/written in one call. As with the GetValueAs functions columns of different types would be able to read as a value of a different type (i.e., read a int column as a double) with the appropriate conversion taking place. The following overloaded methods will be available:

CPLErr ValuesIO(GDALRWFlag eRWFlag, int iField, int iStartRow, int iLength, double *pdfData);
CPLErr ValuesIO(GDALRWFlag eRWFlag, int iField, int iStartRow, int iLength, int *pnData);
CPLErr ValuesIO(GDALRWFlag eRWFlag, int iField, int iStartRow, int iLength, char **papszStrList);

It is also proposed that a Boolean data type is added so the following function would be added.

CPLErr ValuesIO(GDALRWFlag eRWFlag, int iField, int iStartRow, int iLength, bool *pbData);
void SetValue(int iRow, int iField, bool bValue)
bool GetValueAsBoolean (int iRow, int iField) const

The CSLString type will be used for reading and writing strings. When reading, existing data in the string will be destroyed.

These methods will be available from C as GDALRATValuesIOAsDouble, GDALRATValuesIOAsInteger, GDALRATValuesIOAsBoolean and GDALRATValuesIOAsString.

It is also proposed that methods on the GDALRasterAttributeTable class be made virtual so a driver may return a derived implementation that only reads data from the file when it is actually required.

Language Bindings:

The Python bindings will be altered so ValuesIO will be supported using numpy arrays for the data with casting of types as appropriate. Strings will be supported using the numpy support for string arrays.

Backward Compatibility:

The proposed additions will have an impact on C binary compatibility because they change the API. GDAL 2.0 is suggested as an appropriate time to introduce the changes. C++ binary interface will be broken (due to the addition of new members in the GDALRasterAttributeTable class and methods being marked as virtual).

The base GDALRasterAttributeTable implementation of ValuesIO() will use SetValue/GetValue to read and write values so if a derived implantation does not exist the behaviour will be sensible.

The changes are purely extensions and have no impact on existing code.

Impact on Drivers:

The HFA driver will be updated to support the new function calls.

Timeline

We (Sam Gillingham and Pete Bunting) are prepared undertake the work required and have it ready for inclusion in GDAL 2.0.

There needs to be a discussion on the names of the methods and on the internal logic of the methods.

Attachments (5)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.