|Version 40 (modified by jorgearevalo, 4 years ago)|
Implementation of read-only GDAL driver for WKT Raster extension to PostGIS
Table of Contents
********** UPDATED ON 2009/07/26 ***********
This is one of the selected projects for Google Summer of Code 2009. Links to the weekly reports will be posted here during the project development, as well as useful information and conclusions.
Weekly report #1 (23/05 - 29/05)
Weekly report #2 (29/05 - 05/06)
Weekly report #3 (05/06 - 12/06)
Weekly report #4 (12/06 - 19/06)
Weekly report #5 (19/06 - 26/06)
Weekly report #6 (26/06 - 03/07)
Weekly report #7 (03/07 - 10/07)
Weekly report #8 (10/07 - 17/07)
Weekly report #8 (17/07 - 24/07)
The main goal of this project is to create a new type of raster driver in GDAL library. This new type of raster driver will deal with a new type of data: the new PostGIS WKT Raster type (an extension of PostGIS aiming at developing support for raster).
First issue is to match the GDAL Dataset architecture with the new WKT Raster type. So, basically, what is a "WKT Raster"?
- A 'complete' image.
- An image 'tile'.
- A raster object. A new type of object, resulting from the rasterization of a vector coverage.
And a WKT Raster always have:
- one or more raster bands.
- Associated metadata, that includes georeference information
Now, what is a GDAL Dataset? An assembly of related raster bands and some information common to them all (metadata)
So, the relation between "WKT Raster object" and "GDAL Dataset" seems to be very clear.
But there is an important issue here. The WKT Raster objects will be stored at PostgreSQL tables. So, a table with a column of type WKT Raster may be seen as:
- An image warehouse of untiled and (possibly) unrelated images.
- An irregularly tiled raster coverage.
- A regularly tiled raster coverage.
- A rectangular regularly tiled raster coverage.
- A tiled image. --> NOT CONSIDERED FROM NOW. ONLY 1-TABLE-RASTERS
- A raster object coverage resulting from the rasterization of a vector coverage.
Options c and d should have the easier ones to be read by the GDAL driver. They are raster with "regular blocking" structure. When a raster layer of these types is loaded:
- All loaded tiles have the same width and height,
- All tiles do not overlap and their upper left corners follow a regular block grid,
- The global extent of the layer is rectangular and not rotated.
Then, for the basic version of the GDAL WKT Raster driver, we can focus on these tasks (reading raster of types c and d). For this reason, we specify several working modes for it. They aren't totally defined yet, but, at least, we should use: - REGULARLY_BLOCKING mode: In this mode, our raster is regularly tiled. So, it fulfills the three previous properties. - NON_REGULARLY_BLOCKING mode: In this mode, our raster is non-regularly tiled. The tiles can have different size and may overlap, so, the global extent of the raster isn't necessarily rectangular.
Maybe we'll need more working modes in the future, for non-tiled rasters.
Implementing the Dataset
NOTE: ONLY REGULAR-BLOCKING RASTERS, FROM NOW
The Dataset performs the following operations:
- Check connection string format
- Parse connection string, extracting useful information (table name, optional sql where part, working mode)
- Check working mode. If not regularly_tiled mode, from now, finish.
- Open a database connection.
- Perform some security and integrity checkings: check if database has PostGIS extensions, if the table exists, it is registered in RASTER_COLUMNS table, if has a GIST index, etc. Suggestions accepted and appreciated.
- Fetch raster properties from RASTER_COLUMNS table. If the table is not registered in RASTER_COLUMNS, as we are working only in regularly_tiled mode, we should finish.
- Populate its georeference information, to allow GetProjectionRef? and GetGeoTransform? methods provide correct information.
- Create the raster bands, paying attention to pixel types and nodata values.
I'll need more testing, anyway.
Implementing the RasterBand?
If Dataset opens the connection with database, RasterBand? reads blocks of data. So, the key method here is IReadBlock. This method:
- Get pixel size.
- Transform pixel,line coordinates into coordinates of the raster reference systems, by using the proper methods of Dataset, and get the coordinates of the lower left and upper right corners in map units
- Query for blocks that contains this block. As we only consider regularly_blocking rasters, the result will only be 1 block.
- Fetch the block in HEXWKB format and parse it to get the data. The HEXWKB format includes raster information header on each block, so, this information can be used for integrity checkings.
- Look for the correct raster band and copy its block data into the buffer, taking care of the pixel size and endianess of the band data.
I'll need more testing, anyway.
Wrappers for Dataset and RasterBand?
The code was really complex after implementing fully read-only support for regular-blocking raster. And it was really inconvenient to manage with the hexwkb format in the RasterBand? methods. So, I created two wrappers: WKTRasterWrapper and WKTRasterBandWrapper.
The WKTRasterWrapper represents a WKTRaster, with a raster header and an array of raster bands. The constructor of the class takes a hexwkb string as input, and fill all the raster header's properties and create the raster band wrappers. WKTRasterBand class has been declared as friend class, so, it can modify any raster field. The more important part is a method, called GetHexWkbRepresentation? that returns the updated hexwkb string each time is called. It will be useful to perform the inplace update, or to update the RASTER_COLUMNS table if needed.
The WKTRasterBandWrapper represents a raster band. This class is managed by WKTRasterWrapper class. It has one method, SetData?, that updates the data of the band.
I think these wrappers will simplify the rest of the work.
If the RASTER_COLUMNS "regular_blocking" value is true then "all blocks are equal sized, abutted and non-overlapping, started at top left origin", plus additional constraints. This regular blocking capability raises the possibility of having very large contiguous raster coverages (made up of many individual WKTRaster-s) which, in turn, raises potential performance problems. Other raster formats counter this by having overviews; a concept that is already supported by GDAL.
I think that this driver must have support of overviews, because I suppose that it will be common to have large images stored at database, that will produce large datasets when reading. Anyway, this issue was being discused until May 20th ( http://postgis.refractions.net/pipermail/postgis-devel/2009-May/005629.html), and the final decission seems to be having an additional table for overviews (this is the way in which Mateusz' script manage it). To follow this issue: http://trac.osgeo.org/postgis/wiki/WKTRaster/SpecificationWorking01#RASTER_OVERVIEWSMetadataTable.
My implementation of overviews (under development just now) follows the next steps:
- Create one Dataset per overview in the Open method of WKTRasterDataset class. Is necessary to check the number of overviews of the given table, using the RASTER_OVERVIEWS metadata table. Is necessary to set the name of the overviews table in each created Dataset.
- Create the rasterbands for each overview Dataset. Change the pixel size by using the overview factor.
- Create metadata entries for each overview as elements of SUBDATASETS metadata domain in Dataset.
I'm working in the implementation of the support for out-db rasters too. The point is to read the band data from a file in the filesystem, instead of fetching it from database. The files will be a fully-qualified raster files. Maybe TIFF files. This issue is undecided and under development just now (July 26th)
Project plan (The tasks will be revised for the end of the GSoC)
|Objectives and tasks||Approx. Schedule||Status|
|Objective 0 - Prepare basic enviroment||28th June||Done|
|Create basic enviroment for new GDAL driver||21th June||Done|
|Create testing enviroment for debugging driver developing||28th June||Done|
|Objective 1 - Prototype read only support for regular blocking rasters||15th July||Done||midterm evaluation|
|Dataset: Connection with database and creation of Raster Bands objects (regular blocking rasters)||6th July||Done|
|RasterBand?: Query the correct block and fetch the raster data||15th July||Done|
|Objective 1.1 - Support of different pixel data types, take care of byte swapping||17th July||Done|
|Objective 2 - Support access to overviews||26th July||Done|
|Objective 3 - Rasters inplace update||26th July||Done|
|Objective 4 - Support for out-db rasters||2th August||On going|
|Objective 5 - Read only support for non-regular blocking rasters||2th-10th August||todo|
|Objective 5.1 - Block caching in Raster Band||2th-10th August||todo|
|Objective 6 - Support for creating new rasters||10th - 17th August||undecided||final evaluation|
- Until 13th July, I'll dedicate about 20h per week. From 13th July to 10th (or 17th) August, I'll dedicate about 40 hours per week.
- The necessary subtasks will be added when needed.
- Undecided task will be completed only if time permits it.
The quotes of this page are taken from:
- The WKT Raster specification documents, from Pierre Racine.
- The GDAL Driver implementation tutorial: http://www.gdal.org/gdal_drivertut.html, from Frank Warmerdam
- The GDAL Data model: http://www.gdal.org/gdal_datamodel.html, from ¿Frank Warmerdam?
- Comments sent by mail or posted in my blog by Tamas Szekeres, Mateusz Loskot, Frank Warmerdam, Even Rouault, Pierre Racine.
I'm sure that I forget someone, but many thanks :-)
- Student: Jorge Arévalo (jorgearevalo at gis4free.org)
- Mentors: Tamas Szekeres, Frank Warmerdam