wiki:WKTRasterDriver

Version 46 (modified by jorgearevalo, 15 years ago) ( diff )

--

Implementation of read-only GDAL driver for WKT Raster extension to PostGIS

LAST UPDATE ON 2009/08/02 *

This is one of the selected projects for Google Summer of Code 2009. Links to the weekly reports will be posted here during the project development, as well as useful information and conclusions.

Weekly reports

Weekly report #1 (23/05 - 29/05)
Weekly report #2 (29/05 - 05/06)
Weekly report #3 (05/06 - 12/06)
Weekly report #4 (12/06 - 19/06)
Weekly report #5 (19/06 - 26/06)
Weekly report #6 (26/06 - 03/07)
Weekly report #7 (03/07 - 10/07)
Weekly report #8 (10/07 - 17/07)
Weekly report #9 (17/07 - 24/07)
Weekly report #10 (24/07 - 31/07)
Weekly report #11 (31/07 - 07/08)

General overview

The main goal of this project is to create a new type of raster driver in GDAL library. This new type of raster driver will deal with a new type of data: the new PostGIS WKT Raster type (an extension of PostGIS aiming at developing support for raster).

First issue is to match the GDAL Dataset architecture with the new WKT Raster type. So, basically, what is a "WKT Raster"?

  • A 'complete' image.
  • An image 'tile'.
  • A raster object. A new type of object, resulting from the rasterization of a vector coverage.

And a WKT Raster always have:

  • one or more raster bands.
  • Associated metadata, that includes georeference information

Now, what is a GDAL Dataset? An assembly of related raster bands and some information common to them all (metadata)

So, the relation between "WKT Raster object" and "GDAL Dataset" seems to be very clear.

But there is an important issue here. The WKT Raster objects will be stored at PostgreSQL tables. So, a table with a column of type WKT Raster may be seen as:

  1. An image warehouse of untiled and (possibly) unrelated images.
  2. An irregularly tiled raster coverage.
  3. A regularly tiled raster coverage.
  4. A rectangular regularly tiled raster coverage.
  5. A tiled image. --> NOT CONSIDERED FROM NOW. ONLY 1-TABLE-RASTERS
  6. A raster object coverage resulting from the rasterization of a vector coverage.

Options c and d should have the easier ones to be read by the GDAL driver. They are raster with "regular blocking" structure. When a raster layer of these types is loaded:

  1. All loaded tiles have the same width and height,
  2. All tiles do not overlap and their upper left corners follow a regular block grid,
  3. The global extent of the layer is rectangular and not rotated.

Then, for the basic version of the GDAL WKT Raster driver, we can focus on these tasks (reading raster of types c and d). For this reason, we specify several working modes for it. They aren't totally defined yet, but, at least, we should use:

  • REGULARLY_BLOCKING mode: In this mode, our raster is regularly tiled. So, it fulfills the three previous properties.
  • NON_REGULARLY_BLOCKING mode: In this mode, our raster is non-regularly tiled. The tiles can have different size and may overlap, so, the global extent of the raster isn't necessarily rectangular.

Maybe we'll need more working modes in the future, for non-tiled rasters.

Implementing the Dataset

NOTE: ONLY REGULAR-BLOCKING RASTERS, FROM NOW

The Dataset performs the following operations:

  1. Check connection string format
  2. Parse connection string, extracting useful information (table name, optional sql where part, working mode)
  3. Check working mode. If not regularly_tiled mode, from now, finish.
  4. Open a database connection.
  5. Perform some security and integrity checkings: check if database has PostGIS extensions, if the table exists, it is registered in RASTER_COLUMNS table, if has a GIST index, etc. Suggestions accepted and appreciated.
  6. Fetch raster properties from RASTER_COLUMNS table. If the table is not registered in RASTER_COLUMNS, as we are working only in regularly_tiled mode, we should finish.
  7. Populate its georeference information, to allow GetProjectionRef and GetGeoTransform methods provide correct information.
  8. Try to fetch all the blocks covered by raster extent and store them as WKTRasterWrapper objects in a Dataset's array.
  9. Create the raster bands, paying attention to pixel types and nodata values.
  10. Create the overviews as children datasets, if needed.

I'll need more testing, anyway.

Implementing the RasterBand

If Dataset opens the connection with database, RasterBand reads blocks of data. So, the key method here is IReadBlock. This method:

  1. Try to get the blocks from the Dataset cache (the array of WKTRasterWrapper objects). If fails, do the rest of the steps:
  2. Get pixel size.
  3. Transform pixel,line coordinates into coordinates of the raster reference systems, by using the proper methods of Dataset, and get the coordinates of the lower left and upper right corners in map units
  4. Query for blocks that contains this block. As we only consider regularly_blocking rasters, the result will only be 1 block.
  5. Fetch the block in HEXWKB format and parse it to get the data. The HEXWKB format includes raster information header on each block, so, this information can be used for integrity checkings.
  6. Look for the correct raster band and copy its block data into the buffer, taking care of the pixel size and endianess of the band data.

I'll need more testing, anyway.

Wrappers for Dataset and RasterBand

The code was really complex after implementing fully read-only support for regular-blocking raster. And it was really inconvenient to manage with the hexwkb format in the RasterBand methods. So, I created two wrappers: WKTRasterWrapper and WKTRasterBandWrapper.

The WKTRasterWrapper represents a WKTRaster, with a raster header and an array of raster bands. The constructor of the class takes a hexwkb string as input, and fill all the raster header's properties and create the raster band wrappers. WKTRasterBand class has been declared as friend class, so, it can modify any raster field. The more important part is a method, called GetHexWkbRepresentation that returns the updated hexwkb string each time is called. It will be useful to perform the inplace update, or to update the RASTER_COLUMNS table if needed.

The WKTRasterBandWrapper represents a raster band. This class is managed by WKTRasterWrapper class. It has one method, SetData, that updates the data of the band.

I think these wrappers will simplify the rest of the work.

Overviews

If the RASTER_COLUMNS "regular_blocking" value is true then "all blocks are equal sized, abutted and non-overlapping, started at top left origin", plus additional constraints. This regular blocking capability raises the possibility of having very large contiguous raster coverages (made up of many individual WKTRaster-s) which, in turn, raises potential performance problems. Other raster formats counter this by having overviews; a concept that is already supported by GDAL.

I think that this driver must have support of overviews, because I suppose that it will be common to have large images stored at database, that will produce large datasets when reading. Anyway, this issue was being discused until May 20th (http://postgis.refractions.net/pipermail/postgis-devel/2009-May/005629.html), and the final decission seems to be having an additional table for overviews (this is the way in which Mateusz' script manage it). To follow this issue: http://trac.osgeo.org/postgis/wiki/WKTRaster/SpecificationWorking01#RASTER_OVERVIEWSMetadataTable.

My implementation of overviews (under development just now) follows the next steps:

  1. Create one Dataset per overview in the Open method of WKTRasterDataset class. Is necessary to check the number of overviews of the given table, using the RASTER_OVERVIEWS metadata table. Is necessary to set the name of the overviews table in each created Dataset.
  2. Create the rasterbands for each overview Dataset. Change the pixel size by using the overview factor.

Out-db rasters

I'm working in the implementation of the support for out-db rasters too. The point is to read the band data from a file in the filesystem, instead of fetching it from database. The files will be a fully-qualified raster files. Maybe TIFF files. This issue is undecided and under development just now (July 26th)

Project plan (The tasks will be revised for the end of the GSoC)

Objectives and tasksApprox. ScheduleStatus
Objective 0 - Prepare basic enviroment28th JuneDone
Create basic enviroment for new GDAL driver 21th JuneDone
Create testing enviroment for debugging driver developing 28th JuneDone
Objective 1 - Prototype read only support for regular blocking rasters15th JulyDonemidterm evaluation
Dataset: Connection with database and creation of Raster Bands objects (regular blocking rasters)6th JulyDone
RasterBand: Query the correct block and fetch the raster data15th JulyDone
Objective 1.1 - Support of different pixel data types, take care of byte swapping17th JulyDone
Objective 2 - Support access to overviews26th JulyDone
Objective 3 - Rasters inplace update26th JulyDone
Objective 4 - Read only support for non-regular blocking rasters2th-10th Augusttodo
Objective 4.1 - Block caching in Raster Band2th-10th Auguston going
Objective 5 - Support for out-db rasters2th-10th Auguston going
Objective 6 - Testing code and documentation17th Auguston goingfinal evaluation
Objective 7 - Support for creating new rastersundecidedtodo

Notes:

  • Undecided task will be completed after the end of GSoC.

Acknowledgments

The quotes of this page are taken from:

I'm sure that I forget someone, but many thanks :-)

Participants info

  • Student: Jorge Arévalo (jorgearevalo at gis4free.org)
  • Mentors: Tamas Szekeres, Frank Warmerdam
Note: See TracWiki for help on using the wiki.