Changes between Version 47 and Version 48 of WKTRasterDriver


Ignore:
Timestamp:
Aug 16, 2009, 5:06:19 PM (15 years ago)
Author:
jorgearevalo
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WKTRasterDriver

    v47 v48  
    1 = Implementation of read-only GDAL driver for WKT Raster extension to PostGIS =
     1= GDAL WKT Raster format driver =
    22
    33[[TOC]]
    44
     5This was one of the selected projects for Google Summer of Code 2009. The goal is the implementation of read-only GDAL driver for WKT Raster extension to PostGIS. A prototype is working for the end of GSoC (August 17h 2009), by the code is under development at the time of the last update: '''August 17th 2009'''
    56
    6 ********** LAST UPDATE ON 2009/08/02 ***********
     7== PostGIS WKT Raster type ==
     8
     9The PostGIS WKT Raster type is an extension of PostGIS aiming at developing support for raster. Each raster is reading from disk and stored at a PostgreSQL database with PostGIS and WKT extensions using the [https://svn.osgeo.org/postgis/spike/wktraster/scripts/gdal2wktraster.py gdal2wktraster] script. In the future, creation of new WKT Rasters directly from GDAL code will be allowed too.
     10
     11Each WKT Raster is represented by a PostgreSQL table with a column of the new type "raster". All the tables with a "raster" column are registered in the RASTER_COLUMNS table. This new table:
     12  "(...) is a mean for applications to get a quick overview of which table have a raster column and to get the main characteristics (metadata) of the rasters stored in these columns"
     13
     14You can check the structure of the RASTER_COLUMNS table in the [http://trac.osgeo.org/postgis/wiki/WKTRaster/SpecificationWorking01#RASTER_COLUMNSMetadataTable WKTRaster final specification page].
     15
     16This "raster" format is expressed in different form depending of the level at which it is refered. The format in what the raster is written to the database is the [https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat serialized format], but the raster is read by GDAL WKT Raster driver in the [https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC2-WellKnownBinaryFormat Hexadecimal WKB format]. This is the format what you get when outputting the value of a raster field by a query like "SELECT rast FROM table".
     17
     18NOTE: The "int8" data type in Hex WKB format is interpreted like a datatype "Byte" by the GDAL WKT Raster driver.
     19
     20Each row of a table with a column of the type "raster" will represent an '''image tile''', a '''raster object coverage''' (the result from the rasterization of a vector coverage) or a '''whole image''', unrelated with the rest of the table's rows. Only the first table arrangement (one row = one image tile) is fully understood by the GDAL WKT Raster driver at the current state (see last update). The rest of arrangements will be discussed with WKT Raster development team.
     21
     22Focusing on this first arrangemente, we have 2 different cases:
     23  1. Regularly tiled images
     24  1. Irregularly tiled images
     25
     26In the first case, we say that the table with this arrangement is a "regularly blocked table". This implies that:
     27  1. All loaded tiles have the same width and height
     28  1. All tiles do not overlap and their upper left corner follows a regular block grid
     29  1. The global extent of the layer is rectangular and not rotated.
     30
     31If not all the points are carried out for a given table, it will be an "irregularly blocked table".
     32
     33Each WKT Raster could have zero or more overviews. Each overview is another raster table, like the original one. Currently, all the information of the overviews of a WKT Raster table is stored in another metadata table, called [http://trac.osgeo.org/postgis/wiki/WKTRaster/SpecificationWorking01#RASTER_OVERVIEWSMetadataTable RASTER_OVERVIEWS]. But the overviews support is still a working specification, not a final accepted one.
     34 
     35And about the raster data storage, the WKT Raster format offers 2 ways of storing the data of a raster band:
     36  * in-db storage: The band data is stored in the database, in serialized format with the rest of the band's data.
     37  * out-db storage: The band data is registered in an external file residing in the file system.
     38
     39At present, only the first storage system is allowed in WKT Raster code, but the second one is planned to be ready on this year (2009). Anyway, I've modified the loader script to add outdb support. The patch for the original code is in [http://trac.osgeo.org/postgis/ticket/227 PostGIS ticket 227]
    740
    841
    9 This is one of the selected projects for Google Summer of Code 2009. Links to the weekly reports will be posted here during the project development, as well as useful information and conclusions.
     42You can find further information on the WKT Raster type at [http://trac.osgeo.org/postgis/wiki/WKTRaster WKT Raster home page]
    1043
    11 == Weekly reports ==
     44
     45== Using the GDAL WKT Raster driver ==
     46
     47If you want to use the GDAL WKT Raster driver, you must provide a '''connection string''' as Dataset's name. The syntax of this connection string is (respect the quotes):
     48
     49
     50{{{
     51 PG":host='<host>' dbname='<dbname>' user='<user>' password='<password>' table='<raster_table>' [where='sql_where' mode='working_mode']"
     52}}}
     53
     54Note that the string, until the part that starts with "table='" is a libpq-style connection string. That means that you can change the order of these fields (dbname, user, password, host), or leave out unnecessary ones (like password, in some cases). But the rest of the connection string must have the syntax and order shown above.
     55
     56The "where" option is used to filter the results of the raster table. Any SQL-WHERE expression is valid. The "mode" option is used to know the expected arrangement of the raster table. As the driver is currently working with only one table arrangement (regularly blocked tables), you can omit this option, or use it with value "REGULARLY_TILED_MODE". Otherwise, the driver won't work.
     57
     58You must use this dataset's format in all the gdal tools, like gdalinfo, gdal_translate, gdalwarp, etc.
     59
     60== TODO Tasks ==
     61
     62The relevant TODO tasks, in planned order of implementation, are:
     63
     64  1. Block reading improvement. Avoid one server round for each block reading call. (IRasterIO overriding).
     65  1. Out-db data reading support. (IRasterIO overriding).
     66  1. RASTER_COLUMNS table update with the values read on data blocks, if needed.
     67  1. Support for reading non-regularly blocked rasters.
     68  1. Bulk update of data blocks when any raster-related value changes on RASTER_COLUMNS table.
     69  1. Support for creating new WKT Rasters.
     70
     71This list is always under revision.
     72
     73== Development status at last update ==
     74
     75The GDAL WKT Raster driver is able to read regularly blocked rasters stored in-db.
     76
     77== Project plan ==
     78
     79||'''Objectives and tasks'''||'''Approx. Schedule'''||'''Status'''||
     80||'''Objective 1 - Prototype of GDAL WKT Raster read-only driver'''||'''17th August'''||'''Done'''||
     81||'''Objective 2 - Block reading improvement'''||'''undecided'''||'''On going'''||
     82||'''Objective 3 - Out-db data reading support'''||'''undecided'''||'''On going'''||
     83||'''Objective 4 - RASTER_COLUMNS table update'''||'''undecided'''||'''Todo'''||
     84||'''Objective 5 - Support for reading non-regularly blocked rasters'''||'''undecided'''||'''Todo'''||
     85||'''Objective 6 - Bulk update of data blocks'''||'''undecided'''||'''Todo'''||
     86||'''Objective 7 - Support for creating new WKT Rasters'''||'''undecided'''||'''Todo'''||
     87
     88
     89== GSoC 09 Weekly reports ==
    1290[http://www.gis4free.org/blog/2009/05/30/gsoc-09-weekly-report-1-2305-2905/ Weekly report #1 (23/05 - 29/05)][[BR]]
    1391[http://www.gis4free.org/blog/2009/06/09/gsoc-09-weekly-report-2-2905-0506/ Weekly report #2 (29/05 - 05/06)][[BR]]
     
    2098[http://www.gis4free.org/blog/2009/07/26/gsoc-09-weekly-report-9-1707-2407/ Weekly report #9 (17/07 - 24/07)][[BR]]
    2199[http://www.gis4free.org/blog/2009/08/02/gsoc-09-weekly-report-10-2407-3107/ Weekly report #10 (24/07 - 31/07)][[BR]]
    22 [http://www.gis4free.org/blog/2009/08/11/gsoc-09-weekly-report-11-3107-0708/ Weekly report #11 (31/07 - 07/08)]
    23 
    24 == General overview ==
    25 The main goal of this project is to create a new type of raster driver in GDAL library. This new type of raster driver will deal with a new type of data: the new PostGIS WKT Raster type (an extension of PostGIS aiming at developing support for raster).
    26 
    27 First issue is to match the GDAL Dataset architecture with the new WKT Raster type. So, basically, what is a "WKT Raster"?
    28   * A 'complete' image.
    29   * An image 'tile'.
    30   * A raster object. A new type of object, resulting from the rasterization of a vector coverage.
    31  
    32 And a WKT Raster always have:
    33   * '''one or more raster bands'''.
    34   * Associated metadata, that includes '''georeference information'''
    35 
    36 Now, what is a GDAL Dataset? ''An assembly of related raster bands and some information common to them all'' (metadata)
    37 
    38 So, the relation between "WKT Raster object" and "GDAL Dataset" seems to be very clear.
    39 
    40 But there is an important issue here. The WKT Raster objects will be stored at PostgreSQL tables. So, a table with a column of type WKT Raster may be seen as:
    41   a. An image warehouse of untiled and (possibly) unrelated images.
    42   a. An irregularly tiled raster coverage.
    43   a. A regularly tiled raster coverage.
    44   a. A rectangular regularly tiled raster coverage.
    45   a. A tiled image. --> NOT CONSIDERED FROM NOW. ONLY 1-TABLE-RASTERS
    46   a. A raster object coverage resulting from the rasterization of a vector coverage.
    47 
    48 Options c and d should have the easier ones to be read by the GDAL driver. They are raster with "regular blocking" structure. When a raster layer of these types is loaded:
    49   1. All loaded tiles have the same width and height,
    50   1. All tiles do not overlap and their upper left corners follow a regular block grid,
    51   1. The global extent of the layer is rectangular and not rotated.
    52 
    53 Then, for the basic version of the GDAL WKT Raster driver, we can focus on these tasks (reading raster of types c and d). For this reason, we specify several working modes for it. They aren't totally defined yet, but, at least, we should use:
    54 - REGULARLY_BLOCKING mode: In this mode, our raster is regularly tiled. So, it fulfills the three previous properties.
    55 - NON_REGULARLY_BLOCKING mode: In this mode, our raster is non-regularly tiled. The tiles can have different size and may overlap, so, the global extent of the raster isn't necessarily rectangular.
    56 
    57 Maybe we'll need more working modes in the future, for non-tiled rasters.
    58 
    59 == Implementing the Dataset ==
    60 
    61 NOTE: ONLY REGULAR-BLOCKING RASTERS, FROM NOW
    62 
    63 The Dataset performs the following operations:
    64 
    65   1. Check connection string format
    66   1. Parse connection string, extracting useful information (table name, optional sql where part, working mode)
    67   1. Check working mode. If not regularly_tiled mode, from now, finish.
    68   1. Open a database connection.
    69   1. Perform some security and integrity checkings: check if database has PostGIS extensions, if the table exists, it is registered in RASTER_COLUMNS table, if has a GIST index, etc. Suggestions accepted and appreciated.
    70   1. Fetch raster properties from RASTER_COLUMNS table. If the table is not registered in RASTER_COLUMNS, as we are working only in regularly_tiled mode, we should finish.
    71   1. Populate its georeference information, to allow GetProjectionRef and GetGeoTransform methods provide correct information.
    72   1. Try to fetch all the blocks covered by raster extent and store them as WKTRasterWrapper objects in a Dataset's array.
    73   1. Create the raster bands, paying attention to pixel types and nodata values.
    74   1. Create the overviews as children datasets, if needed.
    75 
    76 I'll need more testing, anyway.
    77 
    78 == Implementing the RasterBand ==
    79 
    80 If Dataset opens the connection with database, RasterBand reads blocks of data. So, the key method here is IReadBlock. This method:
    81 
    82   1. Try to get the blocks from the Dataset cache (the array of WKTRasterWrapper objects). If fails, do the rest of the steps:
    83   1. Get pixel size.
    84   1. Transform pixel,line coordinates into coordinates of the raster reference systems, by using the proper methods of Dataset, and get the coordinates of the lower left and upper right corners in map units
    85   1. Query for blocks that contains this block. As we only consider regularly_blocking rasters, the result will only be 1 block.
    86   1. Fetch the block in HEXWKB format and parse it to get the data. The HEXWKB format includes raster information header on each block, so, this information can be used for integrity checkings.
    87   1. Look for the correct raster band and copy its block data into the buffer, taking care of the pixel size and endianess of the band data.
    88 
    89 I'll need more testing, anyway.
    90 
    91 == Wrappers for Dataset and RasterBand ==
    92 
    93 The code was really complex after implementing fully read-only support for regular-blocking raster. And it was really inconvenient to manage with the hexwkb format in the RasterBand methods. So, I created two wrappers: WKTRasterWrapper and WKTRasterBandWrapper.
    94 
    95 The WKTRasterWrapper represents a WKTRaster, with a raster header and an array of raster bands. The constructor of the class takes a hexwkb string as input, and fill all the raster header's properties and create the raster band wrappers. ''WKTRasterBand'' class has been declared as ''friend'' class, so, it can modify any raster field. The more important part is a method, called ''GetHexWkbRepresentation'' that returns the updated hexwkb string each time is called. It will be useful to perform the inplace update, or to update the RASTER_COLUMNS table if needed.
    96 
    97 The WKTRasterBandWrapper represents a raster band. This class is managed by WKTRasterWrapper class. It has one method, ''SetData'', that updates the data of the band.
    98 
    99 I think these wrappers will simplify the rest of the work.
    100 
    101 == Overviews ==
    102 
    103 ''If the RASTER_COLUMNS "regular_blocking" value is true then "all blocks are equal sized, abutted and non-overlapping, started at top left origin", plus additional constraints. This regular blocking capability raises the possibility of having very large contiguous raster coverages (made up of many individual WKTRaster-s) which, in turn, raises potential performance problems. Other raster formats counter this by having overviews; a concept that is already supported by GDAL.''
    104 
    105 I think that this driver must have support of overviews, because I suppose that it will be common to have large images stored at database, that will produce large datasets when reading. Anyway, this issue was being discused until May 20th (http://postgis.refractions.net/pipermail/postgis-devel/2009-May/005629.html), and the final decission seems to be having an additional table for overviews (this is the way in which Mateusz' script manage it). To follow this issue: http://trac.osgeo.org/postgis/wiki/WKTRaster/SpecificationWorking01#RASTER_OVERVIEWSMetadataTable.
    106 
    107 My implementation of overviews (under development just now) follows the next steps:
    108   1. Create one Dataset per overview in the Open method of WKTRasterDataset class. Is necessary to check the number of overviews of the given table, using the RASTER_OVERVIEWS metadata table. Is necessary to set the name of the overviews table in each created Dataset.
    109   1. Create the rasterbands for each overview Dataset. Change the pixel size by using the overview factor.
     100[http://www.gis4free.org/blog/2009/08/11/gsoc-09-weekly-report-11-3107-0708/ Weekly report #11 (31/07 - 07/08)][[BR]]
     101[http://www.gis4free.org/blog/2009/08/17/gsoc-09-final-report-0708-1708/ Final GSoC 09 Weekly report (07/08 - 17/08)]
    110102
    111103
    112 == Out-db rasters ==
    113 
    114 I'm working in the implementation of the support for out-db rasters too. The point is to read the band data from a file in the filesystem, instead of fetching it from database. The files will be a fully-qualified raster files. Maybe TIFF files. This issue is undecided and under development just now (July 26th)
    115 
    116 == Project plan (The tasks will be revised for the end of the GSoC) ==
    117 
    118 ||'''Objectives and tasks'''||'''Approx. Schedule'''||'''Status'''||
    119 ||'''Objective 0 - Prepare basic enviroment'''||'''28th June'''||'''Done'''||
    120 ||Create basic enviroment for new GDAL driver|| 21th June||Done||
    121 ||Create testing enviroment for debugging driver developing|| 28th June||Done||
    122 ||'''Objective 1 - Prototype read only support for regular blocking rasters'''||'''15th July'''||'''Done'''||'''midterm evaluation'''||
    123 ||Dataset: Connection with database and creation of Raster Bands objects (regular blocking rasters)||6th July||Done||
    124 ||RasterBand: Query the correct block and fetch the raster data||15th July||Done||
    125 ||'''Objective 1.1 - Support of different pixel data types, take care of byte swapping'''||'''17th July'''||'''Done'''||
    126 ||'''Objective 2 - Support access to overviews'''||'''26th July'''||'''Done'''||
    127 ||'''Objective 3 - Rasters inplace update'''||'''26th July'''||'''Done'''||
    128 ||Objective 3.1 - Block caching in Raster Band||2th-10th August||on going||
    129 ||'''Objective 4 - Support for out-db rasters'''||'''2th-10th August'''||'''on going'''||
    130 ||'''Objective 5 - Testing code and documentation'''||'''17th August'''||'''on going'''||'''final evaluation'''||
    131 ||'''Objective 6 - Read only support for non-regular blocking rasters'''||'''undecided'''||'''todo'''||
    132 ||'''Objective 7 - Support for creating new rasters'''||'''undecided'''||'''todo'''||
    133 
    134 Notes:
    135   * Undecided task will be completed after the end of GSoC.
    136 
    137 == Acknowledgments ==
    138 
    139 The quotes of this page are taken from:
    140   * The WKT Raster specification documents, from Pierre Racine.
    141   * The GDAL Driver implementation tutorial: http://www.gdal.org/gdal_drivertut.html, from Frank Warmerdam
    142   * The GDAL Data model: http://www.gdal.org/gdal_datamodel.html, from Frank Warmerdam
    143   * Comments sent by mail or posted in my blog by Tamas Szekeres, Mateusz Loskot, Frank Warmerdam, Even Rouault, Pierre Racine.
    144 
    145 I'm sure that I forget someone, but many thanks :-)
    146 
    147 == Participants info ==
     104== GSoC 09 Participants info ==
    148105  * Student: Jorge Arévalo (jorgearevalo at gis4free.org)
    149106  * Mentors: Tamas Szekeres, Frank Warmerdam