6 | | ********** LAST UPDATE ON 2009/08/02 *********** |
| 7 | == PostGIS WKT Raster type == |
| 8 | |
| 9 | The PostGIS WKT Raster type is an extension of PostGIS aiming at developing support for raster. Each raster is reading from disk and stored at a PostgreSQL database with PostGIS and WKT extensions using the [https://svn.osgeo.org/postgis/spike/wktraster/scripts/gdal2wktraster.py gdal2wktraster] script. In the future, creation of new WKT Rasters directly from GDAL code will be allowed too. |
| 10 | |
| 11 | Each WKT Raster is represented by a PostgreSQL table with a column of the new type "raster". All the tables with a "raster" column are registered in the RASTER_COLUMNS table. This new table: |
| 12 | "(...) is a mean for applications to get a quick overview of which table have a raster column and to get the main characteristics (metadata) of the rasters stored in these columns" |
| 13 | |
| 14 | You can check the structure of the RASTER_COLUMNS table in the [http://trac.osgeo.org/postgis/wiki/WKTRaster/SpecificationWorking01#RASTER_COLUMNSMetadataTable WKTRaster final specification page]. |
| 15 | |
| 16 | This "raster" format is expressed in different form depending of the level at which it is refered. The format in what the raster is written to the database is the [https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat serialized format], but the raster is read by GDAL WKT Raster driver in the [https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC2-WellKnownBinaryFormat Hexadecimal WKB format]. This is the format what you get when outputting the value of a raster field by a query like "SELECT rast FROM table". |
| 17 | |
| 18 | NOTE: The "int8" data type in Hex WKB format is interpreted like a datatype "Byte" by the GDAL WKT Raster driver. |
| 19 | |
| 20 | Each row of a table with a column of the type "raster" will represent an '''image tile''', a '''raster object coverage''' (the result from the rasterization of a vector coverage) or a '''whole image''', unrelated with the rest of the table's rows. Only the first table arrangement (one row = one image tile) is fully understood by the GDAL WKT Raster driver at the current state (see last update). The rest of arrangements will be discussed with WKT Raster development team. |
| 21 | |
| 22 | Focusing on this first arrangemente, we have 2 different cases: |
| 23 | 1. Regularly tiled images |
| 24 | 1. Irregularly tiled images |
| 25 | |
| 26 | In the first case, we say that the table with this arrangement is a "regularly blocked table". This implies that: |
| 27 | 1. All loaded tiles have the same width and height |
| 28 | 1. All tiles do not overlap and their upper left corner follows a regular block grid |
| 29 | 1. The global extent of the layer is rectangular and not rotated. |
| 30 | |
| 31 | If not all the points are carried out for a given table, it will be an "irregularly blocked table". |
| 32 | |
| 33 | Each WKT Raster could have zero or more overviews. Each overview is another raster table, like the original one. Currently, all the information of the overviews of a WKT Raster table is stored in another metadata table, called [http://trac.osgeo.org/postgis/wiki/WKTRaster/SpecificationWorking01#RASTER_OVERVIEWSMetadataTable RASTER_OVERVIEWS]. But the overviews support is still a working specification, not a final accepted one. |
| 34 | |
| 35 | And about the raster data storage, the WKT Raster format offers 2 ways of storing the data of a raster band: |
| 36 | * in-db storage: The band data is stored in the database, in serialized format with the rest of the band's data. |
| 37 | * out-db storage: The band data is registered in an external file residing in the file system. |
| 38 | |
| 39 | At present, only the first storage system is allowed in WKT Raster code, but the second one is planned to be ready on this year (2009). Anyway, I've modified the loader script to add outdb support. The patch for the original code is in [http://trac.osgeo.org/postgis/ticket/227 PostGIS ticket 227] |
11 | | == Weekly reports == |
| 44 | |
| 45 | == Using the GDAL WKT Raster driver == |
| 46 | |
| 47 | If you want to use the GDAL WKT Raster driver, you must provide a '''connection string''' as Dataset's name. The syntax of this connection string is (respect the quotes): |
| 48 | |
| 49 | |
| 50 | {{{ |
| 51 | PG":host='<host>' dbname='<dbname>' user='<user>' password='<password>' table='<raster_table>' [where='sql_where' mode='working_mode']" |
| 52 | }}} |
| 53 | |
| 54 | Note that the string, until the part that starts with "table='" is a libpq-style connection string. That means that you can change the order of these fields (dbname, user, password, host), or leave out unnecessary ones (like password, in some cases). But the rest of the connection string must have the syntax and order shown above. |
| 55 | |
| 56 | The "where" option is used to filter the results of the raster table. Any SQL-WHERE expression is valid. The "mode" option is used to know the expected arrangement of the raster table. As the driver is currently working with only one table arrangement (regularly blocked tables), you can omit this option, or use it with value "REGULARLY_TILED_MODE". Otherwise, the driver won't work. |
| 57 | |
| 58 | You must use this dataset's format in all the gdal tools, like gdalinfo, gdal_translate, gdalwarp, etc. |
| 59 | |
| 60 | == TODO Tasks == |
| 61 | |
| 62 | The relevant TODO tasks, in planned order of implementation, are: |
| 63 | |
| 64 | 1. Block reading improvement. Avoid one server round for each block reading call. (IRasterIO overriding). |
| 65 | 1. Out-db data reading support. (IRasterIO overriding). |
| 66 | 1. RASTER_COLUMNS table update with the values read on data blocks, if needed. |
| 67 | 1. Support for reading non-regularly blocked rasters. |
| 68 | 1. Bulk update of data blocks when any raster-related value changes on RASTER_COLUMNS table. |
| 69 | 1. Support for creating new WKT Rasters. |
| 70 | |
| 71 | This list is always under revision. |
| 72 | |
| 73 | == Development status at last update == |
| 74 | |
| 75 | The GDAL WKT Raster driver is able to read regularly blocked rasters stored in-db. |
| 76 | |
| 77 | == Project plan == |
| 78 | |
| 79 | ||'''Objectives and tasks'''||'''Approx. Schedule'''||'''Status'''|| |
| 80 | ||'''Objective 1 - Prototype of GDAL WKT Raster read-only driver'''||'''17th August'''||'''Done'''|| |
| 81 | ||'''Objective 2 - Block reading improvement'''||'''undecided'''||'''On going'''|| |
| 82 | ||'''Objective 3 - Out-db data reading support'''||'''undecided'''||'''On going'''|| |
| 83 | ||'''Objective 4 - RASTER_COLUMNS table update'''||'''undecided'''||'''Todo'''|| |
| 84 | ||'''Objective 5 - Support for reading non-regularly blocked rasters'''||'''undecided'''||'''Todo'''|| |
| 85 | ||'''Objective 6 - Bulk update of data blocks'''||'''undecided'''||'''Todo'''|| |
| 86 | ||'''Objective 7 - Support for creating new WKT Rasters'''||'''undecided'''||'''Todo'''|| |
| 87 | |
| 88 | |
| 89 | == GSoC 09 Weekly reports == |
22 | | [http://www.gis4free.org/blog/2009/08/11/gsoc-09-weekly-report-11-3107-0708/ Weekly report #11 (31/07 - 07/08)] |
23 | | |
24 | | == General overview == |
25 | | The main goal of this project is to create a new type of raster driver in GDAL library. This new type of raster driver will deal with a new type of data: the new PostGIS WKT Raster type (an extension of PostGIS aiming at developing support for raster). |
26 | | |
27 | | First issue is to match the GDAL Dataset architecture with the new WKT Raster type. So, basically, what is a "WKT Raster"? |
28 | | * A 'complete' image. |
29 | | * An image 'tile'. |
30 | | * A raster object. A new type of object, resulting from the rasterization of a vector coverage. |
31 | | |
32 | | And a WKT Raster always have: |
33 | | * '''one or more raster bands'''. |
34 | | * Associated metadata, that includes '''georeference information''' |
35 | | |
36 | | Now, what is a GDAL Dataset? ''An assembly of related raster bands and some information common to them all'' (metadata) |
37 | | |
38 | | So, the relation between "WKT Raster object" and "GDAL Dataset" seems to be very clear. |
39 | | |
40 | | But there is an important issue here. The WKT Raster objects will be stored at PostgreSQL tables. So, a table with a column of type WKT Raster may be seen as: |
41 | | a. An image warehouse of untiled and (possibly) unrelated images. |
42 | | a. An irregularly tiled raster coverage. |
43 | | a. A regularly tiled raster coverage. |
44 | | a. A rectangular regularly tiled raster coverage. |
45 | | a. A tiled image. --> NOT CONSIDERED FROM NOW. ONLY 1-TABLE-RASTERS |
46 | | a. A raster object coverage resulting from the rasterization of a vector coverage. |
47 | | |
48 | | Options c and d should have the easier ones to be read by the GDAL driver. They are raster with "regular blocking" structure. When a raster layer of these types is loaded: |
49 | | 1. All loaded tiles have the same width and height, |
50 | | 1. All tiles do not overlap and their upper left corners follow a regular block grid, |
51 | | 1. The global extent of the layer is rectangular and not rotated. |
52 | | |
53 | | Then, for the basic version of the GDAL WKT Raster driver, we can focus on these tasks (reading raster of types c and d). For this reason, we specify several working modes for it. They aren't totally defined yet, but, at least, we should use: |
54 | | - REGULARLY_BLOCKING mode: In this mode, our raster is regularly tiled. So, it fulfills the three previous properties. |
55 | | - NON_REGULARLY_BLOCKING mode: In this mode, our raster is non-regularly tiled. The tiles can have different size and may overlap, so, the global extent of the raster isn't necessarily rectangular. |
56 | | |
57 | | Maybe we'll need more working modes in the future, for non-tiled rasters. |
58 | | |
59 | | == Implementing the Dataset == |
60 | | |
61 | | NOTE: ONLY REGULAR-BLOCKING RASTERS, FROM NOW |
62 | | |
63 | | The Dataset performs the following operations: |
64 | | |
65 | | 1. Check connection string format |
66 | | 1. Parse connection string, extracting useful information (table name, optional sql where part, working mode) |
67 | | 1. Check working mode. If not regularly_tiled mode, from now, finish. |
68 | | 1. Open a database connection. |
69 | | 1. Perform some security and integrity checkings: check if database has PostGIS extensions, if the table exists, it is registered in RASTER_COLUMNS table, if has a GIST index, etc. Suggestions accepted and appreciated. |
70 | | 1. Fetch raster properties from RASTER_COLUMNS table. If the table is not registered in RASTER_COLUMNS, as we are working only in regularly_tiled mode, we should finish. |
71 | | 1. Populate its georeference information, to allow GetProjectionRef and GetGeoTransform methods provide correct information. |
72 | | 1. Try to fetch all the blocks covered by raster extent and store them as WKTRasterWrapper objects in a Dataset's array. |
73 | | 1. Create the raster bands, paying attention to pixel types and nodata values. |
74 | | 1. Create the overviews as children datasets, if needed. |
75 | | |
76 | | I'll need more testing, anyway. |
77 | | |
78 | | == Implementing the RasterBand == |
79 | | |
80 | | If Dataset opens the connection with database, RasterBand reads blocks of data. So, the key method here is IReadBlock. This method: |
81 | | |
82 | | 1. Try to get the blocks from the Dataset cache (the array of WKTRasterWrapper objects). If fails, do the rest of the steps: |
83 | | 1. Get pixel size. |
84 | | 1. Transform pixel,line coordinates into coordinates of the raster reference systems, by using the proper methods of Dataset, and get the coordinates of the lower left and upper right corners in map units |
85 | | 1. Query for blocks that contains this block. As we only consider regularly_blocking rasters, the result will only be 1 block. |
86 | | 1. Fetch the block in HEXWKB format and parse it to get the data. The HEXWKB format includes raster information header on each block, so, this information can be used for integrity checkings. |
87 | | 1. Look for the correct raster band and copy its block data into the buffer, taking care of the pixel size and endianess of the band data. |
88 | | |
89 | | I'll need more testing, anyway. |
90 | | |
91 | | == Wrappers for Dataset and RasterBand == |
92 | | |
93 | | The code was really complex after implementing fully read-only support for regular-blocking raster. And it was really inconvenient to manage with the hexwkb format in the RasterBand methods. So, I created two wrappers: WKTRasterWrapper and WKTRasterBandWrapper. |
94 | | |
95 | | The WKTRasterWrapper represents a WKTRaster, with a raster header and an array of raster bands. The constructor of the class takes a hexwkb string as input, and fill all the raster header's properties and create the raster band wrappers. ''WKTRasterBand'' class has been declared as ''friend'' class, so, it can modify any raster field. The more important part is a method, called ''GetHexWkbRepresentation'' that returns the updated hexwkb string each time is called. It will be useful to perform the inplace update, or to update the RASTER_COLUMNS table if needed. |
96 | | |
97 | | The WKTRasterBandWrapper represents a raster band. This class is managed by WKTRasterWrapper class. It has one method, ''SetData'', that updates the data of the band. |
98 | | |
99 | | I think these wrappers will simplify the rest of the work. |
100 | | |
101 | | == Overviews == |
102 | | |
103 | | ''If the RASTER_COLUMNS "regular_blocking" value is true then "all blocks are equal sized, abutted and non-overlapping, started at top left origin", plus additional constraints. This regular blocking capability raises the possibility of having very large contiguous raster coverages (made up of many individual WKTRaster-s) which, in turn, raises potential performance problems. Other raster formats counter this by having overviews; a concept that is already supported by GDAL.'' |
104 | | |
105 | | I think that this driver must have support of overviews, because I suppose that it will be common to have large images stored at database, that will produce large datasets when reading. Anyway, this issue was being discused until May 20th (http://postgis.refractions.net/pipermail/postgis-devel/2009-May/005629.html), and the final decission seems to be having an additional table for overviews (this is the way in which Mateusz' script manage it). To follow this issue: http://trac.osgeo.org/postgis/wiki/WKTRaster/SpecificationWorking01#RASTER_OVERVIEWSMetadataTable. |
106 | | |
107 | | My implementation of overviews (under development just now) follows the next steps: |
108 | | 1. Create one Dataset per overview in the Open method of WKTRasterDataset class. Is necessary to check the number of overviews of the given table, using the RASTER_OVERVIEWS metadata table. Is necessary to set the name of the overviews table in each created Dataset. |
109 | | 1. Create the rasterbands for each overview Dataset. Change the pixel size by using the overview factor. |
| 100 | [http://www.gis4free.org/blog/2009/08/11/gsoc-09-weekly-report-11-3107-0708/ Weekly report #11 (31/07 - 07/08)][[BR]] |
| 101 | [http://www.gis4free.org/blog/2009/08/17/gsoc-09-final-report-0708-1708/ Final GSoC 09 Weekly report (07/08 - 17/08)] |
112 | | == Out-db rasters == |
113 | | |
114 | | I'm working in the implementation of the support for out-db rasters too. The point is to read the band data from a file in the filesystem, instead of fetching it from database. The files will be a fully-qualified raster files. Maybe TIFF files. This issue is undecided and under development just now (July 26th) |
115 | | |
116 | | == Project plan (The tasks will be revised for the end of the GSoC) == |
117 | | |
118 | | ||'''Objectives and tasks'''||'''Approx. Schedule'''||'''Status'''|| |
119 | | ||'''Objective 0 - Prepare basic enviroment'''||'''28th June'''||'''Done'''|| |
120 | | ||Create basic enviroment for new GDAL driver|| 21th June||Done|| |
121 | | ||Create testing enviroment for debugging driver developing|| 28th June||Done|| |
122 | | ||'''Objective 1 - Prototype read only support for regular blocking rasters'''||'''15th July'''||'''Done'''||'''midterm evaluation'''|| |
123 | | ||Dataset: Connection with database and creation of Raster Bands objects (regular blocking rasters)||6th July||Done|| |
124 | | ||RasterBand: Query the correct block and fetch the raster data||15th July||Done|| |
125 | | ||'''Objective 1.1 - Support of different pixel data types, take care of byte swapping'''||'''17th July'''||'''Done'''|| |
126 | | ||'''Objective 2 - Support access to overviews'''||'''26th July'''||'''Done'''|| |
127 | | ||'''Objective 3 - Rasters inplace update'''||'''26th July'''||'''Done'''|| |
128 | | ||Objective 3.1 - Block caching in Raster Band||2th-10th August||on going|| |
129 | | ||'''Objective 4 - Support for out-db rasters'''||'''2th-10th August'''||'''on going'''|| |
130 | | ||'''Objective 5 - Testing code and documentation'''||'''17th August'''||'''on going'''||'''final evaluation'''|| |
131 | | ||'''Objective 6 - Read only support for non-regular blocking rasters'''||'''undecided'''||'''todo'''|| |
132 | | ||'''Objective 7 - Support for creating new rasters'''||'''undecided'''||'''todo'''|| |
133 | | |
134 | | Notes: |
135 | | * Undecided task will be completed after the end of GSoC. |
136 | | |
137 | | == Acknowledgments == |
138 | | |
139 | | The quotes of this page are taken from: |
140 | | * The WKT Raster specification documents, from Pierre Racine. |
141 | | * The GDAL Driver implementation tutorial: http://www.gdal.org/gdal_drivertut.html, from Frank Warmerdam |
142 | | * The GDAL Data model: http://www.gdal.org/gdal_datamodel.html, from Frank Warmerdam |
143 | | * Comments sent by mail or posted in my blog by Tamas Szekeres, Mateusz Loskot, Frank Warmerdam, Even Rouault, Pierre Racine. |
144 | | |
145 | | I'm sure that I forget someone, but many thanks :-) |
146 | | |
147 | | == Participants info == |
| 104 | == GSoC 09 Participants info == |