wiki:WKTRaster/SpecificationWorking01

Version 127 (modified by pracine, 14 years ago) ( diff )

PostGIS WKT Raster Working Specifications for Beta 0.1.6

  1. Objective 0.1.6c - Being able to load rasters in the database
  2. Objective 0.1.6d - Being able to get all the properties of a raster (all the “Get” functions).
  3. Objective 0.1.6e - Being able to intersect vector and raster to produce vector.
  4. Objective 0.1.6g - Being able to read and write WKT Raster with GDAL driver.
  5. Objective 0.1.6h - Being able to set all the properties of a raster.
  6. RASTER_COLUMNS Metadata Table
  7. RASTER_OVERVIEWS Metadata Table
  8. Why is it important to avoid linking with PostGIS?
  9. The pros and cons of using GDAL vs implementing our own raster services
  10. Specification Comments following the Code Sprint

Objective 0.1.6c - Being able to load rasters in the database

gdal2wktraster

A prototype of the translation utility implemented in Python and with use of GDAL and its bindings to Python. Create an SQL commands output to create a table of raster. As input raster file, all GDAL formats are accepted. The script is available as gdal2wktraster.py script.

Open Questions:

  1. Should we change names of options to strictly follow those used by raster2pgsql and shp2pgsql? Pierre: Yes! We should follow PostGIS tracks as much as possible in everything we do in this project.
  2. How can I import all the bands from a multiband TIFF?

USAGE:

gdal2wktraster.py -r <RASTERFILE> -t [<SCHEMA>].<TABLE> [<OPTIONS>]

-r <RASTERFILE> Specifies input raster file. Multiple -r options can be specified for a number of input files or wildcards can be used (? and *).
-t [<SCHEMA>].<TABLE> Name of destination table in with or without target schema specified.

OPTIONS:

Raster processing: Optional parameters used to manipulate input raster dataset

-s <SRID> Set the SRID field of output raster. Default is -1.
-b <NBBAND> Specify the number of band to be extracted from input raster file.
-k <BLOCK_SIZE> Specify the size of the block of the input raster, assuming regular blocking mode. Must be specified as WIDTHxHEIGHT.
-R Register the raster as a filesystem (out-db) raster. Absolute path to the raster and georeferencing informations are stored instead of the raster actual data. This storage system is only allowed in SVN trunk version right now.
-l <OVERVIEW_LEVEL> Create overview tables named as o_<LEVEL>_<RASTER_TABLE> and populate it with GDAL-provided overviews. Regular blocking only.

Database processing: Optional parameters used to manipulate database objects

-c Creates a new table and populates it with raster input file(s), this is the default if you do not specify any options.
-d Drops the table, then recreates it and populates it with current raster file(s) data.
-f <COLUMN> Name of target column for raster data. Default column name is rast.
-F Add a "filename" column containing the original name of the loaded raster file.
-I Create a GiST index on the raster column.
-M Issue VACUUM command against all generated tables.
-V Create RASTER_OVERVIEWS table used to store overviews metadata.

Miscellanous: Other optional parameters

-e <ENDIAN> Control endianness of generated binary output of raster; specify 0 for XDR (big-endian) and 1 for NDR (little-endian). Only NDR output is supported right now.
-w <VERSION> Specify version of WKT Raster protocol. Default is 0; only this value is supported right now.
-o <FILE> Output file for generated SQL commands. If not specified, stdout is assumed.
-v Switch on excessively verbose mode, useful for debugging.
-h Display this help screen.

raster2pgsql

Should be as much similar to shp2pgsql as it is possible.

  • For Beta 0.1: TIFF only, only the first band, options for tiling.

Open Questions:


  1. Pierre: Should there be an option to be able to write directly to the database instead of writing a .sql file by default?

  2. Pierre: If we use GDAL, should we allow more file format than only JPEG and TIFF?

    Pierre:I guess this comes for free. We might encounter some problems with some format however. I don't know which one yet.

USAGE:
raster2pgsql [<options>] rasterfile [rasterfile…] [<schema>.]<table>
Create an SQL commands file to create a table of raster. If rasterfile is multiband and –b is not specified, every band are inserted. Multiple band can also be specified using multiple filenames (rasterfile1 is the first band, rasterfile2 the second, etc…). Can process multiple file from a folder. georeference (and pixel size) must exist directly in the files or in a companion World File.

OPTIONS:
-s <srid> Set the SRID field. Default is -1.
-b <nbband> Specify the number of band. The number of rasterfile must correspond to this number.
-P <pixeltypes> Specify the pixels types in which to store each band. Ex. ‘8-bit unsigned integer,16-bit float’. conversion may happens.
-n <nodata values> Specify the nodata value for each bands. Ex. ‘0,0.0’. Default to ‘none’ for each band.
-t <pixels> Divide rasters into <pixels>x<pixels> tiles, one tile per row. Default is to store whole rasters as one row.

(-d|a|b|c|p) Mutually exclusive inserting options:

-d Drops the table, then recreates it and populates it with current raster file data.
-a Appends raster file into current table, must be exactly the same pixel size, number of band, NODATA value and pixel type.
-B Appends raster files as a new bands. When tiled with the –t option, the new band is inserted tiled in the same way as the original band.
-c Creates a new table and populates it, this is the default if you do not specify any options.
-p Prepare mode, only creates the table.

-r <raster_column> Specify the name of the raster column (mostly useful in append mode).
-D Use PostgreSQL dump format (defaults to sql insert statements).
-I Create a GiST index on the bounding box of the raster column.
-? Display this help screen


Objective 0.1.6d - Being able to get all the properties of a raster (all the “Get” functions).

ST_SRID(raster|geometry) → integer

Return the SRID associated with the raster.

ST_Width(raster) → integer

Return the width of the raster.

ST_Height(raster) → integer

Return the height of the raster.

ST_PixelSizeX(raster) → float64

Return the georeference's X pixel size of the raster. See.

ST_PixelSizeY(raster) → float64

Return the georeference's Y pixel size of the raster. See.

ST_SkewX(raster) → float64

Return the georeference's X skew (or rotation parameter in some documentation).See.

ST_SkewY(raster) → float64

Return the georeference's Y skew (or rotation parameter in some documentation). See.

ST_UpperLeftX(raster) → float64

Return the georeference's X-coordinate of the center of the upper left pixel. See.

ST_UpperLeftY(raster) → float64

Return the georeference's Y-coordinate of the center of the upper left pixel. See.

ST_GeoReference(raster, 'ESRI'|'GDAL') → string

Return the georeference of the raster as a string representing the 6 doubles of an equivalent world file (including the carriage return). See.

ST_NumBands(raster) → integer

Return the number of band included in the raster.

ST_BandPixelType(raster, integer) → string

Return the pixel type of the specified 1-based Nth band of raster. Band index is 1-based. The function returns one of the following values:

  • 1BB - 1-bit boolean
  • 2BUI - 2-bit unsigned integer
  • 4BUI - 4-bit unsigned integer
  • 8BSI - 8-bit signed integer
  • 8BUI - 8-bit unsigned integer
  • 16BSI - 16-bit signed integer
  • 16BUI - 16-bit unsigned integer
  • 32BSI - 32-bit signed integer
  • 32BUI - 32-bit unsigned integer
  • 16BF - 16-bit float
  • 32BF - 32-bit float
  • 64BF - 64-bit float

ST_BandNoDataValue(raster, integer) → float32

Return the NoDataValue of the specified 1-based Nth band of raster. Band index is 1-based. The value is always returned as a float32 even if the pixel type is integer.


Objective 0.1.6e - Being able to intersect vector and raster to produce vector.

List of PostGIS functions similar or related to ST_GetBBox(), ST_Envelope() and ST_Polygon():

ST_Boundary(geometry) (not really - always return a geometry a dimension lower - i.e. the boundary of a polygon is a polyline.)

ST_box(geometry) (return a PostgreSQL box object)

ST_box2d(geometry) (return a box2d object)

ST_box3d(geometry) (return a box3d object)

getBBOX(geometry) (Deprecation in 1.2.3)

ST_Envelope(geometry) Returns the minimum bounding box for the supplied geometry, as a geometry. The polygon is defined by the corner points of the bounding box ((MINX, MINY), (MINX, MAXY), (MAXX, MAXY), (MAXX, MINY), (MINX, MINY)). (PostGIS will add a ZMIN/ZMAX coordinate as well). In PostGIS, the bounding box of a geometry is represented internally using float4s instead of float8s that are used to store geometries. The bounding box coordinates are floored, guarenteeing that the geometry is contained entirely within its bounds. This has the advantage that a geometry's bounding box is half the size as the minimum bounding rectangle, which means significantly faster indexes and general performance. But it also means that the bounding box is NOT the same as the minimum bounding rectangle that bounds the geometry.

envelope(geometry) (Deprecation in 1.2.3)

ST_extent(geometry set)

ST_ExteriorRing(geometry)

ST_ConvexHull(geometry)

Function definitions for WKT Raster 0.1.6

ST_GetBBox(raster) (not yet implemented but planned) is replaced with ST_Envelope() since there is no more an equivalent ST_GetBBox function in PostGIS.

ST_Raster_to_box2d(raster) should be renamed ST_box2d(raster).

ST_Box2D(raster) → BOX2D - Returns a BOX2D representing the extent of the raster.

If the raster is rotated, the result is a box2d enclosing the rotated raster.

The only difference with ST_Envelope(raster) is that ST_Box2D(raster) returns a BOX2D, not a geometry.

Implementation details

This function is not anymore available in PostGIS 1.5 and is therefore probably not worth implementing.

This function replaces ST_Raster_to_box2d() and should be used for the "raster AS box2d" cast. It should also be used instead of ST_raster_envelope() when creating indexes in gdal2wktraster.py (to be tested).

ST_Envelope(raster) → polygon geometry - Returns the minimum bounding box for the supplied raster, as a geometry.

If the raster is rotated, the envelope is a non-rotated box enclosing the rotated raster.

The only difference with ST_Box2D(raster) is that ST_Envelope(raster) returns a geometry, not a BOX2D.

Implementation details

This function should be implemented as a SQL or a PL/pgSQL function looking like: 'SELECT ST_geometry(ST_Box2D(rast))'

There is already a ST_raster_envelope() function but this has actually the behavior of the ST_ConvexHull(raster) described below. There is also a ticket (#348) related to this function: http://trac.osgeo.org/postgis/ticket/348

ST_ConvexHull(raster) → polygon geometry - Returns the minimum convex geometry that encloses every pixel from the raster.

The resulting polygon DOES NOT TAKE the NODATA values into account. The resulting polygon enclose every pixel, even those containing NODATA values. The resulting polygon can be rotated or not depending on the rotation of the raster. If the raster is not rotated ST_ConvexHull(geometry) is (almost) equivalent to ST_Envelope(raster) (ST_Envelope floor the coordinates and hence add a little buffer around the raster).

Implementation details

This function is actually already implemented by the ST_raster_envelope() function which is wrongly named. There is also a ticket (#348) related to this function: http://trac.osgeo.org/postgis/ticket/348

ST_Polygon(raster, integer) → polygon geometry - Returns a geometry encomparsing every pixel having a significant value (different than the NODATA value) for the provided band.

This polygon geometry might contain holes if some internal areas of the raster contain pixels with NODATA values.

Variant 1: ST_Polygon(raster) → polygon geometry — default to band # 1

Implementation details

Could also be named ST_Outline().

This function could be roughly implemented as a SQL function looking like 'SELECT ST_Collect((ST_DumpAsPolygons(raster)).geom)'. For sure a more specialised version could be faster than ST_AsPolygon.

ST_DumpAsPolygons(raster, integer) → geomval set - Returns a set of "geomval" value, one for each contiguous group of pixel sharing the same value for the provided band.

This is a set-returning function (SRF). A "geomval" value is a complex type composed of a polygon geometry (one for each contiguous group of pixel sharing the same value) and the value associated with this geometry. These values are always returned as a value of type double precision. The shape of each polygon follow pixels edges.

Variant 1: ST_DumpAsPolygons(raster) → geomval set — default to band # 1

This function should be used with precaution on raster with float pixel type as it could return one "geomval" (or polygon) per pixel. This kind of raster should be reclassified (using the planned ST_Reclassify(raster,…) function) before using ST_DumpAsPolygons().

Implementation details

This function is at the base of the first version of ST_Intersection which compute the intersection between a raster and a geometry.

To avoid linking directly with PostGIS (see "Why avoid to link with PostGIS?" below) It should be implemented as a SQL wrapper around DumpAsWKTPolygons() looking something like this:

CREATE TYPE geomval AS (geom geometry, val float8);
CREATE TYPE wktgeomval AS (wktgeom text, val float8);

CREATE OR REPLACE FUNCTION ST_DumpAsPolygons(rast raster) RETURNS SETOF geomval AS $$

SELECT ST_GeomFromText(wktgeomval.wktgeom), wktgeomval.val FROM DumpAsWKTPolygons(%1) AS wktgeomval;

$$ LANGUAGE SQL;

It should then be used like this:

SELECT (gv).val, (gv).geom FROM (SELECT ST_DumpAsPolygons(rast) gv FROM sometable) foo

DumpAsWKTPolygons(raster, integer) → wktgeomval set - Returns a set of "geomval" value, one for each group of pixel sharing the same value for the provided band.

Variant 1: DumpAsWKTPolygons(raster) → wktgeomval set — default to band # 1

This is a set-returning function (SRF). A "wktgeomval " value is a complex type composed of a the wkt representation of a geometry (one for each group of pixel sharing the same value) and the value associated with this geometry. These values are always returned as a value of type double precision. The shape of each polygon follow pixels edges. Areas with NODATA values are not included in the resulting set.

This function should be used with precaution on raster with float pixel type as it could return one "geomval" (or polygon) per pixel. This kind of raster should be reclassified (using the planned ST_Reclassify(raster,…) function) before using DumpAsWKTPolygons().

DumpAsWKTPolygons() should not be used directly. Use ST_DumpAsPolygons() instead.

Implementation details

This function construct the WKT strings representing the geometries grouping pixels of a raster having the same value. It does not directly construct geometries and therefore prevent WKT Raster from having to link with liblwgeom.a (see "Why avoid to link with PostGIS?" below) It should also return the value associated with this WKT polygon.

It should be implemented as a C function passing the raster to GDAL and converting each polygon produced by the GDALPolygonize function to a WKT string accompagned by its corresponding value in a wktgeomval type. See http://www.postgresql.org/docs/8.4/interactive/xfunc-c.html#AEN44970 on how to return rows and sets in a C function.

ST_Intersects(raster, integer, geometry) → boolean - Returns TRUE if the geometry and the raster "spatially intersect" - (share any portion of space) and FALSE if they don't (they are disjoint).

Variant 1: ST_Intersects(geometry, raster, integer) → boolean — the third parameter is the raster band number

Variant 2: ST_Intersects(geometry, raster) → boolean — default to band # 1

Variant 3: ST_Intersects(raster, geometry) → boolean — default to band # 1

Variant 4: ST_Intersects(raster, raster) → boolean

This function permform a full intersection test and proceed in three steps to determine if the raster intersects with the geometry:

1) First, it checks if the bounding box of the raster intersects with the bounding box of the geometry.

2) If the first test returns TRUE, it checks if the geometry returned by ST_ConvexHull(raster) intersects with the geometry.

3) If the second test returns TRUE, it checks if the geometry returned by ST_Polygon(raster, integer) intersects with the geometry. This test is slower since it involve the computation of the raster shape and it might involve the geometry shape. This test takes NODATA values into account. i.e. A geometry intersecting only with a NODATA value area of a raster is NOT actually not intersecting with this raster. This behavior may be controled by limiting the test to certain conditions as stated below.

Variant 4 proceeds in a very similar way except that convex hulls of both rasters are computed and compared in step 2) and both shape of raster are computed and compared in step 3).

If you want to limit the intersection test to the first condition, simply use the && operator. The raster and the geometry will be casted to their respective bounding box (box2d objects):

raster && geometry

If you want to limit the intersection test to the second condition you can write:

ST_Intersects(ST_ConvexHull(raster), geometry)

Implementation details

This function should be implemented as a pl/pgSQL function performing the three test described one after and conditionally to the other.

It might be faster to skip test 2) if this test is not signicantly faster than test 3).

ST_Intersection(raster, integer, geometry) → geometry - Returns a set of "geomval" value representing the shared portion of the geometry and the raster band areas sharing a common meaningfull values. The integer parameter is the band number of the raster.

Variant 1:

ST_Intersection(raster, geometry) → geometry — default to band # 1
ST_Intersection(geometry, raster, integer) → geometry — the integer parameter is the band number of the raster
ST_Intersection(geometry, raster) → geometry — default to band # 1

Variant 2:

ST_Intersection(raster, integer, geometry, 'raster') → raster — the integer parameter is the band number of the raster
ST_Intersection(raster, geometry, 'raster') → raster — default to band # 1

Variant 3:

ST_Intersection(geometry, raster, integer, 'raster') → raster — the integer parameter is the band number of the raster
ST_Intersection(geometry, raster, 'raster') → raster — default to band # 1

Variant 4:

ST_Intersection(raster, integer, raster, integer) → raster — the integer parameters are the band number of the rasters
ST_Intersection(raster, raster, integer) → raster — default first raster to band # 1
ST_Intersection(raster, integer, raster) → raster — default second raster to band # 1
ST_Intersection(raster, raster) → raster — default both rasters to band # 1

Variant 5:

ST_Intersection(raster, integer, raster, integer, 'geometry') → geometry
ST_Intersection(raster, raster, integer, 'geometry') → geometry — default first raster to band # 1
ST_Intersection(raster, integer, raster, 'geometry') → geometry — default second raster to band # 1
ST_Intersection(raster, raster, 'geometry') → geometry — default both raster to band # 1

This is a set-returning function (SRF). It returns a set of "geomval" representing the point set intersection of the geometry and the areas of the raster sharing a common meaningfull value (NODATA values ARE taken into account). The raster is first polygonised using ST_DumpAsPolygons(raster) and ST_Intersection(geometry, geometry) is then performed between the provided geometry and all the geometries resulting from the polygonisation of the raster.

If the geometries do not share any space (are disjoint), then an empty geometry collection is returned.

ST_Intersection in conjunction with ST_Intersects is very useful for clipping geometries such as in bounding box, buffer, region queries where you only want to return that portion of a geometry that sits in a country or region of interest.

If you only want to compute the intersection between the convex hull of the raster without polygonising it to group of pixels sharing a common value, do:

ST_Intersection(ST_ConvexHull(raster), geometry)

If you only want to compute the intersection between the shape of the raster without polygonising it to group of pixels sharing a common value, do:

ST_Intersection(ST_Polygon(raster), geometry)

Implementation details

This function should be implemented as a pl/pgSQL function (possibly only a SQL function) performing the intersection between the provided geometry and the table generated by ST_DumpAsPolygons(raster).

Priority 1 should be given to the raster/geometry variants (original and variant 1) returning a geometry.

Priority 2 should be given to the raster/raster variants (5) returning a geometry.

Priority 3 should be given to the other variants (2, 3 & 4) returning a raster. Variant 2 & 3 should be implemented after the planned ST_AsRaster(geometry) function. What should be the raster result of the intersection of two rasters is still to be specified.


Objective 0.1.6g - Being able to read and write WKT Raster with GDAL driver.

Currently (January 2010) a basic version of GDAL WKT Raster driver is available

The final GDAL WKT Raster driver version must be able to:

  • Read a WKT Raster from database (in-db/out-db, regular/irregular blocking)
  • Create a new WKT Raster in database (in-db/out-db, regular/irregular blocking)

Objective 0.1.6h - Being able to set all the properties of a raster.

ST_SetSRID(raster|geometry, integer)

Set the spatial reference on the specified raster. This corresponds with the ST_SetSRID method in regular PostGIS.

ST_SetPixelSize(raster, pixelsize)

Variant 1: ST_SetPixelSize(raster, xsize, ysize)

Set the pixel size (cell size) on the specified raster. The base method sets the X and Y pixel size to the same value. The first variant sets the pixel sizes to the individually specified x and y sizes. The second variant sets the pixel size to the pixel size of the 2nd raster parameter.

Open Question:

DZ: Should there be corresponding methods for X/Y sizes? ST_SetXPixelSize(…) and ST_SetYPixelSize(…)

PR: In addition to ST_SetPixelSize(raster, xsize, ysize) or in replacement of?

DZ: In addition to?
PR: Maybe we are fine with ST_SetPixelSize(raster, xsize, ysize)…

PR: Could you provide a full SQL example of ST_SetPixelSize(raster, raster)? And in general of a function getting arguments for the first raster from another raster (like the below ST_Resample(raster, method, raster))

DZ: A SQL example for this is in the regression tests in patch #431: test/regress/rt_set_properties.sql, line 52. I can come up with some more examples, if needed.

DZ: After discussion, will drop the variant that copies the pixel size from a raster parameter.

ST_SetBandNoDataValue(raster, band, value)

Set the data value that corresponds to NODATA in the given raster's band.

ST_BandHasNoDataValue(raster, band)

Return true if the value stored as a NODATA value must be interpreted as a real NODATA value.

ST_SetBandHasNoDataValue(raster, band)

Set the NODATA value as a real NODATA value.

ST_SetGeoReference(raster, georef)

Variant 1: ST_SetGeoReference(raster, georef, format)

Set the georeference fields of the raster (pixel size, rotation, upper left location) all in one pass. The 'format' parameter is optional, and defaults to 'GDAL' georeference format if it is not specified.

If this method does not recieve a georeference string with 6 parameters (or if one of the parameters is not numeric), it raises an exception "st_setgeoreference requires a string with 6 floating point values". If this method recieves a null raster, it does nothing, and returns a null raster.

ST_SetSkew(raster, skewx, skewy)

Variant 1: ST_SetSkew(raster, skew)

Set the skewx and the skewy of the raster's georeference. Variant 1 sets the X and Y skew to the same value.

Open Question:

DZ: Should there be corresponding methods for X/Y skews in addition to this general method? ST_SetSkewX(…) and ST_SetSkewY(…)?

DZ: No, follow the pattern for ST_SetPixelSize

ST_SetRotation(raster, angle)

Set the rotation of the raster. This method actually derive values for pixelsizex, pixelsizey, skewx and skewy based on the provided rotation angle.

Open Question:

PR: The angle should be provided in radian or in degree?

ST_Rotation(raster) → float64

Return the georeference's rotation angle in (degree or radiant?) derived from the pixel size and the skew.

ST_SetUpperLeft(raster, left, top)

Variant 1: ST_SetUpperLeft(raster, raster)

Set the location of the upper left pixel (cell). The base method sets the X and Y coordinates to the individually specified left and top coordinates. The first variant sets the upper left coordinate to the upper left coordinate taken from the 2nd raster parameter.

Open Question:

DZ: Should there be corresponding methods for X/Y rotations in addition to this general method? ST_SetUpperLeftX(…) and ST_SetUpperLeftY()?

DZ: No, follow the pattern for ST_SetPixelSize


RASTER_COLUMNS Metadata Table

The following metadata table provides external applications and possibly internal tools the ability to discover tables with raster data, and to learn various information about those datasets.

Column Name Type Constraints Comments
r_table_catalog character varying(256) NOT NULL
r_table_schema character varying(256) NOT NULL
r_table_name character varying(256) NOT NULL
r_column character varying(256) NOT NULL
srid integer NOT NULL Use 0 for unknown SRID
pixel_types ARRAY=VARCHAR= NOT NULL an array of pixeltypes, one per band. The band count is implicit in the size of this array.
out_db boolean NOT NULL false if internal tiles, true if tiles are references to files outside the database
regular_blocking boolean NOT NULL false by default, true if all blocks are equal sized, abutted and non-overlapping, started at top left origin (see below)
nodata_values ARRAY=DOUBLE= an array of nodata values, one per band. The entry may be NULL to indicate no NODATA values.
pixelsize_x double width of a pixel in geounits (per SRID)
pixelsize_y double height of a pixel in geounits (per SRID)
blocksize_x integer the width of a block in pixels (NULL if irregular)
blocksize_y integer the height of a block in pixels (NULL if irregular)
extent GEOMETRY a polygon geometry containing all raster tiles, or NULL if predefined bounds are not known. For "regular_blocking" cases this geometry will be a simple rectangle. In other cases it might be an irregular polygon.

If the regular_blocking field is true a number of restrictions are placed on the raster column that is defined:

  1. All tiles must have the same size (blocksize_x and blocksize_y).
  2. All tiles must be non-overlapping, and appear on regular block grid.
  3. The top left block must start at the top left corner of the extent.
  4. The right most column, and bottom row of blocks may have portions that extend beyond the raster extent. These areas will be assumed to be NODATA and not part of the described raster.
  5. The extent field must be a simple rectangle (non-rotated).

It is permissible to for regular_blocking rasters to omit some blocks/tiles (sparse rasters) in which case the missing tiles should be assumed to be all NODATA or zero valued.


RASTER_OVERVIEWS Metadata Table

If the RASTER_COLUMNS "regular_blocking" value is true then "all blocks are equal sized, abutted and non-overlapping, started at top left origin", plus additional constraints. This regular blocking capability raises the possibility of having very large contiguous raster coverages (made up of many individual WKTRaster-s) which, in turn, raises potential performance problems. Other raster formats counter this by having overviews; a concept that is already supported by GDAL.

The overviews support was discussed on Martin Daly's (Cadcorp) initiative and is available in the postgis-devel archives under thread WKTRaster & overviews.

The following metadata table provides external applications and possibly internal tools the ability to discover tables that contain overview raster data, and to learn various information about those datasets.

Column Name Type Constraints Comments
o_table_catalog character varying(256) NOT NULL
o_table_schema character varying(256) NOT NULL
o_table_name character varying(256) NOT NULL
o_column character varying(256) NOT NULL Together define the column containing overview data
r_table_catalog character varying(256) NOT NULL
r_table_schema character varying(256) NOT NULL
r_table_name character varying(256) NOT NULL
r_column character varying(256) NOT NULL Together define the base raster coverage
out_db boolean NOT NULL false if internal tiles, true if tiles are references to files outside the database
overview_factor integer NOT NULL The integral factor of the overview, e.g. 2 means an overview is ½ resolution of the base raster coverage

The following restrictions apply:

  1. blocksize_x and blocksize_y for the overview are the same as that of the original, base raster coverage.
  2. pixelsize_x and pixelsize_y for the overview are calculated from the original, base raster coverage pixelsize_x and pixelsize_y (using overview_factor).
  3. Overviews should not be registered in the raster_columns metadata table.
  4. The raster_overviews metadata table can have more than one entry for any given base raster coverage, but they must all have different overview_factor values, e.g. 2, 4, 8, 16, etc.

No provision is provided, or suggested, for creating, updating, or deleting overviews.

Pros

  1. Overviews and base raster data tables are independent of each other, so overviews do not muddy the water for casual queries ("SELECT * FROM RASTER_COLUMNS").
  2. Overview and base raster data metadata will always match.

Cons

  1. Two metadata tables instead of one.
  2. Overviews have to share blocksizes with the base raster column.

Alternative Overviews Metadata

An alternative design is to extend the RASTER_COLUMNS metadata table with the following columns:

Column Name Type Constraints Comments
or_table_catalog character varying(256) NOT NULL
or_table_schema character varying(256) NOT NULL
or_table_name character varying(256) NOT NULL
or_column character varying(256) NOT NULL Together define the original ("or_") raster coverage
overview_factor integer NOT NULL The integral factor of the overview, e.g. 2 means an overview is ½ resolution of the base raster coverage

The overview_factor column could be removed by enforcing a divide by two rule, each overview being half the resolution of its parent.

Original rasters would be queried like this:

SELECT * FROM raster_columns WHERE overview_factor = 0

Overview rasters would be queried like this:

SELECT * FROM raster_columns WHERE or_table_name = 'myraster'

Pros

  1. Only one metadata table.
  2. Overviews can have blocksizes independent of their base raster data.

Cons

  1. A casual metadata query ("SELECT * FROM RASTER_COLUMNS") will return overviews and base raster data.
  2. Metadata for base data and overviews could conflict, e.g. different SRID-s, mis-matched pixelsizes, etc.

Why is it important to avoid linking with PostGIS?

There is three level of dependency over PostGIS:

1) Compilation dependency. This is only a problem as far as we have to first compile PostGIS in order to compile WKT Raster. This remains our problem, not a user problem.

2) Linking dependency. This is the major problem since it prevent us from releasing when we want as we always have to link with the last release of PostGIS and that this might brake support for older PostGIS. WKT Raster independency over PostGIS means WKT Raster can be installed over (almost) every version of PostGIS past and future (as far as their SQL API do not change and it (should) change less often than the C API). This give us much more flexibility (we can release when we want/need) and to our users (they can install over any version of PostGIS i.e. They don<t change their installation. We take for granted that WKT Raster users are already experienced PostGIS users). It means that even if they have a fairly old version of PostGIS installed on their system, they can install the very last WKT Raster. It also means that they can keep their version of WKT Raster even if they upgrade PostGIS. We can not take for granted that PostGIS folks will integrate WKT Raster into the main PostGIS trunk soon and fix this problem. There might be years before WKT Raster is officially integrated in PostGIS releases. Some functions still link with PostGIS and this does not seems to prevent WKT Raster to work with many versions of PostGIS. This is to be tested.

3) SQL Dependency. WKT Raster is very much dependent on PostGIS at the SQL level and it will remains like this. Most (every) WKT Raster functions handling geometries rely on PostGIS SQL functions. This does not prevent WKT Raster to work with past/future versions of PostGIS as far as the SQL API does not change, and it is not in the POSTGIS folks interest to change it too much so it should remain very stable.

So the general idea is: When we need PostGIS services to manipulate geometries, we should always use PL/pgSQL functions instead of implementing them in C. This prevent us from having to link with any PostGIS lib (mostly liblwgeom.a). This has the following advantages:

  1. We facilitate backward and forward compatibility with older and newer versions of PostGIS - The SQL API is much more stable than the internal C API.
  1. This makes WKT Raster releases as much as possible independent from PostGIS releases. We don't have to make sure WKT Raster is compatible (compile) with the very last PostGIS C function version.
  1. This prevent us from constantly having to adjust our code to changes in the PostGIS C API.

The pros and cons of using GDAL vs implementing our own raster services

A discussion is going on to find if we are better to use GDAL for most raster operations or if we should reimplement our own C raster functions them being based on GDAL algorithms or anything else. Here are the pros and cons of using GDAL:

Pros

  • We do not have to reinvent the wheel for many WKT Raster functions (AsPolygon, AsRaster, AsJPEG, AsTIFF, AsPNG, etc…). This means a shorter WKT Raster development time.
  • We can contribute to GDAL by fixing bugs and sometime implement new features. Actually most new WKT Raster features (eg. ST_MapAlgebra()) could be implemented in GDAL by WKT Raster folks and then used by WKT Raster.

Cons

  • We must translation WKT Raster WKB to GDALDataset with bands (likely using GDAL MEM driver).
  • We must compile and link with GDAL.

Specification Comments following the Code Sprint

This section is dedicated to discussion that took place during the Toronto Code Sprint 2009.

  1. Bug tracking - Launch separate Google project or join PostGIS bug tracker, as a module? Answer: We will track our bug in the PostGIS Google Code project. PostGIS plan on moving (back) to Trac soon.

  2. Wiki: is there a chance to have a better wiki for documentation, schedule, specs, road map, users inputs? Actual PostGIS wiki limitations: security, no images, no tables (or sophisticated page structure) (For this Leo and Regina are drafting up a PSC and then once we have a PSC we'll vote to locate on OSGEO. Main issue is Refractions server is kind of old)

    1. Is Google Code Wiki well working?

      (remember export issues with Google code - see gvSIG - Cuba)

  3. Windows build. What is the plan?

    1. First step is to review Windows/Visual C++ native build configuration for PostGIS. ML: I will try to investigate and solve outstanding issues like PGXS makefiles generation, soon.

    2. ML: Prefer native build toolsets on supported platform (Windows/Visual C++, GCC/Linux, etc.). On Windows, avoid Cygwin, MinGW, wherever possible. Issue directly related to point a).

    3. Is cmake a possibility?

  4. GiST indexing: is wrapping PostGIS Gist function in PL/PgSQL sufficient? From Paul comments it seems actually that cast are sufficients.

  5. API: Any other SQL raster function you would like to see?

    1. ML: ST_AsJPEG - What JPEG implementation to use?

    2. ML: ST_AsTIFF - Do we aim to depend on libgeotiff?

    3. ML: Is there any function that would require/fit GDAL dependency?

    4. Pierre: We should maybe investigate the cost of using GDAL and have a general ST_AsImage(format, params) instead of using the individual tiff, jpeg and png libraries. Those will anyway also have a cost I guess… And we will probably have to link with GDAL for DumpAsPolygons(rast) and AsRaster(geom) anyway.

  6. Implementing raster → vector conversion for intersections(). How to do it? Possible using only plpgsql? So we don't have to link with liblwgeos and we can have our own release schedule.
    • polygonize with GDAL or GRASS, that generates some internal structure for those libraries, and you turn that into a WKT form of polygon. That can be passed to PostGIS, and you don't need to link to liblwgeom.

  7. Best practice for band referencing: having a band parameter (index?) or always use ST_Band()

    -e.g. SELECT ST_AsTIFF(ST_Band(raster,2),60) FROM coverandtemp

    SELECT ST_AsTIFF(ST_Band(raster,2),ST_Band(raster,1),60) FROM coverandtemp
    or
    SELECT ST_AsTIFF(raster,2,60) FROM coverandtemp

    Pierre: Having a band parameter (or a list of band) for each function trying to access a band would be preferable to avoid useless memcopy of band data.

  8. Architecture & design issues:

    • How to handle no NODATA? Pierre from Frank input: We'll need a HasNoDataValue flag for each band.

  9. Documentation

    • Setup a structure (likely on a Wiki), sections for developers and for users

    • Update existing documents like RFCs - they are out of date.

    • Describe key terms: canonical format, serialized format, transfer format (WKB), in-db raster, out-db raster, etc. This should help people to grasp the WKT Raster key elements quickly, and help them to decide which document/RFC to read in details.

  10. Development Roadmap

    • Review current schedule.
    • Add note on a website where users should (or should not) expect 1 release - clear statement.

Attachments (2)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.