Context Navigation

Changes between Version 10 and Version 11 of WKTRasterTutorial01

Timestamp:: Jun 10, 2010, 1:54:15 PM (14 years ago)
Author:: pracine
Comment:: —

Legend:

: Unmodified
: Added
: Removed
: Modified

WKTRasterTutorial01

-              v10
+              v11
 = PostGIS WKT Raster Tutorial 1 - Intersecting vector buffers with large raster coverage using PostGIS WKT Raster =
+Pierre Racine, June 2010
 This tutorial will show you how to do a very classical raster/vector analysis with PostGIS WKT Raster. Basically the problem is to compute the mean elevation for buffers surrounding a series of point representing caribou observations. You can do this kind of analysis pretty easily in any GIS package but what is special here and is not easy to do in ANY GIS package is the size of the datasets used (900 MB or raster data), the simplicity of the queries and the speed at which we will get the results.
 We will describe the steps mainly on Windows, but Linux gurus will have no problem writing the equivalent commands in their favourite OS. We assume that PostgreSQL, PostGIS and PostGIS WKT Raster are well installed. Refer to the readme file of each software to install them properly. The version used are:
+We will describe the steps mainly on Windows, but Linux gurus will have no problem writing the equivalent commands in their favorite OS. We assume that PostgreSQL, PostGIS and PostGIS WKT Raster are well installed. Refer to the readme file of each software to install them properly. The version used are:
  * PostgreSQL 8.4
 …
 Here is the procedure to import the caribou point shapefile:
 Launch a command prompt (or a shell under Linux) and make sure you can execute "shp2pgsql". If this works, you should get the help for the shp2pgsql command. If not, make sure the folder where it is located ("PostgreSQL_Installation_Path/bin) is in you PATH environement variable or simply type the complete path to shp2pgsql like this:
+Launch a command prompt (or a shell under Linux) and make sure you can execute "shp2pgsql". If this works, you should get the help for the shp2pgsql command. If not, make sure the folder where it is located ("PostgreSQL_Installation_Path/bin") is in you PATH environment variable or simply type the complete path to shp2pgsql like this:
 {{{
 …
 }}}
 shp2pgsql works in two steps: The first step is to create a .sql file that you then have to execute using psql, an application which comes with PostgreSQL. To convert the caribou point shapefile in cariboupoint.sql, we do the following:
+shp2pgsql works in two steps: The first step is to create a .sql file that you then have to execute using "psql", an application included with PostgreSQL. To convert the caribou point shapefile in "cariboupoint.sql", we do the following:
 {{{
 …
 The -t option specify the table in which we want to load them.
 Similar to shp2pgsql.exe, the -s option is required to specify the spatial reference system ID. In this case the raster are in WGS 84 having the SRID number 4326. Unlike some GIS software, PostGIS does not support on the fly reprojection so that we cannot do operations on table stored in different spatial reference systems. As we could see, the caribou point layer was in the projected spatial reference system NAD 83/Quebec Lambert", when the SRTM images are in the geographic spatial reference system "WGS 84". We will have to deal with this later.
+Similar to shp2pgsql.exe, the -s option is required to specify the spatial reference system ID. In this case the raster are in "WGS 84" having the SRID number 4326. Unlike some GIS software, PostGIS does not support on the fly reprojection so that we cannot do operations on table stored in different spatial reference systems. As we could see, the caribou point layer was in the projected spatial reference system NAD 83/Quebec Lambert", when the SRTM images are in the geographic spatial reference system "WGS 84". We will have to deal with this later.
 The last -k option specify the size of the tiles we want to load in PostGIS.
 …
 The result of the gdal2wktraster.py command is a 1.8 GB .sql file produced in 1 minute on my brand new Lenovo X201 labtop (Intel Core i5, 1.17 GHz, 3 GB or RAM :-).
 The same way we loaded the caribou point sql command file, we will load this sql file using psql:
+The same way we loaded the caribou point sql command file, we will load this sql file using "psql":
 {{{
 …
 }}}
 This process took less than 4 minutes. You can quickly verify the success of the loading operation by looking at the number of row present in the srtm_tiled table. There should be 46800 rows.
 You can then visualise the extent of each of those 46800 rasters by typing the following command in the OpenJUMP "Run Datastore Query" dialog:
+This process took less than 4 minutes. You can quickly verify the success of the loading operation by looking at the number of row present in the "srtm_tiled" table. There should be 46800 rows.
+You can then visualize the extent of each of those 46800 rasters by typing the following command in the OpenJUMP "Run Datastore Query" dialog:
 {{{
 …
 }}}
 As you can see, unlike most raster coverage loaded into a GIS, the area covered by this single table raster coverage is not stricly rectangular and the two big missing square areas are not filled with nodata values. Here you have only one table for the whole coverage which is build from many raster files loaded in a single step. You don't need 13 tables to store 13 rasters. We did this with one band SRTM files in TIF format, but it could have been 3 bands BMP files of JPEG and the total size of the coverage does not really matter (PostgreSQL has a limit of 32 Terabytes)...
+As you can see, unlike most raster coverage loaded into a GIS, the area covered by this single table raster coverage is not strictly rectangular and the two big missing square areas are not filled with nodata values. Here you have only one table for the whole coverage which is build from many raster files loaded in a single step. You don't need 13 tables to store 13 rasters. We did this with one band SRTM files in TIF format, but it could have been 3 bands BMP files of JPEG and the total size of the coverage does not really matter (PostgreSQL has a limit of 32 Terabytes)...
 If you want to go to the pixel level and verify the integrity of the values associated to a sample of the raster, (thanks to Jorge Arevalo and GDAL) you can view a vectorization of one of the tile (vectorizing all the table would be way too long) in OpenJUMP. Just type the following SQL query in the same query dialog:
 …
 }}}
 You will notice that a small area in the upper left corner of the tile is missing. This is the nodata value area. Any analysis function in PostGIS WKT Raster takes the nodata value into account. That means those area will be ommited from the operation. For example in the upcoming intersection operation, if one buffer falls in a nodata value area, st_intersection() and st_intersects() will ignore any part of the raster with nodata values possibly resulting in no intersection at all.
+You will notice that a small area in the upper left corner of the tile is missing. This is the nodata value area. Any analysis function in PostGIS WKT Raster takes the nodata value into account. That means those area will be omitted from the operation. For example in the upcoming intersection operation, if one buffer falls in a nodata value area, st_intersection() and st_intersects() will ignore any part of the raster with nodata values possibly resulting in no intersection at all.
 You can verify that your raster has a nodatavalues set and the effective value of the nodata value by doing this query:
 …
 }}}
 One approach to make this operation without PostGIS WKT Raster would have been to first convert our 13 rasters to 13 shapefiles, import them in PostGIS and do a traditionnal intersection query between geometries only. The problem is that converting only one of those SRTM file is very difficult... The vector version of only one of those files exceed the limit of 2 GB imposed on shapefiles and the GML equivalent file is more than 10 GB... The WKT Raster approach goes like this: "since it is mostly impossible to convert those raster to vector, load them all into PostGIS as is and only the smallest part of them raster will be vectorized to do the requested operation", this using the very efficient GiST index, the versatility of GDAL and of power of the postGIS vector engine.
+One approach to make this operation without PostGIS WKT Raster would have been to first convert our 13 rasters to 13 shapefiles, import them in PostGIS and do a traditional intersection query between geometries only. The problem is that converting only one of those SRTM file is very difficult... The vector version of only one of those files exceed the limit of 2 GB imposed on shapefiles and the GML equivalent file is more than 10 GB... The WKT Raster approach goes like this: "since it is mostly impossible to convert those raster to vector, load them all into PostGIS as is and only the smallest part of them raster will be vectorized to do the requested operation", this using the very efficient GiST index, the versatility of GDAL and of power of the postGIS vector engine.