Changes between Version 2 and Version 3 of WKTRasterTutorial01


Ignore:
Timestamp:
06/10/10 13:30:25 (15 years ago)
Author:
pracine
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • WKTRasterTutorial01

    v2 v3  
    2828Launch a command prompt (or a shell under Linux) and make sure you can execute "shp2pgsql". If this works, you should get the help for the shp2pgsql command. If not, make sure the folder where it is located ("PostgreSQL_Installation_Path/bin) is in you PATH environement variable or simply type the complete path to shp2pgsql like this:
    2929
     30{{{
    3031        >"c:/Program Files/PostgreSQL/8.X/bin/shp2pgsql"
     32}}}
    3133       
    3234shp2pgsql works in two steps: The first step is to create a .sql file that you then have to execute using psql, an application which comes with PostgreSQL. To convert the caribou point shapefile in cariboupoint.sql, we do the following:
    3335
    34         >shp2pgsql.exe -s 32198 -I C:\Temp\Pierre\Data\CaribouPoints\cariboupoints.shp > C:\Temp\
     36{{{     >shp2pgsql.exe -s 32198 -I C:\Temp\Pierre\Data\CaribouPoints\cariboupoints.shp > C:\Temp\
    3537Pierre\Data\CaribouPoints\cariboupoints.sql
     38}}}
    3639
    3740The -s option tells the loader that the shapefile points are located in the Quebec Lambert spatial reference system which has SRID number 32198 in the PostGIS spatial_ref_sys table.
     
    4346You can then execute the sql file with psql like this:
    4447
     48{{{
    4549        >psql -f C:\Temp\Pierre\Data\CaribouPoints\cariboupoints.sql demodb     
     50}}}
    4651       
    4752== Visualizing the caribou point in OpenJUMP ==
     
    5762 3. In the query field, type the following command:
    5863
     64{{{
    5965        SELECT id, ST_AsBinary(the_geom)
    60         FROM cariboupoints
     66        FROM cariboupoints;
     67}}}
    6168       
    6269You should see your caribou observation point appear in the OpenJUMP map window.
     
    6875To display gdal2wktraster.py help, simply do:
    6976
     77{{{
    7078        >gdal2wktraster.py -h
     79}}}
    7180
    7281We downloaded 8 zipped files. If you have 7-Zip installed, you can simply select all the zip files, right-click on them and select "7-Zip->Extract files..." 7-Zip will extract each TIF file in the same folder. You can then create another folder and move them there. Each file is 6000 pixels wide by 6000 pixel high. You can quickly preview the images if you have IrfanView installed. Efficient raster/vector analysis with PostGIS WKT Raster requires the raster files to be split into tiles in the database. Fortunately gdal2wktraster.py has an option to tile the images the size we like. We will tile them into 100x100 pixels resulting in 60 * 60 * 13 = 46800 tiles. gdal2wktraster.py also accepts wildcard so we will be able to load our 8 big raster in one sole command.
     
    7483To load the raster files in PostGIS WKT Raster, do something like:
    7584
     85{{{
    7686        >gdal2wktraster.py -r C:\Temp\Pierre\Data\SRTM\tif\*.tif -t srtm_tiled -s 4326 -k 100x100 -I > C:\Temp\Pierre\Data\SRTM\srtm.sql
    77        
     87}}}
     88
    7889The -r option indicate the series of raster we want to load in the database.
    7990
     
    8697The -I option tells the loader to create a spatial index on the raster tile table. If you forget to add this option you can always add the index by executing a SQL command similar to this one:
    8798
     99{{{
    88100        CREATE INDEX srtm_tiled_gist_idx ON srtm_tiled USING GIST (ST_ConvexHull(rast));
     101}}}
    89102
    90103The result of the gdal2wktraster.py command is a 1.8 GB .sql file produced in 1 minute on my brand new Lenovo X201 labtop (Intel Core i5, 1.17 GHz, 3 GB or RAM :-).
     
    92105The same way we loaded the caribou point sql command file, we will load this sql file using psql:
    93106
     107{{{
    94108        >psql -f C:\Temp\Pierre\Data\SRTM\srtm.sql demodb
     109}}}
    95110
    96111This process took less than 4 minutes. You can quickly verify the success of the loading operation by looking at the number of row present in the srtm_tiled table. There should be 46800 rows.
     
    98113You can then visualise the extent of each of those 46800 rasters by typing the following command in the OpenJUMP "Run Datastore Query" dialog:
    99114
    100         SELECT rid, ST_AsBinary(rast::geometry) FROM srtm_tiled
     115{{{
     116        SELECT rid, ST_AsBinary(rast::geometry) FROM srtm_tiled;
     117}}}
    101118       
    102119You can also view the complete area covered by the raster coverage as one geometry by doing this:
    103120
    104         SELECT ST_AsBinary(ST_Buffer(ST_Union(rast::geometry), 0.000001)) FROM srtm_tiled
     121{{{
     122        SELECT ST_AsBinary(ST_Buffer(ST_Union(rast::geometry), 0.000001)) FROM srtm_tiled;
     123}}}
    105124       
    106125As you can see, unlike most raster coverage loaded into a GIS, the area covered by this single table raster coverage is not stricly rectangular and the two big missing square areas are not filled with nodata values. Here you have only one table for the whole coverage which is build from many raster files loaded in a single step. You don't need 13 tables to store 13 rasters. We did this with one band SRTM files in TIF format, but it could have been 3 bands BMP files of JPEG and the total size of the coverage does not really matter (PostgreSQL has a limit of 32 Terabytes)...
     
    108127If you want to go to the pixel level and verify the integrity of the values associated to a sample of the raster, (thanks to Jorge Arevalo and GDAL) you can view a vectorization of one of the tile (vectorizing all the table would be way too long) in OpenJUMP. Just type the following SQL query in the same query dialog:
    109128
     129{{{
    110130        SELECT ST_AsBinary((ST_DumpAsPolygons(rast)).geom), (ST_DumpAsPolygons(rast)).val
    111131        FROM srtm_tiled
    112         WHERE rid=34
     132        WHERE rid=34;
     133}}}
    113134       
    114135You will notice that a small area in the upper left corner of the tile is missing. This is the nodata value area. Any analysis function in PostGIS WKT Raster takes the nodata value into account. That means those area will be ommited from the operation. For example in the upcoming intersection operation, if one buffer falls in a nodata value area, st_intersection() and st_intersects() will ignore any part of the raster with nodata values possibly resulting in no intersection at all.
     
    116137You can verify that your raster has a nodatavalues set and the effective value of the nodata value by doing this query:
    117138
     139{{{
    118140        SELECT ST_BandHasNodataValue(rast), ST_BandNodataValue(rast)
    119141        FROM srtm_tiled
    120         LIMIT 1
     142        LIMIT 1;
     143}}}
    121144
    122145I you want a query to ignore the nodata value set and treat it like any other value, you can always replace any rast argument with ST_SetBandHasNodataValue(rast, FALSE) like in this query:
    123146
     147{{{
    124148        SELECT ST_AsBinary((ST_DumpAsPolygons(ST_SetBandHasNodataValue(rast, FALSE) )).geom), (ST_DumpAsPolygons(ST_SetBandHasNodataValue(rast, FALSE))).val
    125149        FROM srtm_tiled
    126         WHERE rid=XXX
     150        WHERE rid=34;
     151}}}
    127152
    128153== Producing a random point table similar to the caribou shapefile ==
     
    130155Now that we know the extent of our elevation data, we can produce a table similar to the caribou_point one. For this we will use a function that generate a certain number of point geometry inside a polygon. Copy this function in the pgAdmin III query window and execute it:
    131156
    132 CREATE OR REPLACE FUNCTION ST_RandomPoints(geom geometry, nb int)
     157{{{
     158    CREATE OR REPLACE FUNCTION ST_RandomPoints(geom geometry, nb int)
    133159    RETURNS SETOF geometry AS
    134160    $$
     
    170196    $$
    171197    LANGUAGE 'plpgsql' IMMUTABLE STRICT;
     198}}}
    172199
    173200You can then generate a fake cariboupoints table with 814 points doing this query:
    174201
     202{{{
    175203        CREATE TABLE cariboupoints AS
    176204        SELECT generate_series(1,1000) id,
    177205               ST_RandomPoint(the_geom, 1000) the_geom
    178         FROM (SELECT ST_Transform(ST_Extent(rast::geometry), 32198) FROM srtm_tiled) foo
     206        FROM (SELECT ST_Transform(ST_Extent(rast::geometry), 32198) FROM srtm_tiled) foo;
     207}}}
    179208
    180209Similarly you can visualise your own caribou point layer with this OpenJUMP query:
    181210
     211{{{
    182212        SELECT id, ST_AsBinary(the_geom)
    183         FROM cariboupoints
     213        FROM cariboupoints;
     214}}}
    184215
    185216Now that we have our two coverages loaded in the database and that we are confident about their values and spatial reference, let's start doing some analysis queries...
     
    189220This operation is also very straightforward for any casual PostGIS user. The caribou point table is in a spatial reference system suitable for measurements in meters so we can build our buffers in this system and then convert them to the raster one later in order to intersects them. We will make 1 km buffers around our points and save them as a new table. We do this operation in a pgAdmin III query window (not in OpenJUMP):
    190221
     222{{{
    191223        CREATE TABLE cariboupoint_buffers AS
    192224        SELECT id, ST_Buffer(the_geom, 1000) the_geom
    193         FROM cariboupoints
    194        
     225        FROM cariboupoints;
     226}}}
     227
    195228Again you can display your buffers in OpenJUMP with this query:
    196229
     230{{{
    197231        SELECT id, ST_AsBinary(the_geom)
    198         FROM cariboupoint_buffers
    199        
     232        FROM cariboupoint_buffers;
     233}}}
     234
    200235We then reproject our buffers to WGS 84 so we can intersect them with our raster coverage:
    201236
     237{{{
    202238        CREATE TABLE cariboupoint_buffers_wgs AS
    203239        SELECT id, ST_Transform(the_geom, 4326) the_geom
    204         FROM cariboupoint_buffers
    205        
     240        FROM cariboupoint_buffers;
     241}}}
     242
    206243Again the same kind of query to display them in OpenJUMP and confirm that they overlap with the raster coverage and see that now they are a bit squashed:
    207244
     245{{{
    208246        SELECT id, ST_AsBinary(the_geom)
    209         FROM cariboupoint_buffers_wgs
    210        
     247        FROM cariboupoint_buffers_wgs;
     248}}}
     249
    211250We could have done those two queries in one:
    212251
     252{{{
    213253        CREATE TABLE cariboupoint_buffers_wgs AS
    214254        SELECT id, ST_Transform(ST_Buffer(the_geom, 1000), 4326) the_geom
    215         FROM cariboupoints
    216        
     255        FROM cariboupoints;
     256}}}
     257
    217258We then create a spatial index on this last table to make sure the next intersection operation will be done as quickly as possible:
    218259
     260{{{
    219261        CREATE INDEX cariboupoint_buffers_wgs_geom_idx ON cariboupoint_buffers_wgs USING GIST (the_geom);
    220        
     262}}}
     263
    221264== Intersecting the caribou buffers with the elevation rasters ==
    222265
    223266Our two coverages are now ready to be intersected. We will use the brand new st_intersection() function and st_intersects() operator which operates directly on raster and geometries by vectorizing only the necessary part of the raster before doing the intersection and is one of the main feature of PostGIS WKT Raster: 
    224267
     268{{{
    225269        CREATE TABLE caribou_srtm_inter AS
    226270        SELECT id,
     
    229273        FROM cariboupoint_buffers_wgs,
    230274             srtm_tiled
    231         WHERE st_intersects(rast, the_geom)
     275        WHERE st_intersects(rast, the_geom);
     276}}}
    232277       
    233278This query takes about 30 minutes to complete. It can greatly be optimized by invoquing the st_intersection only once with the help of a subquery producing record values that we split later in the main query like this:
    234279
     280{{{
    235281        CREATE TABLE caribou_srtm_inter AS
    236282        SELECT id,
     
    242288                   cariboupoint_buffers_wgs
    243289              WHERE st_intersects(rast, the_geom)
    244              ) foo
     290             ) foo;
     291}}}
    245292       
    246293This version takes only 17 minutes. Almost half the time spent by the preceding one... This little increase in complexity certainly worths the gain in performance.
     
    248295Once again you can visualise the result of the query by doing this query in OpenJUMP (you might have to increase RAM allowed to OpenJUMP by changing "-Xmx256M" to "-Xmx1024M" in "OpenJUMPInstallPath/bin/openjump.bat"):
    249296
     297{{{
    250298        SELECT id, val, ST_AsBinary(the_geom)
    251         FROM caribou_srtm_inter
     299        FROM caribou_srtm_inter;
     300}}}
    252301
    253302One approach to make this operation without PostGIS WKT Raster would have been to first convert our 13 rasters to 13 shapefiles, import them in PostGIS and do a traditionnal intersection query between geometries only. The problem is that converting only one of those SRTM file is very difficult... The vector version of only one of those files exceed the limit of 2 GB imposed on shapefiles and the GML equivalent file is more than 10 GB... The WKT Raster approach goes like this: "since it is mostly impossible to convert those raster to vector, load them all into PostGIS as is and only the smallest part of them raster will be vectorized to do the requested operation", this using the very efficient GiST index, the versatility of GDAL and of power of the postGIS vector engine. XXXXXWe will later examine alternative solutions to this particular vectorization problem as well as how we could do the complete raster/vector operation using other software.
     
    258307The last step of our analysis is to summarize the elevation for each buffer by computing 814 weighted mean elevations. "Weighted mean elevation" means that if the area occupied by pixels with an elevation equal to 234 m is greater than the area occupy pixels with an elevation equal to 56 m, then 234 has a higher weight in the equation (proportional to it relative area).This query should do the work:
    259308
     309{{{
    260310        CREATE TABLE result01 AS
    261311        SELECT id,
     
    263313        FROM caribou_srtm_inter1
    264314        GROUP BY id
    265         ORDER BY id
     315        ORDER BY id;
     316}}}